Abstract
The random accumulation of variations in the human genome over time implicitly encodes a history of how human populations have arisen, dispersed, and intermixed since we emerged as a species. Reconstructing that history is a challenging computational and statistical problem but has important applications both to basic research and to the discovery of genotype-phenotype correlations. In this study, we present a novel approach to inferring human evolutionary history from genetic variation data. Our approach uses the idea of consensus trees, a technique generally used to reconcile species trees from divergent gene trees, adapting it to the problem of finding the robust relationships within a set of intraspecies phylogenies derived from local regions of the genome. We assess the quality of the method on two large-scale genetic variation data sets: the HapMap Phase II and the Human Genome Diversity Project. Qualitative comparison to a consensus model of the evolution of modern human population groups shows that our inferences closely match our best current understanding of human evolutionary history. A further comparison with results of a leading method for the simpler problem of population substructure assignment verifies that our method provides comparable accuracy in identifying meaningful population subgroups in addition to inferring the relationships among them.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Adams, E.N.: N-trees as nestings: Complexity, similarity, and consensus. Journal of Classification 3(2), 299–317 (1986) 10.1007/BF01894192
Cann, R.L., Stoneking, M., Wilson, A.C.: Mitochondrial DNA and human evolution. Nature 325(6099), 31–36 (1987) 10.1038/325031a0
Chu, Y.J., Liu, T.H.: On the shortest arborescence of a directed graph. Science Sinica 14, 1396–1400 (1965)
Grnwald, P.D., Myung, I.J., Pitt, M.A.: Advances in Minimum Description Length: Theory and Applications. The MIT Press, Cambridge (2005)
Hammer, M.F., Spurdle, A.B., Karafet, T., Bonner, M.R., Wood, E.T., Novelletto, A., Malaspina, P., Mitchell, R.J., Horai, S., Jenkins, T., Zegura, S.L.: The geographic distribution of human Y chromosome variation. Genetics 145(3), 787–805 (1997)
He, M., Gitschier, J., Zerjal, T., de Knijff, P., Tyler-Smith, C., Xue, Y.: Geographical affinities of the HapMap samples. PLoS ONE 4(3), e4684, 03 (2009)
International HapMap Consortium. A second generation human haplotype map of over 3.1 million snps. Nature 449(7164), 851–861 (October 2007)
Jakobsson, M., Scholz, S.W., Scheet, P., Gibbs, R.J., Vanliere, J.M., Fung, H.C., Szpiech, Z.A., Degnan, J.H., Wang, K., Guerreiro, R., Bras, J.M., Schymick, J.C., Hernandez, D.G., Traynor, B.J., Simon-Sanchez, J., Matarin, M., Britton, A., van de Leemput, J., Rafferty, I., Bucan, M., Cann, H.M., Hardy, J.A., Rosenberg, N.A., Singleton, A.B.: Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451(7181), 998–1003 (2008)
Jorde, L.B., Bamshad, M.J., Watkins, W.S., Zenger, R., Fraley, A.E., Krakowiak, P.A., Carpenter, K.D., Soodyall, H., Jenkins, T., Rogers, A.R.: Origins and affinities of modern humans: a comparison of mitochondrial and nuclear genetic data. American Journal of Human Genetics 57, 523–538 (1995)
Kayser, M., Krawczak, M., Excoffier, L., Dieltjes, P., Corach, D., Pascali, V., Gehrig, C., Bernini, L.F., Jespersen, J., Bakker, E., Roewer, L., de Knijff, P.: An extensive analysis of Y-chromosomal microsatellite haplotypes in globally dispersed human populations. American Journal of Human Genetics 68(4), 990–1018 (2001)
Margush, T., Mcmorris, F.R.: Consensus n-trees. Bulletin of Mathematical Biology 43, 239–244 (1981)
Meila, M.: Comparing clusterings–an information based distance. Journal of Multivariate Analysis 98(5), 873–895 (2007) doi: 10.1016/j.jmva.2006.11.013
Nei, M., Kumar, S.: Molecular Evolution and Phylogenetics. Oxford University Press, Oxford (2000)
Nei, M., Roychoudhury, A.K.: Genetic relationship and evolution of human races. Evolutionary Biology 14, 1–59 (1982)
Patterson, N., Price, A.L., Reich, D.: Population structure and eigenanalysis. PLoS Genetics 2(12), e190+ (2006)
Pritchard, J.K., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155(2), 945–959 (2000)
Reich, D., Thangaraj, K., Patterson, N., Price, A.L., Singh, L.: Reconstructing indian population history. Nature 461(7263), 489–494 (2009) 10.1038/nature08365
Shriver, M.D., Kittles, R.A.: Genetic ancestry and the search for personalized genetic histories. Nature Reviews Genetics 5, 611–618 (2004)
Sohn, K.A., Xing, E.P.: Spectrum: joint bayesian inference of population structure and recombination events. Bioinformatics 23(13), i479–i489 (2007)
Sridhar, S., Lam, F., Blelloch, G., Ravi, R., Schwartz, R.: Direct maximum parsimony phylogeny reconstruction from genotype data. BMC Bioinformatics 8(1), 472 (2007)
Tang, H., Coram, M., Wang, P., Zhu, X., Risch, N.: Reconstructing genetic ancestry blocks in admixed individuals. The American Journal of Human Genetics 79(1), 1–12 (2006) doi: 10.1086/504302
Thomas, D.C., Witte, J.S.: Point: Population stratification: A problem for case-control studies of candidate-gene associations? Cancer Epidemiol Biomarkers Prev. 11(6), 505–512 (2002)
Tishkoff, S.A., Dietzsch, E., Speed, W., Pakstis, A.J., Kidd, J.R., Cheung, K., Bonn-Tamir, B., Santachiara-Benerecetti, A.S., Moral, P., Krings, M., Pbo, S., Watson, E., Risch, N., Jenkins, T., Kidd, K.K.: Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271(5254), 1380–1387 (1996)
Tishkoff, S.A., Verrelli, B.C.: Patterns of human genetic diversity: Implications for human evolutionary history and disease. Annual Review of Genomics and Human Genetics 4(1), 293–340 (2003)
Tishkoff, S.A., Williams, S.M.: Genetic analysis of African populations: human evolution and complex disease. Nat. Rev. Genet. 3(8), 611–621 (2002) 10.1038/nrg865
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsai, MC., Blelloch, G., Ravi, R., Schwartz, R. (2010). A Consensus Tree Approach for Reconstructing Human Evolutionary History and Detecting Population Substructure. In: Borodovsky, M., Gogarten, J.P., Przytycka, T.M., Rajasekaran, S. (eds) Bioinformatics Research and Applications. ISBRA 2010. Lecture Notes in Computer Science(), vol 6053. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13078-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-13078-6_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13077-9
Online ISBN: 978-3-642-13078-6
eBook Packages: Computer ScienceComputer Science (R0)