Abstract
Inference of ancestry from genetic data is a fundamental problem in computational genetics, with wide applications in human genetics and population genetics. The treatment of ancestry as a continuum instead of a categorical trait has been recently advocated in the literature. Particularly, it was shown that a European individual’s geographic coordinates of origin can be determined up to a few hundred kilometers of error using spatial ancestry inference methods. Current methods for the inference of spatial ancestry focus on individuals for whom all ancestors originated from the same geographic location. In this work we develop a spatial ancestry inference method that aims at inferring the geographic coordinates of ancestral origins of recently admixed individuals, i.e. individuals whose recent ancestors originated from multiple locations. Our model is based on multivariate normal distributions integrated into a two-layered Hidden Markov Model, designed to capture both long-range correlations between SNPs due to the recent mixing and short-range correlations due to linkage disequilibrium. We evaluate the method on both simulated and real European data, and demonstrate that it achieves accurate results for up to three generations of admixture. Finally, we discuss the challenges of spatial inference for older admixtures and suggest directions for future work.
Y. Margalit and Y. Baran—Contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alexander, D.H., Novembre, J., Lange, K.: Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19(9), 1655–1664 (2009)
Baran, Y., Pasaniuc, B., Sankararaman, S., Torgerson, D.G., Gignoux, C., Eng, C., Rodriguez-Cintron, W., Chapela, R., Ford, J.G., Avila, P.C., et al.: Fast and accurate inference of local ancestry in latino populations. Bioinformatics 28(10), 1359–1367 (2012)
Baran, Y., Quintela, I., Carracedo, Á., Pasaniuc, B., Halperin, E.: Enhanced localization of genetic samples through linkage-disequilibrium correction. Am. J. Hum. Genet. 92(6), 882–894 (2013)
Browning, S.R., Browning, B.L.: Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81(5), 1084–1097 (2007)
Bryc, K., Velez, C., Karafet, T., Moreno-Estrada, A., Reynolds, A., Auton, A., Hammer, M., Bustamante, C., Ostrer, H.: Genome-wide patterns of population structure and admixture among hispanic/latino populations. Proc. Nat. Acad. Sci. 107(Supplement 2), 8954–8961 (2010)
Churchhouse, C., Marchini, J.: Multiway admixture deconvolution using phased or unphased ancestral panels. Genet. Epidemiol. 37(1), 1–12 (2013)
Gravel, S., Henn, B., Gutenkunst, R., Indap, A., Marth, G., Clark, A., Yu, F., Gibbs, R., Bustamante, C., Altshuler, D., et al.: Demographic history and rare allele sharing among human populations. Proc. Nat. Acad. Sci. 108(29), 11983–11988 (2011)
Haiman, C.A., Patterson, N., Freedman, M.L., Myers, S.R., Pike, M.C., Waliszewska, A., Neubauer, J., Tandon, A., Schirmer, C., McDonald, G.J., et al.: Multiple regions within 8q24 independently affect risk for prostate cancer. Nat. Genet. 39(5), 638–644 (2007)
Hinch, A., Tandon, A., Patterson, N., Song, Y., Rohland, N., Palmer, C., Chen, G., Wang, K., Buxbaum, S., Akylbekova, E., et al.: The landscape of recombination in african americans. Nature 476(7359), 170–175 (2011)
Jarvis, J., Scheinfeldt, L., Soi, S., Lambert, C., Omberg, L., Ferwerda, B., Froment, A., Bodo, J., Beggs, W., Hoffman, G., et al.: Patterns of ancestry, signatures of natural selection, and genetic association with stature in western african pygmies. PLoS Genet. 8(4), e1002641 (2012)
Johnson, N.A., Coram, M.A., Shriver, M.D., Romieu, I., Barsh, G.S., London, S.J., Tang, H.: Ancestral components of admixed genomes in a mexican cohort. PLoS Genet. 7(12), e1002410 (2011)
Kao, W.L., Klag, M.J., Meoni, L.A., Reich, D., Berthier-Schaad, Y., Li, M., Coresh, J., Patterson, N., Tandon, A., Powe, N.R., et al.: Myh9 is associated with nondiabetic end-stage renal disease in african americans. Nature Genet. 40(10), 1185–1192 (2008)
Maples, B.K., Gravel, S., Kenny, E.E., Bustamante, C.D.: Rfmix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93(2), 278–288 (2013)
Menelaou, A., Marchini, J.: Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold. Bioinformatics 29(1), 84–91 (2013)
Moreno-Estrada, A., Gravel, S., Zakharia, F., McCauley, J.L., Byrnes, J.K., Gignoux, C.R., Ortiz-Tello, P.A., Martínez, R.J., Hedges, D.J., Morris, R.W., et al.: Reconstructing the population genetic history of the caribbean. PLoS Genet. 9(11), e1003925 (2013)
Nelson, M.R., Bryc, K., King, K.S., Indap, A., Boyko, A.R., Novembre, J., Briley, L.P., Maruyama, Y., Waterworth, D.M., Waeber, G., et al.: The population reference sample, popres: a resource for population, disease, and pharmacological genetics research. Am. J. Hum. Genet. 83(3), 347–358 (2008)
Novembre, J., Johnson, T., Bryc, K., Kutalik, Z., Boyko, A.R., Auton, A., Indap, A., King, K.S., Bergmann, S., Nelson, M.R., et al.: Genes mirror geography within europe. Nature 456(7218), 98–101 (2008)
Price, A.L., Tandon, A., Patterson, N., Barnes, K.C., Rafaels, N., Ruczinski, I., Beaty, T.H., Mathias, R., Reich, D., Myers, S.: Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 5(6), e1000519 (2009)
Price, A.L., Zaitlen, N.A., Reich, D., Patterson, N.: New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11(7), 459–463 (2010)
Pritchard, J.K., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155(2), 945–959 (2000)
Seldin, M.F., Pasaniuc, B., Price, A.L.: New approaches to disease mapping in admixed populations. Nat. Rev. Genet. 12(8), 523–528 (2011)
Smith, M.W., O’Brien, S.J.: Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nat. Rev. Genet. 6(8), 623–632 (2005)
Wang, C., Zöllner, S., Rosenberg, N.A.: A quantitative comparison of the similarity between genes and geography in worldwide human populations. PLoS Genet. 8(8), e1002886 (2012)
Wegmann, D., Kessner, D., Veeramah, K., Mathias, R., Nicolae, D., Yanek, L., Sun, Y., Torgerson, D., Rafaels, N., Mosley, T., et al.: Recombination rates in admixed individuals identified by ancestry-based inference. Nat. Genet. 43(9), 847–853 (2011)
Wen, X., Stephens, M.: Using linear predictors to impute allele frequencies from summary or pooled genotype data. Ann. Appl. Stat. 4(3), 1158 (2010)
Yang, W.Y., Novembre, J., Eskin, E., Halperin, E.: A model-based approach for analysis of spatial structure in genetic data. Nat. Genet. 44(6), 725–731 (2012)
Yang, W.Y., Platt, A., Chiang, C.W.K., Eskin, E., Novembre, J., Pasaniuc, B.: Spatial localization of recent ancestors for admixed individuals. G3: Genes, Genomes, Genet. 4(12), 2505–2518 (2014)
Zhu, X., Luke, A., Cooper, R.S., Quertermous, T., Hanis, C., Mosley, T., Gu, C.C., Tang, H., Rao, D.C., Risch, N., et al.: Admixture mapping for hypertension loci with genome-scan markers. Nat. Genet. 37(2), 177–181 (2005)
Acknowledgements
E.H. is a faculty fellow of the Edmond J. Safra Center at Tel Aviv University. Y.B. was supported in part by a fellowship from the Edmond J. Safra Center for Bioinformatics at Tel-Aviv University. E.H. and Y.B. were also supported in part by the United States Israel Binational Science Foundation (grant 2012304), and by the National Science Foundation (grant III-1217615), and by the Israeli Science Foundation (grant 989/08). E.H, Y.B, and Y.M were partially supported by the German-Israeli Foundation (grant 1094-33.2/ 2010). E.H was also supported by the Israel Science Foundation (grant 1425/13).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Margalit, Y., Baran, Y., Halperin, E. (2015). Multiple-Ancestor Localization for Recently Admixed Individuals. In: Pop, M., Touzet, H. (eds) Algorithms in Bioinformatics. WABI 2015. Lecture Notes in Computer Science(), vol 9289. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48221-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-662-48221-6_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-48220-9
Online ISBN: 978-3-662-48221-6
eBook Packages: Computer ScienceComputer Science (R0)