Abstract
A genome wide association studies require genotyping DNA sequence of a large sample of individuals with and without the specific disease of interest. The current technologies of genotyping individual DNA sequence only genotype a limited DNA sequence of each individual in the study. As a result, a large fraction of Single Nucleotide Polymorphisms (SNPs) are not genotyped. Existing imputation methods are based on individual level data, which are often time consuming and costly. A new method, the Minimum Deviation of Conditional Probability (MiDCoP), was recently developed that aims at imputing the allele frequencies of the missing SNPs using the allele frequencies of neighboring SNPs without using the individual level SNP information. This article studies the performance of the MiDCoP approach using association analysis based on the imputed allele frequency by analyzing the GAIN Schizophrenia data. The results indicate that the choice of reference sets has strong impact on the performance. The imputation accuracy improves if the case and control data sets are imputed using a separate but better matched reference set, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Marchini, J., Howie, B., Myers, S., McVean, G., Donnelly, P.: A new multipoint method for genome-wide association studies by imputation of genotypes. Nature Genetics 39, 906–913 (2007)
Howie, B., Donnelly, P., Marchini, J.: A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics 5, e1000529 (2009)
Li, Y., Ding, J., Abecasis, G.R.: Mach 1.0: Rapid Haplotype Reconstruction and Missing Genotype Inference. The American Journal of Human Genetics 79, S2290 (2006)
Li, Y., Willer, C.J., Ding, J., Scheet, P., Abecasis, G.R.: MaCH: Using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genetic Epidemiology 35, 816–834 (2010)
Browning, B., Browning, S.R.: A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. The American Journal of Human Genetics 84, 210–223 (2009)
Guan, Y., Stephens, M.: Practical Issues in Imputation-Based Association Mapping. PLoS Genetics 4(12), e1000279 (2008), doi:10.1371/journal.pgen.1000279
Nicolae, D.L.: Testing untyped alleles (TUNA)-applications to genome-wide association studies. Genetic Epidemiology 30, 718–727 (2006)
Zaitlen, N., Kang, H.M., Eskin, E., Halperin, E.: Leveraging the HapMap correlation structure in association studies. American Journal of Human Genetics 80, 683–691 (2007)
Lin, D.Y., Hu, Y., Huang, B.: Simple and efficient analysis of disease association with missing genotype data. The American Journal of Human Genetics 82, 444–452 (2008)
Gautam, Y.: A novel approach of imputing untypes SNP using the allele frequencies of neighboring SNPs. Unpublished dissertation, Central Michigan University, USA (2014)
The International HapMap Consortium: Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010)
Zhang, L., Liu, J., Deng, H.W.: A multilocus linkage disequilibrium measure based on mutual information theory and its applications. Genetica 137, 355–364 (2009)
Database of Genotype and phenotype (dbGap): Available at Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine, http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap
Zheng, G., Yang, Y., Zhu, X., Elston, R.C.: Analysis of Genetic Association Studies. Springer, New York (2012)
The 1000 Genomes Project Consortium: An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Gautam, Y., Lee, C., Cheng, CI., Langefeld, C. (2015). An Evaluation of the MiDCoP Method for Imputing Allele Frequency in Genome Wide Association Studies. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. Studies in Computational Intelligence, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-319-10389-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-10389-1_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10388-4
Online ISBN: 978-3-319-10389-1
eBook Packages: EngineeringEngineering (R0)