Abstract
Genome wide association studies (GWAS) have discovered numerous loci involved in genetic traits. Virtually all studies have reported associations between individual single nucleotide polymorphism (SNP) and traits. However, it is likely that complex traits are influenced by interaction of multiple SNPs. One approach to detect interactions of SNPs is the brute force approach which performs a pairwise association test between a trait and each pair of SNPs. The brute force approach is often computationally infeasible because of the large number of SNPs collected in current GWAS studies. We propose a two-stage model, Threshold-based Efficient Pairwise Association Approach (TEPAA), to reduce the number of tests needed while maintaining almost identical power to the brute force approach. In the first stage, our method performs the single marker test on all SNPs and selects a subset of SNPs that achieve a certain significance threshold. In the second stage, we perform a pairwise association test between traits and pairs of the SNPs selected from the first stage. The key insight of our approach is that we derive the joint distribution between the association statistics of a single SNP and the association statistics of pairs of SNPs. This joint distribution allows us to provide guarantees that the statistical power of our approach will closely approximate the brute force approach. We applied our approach to the Northern Finland Birth Cohort data and achieved 63 times speedup while maintaining 99% of the power of the brute force approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Altshuler, D., Hirschhorn, J.N., Klannemark, M., Lindgren, C.M., Vohl, M.C., Nemesh, J., Lane, C.R., Schaffner, S.F., Bolk, S., Brewer, C., et al.: The common pparγ pro12ala polymorphism is associated with decreased risk of type 2 diabetes. Nature Genetics 26(1), 76–80 (2000)
Bertina, R.M., Koeleman, B.P.C., Koster, T., Rosendaal, F.R., Dirven, R.J., de Ronde, H., Van Der Velden, P.A., Reitsma, P.H., et al.: Mutation in blood coagulation factor v associated with resistance to activated protein c. Nature 369(6475), 64–67 (1994)
Brem, R.B., Storey, J.D., Whittle, J., Kruglyak, L.: Genetic interactions between polymorphisms that affect gene expression in yeast. Nature 436(7051), 701–703 (2005)
Brinza, D., Schultz, M., Tesler, G., Bafna, V.: Rapid detection of gene-gene interactions in genome-wide association studies. Bioinformatics (2010)
Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661–678 (2007)
Corder, E.H., Saunders, A.M., Strittmatter, W.J., Schmechel, D.E., Gaskell, P.C., Small, G.W., Roses, A.D., Haines, J.L., Pericak-Vance, M.A.: Gene dose of apolipoprotein e type 4 allele and the risk of alzheimer’s disease in late onset families. Science 261(5123), 921–923 (1993)
Evans, D.M., Marchini, J., Morris, A.P., Cardon, L.R.: Two-stage two-locus models in genome-wide association. PLoS Genet. 2(9), e157 (2006)
Han, B., Kang, H.M., Eskin, E.: Rapid and accurate multiple testing correction and power estimation for millions of correlated markers. PLoS Genet. 5, e1000456 (2009)
Kostem, E., Eskin, E.: Efficiently identifying significant associations in genome-wide association studies. J. Comput. Biol. 9 (2013)
Kostem, E., Lozano, J.A., Eskin, E.: Increasing power of genome-wide association studies by collecting additional snps. Genetics 188(2), 449–460 (2011)
Lin, D.Y.: An efficient monte carlo approach to assessing statistical significance in genomic studies. Bioinformatics 21(6), 781–787 (2005)
Listgarten, J., Lippert, C., Kang, E.Y., Xiang, J., Kadie, C.M., Heckerman, D.: A powerful and efficient set test for genetic markers that handles confounders. Bioinformatics 4 (2013)
Ljungberg, K., Holmgren, S., Carlborg, O.: Simultaneous search for multiple qtl using the global optimization algorithm direct. Bioinformatics 20(12), 1887–1895 (2004)
Millstein, J., Conti, D.V., Gilliland, F.D., Gauderman, W.J.: A testing framework for identifying susceptibility genes in the presence of epistasis. The American Journal of Human Genetics 78(1), 15–27 (2006)
Prabhu, S., Pe’er, I.: Ultrafast genome-wide scan for snp-snp interactions in common complex disease. Genome Research 22(11), 2230–2240 (2012)
Saxena, R., Voight, B.F., Lyssenko, V., Burtt, N.P., de Bakker, P.I.W., Chen, H., Roix, J.J., Kathiresan, S., Hirschhorn, J.N., Daly, M.J., Hughes, T.E., Groop, L., Altshuler, D., Almgren, P., Florez, J.C., Meyer, J., Ardlie, K., Bostrőm, K.B., Isomaa, B., Lettre, G., Lindblad, U., Lyon, H.N., Melander, O., Newton-Cheh, C., Nilsson, P., Orho-Melander, M., Rastam, L., Speliotes, E.K., Taskinen, M.-R.R., Tuomi, T., Guiducci, C., Berglund, A., Carlson, J., Gianniny, L., Hackett, R., Hall, L., Holmkvist, J., Laurila, E., Sjőgren, M., Sterner, M., Surti, A., Svensson, M., Svensson, M., Tewhey, R., Blumenstiel, B., Parkin, M., Defelice, M., Barry, R., Brodeur, W., Camarata, J., Chia, N., Fava, M., Gibbons, J., Handsaker, B., Healy, C., Nguyen, K., Gates, C., Sougnez, C., Gage, D., Nizzari, M., Gabriel, S.B., Chirn, G.-W.W., Ma, Q., Parikh, H., Richardson, D., Ricke, D., Purcell, S.: Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316(5829), 1331–1336 (2007)
Schpbach, T., Xenarios, I., Bergmann, S., Kapur, K.: Fastepistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics 26(11), 1468–1469 (2010)
Seaman, S.R., Muller-Myhsok, B.: Rapid simulation of p values for product methods and multiple-testing adjustment in association studies. American Journal of Human Genetics 76(3), 399–408 (2005)
Williams, S.M., Addy, J.H., Phillips, J.A., Dai, M., Kpodonu, J., Afful, J., Jackson, H., Joseph, K., Eason, F., Murray, M.M., Epperson, P., Aduonum, A., Wong, L.J., Jose, P.A., Felder, R.A.: Combinations of variations in multiple genes are associated with hypertension. Hypertension 36(1), 2–6 (2000)
Wu, M.C., Kraft, P., Epstein, M.P., Taylor, D.M., Chanock, S.J., Hunter, D.J., Lin, X.: Powerful snp-set analysis for case-control genome-wide association studies. Am. J. Hum. Genet. 86(6), 929–942 (2010)
Wu, M.C., Lee, S., Cai, T., Li, Y., Boehnke, M., Lin, X.: Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89(1), 82–93 (2011)
Xiang, W., Can, Y., Qiang, Y., Hong, X., Xiaodan, F., Nelson, T., Weichuan, Y.: Boost: A fast approach to detecting gene-gene interactions in genome-wide case-control studies. The American Journal of Human Genetics 87, 325–340 (2010)
Yanchina, E.D., Ivchik, T.V., Shvarts, E.I., Kokosov, A.N., Khodzhayantz, N.E.: Gene-gene interactions between glutathione-s transferase m1 and matrix metalloproteinase 9 in the formation of hereditary predisposition to chronic obstructive pulmonary disease. Bulletin of Experimental Biology and Medicine 137(1), 64–66 (2004)
Yang, C., He, Z., Wan, X., Yang, Q., Xue, H., Yu, W.: Snpharvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4), 504–511 (2009)
Zhang, X., Huang, S., Zou, F., Wang, W.: Team: efficient two-locus epistasis tests in human genome-wide association study. Bioinformatics 227, i217–i227 (2010)
Zhang, X., Pan, F., Xie, Y., Zou, F., Wang, W.: COE: A General Approach for Efficient Genome-Wide Two-Locus Epistasis Test in Disease Association Study. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 253–269. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, Z., Sul, J.H., Snir, S., Lozano, J.A., Eskin, E. (2014). Gene-Gene Interactions Detection Using a Two-Stage Model. In: Sharan, R. (eds) Research in Computational Molecular Biology. RECOMB 2014. Lecture Notes in Computer Science(), vol 8394. Springer, Cham. https://doi.org/10.1007/978-3-319-05269-4_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-05269-4_28
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05268-7
Online ISBN: 978-3-319-05269-4
eBook Packages: Computer ScienceComputer Science (R0)