Abstract
This paper reports on an investigation of disease discovery from genomic data, by methods which depart substantially from customary practices found in the investigation of genome-wide association studies. Such data in general are composed of the genomic content from two contrasting phenotypes, e.g., disease versus control populations, and the analysis proceeds under the hypothesis that populational dissimilarities might reveal disease risk alleles. The proposed suite of new methods is in part based on information theory (Shannon in Bell Syst Tech J 27:379–423, 1948a; Bell Syst Tech J 27:623–656, 1948b; Jaynes in Phys Rev 106:620–630, 1957), and strong evidence will be given of the effectiveness of this new approach. The methodology extends naturally and successfully to predicting genomic disposition to disease arising from large collections of weakly contributing genomic loci. Evidence will be advanced that the example of adult-onset diabetes (“type 2 diabetes”) is such a candidate disease, and in this case, probably for the first time, it can be demonstrated that disease prediction is possible. Another novel element of this study is the search and identification of potential beneficial genomic loci that may counter a disease. The generality of the methodology suggests that it might extend to other diseases.
Similar content being viewed by others
References
Altshuler D, Hirschhorn JN, Klannemark M, Lindgren CM, Vohl MC, Nemesh J, Lane CR, Schaffner SF, Bolk S, Brewer C et al (2000) The common ppargamma pro12ala polymorphism is associated with decreased risk of type 2 diabetes. Nat Genet 26:76–80
Bansal V, Libiger O, Torkamani A, Schork NJ (2010) Statistical analysis strategies for association studies involving rare variants. Nat Rev Genet 11:773–785
Bush WS, Moore JH (2012) Chapter 11: genome-wide association studies. PLoS Comput Biol 8:e1002822
Chakravarti A (2011) Genomics is not enough. Science 334:15
Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6:95–108
Jaynes ET (1957) Information theory and statistical mechanics. Phys Rev 106:620–630
Johnson AD, O’Donnell CJ (2009) An open access database of genome-wide association results. BMC Med Genet 10:6–6
Jonsson T, Atwal JK, Steinberg S, Snaedal J, Jonsson PV, Bjornsson S, Stefansson H, Sulem P, Gudbjartsson D, Maloney J et al (2012) A mutation in app protects against alzheimer/’s disease and age-related cognitive decline. Nature 488:96–99
Klein RJ, Zeiss C, Chew EY, Tsai J-Y, Sackler RS, Haynes C, Henning AK, SanGiovanni JP, Mane SM, Mayne ST et al (2005) Complement factor h polymorphism in age-related macular degeneration. Science 308:385–389
Kolata G (2012) Study says dna’s power to predict illness is limited. NY Times. 2 April 2012
Lander ES (1996) The new genomics: global views of biology. Science 274:536–539
Mahajan A, Go MJ, Zhang W, Below JE, Gaulton KJ, Ferreira T, Horikoshi M, Johnson AD, Ng MCY, Prokopenko I et al (2014) Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat Genet 46:234–244
Manolio TA (2010) Genomewide association studies and assessment of the risk of disease. N Engl J Med 363:166–176
McCarthy MM, Menzel S (2001) The genetics of type 2 diabetes. Br J Clin Pharmacol 51:195–199
Morris AP, Voight BF, Teslovich TM, Ferreira T, Segre AV, Steinthorsdottir V, Strawbridge RJ, Khan H, Grallert H, Mahajan A et al (2012) Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet 44:981–990
National Human Genome Resource Institute. A catalog of published genome-wide association studies. http://www.genome.gov/gwastudies/
Neel J (1982) The genetics of diabetes mellitus. Academic Press, London
Pe’er I, de Bakker PIW, Maller J, Yelensky R, Altshuler D, Daly MJ (2006) Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat Genet 38:663–667
Pritchard JK (2001) Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet 69(1):124–137
Pritchard JK, Cox NJ (2002) The allelic architecture of human disease genes: common disease-common variant... or not? Hum Mol Genet 11:2417–2423
Reich DE, Lander ES (2001) On the allelic spectrum of human disease. Trends Genet 17:502–510
Roberts NJ, Vogelstein JT, Parmigiani G, Kinzler KW, Vogelstein B, Velculescu VE (2012) The predictive capacity of personal genome sequencing. Sci Transl Med 4(133):133ra58
Shannon CE (1948a) A mathematical theory of communication. Bell Syst Tech J 27:379–423
Shannon CE (1948b) A mathematical theory of communication. Bell Syst Tech J 27:623–656
Sirovich L (1987) Turbulence and the dynamics of coherent structures, parts i, ii, and iii. Q Appl Math XLV:561–590
Sirovich L (2014) Genomic data and disease forecasting: application to type 2 diabetes (t2d). PLoS One 9:e85684
Valle T, Tuomilehto J, Bergman RN, Ghosh S, Hauser ER, Eriksson J, Nylund SJ, Kohtamaki K, Toivanen L, Vidgren G et al (1998) Mapping genes for niddm. Design of the Finland-United States Investigation of NIDDM Genetics (Fusion) Study. Diabetes Care 21:949–958
Wade N (2010) A decade later, genetic map yields few new cures. New York Times, New York
Acknowledgments
I thank Bruce Knight and Jon Victor for their comments on reading this manuscript, Jon for suggesting the terminology incremental information for (2.5), and Max Pensack for his help with the data. An essential element of this effort was the high quality Fusion database, which was acquired from National Institutes of Health at the request of the Rockefeller University Committee for Clinical and Translational Science [UL1RR024143], National Center for Research Resources, National Institutes of Health. Support for this project was provided by a grant from the Robertson which the author gratefully acknowledges. Finally, grateful thanks to Mitchell Feigenbaum and Bruce Knight for affording me the hospitality of Rockefeller University.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sirovich, L. A new structural approach to genomic discovery of disease: example of adult-onset diabetes. Biol Cybern 110, 383–391 (2016). https://doi.org/10.1007/s00422-016-0692-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00422-016-0692-8