skip to main content
10.1145/3543377.3543387acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbtConference Proceedingsconference-collections
research-article

Heritability, genetic variation, and the number of risk SNPs effect on deep learning and polygenic risk scores AUC

Authors Info & Claims
Published:08 August 2022Publication History

ABSTRACT

For genotype-phenotype classification, many methods are used, like polygenic risk scores and deep learning, each using a different computation technique. The performance of each method varies depending on the genetic variation and is measured by accuracy or area under the curve (AUC). This article investigates the relationship between deep learning classifiers and polygenic risk scores performance for genotype-phenotype classification with respect to variation in heritability, genetic variation, and the number of risk SNP (400 different datasets of 5000 people) through extensive computation. These variation helps to find an optimal classifier for a dataset with specific heritability and an expected score for a specific case/control classification.

The deep learning classifier AUC decreases with an increase in heritability, whereas the polygenic risk scores AUC improves. The machine-learning algorithm has low AUC for high genetic variation, but for low genetic variation, AUC is high. PRS tools have the opposite behavior; for high genetic variation, the PRS tools have high AUC compared to low genetic variation data sets.

The article gives a basic template showing deep learning or PRS tools should be used depending on the heritability and genetic variation of the dataset. All the code segments are available publically to generate datasets with different parameters and explore such patterns.

References

  1. Z.L. Awdeh and Chester A. Alper. 2005. Mendelian inheritance of polygenic diseases: a hypothetical basis for increasing incidence. Medical Hypotheses 64, 3 (Jan. 2005), 495–498. https://doi.org/10.1016/j.mehy.2004.08.025Google ScholarGoogle ScholarCross RefCross Ref
  2. Hossein Darvish, Luis J. Azcona, Abbas Tafakhori, Roxana Mesias, Azadeh Ahmadifard, Elena Sanchez, Arman Habibi, Elham Alehabib, Amir Hossein Johari, Babak Emamalizadeh, Faezeh Jamali, Marjan Chapi, Javad Jamshidi, Yuji Kajiwara, and Coro Paisán-Ruiz. 2020. Phenotypic and genotypic characterization of families with complex intellectual disability identified pathogenic genetic variations in known and novel disease genes. Scientific Reports 10, 1 (Jan. 2020). https://doi.org/10.1038/s41598-020-57929-4Google ScholarGoogle ScholarCross RefCross Ref
  3. Cathy E. Elks, Marcel den Hoed, Jing Hua Zhao, Stephen J. Sharp, Nicholas J. Wareham, Ruth J. F. Loos, and Ken K. Ong. 2012. Variability in the Heritability of Body Mass Index: A Systematic Review and Meta-Regression. Frontiers in Endocrinology 3 (2012). https://doi.org/10.3389/fendo.2012.00029Google ScholarGoogle Scholar
  4. Anders Forsman. 2013. Effects of genotypic and phenotypic variation on establishment are important for conservation, invasion, and infection biology. Proceedings of the National Academy of Sciences 111, 1 (Dec. 2013), 302–307. https://doi.org/10.1073/pnas.1317745111Google ScholarGoogle Scholar
  5. Tian Ge, Avram J. Holmes, Randy L. Buckner, Jordan W. Smoller, and Mert R. Sabuncu. 2017. Heritability analysis with repeat measurements and its application to resting-state functional connectivity. Proceedings of the National Academy of Sciences 114, 21 (May 2017), 5521–5526. https://doi.org/10.1073/pnas.1700765114Google ScholarGoogle ScholarCross RefCross Ref
  6. Virginia W Gitonga, Carole FS Koning-Boucoiran, Kathryn Verlinden, Oene Dolstra, Richard GF Visser, Chris Maliepaard, and Frans A Krens. 2014. Genetic variation, heritability and genotype by environment interaction of morphological traits in a tetraploid rose population. BMC Genetics 15, 1 (Dec. 2014). https://doi.org/10.1186/s12863-014-0146-zGoogle ScholarGoogle ScholarCross RefCross Ref
  7. Yanting Han and Ralph Adolphs. 2020. Estimating the heritability of psychological measures in the Human Connectome Project dataset. PLOS ONE 15, 7 (July 2020), e0235860. https://doi.org/10.1371/journal.pone.0235860Google ScholarGoogle ScholarCross RefCross Ref
  8. Gareth J Hollands, David P French, Simon J Griffin, A Toby Prevost, Stephen Sutton, Sarah King, and Theresa M Marteau. 2016. The impact of communicating genetic risks of disease on risk-reducing health behaviour: systematic review with meta-analysis. BMJ (March 2016), i1102. https://doi.org/10.1136/bmj.i1102Google ScholarGoogle Scholar
  9. Arshad Iqbal, Iftikhar Hussain Khalil, Mehar Ali Shah, and Muhammad Sharif Kakar. 2017. Estimation of Heritability, Genetic Advance and Correlation for Marphological Traits in Spring Wheat. Sarhad Journal of Agriculture 33, 4 (Nov. 2017). https://doi.org/10.17582/journal.sja/2017/33.4.674.679Google ScholarGoogle ScholarCross RefCross Ref
  10. Joeri A Jansweijer, Karin Y van Spaendonck-Zwarts, Michael W T Tanck, J Peter van Tintelen, Imke Christiaans, Jasper J van der Smagt, Alexa M C Vermeer, J Martijn Bos, Arthur J Moss, Heikki Swan, Sylvia G Priori, Annika Rydberg, Jacob Tfelt-Hansen, Michael J Ackerman, Iacopo Olivotto, Philippe Charron, Juan R Gimeno, Maarten P van den Berg, Arthur AM Wilde, and Yigal M Pinto. 2019. Heritability in genetic heart disease: the role of genetic background. Open Heart 6, 1 (May 2019), e000929. https://doi.org/10.1136/openhrt-2018-000929Google ScholarGoogle ScholarCross RefCross Ref
  11. Andrew D. Johnson. 2009. Single-Nucleotide Polymorphism Bioinformatics. Circulation: Cardiovascular Genetics 2, 5 (Oct. 2009), 530–536. https://doi.org/10.1161/circgenetics.109.872010Google ScholarGoogle ScholarCross RefCross Ref
  12. Chandramohanan KT and Neethu Narayanan. 2018. Study of heritability, genetic advance and variability in scoparia dulcis L.Forestry Research and Engineering: International Journal 2, 4 (July 2018). https://doi.org/10.15406/freij.2018.02.00050Google ScholarGoogle Scholar
  13. J. Little, L. Bradley, M. S. Bray, M. Clyne, J. Dorman, D. L. Ellsworth, J. Hanson, M. Khoury, J. Lau, T. R. O'Brien, N. Rothman, D. Stroup, E. Taioli, D. Thomas, H. Vainio, S. Wacholder, and C. Weinberg. 2002. Reporting, Appraising, and Integrating Data on Genotype Prevalence and Gene-Disease Associations. American Journal of Epidemiology 156, 4 (Aug. 2002), 300–310. https://doi.org/10.1093/oxfordjournals.aje.a000179Google ScholarGoogle ScholarCross RefCross Ref
  14. Zhanshan (Sam) Ma, Lianwei Li, and Ya-Ping Zhang. 2020. Defining Individual-Level Genetic Diversity and Similarity Profiles. Scientific Reports 10, 1 (April 2020). https://doi.org/10.1038/s41598-020-62362-8Google ScholarGoogle ScholarCross RefCross Ref
  15. The Tien Mai, Paul Turner, and Jukka Corander. 2021. Boosting heritability: estimating the genetic component of phenotypic variation with multiple sample splitting. BMC Bioinformatics 22, 1 (March 2021). https://doi.org/10.1186/s12859-021-04079-7Google ScholarGoogle ScholarCross RefCross Ref
  16. Alexandra J. Mayhew and David Meyre. 2017. Assessing the Heritability of Complex Traits in Humans: Methodological Challenges and Opportunities. Current Genomics 18, 4 (July 2017). https://doi.org/10.2174/1389202918666170307161450Google ScholarGoogle ScholarCross RefCross Ref
  17. Hannah Verena Meyer and Ewan Birney. 2018. PhenotypeSimulator: A comprehensive framework for simulating multi-trait, multi-locus genotype to phenotype relationships. Bioinformatics 34, 17 (March 2018), 2951–2956. https://doi.org/10.1093/bioinformatics/bty197Google ScholarGoogle ScholarCross RefCross Ref
  18. David S. Moore and David Shenk. 2016. The heritability fallacy. Wiley Interdisciplinary Reviews: Cognitive Science 8, 1-2 (Dec. 2016), e1400. https://doi.org/10.1002/wcs.1400Google ScholarGoogle Scholar
  19. Muhammad Muneeb, Samuel Feng, and Andreas Henschel. 2022. An empirical comparison between polygenic risk scores and machine learning for case/control classification. (Feb. 2022). https://doi.org/10.21203/rs.3.rs-1298372/v1Google ScholarGoogle Scholar
  20. Muhammad Muneeb and Andreas Henschel. 2021. Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods. BMC Bioinformatics 22, 1 (April 2021). https://doi.org/10.1186/s12859-021-04077-9Google ScholarGoogle Scholar
  21. Virginie Orgogozo, Baptiste Morizot, and Arnaud Martin. 2015. The differential view of genotype–phenotype relationships. Frontiers in Genetics 6 (May 2015). https://doi.org/10.3389/fgene.2015.00179Google ScholarGoogle Scholar
  22. Francis Robert and Jerry Pelletier. 2018. Exploring the Impact of Single-Nucleotide Polymorphisms on Translation. Frontiers in Genetics 9 (Oct. 2018). https://doi.org/10.3389/fgene.2018.00507Google ScholarGoogle Scholar
  23. S A Saidon, R Kamaruzaman, M S F A Razak, A Ramli, H M Sarif, Z M Zuki, S N A Rahman, T Devarajan, and E Sunian. 2020. Studies on heritability and genetic variability for grain physical properties in Malaysian rice germplasm. IOP Conference Series: Earth and Environmental Science 482, 1 (March 2020), 012022. https://doi.org/10.1088/1755-1315/482/1/012022Google ScholarGoogle ScholarCross RefCross Ref
  24. H.C. Slavkin. 2014. From Phenotype to Genotype. Journal of Dental Research 93, 7_suppl (May 2014), 3S–6S. https://doi.org/10.1177/0022034514533569Google ScholarGoogle ScholarCross RefCross Ref
  25. Lingtao Su, Guixia Liu, Han Wang, Yuan Tian, Zhihui Zhou, Liang Han, and Lun Yan. 2015. Research on Single Nucleotide Polymorphisms Interaction Detection from Network Perspective. PLOS ONE 10, 3 (March 2015), e0119146. https://doi.org/10.1371/journal.pone.0119146Google ScholarGoogle Scholar
  26. Albert Tenesa and Chris S. Haley. 2013. The heritability of human disease: estimation, uses and abuses. Nature Reviews Genetics 14, 2 (Jan. 2013), 139–149. https://doi.org/10.1038/nrg3377Google ScholarGoogle ScholarCross RefCross Ref
  27. Eva Vallejos-Vidal, Sebastián Reyes-Cerpa, Jaime Andrés Rivas-Pardo, Kevin Maisey, José M. Yáñez, Hector Valenzuela, Pablo A. Cea, Victor Castro-Fernandez, Lluis Tort, Ana M. Sandino, Mónica Imarai, and Felipe E. Reyes-López. 2020. Single-Nucleotide Polymorphisms (SNP) Mining and Their Effect on the Tridimensional Protein Structure Prediction in a Set of Immunity-Related Expressed Sequence Tags (EST) in Atlantic Salmon (Salmo salar). Frontiers in Genetics 10 (Feb. 2020). https://doi.org/10.3389/fgene.2019.01406Google ScholarGoogle Scholar
  28. Zhi Wei, Kai Wang, Hui-Qi Qu, Haitao Zhang, Jonathan Bradfield, Cecilia Kim, Edward Frackleton, Cuiping Hou, Joseph T. Glessner, Rosetta Chiavacci, Charles Stanley, Dimitri Monos, Struan F. A. Grant, Constantin Polychronakos, and Hakon Hakonarson. 2009. From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes. PLoS Genetics 5, 10 (Oct. 2009), e1000678. https://doi.org/10.1371/journal.pgen.1000678Google ScholarGoogle ScholarCross RefCross Ref
  29. Y.E. Willems, N. Boesen, J. Li, C. Finkenauer, and M. Bartels. 2019. The heritability of self-control: A meta-analysis. Neuroscience & Biobehavioral Reviews 100 (May 2019), 324–334. https://doi.org/10.1016/j.neubiorev.2019.02.012Google ScholarGoogle Scholar
  30. Charles S Wondji, Janet Hemingway, and Hilary Ranson. 2007. Identification and analysis of Single Nucleotide Polymorphisms (SNPs) in the mosquito Anopheles funestus, malaria vector. BMC Genomics 8, 1 (Jan. 2007). https://doi.org/10.1186/1471-2164-8-5Google ScholarGoogle ScholarCross RefCross Ref
  31. Naomi R. Wray, Michael E. Goddard, and Peter M. Visscher. 2007. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Research 17, 10 (Sept. 2007), 1520–1528. https://doi.org/10.1101/gr.6665407Google ScholarGoogle ScholarCross RefCross Ref
  32. Dan Zhou, Dongmei Yu, Jeremiah M. Scharf, Carol A. Mathews, Lauren McGrath, Edwin Cook, S. Hong Lee, Lea K. Davis, and Eric R. Gamazon. 2021. Contextualizing genetic risk score for disease screening and rare variant discovery. Nature Communications 12, 1 (July 2021). https://doi.org/10.1038/s41467-021-24387-zGoogle ScholarGoogle Scholar
  1. Heritability, genetic variation, and the number of risk SNPs effect on deep learning and polygenic risk scores AUC

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICBBT '22: Proceedings of the 14th International Conference on Bioinformatics and Biomedical Technology
        May 2022
        190 pages
        ISBN:9781450396387
        DOI:10.1145/3543377

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 August 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format