skip to main content
10.1145/3543377.3543387acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbtConference Proceedingsconference-collections
research-article

Heritability, genetic variation, and the number of risk SNPs effect on deep learning and polygenic risk scores AUC

Published: 08 August 2022 Publication History

Abstract

For genotype-phenotype classification, many methods are used, like polygenic risk scores and deep learning, each using a different computation technique. The performance of each method varies depending on the genetic variation and is measured by accuracy or area under the curve (AUC). This article investigates the relationship between deep learning classifiers and polygenic risk scores performance for genotype-phenotype classification with respect to variation in heritability, genetic variation, and the number of risk SNP (400 different datasets of 5000 people) through extensive computation. These variation helps to find an optimal classifier for a dataset with specific heritability and an expected score for a specific case/control classification.
The deep learning classifier AUC decreases with an increase in heritability, whereas the polygenic risk scores AUC improves. The machine-learning algorithm has low AUC for high genetic variation, but for low genetic variation, AUC is high. PRS tools have the opposite behavior; for high genetic variation, the PRS tools have high AUC compared to low genetic variation data sets.
The article gives a basic template showing deep learning or PRS tools should be used depending on the heritability and genetic variation of the dataset. All the code segments are available publically to generate datasets with different parameters and explore such patterns.

References

[1]
Z.L. Awdeh and Chester A. Alper. 2005. Mendelian inheritance of polygenic diseases: a hypothetical basis for increasing incidence. Medical Hypotheses 64, 3 (Jan. 2005), 495–498. https://doi.org/10.1016/j.mehy.2004.08.025
[2]
Hossein Darvish, Luis J. Azcona, Abbas Tafakhori, Roxana Mesias, Azadeh Ahmadifard, Elena Sanchez, Arman Habibi, Elham Alehabib, Amir Hossein Johari, Babak Emamalizadeh, Faezeh Jamali, Marjan Chapi, Javad Jamshidi, Yuji Kajiwara, and Coro Paisán-Ruiz. 2020. Phenotypic and genotypic characterization of families with complex intellectual disability identified pathogenic genetic variations in known and novel disease genes. Scientific Reports 10, 1 (Jan. 2020). https://doi.org/10.1038/s41598-020-57929-4
[3]
Cathy E. Elks, Marcel den Hoed, Jing Hua Zhao, Stephen J. Sharp, Nicholas J. Wareham, Ruth J. F. Loos, and Ken K. Ong. 2012. Variability in the Heritability of Body Mass Index: A Systematic Review and Meta-Regression. Frontiers in Endocrinology 3 (2012). https://doi.org/10.3389/fendo.2012.00029
[4]
Anders Forsman. 2013. Effects of genotypic and phenotypic variation on establishment are important for conservation, invasion, and infection biology. Proceedings of the National Academy of Sciences 111, 1 (Dec. 2013), 302–307. https://doi.org/10.1073/pnas.1317745111
[5]
Tian Ge, Avram J. Holmes, Randy L. Buckner, Jordan W. Smoller, and Mert R. Sabuncu. 2017. Heritability analysis with repeat measurements and its application to resting-state functional connectivity. Proceedings of the National Academy of Sciences 114, 21 (May 2017), 5521–5526. https://doi.org/10.1073/pnas.1700765114
[6]
Virginia W Gitonga, Carole FS Koning-Boucoiran, Kathryn Verlinden, Oene Dolstra, Richard GF Visser, Chris Maliepaard, and Frans A Krens. 2014. Genetic variation, heritability and genotype by environment interaction of morphological traits in a tetraploid rose population. BMC Genetics 15, 1 (Dec. 2014). https://doi.org/10.1186/s12863-014-0146-z
[7]
Yanting Han and Ralph Adolphs. 2020. Estimating the heritability of psychological measures in the Human Connectome Project dataset. PLOS ONE 15, 7 (July 2020), e0235860. https://doi.org/10.1371/journal.pone.0235860
[8]
Gareth J Hollands, David P French, Simon J Griffin, A Toby Prevost, Stephen Sutton, Sarah King, and Theresa M Marteau. 2016. The impact of communicating genetic risks of disease on risk-reducing health behaviour: systematic review with meta-analysis. BMJ (March 2016), i1102. https://doi.org/10.1136/bmj.i1102
[9]
Arshad Iqbal, Iftikhar Hussain Khalil, Mehar Ali Shah, and Muhammad Sharif Kakar. 2017. Estimation of Heritability, Genetic Advance and Correlation for Marphological Traits in Spring Wheat. Sarhad Journal of Agriculture 33, 4 (Nov. 2017). https://doi.org/10.17582/journal.sja/2017/33.4.674.679
[10]
Joeri A Jansweijer, Karin Y van Spaendonck-Zwarts, Michael W T Tanck, J Peter van Tintelen, Imke Christiaans, Jasper J van der Smagt, Alexa M C Vermeer, J Martijn Bos, Arthur J Moss, Heikki Swan, Sylvia G Priori, Annika Rydberg, Jacob Tfelt-Hansen, Michael J Ackerman, Iacopo Olivotto, Philippe Charron, Juan R Gimeno, Maarten P van den Berg, Arthur AM Wilde, and Yigal M Pinto. 2019. Heritability in genetic heart disease: the role of genetic background. Open Heart 6, 1 (May 2019), e000929. https://doi.org/10.1136/openhrt-2018-000929
[11]
Andrew D. Johnson. 2009. Single-Nucleotide Polymorphism Bioinformatics. Circulation: Cardiovascular Genetics 2, 5 (Oct. 2009), 530–536. https://doi.org/10.1161/circgenetics.109.872010
[12]
Chandramohanan KT and Neethu Narayanan. 2018. Study of heritability, genetic advance and variability in scoparia dulcis L.Forestry Research and Engineering: International Journal 2, 4 (July 2018). https://doi.org/10.15406/freij.2018.02.00050
[13]
J. Little, L. Bradley, M. S. Bray, M. Clyne, J. Dorman, D. L. Ellsworth, J. Hanson, M. Khoury, J. Lau, T. R. O'Brien, N. Rothman, D. Stroup, E. Taioli, D. Thomas, H. Vainio, S. Wacholder, and C. Weinberg. 2002. Reporting, Appraising, and Integrating Data on Genotype Prevalence and Gene-Disease Associations. American Journal of Epidemiology 156, 4 (Aug. 2002), 300–310. https://doi.org/10.1093/oxfordjournals.aje.a000179
[14]
Zhanshan (Sam) Ma, Lianwei Li, and Ya-Ping Zhang. 2020. Defining Individual-Level Genetic Diversity and Similarity Profiles. Scientific Reports 10, 1 (April 2020). https://doi.org/10.1038/s41598-020-62362-8
[15]
The Tien Mai, Paul Turner, and Jukka Corander. 2021. Boosting heritability: estimating the genetic component of phenotypic variation with multiple sample splitting. BMC Bioinformatics 22, 1 (March 2021). https://doi.org/10.1186/s12859-021-04079-7
[16]
Alexandra J. Mayhew and David Meyre. 2017. Assessing the Heritability of Complex Traits in Humans: Methodological Challenges and Opportunities. Current Genomics 18, 4 (July 2017). https://doi.org/10.2174/1389202918666170307161450
[17]
Hannah Verena Meyer and Ewan Birney. 2018. PhenotypeSimulator: A comprehensive framework for simulating multi-trait, multi-locus genotype to phenotype relationships. Bioinformatics 34, 17 (March 2018), 2951–2956. https://doi.org/10.1093/bioinformatics/bty197
[18]
David S. Moore and David Shenk. 2016. The heritability fallacy. Wiley Interdisciplinary Reviews: Cognitive Science 8, 1-2 (Dec. 2016), e1400. https://doi.org/10.1002/wcs.1400
[19]
Muhammad Muneeb, Samuel Feng, and Andreas Henschel. 2022. An empirical comparison between polygenic risk scores and machine learning for case/control classification. (Feb. 2022). https://doi.org/10.21203/rs.3.rs-1298372/v1
[20]
Muhammad Muneeb and Andreas Henschel. 2021. Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods. BMC Bioinformatics 22, 1 (April 2021). https://doi.org/10.1186/s12859-021-04077-9
[21]
Virginie Orgogozo, Baptiste Morizot, and Arnaud Martin. 2015. The differential view of genotype–phenotype relationships. Frontiers in Genetics 6 (May 2015). https://doi.org/10.3389/fgene.2015.00179
[22]
Francis Robert and Jerry Pelletier. 2018. Exploring the Impact of Single-Nucleotide Polymorphisms on Translation. Frontiers in Genetics 9 (Oct. 2018). https://doi.org/10.3389/fgene.2018.00507
[23]
S A Saidon, R Kamaruzaman, M S F A Razak, A Ramli, H M Sarif, Z M Zuki, S N A Rahman, T Devarajan, and E Sunian. 2020. Studies on heritability and genetic variability for grain physical properties in Malaysian rice germplasm. IOP Conference Series: Earth and Environmental Science 482, 1 (March 2020), 012022. https://doi.org/10.1088/1755-1315/482/1/012022
[24]
H.C. Slavkin. 2014. From Phenotype to Genotype. Journal of Dental Research 93, 7_suppl (May 2014), 3S–6S. https://doi.org/10.1177/0022034514533569
[25]
Lingtao Su, Guixia Liu, Han Wang, Yuan Tian, Zhihui Zhou, Liang Han, and Lun Yan. 2015. Research on Single Nucleotide Polymorphisms Interaction Detection from Network Perspective. PLOS ONE 10, 3 (March 2015), e0119146. https://doi.org/10.1371/journal.pone.0119146
[26]
Albert Tenesa and Chris S. Haley. 2013. The heritability of human disease: estimation, uses and abuses. Nature Reviews Genetics 14, 2 (Jan. 2013), 139–149. https://doi.org/10.1038/nrg3377
[27]
Eva Vallejos-Vidal, Sebastián Reyes-Cerpa, Jaime Andrés Rivas-Pardo, Kevin Maisey, José M. Yáñez, Hector Valenzuela, Pablo A. Cea, Victor Castro-Fernandez, Lluis Tort, Ana M. Sandino, Mónica Imarai, and Felipe E. Reyes-López. 2020. Single-Nucleotide Polymorphisms (SNP) Mining and Their Effect on the Tridimensional Protein Structure Prediction in a Set of Immunity-Related Expressed Sequence Tags (EST) in Atlantic Salmon (Salmo salar). Frontiers in Genetics 10 (Feb. 2020). https://doi.org/10.3389/fgene.2019.01406
[28]
Zhi Wei, Kai Wang, Hui-Qi Qu, Haitao Zhang, Jonathan Bradfield, Cecilia Kim, Edward Frackleton, Cuiping Hou, Joseph T. Glessner, Rosetta Chiavacci, Charles Stanley, Dimitri Monos, Struan F. A. Grant, Constantin Polychronakos, and Hakon Hakonarson. 2009. From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes. PLoS Genetics 5, 10 (Oct. 2009), e1000678. https://doi.org/10.1371/journal.pgen.1000678
[29]
Y.E. Willems, N. Boesen, J. Li, C. Finkenauer, and M. Bartels. 2019. The heritability of self-control: A meta-analysis. Neuroscience & Biobehavioral Reviews 100 (May 2019), 324–334. https://doi.org/10.1016/j.neubiorev.2019.02.012
[30]
Charles S Wondji, Janet Hemingway, and Hilary Ranson. 2007. Identification and analysis of Single Nucleotide Polymorphisms (SNPs) in the mosquito Anopheles funestus, malaria vector. BMC Genomics 8, 1 (Jan. 2007). https://doi.org/10.1186/1471-2164-8-5
[31]
Naomi R. Wray, Michael E. Goddard, and Peter M. Visscher. 2007. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Research 17, 10 (Sept. 2007), 1520–1528. https://doi.org/10.1101/gr.6665407
[32]
Dan Zhou, Dongmei Yu, Jeremiah M. Scharf, Carol A. Mathews, Lauren McGrath, Edwin Cook, S. Hong Lee, Lea K. Davis, and Eric R. Gamazon. 2021. Contextualizing genetic risk score for disease screening and rare variant discovery. Nature Communications 12, 1 (July 2021). https://doi.org/10.1038/s41467-021-24387-z

Cited By

View all
  • (2022)Transfer learning for genotype–phenotype prediction using deep learning modelsBMC Bioinformatics10.1186/s12859-022-05036-823:1Online publication date: 29-Nov-2022
  1. Heritability, genetic variation, and the number of risk SNPs effect on deep learning and polygenic risk scores AUC

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICBBT '22: Proceedings of the 14th International Conference on Bioinformatics and Biomedical Technology
      May 2022
      190 pages
      ISBN:9781450396387
      DOI:10.1145/3543377
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 August 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. applied deep learning
      2. genetic variation
      3. heritability
      4. polygenic risk scores
      5. risk SNPs

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      • Khalifa University of Science and Technology

      Conference

      ICBBT 2022

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)32
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Transfer learning for genotype–phenotype prediction using deep learning modelsBMC Bioinformatics10.1186/s12859-022-05036-823:1Online publication date: 29-Nov-2022

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media