Skip to main content

Random Forest and Gene Networks for Association of SNPs to Alzheimer’s Disease

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8213))

Abstract

Machine learning methods, such as Random Forest (RF), have been used to predict disease risk and select a set of single nucleotide polymorphisms (SNPs) associated to the disease on Genome-Wide Association Studies (GWAS). In this study, we extracted information from biological networks for selecting candidate SNPs to be used by RF, for predicting and ranking SNPs by importance measures. From an initial set of genes already related to a disease, we used the tool GeneMANIA for constructing gene interaction networks to find novel genes that might be associated with Alzheimer’s Disease (AD). Therefore, it is possible to extract a small number of SNPs making the application of RF feasible. The experiments conducted in this study focus on investigating which SNPs may influence the susceptibility to AD.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Thies, W., Bleiler, L.: Alzheimers disease facts and figures. Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association 7, 208–244 (2011)

    Article  Google Scholar 

  2. Wang, W.Y.S., Barratt, B.J., Clayton, D.G., Todd, J.A.: Genome-wide association studies: theoretical and practical concerns. Nature Reviews. Genetics 6, 109–118 (2005)

    Google Scholar 

  3. Bertram, L., McQueen, M.B., Mullin, K., Blacker, D., Tanzi, R.E.: Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database. Nature Genetics 39, 17–23 (2007)

    Article  Google Scholar 

  4. Saykin, A.J., et al.: Alzheimer’s Disease Neuroimaging Initiative biomarkers as quantitative phenotypes: Genetics core aims, progress, and plans. Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association 6, 265–273 (2010)

    Article  Google Scholar 

  5. Petersen, R.C., et al.: Alzheimer’s Disease Neuroimaging Initiative (ADNI) Clinical characterization. Neurology 74, 201–209 (2010)

    Article  Google Scholar 

  6. Kim, S., Misra, A.: SNP genotyping: technologies and biomedical applications. Annual Review of Biomedical Engineering 9, 289–320 (2007)

    Article  Google Scholar 

  7. Montojo, J., Zuberi, K., Rodriguez, H., Kazi, F., Wright, G., Donaldson, S.L., Morris, Q., Bader, G.D.: GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 26(22), 2927–2928 (2010)

    Article  Google Scholar 

  8. Ritchie, M.D.: Using biological knowledge to uncover the mystery in the search for epistasis in genome-wide association studies. Ann. Hum. Genet. 75(1), 172–182 (2011)

    Article  Google Scholar 

  9. Mostafavi, S., Ray, D., Warde-Farley, D., Grouios, C., Morris, Q.: GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome. Biol. 9(suppl. 1), S4 (2008)

    Google Scholar 

  10. Goldstein, B.A., Hubbard, A.E., Cutler, A., Barcellos, L.F.: An application of Random Forests to a genome-wide association dataset: methodological considerations & new findings. BMC Genetics 11, 49 (2010)

    Article  Google Scholar 

  11. Lunetta, K.L., Hayward, L.B., Segal, J., Van Eerdewegh, P.: Screening large-scale association study data: exploiting interactions using random forests. BMC Genet. 5, 32 (2004)

    Article  Google Scholar 

  12. Meng, Y.A., Yu, Y., Cupples, L.A., Farrer, L.A., Lunetta, K.L.: Performance of random forest when SNPs are in linkage disequilibrium. BMC Bioinformatics 10, 78 (2009)

    Article  Google Scholar 

  13. Purcell, S., Neale, B., Todd-Brown, K., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007)

    Article  Google Scholar 

  14. Liaw, A., Wiener, M.: Classification and Regression by randomForest. R News 2, 18–22 (2002)

    Google Scholar 

  15. Breiman, L.: Random Forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  16. Heidema, A.G., Boer, J.M., Nagelkerke, N., Mariman, E.C., van der A, D.L., Feskens, E.J.: The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases. BMC Genet. 7, 23 (2006)

    Article  Google Scholar 

  17. Glaser, B., Nikolov, I., Chubb, D., Hamshere, M.L., Segurado, R., Moskvina, V., Holmans, P.: Analyses of single marker and pairwise effects of candidate loci for rheumatoid arthritis using logistic regression and random forests. BMC Proc. 1(suppl. 1), S54 (2007)

    Google Scholar 

  18. Liu, C., Ackerman, H.H., Carulli, J.P.: A genome-wide screen of gene-gene interactions for rheumatoid arthritis susceptibility. Hum. Genet. 129(5), 473–485 (2011)

    Article  Google Scholar 

  19. Sun, Y.V., Cai, Z., Desai, K., Lawrance, R., Leff, R., Jawaid, A., Kardia, S.L., Yang, H.: Classification of rheumatoid arthritis status with candidate gene and genome-wide single-nucleotide polymorphisms using random forests. BMC Proc. 1(suppl. 1), S62 (2007)

    Google Scholar 

  20. Araujo, G., Costa, I.G., Souza, M., Oliveira, J.R.M.: An Experimental Application of Random Forest on ADNI Genotype Dataset. In: Digital Proceedings of Brazilian Symposium on Bioinformatics, Campo Grande, pp. 68–73. SBC, Porto Alegre (2012)

    Google Scholar 

  21. Di Paolo, G., Kim, T.W.: Linking lipids to Alzheimer’s disease: cholesterol and beyond. Nat. Rev. Neurosci. 12(5), 284–296 (2011)

    Article  Google Scholar 

  22. Hirsch-Reinshagen, V., Burgess, B., Wellington, C.: Why lipids are important for Alzheimer disease? Molecular and Cellular Biochemistry 326(1), 121–129 (2009)

    Article  Google Scholar 

  23. Holtzman, D.M., Herz, J., Bu, G.: Apolipoprotein e and apolipoprotein e receptors: normal biology and roles in Alzheimer disease. Cold Spring Harb. Perspect. Med. 2(3), a006312(2012)

    Google Scholar 

  24. Wu, F., Yao, P.J.: Clathrin-mediated endocytosis and Alzheimer’s disease: an update. Ageing Res. Rev. 8(3), 147–149 (2009)

    Article  Google Scholar 

  25. McMahon, H.T., Boucrot, E.: Molecular mechanism and physiological functions of clathrin-mediated endocytosis. Nat. Rev. Mol. Cell Biol. 12(8), 517–533 (2011)

    Article  Google Scholar 

  26. Chatr-Aryamontri, A., Breitkreutz, B.J., Heinicke, S., Boucher, L., Winter, A., Stark, C., Nixon, J., Ramage, L., Kolas, N., O’Donnell, L., Reguly, T., Breitkreutz, A., Sellam, A., Chen, D., Chang, C., Rust, J., Livstone, M., Oughtred, R., Dolinski, K., Tyers, M.: The BioGRID interaction database: 2013 update. Nucleic Acids Res. 41(Database issue), D816-D823 (2013)

    Google Scholar 

  27. Barrett, T., Wilhite, S.E., Ledoux, P., Evangelista, C., Kim, I.F., Tomashevsky, M., Marshall, K.A., Phillippy, K.H., Sherman, P.M., Holko, M., Yefanov, A., Lee, H., Zhang, N., Robertson, C.L., Serova, N., Davis, S., Soboleva, A.: NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 41(Database issue), D991-D995 (2013)

    Google Scholar 

  28. Cerami, E.G., Gross, B.E., Demir, E., Rodchenkov, I., Babur, O., Anwar, N., Schultz, N., Bader, G.D., Sander, C.: Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res. 39(Database issue), D685-D690 (2011)

    Google Scholar 

  29. Brown, K.R., Jurisica, I.: Online Predicted Human Interaction Database. Bioinformatics 21(9), 2076–2082 (2005)

    Article  Google Scholar 

  30. Bush, W.S., Moore, J.H.: Chapter 11: Genome-wide association studies. PLoS Comput. Biol. 8(12), e1002822 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Araújo, G.S., Souza, M.R.B., Oliveira, J.R.M., Costa, I.G. (2013). Random Forest and Gene Networks for Association of SNPs to Alzheimer’s Disease. In: Setubal, J.C., Almeida, N.F. (eds) Advances in Bioinformatics and Computational Biology. BSB 2013. Lecture Notes in Computer Science(), vol 8213. Springer, Cham. https://doi.org/10.1007/978-3-319-02624-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02624-4_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02623-7

  • Online ISBN: 978-3-319-02624-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics