Abstract
Susceptibility to Alzheimer’s disease is likely due to complex interaction among many genetic and environmental factors. Identifying complex genetic effects in large data sets will require computational methods that extend beyond what parametric statistical methods such as logistic regression can provide. We have previously introduced a computational evolution system (CES) that uses genetic programming (GP) to represent genetic models of disease and to search for optimal models in a rugged fitness landscape that is effectively infinite in size. The CES approach differs from other GP approaches in that it is able to learn how to solve the problem by generating its own operators. A key feature is the ability for the operators to use expert knowledge to guide the stochastic search. We have previously shown that CES is able to discover nonlinear genetic models of disease susceptibility in both simulated and real data. The goal of the present study was to introduce a measure of interestingness into the modeling process. Here, we define interestingness as a measure of non-additive gene-gene interactions. That is, we are more interested in those CES models that include attributes that exhibit synergistic effects on disease risk. To implement this new feature we first pre-processed the data to measure all pairwise gene-gene interaction effects using entropy-based methods. We then provided these pre-computed measures to CES as expert knowledge and as one of three fitness criteria in three-dimensional Pareto optimization. We applied this new CES algorithm to an Alzheimer’s disease data set with approximately 520,000 genetic attributes. We show that this approach discovers more interesting models with the added benefit of improving classification accuracy. This study demonstrates the applicability of CES to genome-wide genetic analysis using expert knowledge derived from measures of interestingness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Banzhaf W, Francone FD, Keller RE, Nordin P (1998) Genetic programming: an introduction on the automatic evolution of computer programs and its applications. Morgan Kaufmann, San Francisco
Banzhaf W, Beslon G, Christensen S, Foster J, Képès F, Lefort V, Miller J, Radman M, Ramsden J (2006) From artificial evolution to computational evolution: a research agenda. Nat Rev Genet 7:729–735
Bertram L, Tanzi RE (2012) The genetics of Alzheimer’s disease. Prog Mol Biol Transl Sci 107:79–100. doi:10.1016/B978-0-12-385883-2.00008-4
Bullock JM, Medway C, Cortina-Borja M, Turton JC, Prince JA, Ibrahim-Verbaas CA, Schuur M, Breteler MM, van Duijn CM, Kehoe PG, Barber R, Coto E, Alvarez V, Deloukas P, Hammond N, Combarros O, Mateo I, Warden DR, Lehmann MG, Belbin O, Brown K, Wilcock GK, Heun R, Kolsch H, Smith AD, Lehmann DJ, Morgan K (2013) Discovery by the epistasis project of an epistatic interaction between the GSTM3 gene and the HHEX/IDE/KIF11 locus in the risk of Alzheimer’s disease. Neurobiol Aging 34(4):1309. e1–1309.e7. doi:10.1016/j.neurobiolaging.2012.08.010
Combarros O, van Duijn CM, Hammond N, Belbin O, Arias-Vasquez A, Cortina-Borja M, Lehmann MG, Aulchenko YS, Schuur M, Kolsch H, Heun R, Wilcock GK, Brown K, Kehoe PG, Harrison R, Coto E, Alvarez V, Deloukas P, Mateo I, Gwilliam R, Morgan K, Warden DR, Smith AD, Lehmann DJ (2009) Replication by the epistasis project of the interaction between the genes for IL-6 and IL-10 in the risk of Alzheimer’s disease. J Neuroinflammation 6:22. doi:10.1186/1742-2094-6-22
Fogel GB, Corne DW (eds) (2003) Evolutionary computation in bioinformatics. Morgan Kaufmann, San Francisco
Geng L, Hamilton HJ (2006) Interestingness measures for data mining: a survey. ACM Comput Surv 38(3). doi:10.1145/1132960.1132963, http://doi.acm.org/10.1145/1132960.1132963
Greene CS, Hill DP, Moore JH (2009a) Environmental noise improves epistasis models of genetic data discovered using a computational evolution system. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, GECCO’09, Montreal. ACM, New York, pp 1785–1786. doi:10.1145/1569901.1570160, http://doi.acm.org/10.1145/1569901.1570160
Greene CS, Hill DP, Moore JH (2009b) Environmental sensing of expert knowledge in a computational evolution system for complex problem solving in human genetics. In: Riolo RL, O’Reilly UM, McConaghy T (eds) Genetic programming theory and practice VII. Genetic and evolutionary computation. Springer, Ann Arbor, chap 2, pp 19–36
Horn J, Nafpliotis N, Goldberg DE (1994) A niched pareto genetic algorithm for multiobjective optimization. In: Proceedings of the first IEEE conference on evolutionary computation, IEEE world congress on computational intelligence, Orlando, vol 1, pp 82–87. doi:10.1109/ICEC.1994.350037, http://dx.doi.org/10.1109/ICEC.1994.350037
Hornby GS (2006) ALPS: the age-layered population structure for reducing the problem of premature convergence. In: Proceedings of the 8th annual conference on genetic and evolutionary computation, GECCO’06, Seattle. ACM, New York, pp 815–822. doi:10.1145/1143997.1144142, http://doi.acm.org/10.1145/1143997.1144142
Hu T, Chen Y, Kiralis JW, Moore JH (2013) ViSEN: methodology and software for visualization of statistical epistasis networks. Genet Epidemiol 37(3):283–285. doi:10.1002/gepi.21718
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection (complex adaptive systems), 1st edn. A Bradford Book. MIT Press, London. http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20\&path=ASIN/0262111705
Lamont GB, VanVeldhuizen DA (2002) Evolutionary algorithms for solving multi-objective problems. Kluwer Academic, Norwell
Lehmann DJ, Schuur M, Warden DR, Hammond N, Belbin O, Kolsch H, Lehmann MG, Wilcock GK, Brown K, Kehoe PG, Morris CM, Barker R, Coto E, Alvarez V, Deloukas P, Mateo I, Gwilliam R, Combarros O, Arias-Vasquez A, Aulchenko YS, Ikram MA, Breteler MM, van Duijn CM, Oulhaj A, Heun R, Cortina-Borja M, Morgan K, Robson K, Smith AD (2012) Transferrin and HFE genes interact in Alzheimer’s disease risk: the epistasis project. Neurobiol Aging 33(1):202.e1–202.e13. doi:10.1016/j.neurobiolaging.2010.07.018
Moore JH, White BC (2007) Tuning ReliefF for genome-wide genetic analysis. In: Proceedings of the 5th European conference on evolutionary computation, machine learning and data mining in bioinformatics, EvoBIO’07, Valencia. Springer, Berlin/Heidelberg, pp 166–175. http://dl.acm.org/citation.cfm?id=1761486.1761502
Moore JH, Williams SM (2009) Epistasis and its implications for personal genetics. Am J Hum Genet 85(3):309–320. doi:10.1016/j.ajhg.2009.08.006, http://dx.doi.org/10.1016/j.ajhg.2009.08.006
Moore JH, Parker JS, Olsen NJ, Aune TM (2002) Symbolic discriminant analysis of microarray data in autoimmune disease. Genet Epidemiol 23(1):57–69
Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden T, Barney N, White BC (2006) A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol 241(2):252–261. doi:10.1016/j.jtbi.2005.11.036, http://dx.doi.org/10.1016/j.jtbi.2005.11.036
Moore JH, Andrews PC, Barney N, White BC (2008) Development and evaluation of an open-ended computational evolution system for the genetic analysis of susceptibility to common human diseases. In: Marchiori E, Moore JH (eds) EvoBIO’08, Naples. Lecture notes in computer science, vol 4973. Springer, pp 129–140
Moore JH, Asselbergs FW, Williams SM (2010) Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4):445–455. doi:10.1093/bioinformatics/btp713
Moore JH, Hill DP, Fisher JM, Lavender N, Kidd LC (2011) Human-computer interaction in a computational evolution system for the genetic analysis of cancer. In: Riolo R, Vladislavleva E, Moore JH (eds) Genetic programming theory and practice IX. Genetic and evolutionary computation. Springer, Ann Arbor, chap 9, pp 153–171. doi:10.1007/978-1-4614-1770-5-9
Moore JH, Hill DP, Sulovary A, Kidd L (2013) Genetic analysis of prostate cancer using computational evolution, pareto-optimization and post-processing. In: Riolo RL, Moore JH, Ritchie MD, Vladislavleva K (eds) Genetic programming theory and practice X. Genetic and evolutionary computation. Springer, Ann Arbor, pp 87–101
Pattin KA, Payne JL, Hill DP, Caldwell T, Fisher JM, Moore JH (2010) Exploiting expert knowledge of protein-protein interactions in a computational evolution system for detecting epistasis. In: Riolo R, McConaghy T, Vladislavleva E (eds) Genetic programming theory and practice VIII. Genetic and evolutionary computation, vol 8. Springer, Ann Arbor, chap 12, pp 195–210. http://www.springer.com/computer/ai/book/978-1-4419-7746-5
Payne J, Greene C, Hill D, Moore J (2010) Sensible initialization of a computational evolution system using expert knowledge for epistasis analysis in human genetics. In: Exploitation of linkage learning in evolutionary algorithms. Springer, Ann Arbor, chap 10, pp 215–226
Smits G, Kotanchek M (2004) Pareto-front exploitation in symbolic regression. In: O’Reilly UM, Yu T, Riolo RL, Worzel B (eds) Genetic programming theory and practice II. Springer, Ann Arbor, chap 17, pp 283–299. doi:10.1007/0-387-23254-0-17
Acknowledgements
This work was supported by NIH grants LM011360, LM009012, LM010098 and AI59694. We would like to thank the participants of present and past Genetic Programming Theory and Practice Workshops (GPTP) for their stimulating feedback and discussion that helped formulate some of the ideas in this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Moore, J.H., Hill, D.P., Saykin, A., Shen, L. (2014). Exploring Interestingness in a Computational Evolution System for the Genome-Wide Genetic Analysis of Alzheimer’s Disease. In: Riolo, R., Moore, J., Kotanchek, M. (eds) Genetic Programming Theory and Practice XI. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-0375-7_2
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0375-7_2
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-0374-0
Online ISBN: 978-1-4939-0375-7
eBook Packages: Computer ScienceComputer Science (R0)