Abstract
The availability of chip-based technology has transformed human genetics and made routine the measurement of thousands of DNA sequence variations giving rise to an informatics challenge. This challenge is the identification of combinations of interacting DNA sequence variations predictive of common diseases. We have previously developed Multifactor Dimensionality Reduction (MDR), a method capable of detecting these interactions, but an exhaustive MDR analysis is exponential in time complexity and thus unsuitable for an interaction analysis of genome-wide datasets. Therefore we look to stochastic search approaches to find a suitable wrapper for the analysis of these data. We have previously shown that an ant colony optimization (ACO) framework can be successfully applied to human genetics when expert knowledge is included. We have integrated an ACO stochastic search wrapper into the open source MDR software package. In this wrapper we also introduce a scaling method based on an exponential distribution function with a single user-adjustable parameter. Here we obtain expert knowledge from Tuned ReliefF (TuRF), a method capable of detecting attribute interactions in the absence of main effects, and perform a power analysis at different parameter settings. We show that the expert knowledge distribution parameter, the retention factor, and the weighting of expert knowledge significantly affect the power of the method.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Moore, J.H.: The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Human Heredity 56, 73–82 (2003)
Bateson, W.: Mendel’s Principles of Heredity. Cambridge University Press, Cambridge (1909)
Shull, G.H.: Duplicate genes for capsule form in Bursa bursa-pastoris. J. Ind. Abst. Vererb 12, 97–149 (1914)
Hirschhorn, J.N., Lohmueller, K., Byrne, E., Hirschhorn, K.: A comprehensive review of genetic association studies. Genet. Med. 4, 45–61 (2002)
Finckh, U.: The future of genetic association studies in Alzheimer disease. Journal of Neural Transmission 110(3), 253–266 (2003)
Templeton, A.: Epistasis and complex traits. Epistasis and the Evolutionary Process, 41–57 (2000)
Leamy, L.J., Routman, E.J., Cheverud, J.M.: An Epistatic Genetic Basis for Fluctuating Asymmetry of Mandible Size in Mice. Evolution 56(3), 642–653 (2002)
Goldberg, D.E.: The Design of Innovation: Lessons from and for Competent Genetic Algorithms. Kluwer Academic Publishers, Norwell (2002)
The International HapMap Consortium: A haplotype map of the human genome. Nature 437(7063), 1299–1320 (2005)
Dorigo, M., Maniezzo, V., Colorni, A.: Positive feedback as a search strategy. Technical report 91-016, Dipartimento di Elettronica e Informatica, Politecnico di Milano (1991)
Greene, C.S., White, B.C., Moore, J.H.: Ant colony optimization for genome-wide genetic analysis. In: Dorigo, M., Birattari, M., Blum, C., Clerc, M., Stützle, T., Winfield, A.F.T. (eds.) ANTS 2008. LNCS, vol. 5217, pp. 37–47. Springer, Heidelberg (2008)
Colorni, A., Dorigo, M., Maniezzo, V., Trubian, M.: Ant system for job-shop scheduling. Belg. J. Oper. Res. 34, 39–53 (1994)
Parpinelli, R., Lopes, H., Freitas, A.: An Ant Colony Based System for Data Mining: Applications to Medical Data. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), pp. 791–797 (2001)
Dorigo, M., Stützle, T.: Ant Colony Optimization (2004)
Brutschy, A., Scheidler, A., Merkle, D., Middendorf, M.: Learning from house-hunting ants: Collective decision-making in organic computing systems. In: Dorigo, M., Birattari, M., Blum, C., Clerc, M., Stützle, T., Winfield, A.F.T. (eds.) ANTS 2008. LNCS, vol. 5217, pp. 96–107. Springer, Heidelberg (2008)
Moore, J.H., White, B.C.: Genome-wide genetic analysis using genetic programming: The critical need for expert knowledge. In: Riolo, R., Soule, T., Worzel, B. (eds.) Genetic Programming Theory and Practice IV. Springer, Heidelberg (2007)
Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Machine Learning: Proceedings of the AAAI 1992 (1992)
Kononenko, I.: Estimating attributes: Analysis and extension of relief. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of relieff and rrelieff. Mach. Learn. 53(1-2), 23–69 (2003)
Moore, J.H., White, B.C.: Tuning relieff for genome-wide genetic analysis. In: Marchiori, E., Moore, J.H., Rajapakse, J.C. (eds.) EvoBIO 2007. LNCS, vol. 4447, pp. 166–175. Springer, Heidelberg (2007)
Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H.: Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. American Journal of Human Genetics 69, 138–147 (2001)
Moore, J.H., Gilbert, J.C., Tsai, C.T., Chiang, F.T., Holden, T., Barney, N., White, B.C.: A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology 241(2), 252–261 (2006)
Julià , A., Moore, J., Miquel, L., Alegre, C., Barceló, P., Ritchie, M., Marsal, S.: Identification of a two-loci epistatic interaction associated with susceptibility to rheumatoid arthritis through reverse engineering and multifactor dimensionality reduction. Genomics 90(1), 6–13 (2007)
Beretta, L., Cappiello, F., Moore, J.H., Barili, M., Greene, C.S., Scorza, R.: Ability of epistatic interactions of cytokine single-nucleotide polymorphisms to predict susceptibility to disease subsets in systemic sclerosis patients. Arthritis and Rheumatism 59(7), 974–983 (2008)
Sokal, R.R., Rohlf, F.J.: Biometry: the principles and practice of statistics in biological research, 3rd edn. W. H. Freeman and Co., New York (1995)
Hastie, T., Tibshirani, R., Friedman, J.: Elements of Statistical Learning, 1st edn. Springer, Canada (2001)
Harrell Jr., F.E.: Design: Design Package (2007)
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2008)
Snel, B., Lehmann, G., Bork, P., Huynen, M.A.: String: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Research 28(18), 3442–3444 (2000)
The Gene Ontology Consortium: Gene ontology: Tool for the unification of biology. Nature Genetics 25, 25–29 (2000)
Kanehisa, M., Goto, S.: Kegg: kyoto encyclopedia of genes and genomes. Nucleic Acids Research 28(1), 27–30 (2000)
Pattin, K., Moore, J.: Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases. Human Genetics 124(1), 19–29 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Greene, C.S., Gilmore, J.M., Kiralis, J., Andrews, P.C., Moore, J.H. (2009). Optimal Use of Expert Knowledge in Ant Colony Optimization for the Analysis of Epistasis in Human Disease. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2009. Lecture Notes in Computer Science, vol 5483. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01184-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-01184-9_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01183-2
Online ISBN: 978-3-642-01184-9
eBook Packages: Computer ScienceComputer Science (R0)