Skip to main content

Cross Validation Consistency for the Assessment of Genetic Programming Results in Microarray Studies

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2611))

Abstract

DNA microarray technology has made it possible to measure the expression levels of thousands of genes simultaneously in a particular cell or tissue. The challenge for computational biologists and bioinformaticists will be to develop methods that are able to identify subsets of gene expression variables and features that classify cells and tissues into meaningful biological and clinical groups. Genetic programming (GP) has emerged as a machine learning tool for variable and feature selection in microarray data analysis. However, a limitation of GP is a lack of cross validation strategies for the assessment of GP results. This is partly due to the inherent complexity of GP due to its stochastic properties. Here, we introduce and review cross validation consistency (CVC) as a new modeling strategy for use with GP. We review the application of CVC to symbolic discriminant analysis (SDA), a GP-based analytical strategy for mining gene expression patterns in DNA microarray data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schena, M., Shalon, D., Davis, R.W., Brown, P.O.: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270 (1995) 467–470

    Article  Google Scholar 

  2. Velculesco, V.E., Zhang, L., Vogelstein, B., Kinzler, K.W.: Serial analysis of gene expression. Science 270 (1995) 484–487

    Article  Google Scholar 

  3. Caprioli, R.M., Farmer, T.B., Gile, J.: Molecular imaging of biological samples: Localization of peptides and proteins using MALDI-TOF MS. Analyt. Chem. 69 (1997) 4751–4760

    Article  Google Scholar 

  4. Bradley, J.V.: Distribution-free statistical tests. Prentice-Hall, Englewood Cliffs (1968)

    MATH  Google Scholar 

  5. Freitas, A.A.: Understanding the crucial role of attribute interaction in data mining. Artificial Intelligence Reviews 16 (2001) 177–199

    Article  MATH  Google Scholar 

  6. Moore, J.H., Williams, S.M.: New strategies for identifying gene-gene interactions in hypertension. Annals of Medicine 34 (2002) 88–95

    Article  Google Scholar 

  7. Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Plummer, W.D., Parl, F.F. and Moore, J.H.: Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. American Journal of Human Genetics 69 (2001) 138–147

    Article  Google Scholar 

  8. Templeton, A.R.: Epistasis and complex traits. In: Wade, M., Brodie III, B., Wolf, J. (eds.): Epistasis and Evolutionary Process. Oxford University Press, New York (2000)

    Google Scholar 

  9. Fisher, R.A.: The Use of Multiple Measurements in Taxonomic Problems. Ann. Eugen. 7 (1936) 179–188

    Google Scholar 

  10. Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis. Prentice Hall, Upper Saddle River (1998)

    Google Scholar 

  11. Huberty, C.J.: Applied Discriminant Analysis. John Wiley & Sons, Inc., New York Chichester Bisbane Toronto Singapore (1994)

    MATH  Google Scholar 

  12. Neter, J., Wasserman, W., Kutner, M.H.: Applied Linear Statistical Models, Regression, Analysis of Variance, and Experimental Designs. 3rd edn. Irwin, Homewood (1990)

    Google Scholar 

  13. Moore, J.H., Parker, J.S., Hahn, L.W.: Symbolic discriminant analysis for mining gene expression patterns. In: De Raedt, L., Flach, P. (eds) Lecture Notes in Artificial Intelligence 2167, pp 372–81, Springer-Verlag, Berlin (2001)

    Google Scholar 

  14. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press, Cambridge London (1992)

    MATH  Google Scholar 

  15. Moore, J.H., Parker, J.S.: Evolutionary computation in microarray data analysis. In: Lin, S. and Johnson, K. (eds): Methods of Microarray Data Analysis. Kluwer Academic Publishers, Boston (2001)

    Google Scholar 

  16. Moore, J.H., Parker, J.S., Olsen, N., Aune, T. Symbolic discriminant analysis of microarray data in autoimmune disease. Genetic Epidemiology 23 (2002) 57–69

    Article  Google Scholar 

  17. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2001)

    MATH  Google Scholar 

  18. Devroye, L., Gyorfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York (1996)

    MATH  Google Scholar 

  19. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286 (1999) 531–537

    Article  Google Scholar 

  20. Maas, K., Chan, S., Parker, J., Slater, A., Moore, J.H., Olsen, N., and Aune, T.M.: Cutting edge: molecular portrait of human autoimmunity. Journal of Immunology 169 (2002) 5–9

    Google Scholar 

  21. Gilbert, R.J., Rowland, J.J., Kell, D.B.: Genomic computing: explanatory modelling for functional genomics. In: Whitley, D., Goldberg, D., Cantu-Paz, E., Spector, L., Parmee, I., Beyer, H.-G. (eds): Proceedings of the Genetic and Evolutionary Computation Conference. Morgan Kaufmann Publishers, San Francisco (2000)

    Google Scholar 

  22. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences USA 95 (1998) 14863–68

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moore, J.H. (2003). Cross Validation Consistency for the Assessment of Genetic Programming Results in Microarray Studies. In: Cagnoni, S., et al. Applications of Evolutionary Computing. EvoWorkshops 2003. Lecture Notes in Computer Science, vol 2611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36605-9_10

Download citation

  • DOI: https://doi.org/10.1007/3-540-36605-9_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00976-4

  • Online ISBN: 978-3-540-36605-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics