Skip to main content

Towards the Use of Genetic Programming for the Prediction of Survival in Cancer

  • Chapter

Abstract

Risk stratification of cancer patients, that is the prediction of the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years, the use of gene expression profiling in combination with the clinical and histological criteria traditionally used in such a prediction has been successfully introduced. Sets of genes whose expression values in a tumor can be used to predict the outcome of the pathology (gene expression signatures) were introduced and tested by many research groups. A well-known such signature is the 70-genes signature, on which we recently tested several machine learning techniques in order to maximize its predictive power. Genetic Programming (GP) was shown to perform significantly better than other techniques including Support Vector Machines, Multilayer Perceptrons, and Random Forests in classifying patients. Genetic Programming has the further advantage, with respect to other methods, of performing an automatic feature selection. Importantly, by using a weighted average between false positives and false negatives in the definition of the fitness, we showed that GP can outperform all the other methods in minimizing false negatives (one of the main goals in clinical applications) without compromising the overall minimization of incorrectly classified instances. The solutions returned by GP are appealing also from a clinical point of view, being simple, easy to understand, and built out of a rather limited subset of the available features.

An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-3-642-37577-4_18

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Nevins, J.R., Potti, A.: Mining gene expression profiles: expression signatures as cancer phenotypes. Natl. Rev. Genet. 8(8), 601–609 (2007)

    Article  Google Scholar 

  2. Lu, Y., Han, J.: Cancer classification using gene expression data. Inf. Syst. 28(4), 243–268 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  3. Michie, D., Spiegelhalter, D., Taylor, C.: Machine learning, neural and statistical classification. Prentice-Hall, Englewood Cliffs, NJ (1994)

    MATH  Google Scholar 

  4. Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumour and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 96, 6745–6750 (1999)

    Article  Google Scholar 

  5. Hsu, A., Tang, S., Halgamuge, S.: An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data. Bioinformatics 19(16), 2131–2140 (2003)

    Article  Google Scholar 

  6. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002)

    Article  MATH  Google Scholar 

  7. Hernandez, J.C.H., Duval, B., Hao, J.: A genetic embedded approach for gene selection and classification of microarray data. Lect. Notes Comput. Sci. 4447, 90–101 (2007)

    Article  Google Scholar 

  8. Friedman, N., Linial, M., Nachmann, I., Peer, D.: Using bayesian networks to analyze expression data. J. Comput. Biol. 7, 601–620 (2000)

    Article  Google Scholar 

  9. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)

    Google Scholar 

  10. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, MA (1989)

    MATH  Google Scholar 

  11. Liu, J., Cutler, G., Li, W., Pan, Z., Peng, S., Hoey, T., Chen, L., Ling, X.-B.: Multiclass cancer classification and biomarker discovery using ga-based algorithms. Bioinformatics 21, 2691–2697 (2005)

    Article  Google Scholar 

  12. Moore, J., Parker, J., Hahn, L.: Symbolic discriminant analysis for mining gene expression patterns. Lect. Notes Artif. Int. 2167, 372–381 (2001)

    Google Scholar 

  13. Rosskopf, M., Schmidt, H., Feldkamp, U., Banzhaf, W.: Genetic programming based dna microarray analysis for classification of tumour tissues. Technical Report 2007-2003, Memorial University of Newfoundland (2007)

    Google Scholar 

  14. Yu, J., Yu, J., Almal, A.A., Dhanasekaran, S.M., Ghosh, D., Worzel, W.P., Chinnaiyan, A.M.: Feature selection and molecular classification of cancer using genetic programming. Neoplasia 9(4), 292–303 (2007)

    Article  Google Scholar 

  15. Bojarczuk, C., Lopesb, H., Freitasc, A.: Data mining with constrained-syntax genetic programming: applications to medical data sets. Proc. Intell. Data Anal. Med. Pharmacol. (2001)

    Google Scholar 

  16. Hong, J., Cho, S.: The classification of cancer based on dna microarray data that uses diverse ensemble genetic programming. Artif. Intell. Med. 36, 43–58 (2006)

    Article  Google Scholar 

  17. Vanneschi, L., Farinaccio, A., Giacobini, M., Antoniotti, M., Mauri, G., Provero, P.: Identification of individualized feature combinations for survival prediction in breast cancer: a comparison of machine learning techniques. In: Giacobini, M., et al. (eds.) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. Proceedings of the Nineth European Conference, EvoBIO 2010. Lecture Notes in Computer Science, LNCS 6023, pp. 110–121. Springer, Berlin (2010)

    Google Scholar 

  18. van ’t Veer, L.J., Dai, H., van de Vijver, M.J., He, Y.D., Hart, A.A.M., Mao, M., Peterse, H.L., van der Kooy, K., Marton, M.J., Witteveen, A.T., Schreiber, G.J., Kerkhoven, R.M., Roberts, C., Linsley, P.S., Bernards, R., Friend, S.H.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002)

    Google Scholar 

  19. Koza, J.R.: Genetic Programming. MIT, Cambridge, MA (1992)

    MATH  Google Scholar 

  20. van de Vijver, M.J., He, Y.D., van’t Veer, L.J., Dai, H., Hart, A.A.M., Voskuil, D.W., Schreiber, G.J., Peterse, J.L., Roberts, C., Marton, M.J., Parrish, M., Atsma, D., Witteveen, A., Glas, A., Delahaye, L., van der Velde, T., Bartelink, H., Rodenhuis, S., Rutgers, E.T., Friend, S.H., Bernards, R.: A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347(25), 1999–2009 (2002)

    Article  Google Scholar 

  21. Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk (2008) (With contributions by J.R. Koza)

  22. Archetti, F., Lanzeni, S., Messina, E., Vanneschi, L.: Genetic programming for human oral bioavailability of drugs. In: Cattolico, M., et al. (eds.) Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, pp. 255–262. Seattle, Washington, DC (2006)

    Google Scholar 

  23. Archetti, F., Messina, E., Lanzeni, S., Vanneschi, L.: Genetic programming and other machine learning approaches to predict median oral lethal dose (LD50) and plasma protein binding levels (%PPB) of drugs. In: Marchiori, E., et al. (eds.) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. Proceedings of the Fifth European Conference, EvoBIO 2007. Lecture Notes in Computer Science, LNCS 4447, pp. 11–23. Springer, Berlin (2007)

    Google Scholar 

  24. Archetti, F., Messina, E., Lanzeni, S., Vanneschi, L.: Genetic programming for computational pharmacokinetics in drug discovery and development. Genet. Program. Evol. M. 8(4), 17–26 (2007)

    Google Scholar 

  25. Silva, S.: GPLAB: a genetic programming toolbox for MATLAB, version 3.0. http://gplab.sourceforge.net (2007)

  26. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  27. Platt, J.: Fast training of support vector machines using sequential minimal optimization. Advances in Kernel Methods – Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1998)

    Google Scholar 

  28. Weka: A multi-task machine learning software developed by Waikato University. www.cs.waikato.ac.nz/ml/weka (2006)

  29. Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice-Hall, London (1999)

    MATH  Google Scholar 

  30. Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. In: The Eleventh Annual Conference on Computational Learning Theory, Machine Learning, 37(3), 277–296 (1999)

    MATH  Google Scholar 

  31. Helmbold, D.P., Warmuth, M.K.: On weak learning. J. Comput. Syst. Sci. 50(3), 551–573 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  32. Park, J., Sandberg, J.W.: Universal approximation using radial basis functions network. Neural Comput. 3, 246–257 (1991)

    Article  Google Scholar 

  33. Poggio, T., Girosi, F.: Networks for approximation and learning. P. IEEE 78(9), 1481–1497 (1990)

    Article  Google Scholar 

  34. Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice-Hall, London (1999)

    MATH  Google Scholar 

Download references

Acknowledgments

This work was partially supported by Neuroscience Program of the Compagnia di San Paolo in Torino.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Giacobini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Giacobini, M., Provero, P., Vanneschi, L., Mauri, G. (2014). Towards the Use of Genetic Programming for the Prediction of Survival in Cancer. In: Cagnoni, S., Mirolli, M., Villani, M. (eds) Evolution, Complexity and Artificial Life. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37577-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37577-4_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37576-7

  • Online ISBN: 978-3-642-37577-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics