Skip to main content

A Multi-objective Genetic Programming Biomarker Detection Approach in Mass Spectrometry Data

  • Conference paper
  • First Online:
Book cover Applications of Evolutionary Computation (EvoApplications 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9597))

Included in the following conference series:

Abstract

Mass spectrometry is currently the most commonly used technology in biochemical research for proteomic analysis. The main goal of proteomic profiling using mass spectrometry is the classification of samples from different clinical states. This requires the identification of proteins or peptides (biomarkers) that are expressed differentially between different clinical states. However, due to the high dimensionality of the data and the small number of samples, classification of mass spectrometry data is a challenging task. Therefore, an effective feature manipulation algorithm either through feature selection or construction is needed to enhance the classification performance and at the same time minimise the number of features. Most of the feature manipulation methods for mass spectrometry data treat this problem as a single objective task which focuses on improving the classification performance. This paper presents two new methods for biomarker detection through multi-objective feature selection and feature construction. The results show that the proposed multi-objective feature selection method can obtain better subsets of features than the single-objective algorithm and two traditional multi-objective approaches for feature selection. Moreover, the multi-objective feature construction algorithm further improves the perfomance over the multi-objective feature selection algorithm. This paper is the first multi-objective genetic programming approach for biomarker detection in mass spectrometry data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Morris, J.S., Coombes, K.R., Koomen, J., Baggerly, K.A., Kobayashi, R.: Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics 21(9), 1764–1775 (2005)

    Article  Google Scholar 

  2. Ahmed, S., Zhang, M., Peng, L., Xue, B.: Genetic programming for measuring peptide detectability. In: Dick, G., et al. (eds.) SEAL 2014. LNCS, vol. 8886, pp. 593–604. Springer, Heidelberg (2014)

    Google Scholar 

  3. Yang, P., Zhang, Z.: A clustering based hybrid system for mass spectrometry data analysis. In: Chetty, M., Ngom, A., Ahmad, S. (eds.) PRIB 2008. LNCS (LNBI), vol. 5265, pp. 98–109. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Norwell (1998)

    Book  MATH  Google Scholar 

  5. Xue, B., Fu, W., Zhang, M.: Differential evolution (de) for multi-objective feature selection in classification. In: Proceedings of the 2014 Conference Companion on Genetic and Evolutionary Computation Companion, GECCO Comp 2014, pp. 83–84. ACM, New York (2014)

    Google Scholar 

  6. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  7. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  8. Neshatian, K., Zhang, M.: Unsupervised elimination of redundant features using genetic programming. In: Nicholson, A., Li, X. (eds.) AI 2009. LNCS, vol. 5866, pp. 432–442. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  9. Gertheiss, J., Tutz, G.: Supervised feature selection in mass spectrometry-based proteomic profiling by blockwise boosting. Bioinformatics 25(8), 1076–1077 (2009)

    Article  Google Scholar 

  10. Somnath, D.: Classification of breast cancer versus normal samples from mass spectrometry profiles using linear discriminant analysis of important features selected by random forest. Stat. Appl. Genet. Mol. Biol. 7(2), 1–14 (2008)

    MathSciNet  MATH  Google Scholar 

  11. Muni, D., Pal, N., Das, J.: Genetic programming for simultaneous feature selection and classifier design. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 36(1), 106–117 (2006)

    Article  Google Scholar 

  12. Ahmed, S., Zhang, M., Peng, L.: Improving feature ranking for biomarker discovery in proteomics mass spectrometry data using genetic programming. Connection Sci., 1-29 (2014). doi:10.1080/09540091.2014.906388

    Google Scholar 

  13. Kourid, A., Batouche, M.: Biomarker discovery based on large-scale feature selection and MapReduce. In: Amine, A., Bellatreche, L., Elberrichi, Z., Neuhold, E.J., Wrembel, R. (eds.) Computer Science and Its Applications. IFIP AICT, vol. 456, pp. 81–92. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  14. Duval, B., Hao, J.K.: Advances in metaheuristics for gene selection and classification of microarray data. Briefings Bioinform. 11(1), 127–141 (2010)

    Article  Google Scholar 

  15. Xue, B., Cervante, L., Shang, L., Browne, W.N., Zhang, M.: Binary PSO and rough set theory for feature selection: a multi-objective filter based approach. Int. J. Comput. Intell. Appl. 13(2), 1450009 (2014)

    Article  Google Scholar 

  16. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6, 182–197 (2000)

    Article  Google Scholar 

  17. Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: improving the strength pareto evolutionary algorithm for multiobjective optimization. In: Evolutionary Methods for Design, Optimisation, and Control, CIMNE, Barcelona, Spain, pp. 95–100 (2002)

    Google Scholar 

  18. Ngatchou, P., Zarei, A., El-Sharkawi, M.: Pareto multi objective optimization. In: Proceedings of the 13th International Conference on Intelligent Systems Application to Power Systems, pp. 84–91 (2005)

    Google Scholar 

  19. Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Trans. Evol. Comput. 17(3), 368–386 (2013)

    Article  Google Scholar 

  20. Ahmed, S., Zhang, M., Peng, L., Xue, B.: Multiple feature construction for effective biomarker identification and classification using genetic programming. In: Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, GECCO 2014, pp. 249–256. ACM, New York (2014)

    Google Scholar 

  21. Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., Liotta, L.A.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359, 572–577 (2002)

    Article  Google Scholar 

  22. Hingorani, S.R., Petricoin III, E.F., Maitra, A., Rajapakse, V., King, C., Jacobetz, M.A., Ross, S., Conrads, T.P., Veenstra, T.D., Hitt, B.A., Kawaguchi, Y., Johann, D., Liotta, L.A., Crawford, H.C., Putt, M.E., Jacks, T., Wright, C.V., Hruban, R.H., Lowy, A.M., Tuveson, D.A.: Preinvasive and invasive ductal pancreatic cancer and its early detection in the mouse. Cancer Cell 4(6), 437–450 (2003)

    Google Scholar 

  23. Petricoin, E.F., Rajapaske, V., Herman, E.H., Arekani, A.M., Ross, S., Johann, D., Knapton, A., Zhang, J., Hitt, B.A., Conrads, T.P., Veenstra, T.D., Liotta, L.A., Sistare, F.D.: Toxicoproteomics: serum proteomic pattern diagnostics for early detection of drug induced cardiac toxicities and cardioprotection. Toxicol. Pathol. 32, 122–130 (2004)

    Article  Google Scholar 

  24. Ressom, H., Varghese, R.S., Orvisky, E., Drake, S., Hortin, G., Abdel-Hamid, M., Loffredo, C.A., Goldman, R.: Ant colony optimization for biomarker identification from MALDI-TOF mass spectra. In: Proceedings ofthe 28th IEEE Annual International Conference in Engineering in Medicine and Biology Society, pp. 4560–4563 (2006)

    Google Scholar 

  25. Armañanzas, R., Saeys, Y., Inza, I., García-Torres, M., Bielza, C., Larranaga, P., van de Peer, Y.: Peakbin selection in mass spectrometry data using a consensus approach with estimation of distribution algorithms. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(3), 760–774 (2011)

    Article  Google Scholar 

  26. Petricoin, E.F., Ornstein, D.K., Paweletz, C.P., Ardekani, A., Hackett, P.S., Hitt, B.A., Velassco, A., Trucco, C., Wiegand, L., Wood, K., Simone, C.B., Levine, P.J., Linehan, W.M., Emmert-Buck, M.R., Steinberg, S.M., Kohn, E.C., Liotta, L.A.: Serum proteomic patterns for detection of prostate cancer. J. Nat. Cancer Institute 94(20), 1576–1578 (2002)

    Article  Google Scholar 

  27. MATLAB: version 7.10.0 (R2010a). The MathWorks Inc., Natick, Massachusetts (2010)

    Google Scholar 

  28. Smith, C., Want, E., O’Maille, G., Abagyan, R., Siuzdak, G.: XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78, 779–787 (2006)

    Article  Google Scholar 

  29. Datta, S.: Feature selection and machine learning with mass spectrometry data. In: Matthiesen, R. (ed.) Mass Spectrometry Data Analysis in Proteomics. Methods in Molecular Biology, vol. 1007, pp. 237–262. Humana Press (2013)

    Google Scholar 

  30. Koza, J.: Genetic Programming III: Darwinian Invention and Problem Solving. A Bradford book, Elsevier Science & Tech, Massachusetts, Philadelphia (1999)

    Google Scholar 

  31. Neshatian, K., Zhang, M., Johnston, M.: Feature construction and dimension reduction using genetic programming. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS (LNAI), vol. 4830, pp. 160–170. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  32. Luke, S.: Essentials of Metaheuristics, 2nd edn. Lulu (2013). http://cs.gmu.edu/sean/book/metaheuristics/

  33. Soyel, H., Tekguc, U., Demirel, H.: Application of NSGA-II to feature selection for facial expression recognition. Comput. Electr. Eng. 37(6), 1232–1240 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bing Xue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Ahmed, S., Zhang, M., Peng, L., Xue, B. (2016). A Multi-objective Genetic Programming Biomarker Detection Approach in Mass Spectrometry Data. In: Squillero, G., Burelli, P. (eds) Applications of Evolutionary Computation. EvoApplications 2016. Lecture Notes in Computer Science(), vol 9597. Springer, Cham. https://doi.org/10.1007/978-3-319-31204-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31204-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31203-3

  • Online ISBN: 978-3-319-31204-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics