Skip to main content

Advertisement

Log in

Artificial neural network model for effective cancer classification using microarray gene expression data

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Microarray gene expression profile shall be exploited for the efficient and effective classification of cancers. This is a computationally challenging task because of large quantity of genes and relatively small amount of experiments in gene expression data. The repercussion of this work is to devise a framework of techniques based on supervised machine learning for discrimination of acute lymphoblastic leukemia and acute myeloid leukemia using microarray gene expression profiles. Artificial neural network (ANN) technique was employed for this classification. Moreover, ANN was compared with other five machine learning techniques. These methods were assessed on eight different classification performance measures. This article reports a significant classification accuracy of 98% using ANN with no error in identification of acute lymphoblastic leukemia and only one error in identification of acute myeloid leukemia on tenfold cross-validation and leave-one-out approach. Furthermore, models were validated on independent test data, and all samples were correctly classified.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537

    Article  Google Scholar 

  2. Haferlach T, Kohlmann A, Schnittger S, Dugas M, Hiddemann W, Kern W, Schoch C (2005) Global approach to the diagnosis of leukemia using gene expression profiling. Blood 106(4):1189–1198

    Article  Google Scholar 

  3. Mallick BK, Ghosh D, Ghosh M (2005) Bayesian classification of tumours by using gene expression data. J R Stat Soc Ser B Stat Methodol 67(2):219–234

    Article  MathSciNet  MATH  Google Scholar 

  4. Antonov AV, Tetko IV, Mader MT, Budczies J, Mewes HW (2004) Optimization models for cancer classification: extracting gene interaction information from microarray expression data. Bioinformatics 20(5):644–652

    Article  Google Scholar 

  5. Lee Y, Lee C-K (2003) Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics 19(9):1132–1139

    Article  Google Scholar 

  6. Peng S, Xu Q, Ling XB, Peng X, Du W, Chen L (2003) Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS Lett 555(2):358–362

    Article  Google Scholar 

  7. Berrar DP, Downes CS, Dubitzky W (2003) Multiclass cancer classification using gene expression profiling and probabilistic neural networks. In: Proceedings of the Pacific symposium on biocomputing, pp 5–16

  8. Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679

    Article  Google Scholar 

  9. Dwivedi AK, Chouhan U (2016) Comparative study of artificial neural network for classification of hot and cold recombination regions in Saccharomyces cerevisiae. Neural Comput Appl. doi:10.1007/s00521-016-2466-6

    Google Scholar 

  10. Dwivedi AK, Chouhan U (2016) Comparative study of machine learning techniques for genome scale discrimination of recombinant HIV-1 strains. J Med Imaging Health Inform 6(2):425–430

    Article  Google Scholar 

  11. Dwivedi AK (2016) Performance evaluation of different machine learning techniques for prediction of heart disease. Neural Comput Appl 27(7):1–9

    Google Scholar 

  12. García-Pedrajas N, Hervás-Martínez C, Ortiz-Boyer D (2005) Cooperative coevolution of artificial neural network ensembles for pattern classification. IEEE Trans Evol Comput 9(3):271–302

    Article  Google Scholar 

  13. Yao X, Liu Y (1998) Making use of population information in evolutionary artificial neural networks. IEEE Trans Syst Man Cybern Part B Cybern 28(3):417–425

    Google Scholar 

  14. Haykin S (2010) Neural networks: a comprehensive foundation, 1994. McMillan, New Jersey

    MATH  Google Scholar 

  15. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, New York

    MATH  Google Scholar 

  16. Bahrammirzaee A (2010) A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems. Neural Comput Appl 19(8):1165–1195

    Article  Google Scholar 

  17. Hoptroff RG (1993) The principles and practice of time series forecasting and business modelling using neural nets. Neural Comput Appl 1(1):59–66

    Article  Google Scholar 

  18. Azar AT (2013) Fast neural network learning algorithms for medical applications. Neural Comput Appl 23(3–4):1019–1034

    Article  Google Scholar 

  19. Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares M, Haussler D (2000) Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci 97(1):262–267

    Article  Google Scholar 

  20. Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10):906–914

    Article  Google Scholar 

  21. Ranawana R, Palade V (2005) A neural network based multi-classifier system for gene identification in DNA sequences. Neural Comput Appl 14(2):122–131

    Article  Google Scholar 

  22. Yasdi R (2000) A literature survey on applications of neural networks for human-computer interaction. Neural Comput Appl 9(4):245–258

    Article  MATH  Google Scholar 

  23. Dwivedi AK, Chouhan U (2014) On support vector machine ensembles for classification of recombination breakpoint regions in Saccharomyces Cerevisiae. Int J Comput Appl 108(13):44–48

    Google Scholar 

  24. Dwivedi AK, Chouhan U (2016) Genome-scale classification of recombinant and non-recombinant HIV-1 sequences using artificial neural network ensembles. Curr Sci 111(5):853

    Article  Google Scholar 

  25. Vapnik VN, Vapnik V (1998) Statistical learning theory, vol 2. Wiley, New York

    MATH  Google Scholar 

  26. Vapnik VN (2000) The nature of statistical learning theory, ser. Statistics for engineering and information science, vol 21. Springer, New York, pp 1003–1008

    Google Scholar 

  27. Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2(2):121–167

    Article  Google Scholar 

  28. Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn 20(3):197–243

    MATH  Google Scholar 

  29. Jensen FV (1996) An introduction to Bayesian networks, vol 210. UCL Press, London

    Google Scholar 

  30. Peral J (1988) Probabilistic reasoning in intelligent systems, vol 12. Morgan Kaufmann, San Mateo, pp 241–288

    Google Scholar 

  31. Castillo E (1997) Expert systems and probabilistic network models. Springer, Berlin

    Book  Google Scholar 

  32. Shafer G, Pearl J (1990) Readings in uncertain reasoning. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  33. Hosmer DW Jr, Lemeshow S (2004) Applied logistic regression, 2nd edn. Wiley, Columbus

    MATH  Google Scholar 

  34. Schumacher M, Roßner R, Vach W (1996) Neural networks and logistic regression: part I. Comput Stat Data Anal 21(6):661–682

    Article  MATH  Google Scholar 

  35. Vach W, Roßner R, Schumacher M (1996) Neural networks and logistic regression: part II. Comput Stat Data Anal 21(6):683–701

    Article  MATH  Google Scholar 

  36. Hajmeer M, Basheer I (2003) Comparison of logistic regression and neural network-based classifiers for bacterial growth. Food Microbiol 20(1):43–55

    Article  Google Scholar 

  37. Aha DW (1997) Lazy learning. Kluwer, Norwell

    Book  MATH  Google Scholar 

  38. Kanmani S, Uthariaraj VR, Sankaranarayanan V, Thambidurai P (2007) Object-oriented software fault prediction using neural networks. Inf Softw Technol 49(5):483–492

    Article  Google Scholar 

  39. Geisser S (1993) Predictive inference, vol 55. CRC Press, New York

    Book  MATH  Google Scholar 

Download references

Acknowledgements

The authors are extremely thankful to department of Biotechnology, New Delhi for providing Bioinformatics Infrastructure Facility of DBT at MANIT Bhopal.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashok Kumar Dwivedi.

Ethics declarations

Conflict of interest

None.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dwivedi, A.K. Artificial neural network model for effective cancer classification using microarray gene expression data. Neural Comput & Applic 29, 1545–1554 (2018). https://doi.org/10.1007/s00521-016-2701-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-016-2701-1

Keywords

Navigation