Skip to main content
Log in

Classification of bioinformatics dataset using finite impulse response extreme learning machine for cancer diagnosis

  • Extreme Learning Machine's Theory & Application
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In this paper, the classification of the two binary bioinformatics datasets, leukemia and colon tumor, is further studied by using the recently developed neural network-based finite impulse response extreme learning machine (FIR-ELM). It is seen that a time series analysis of the microarray samples is first performed to determine the filtering properties of the hidden layer of the neural classifier with FIR-ELM for feature identification. The linear separability of the data patterns in the microarray datasets is then studied. For improving the robustness of the neural classifier against noise and errors, a frequency domain gene feature selection algorithm is also proposed. It is shown in the simulation results that the FIR-ELM algorithm has an excellent performance for the classification of bioinformatics data in comparison with many existing classification algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Dudoit S, Fridlyand J (2002) Introduction to classification in microarray experiments. In: Berrar D, Dubitzky W, Granzow M (eds) A practical approach to microarray data analysis. Kluwer, Boston

    Google Scholar 

  2. Lu Y, Han J (2003) Cancer classification using gene expression data. Inform Syst 28(4):243–268

    Article  MathSciNet  MATH  Google Scholar 

  3. Huber W, Heydebreck AC, Vingron M (2003) Analysis of microarray gene expression data. In: Bishop M et al (eds) Handbook of statistical genetics. Wiley, Chichester

    Google Scholar 

  4. Misra J, Schmitt W, Hwang D, Hsiao L, Gullans S, Stephanopoulos G (2002) Interactive exploration of microarray gene expression patterns in a reduced dimensional space. Genome Res 12(7):1112–1120

    Article  Google Scholar 

  5. Wall ME, Rechtsteiner A, Rocha LM (2003) Singular value decomposition and principal component analysis. In: Berrar DP, Dubitzky W, Granzow M (eds) A practical approach to microarray data analysis. Kluwer, Norwell, pp 91–109

    Chapter  Google Scholar 

  6. Liao X, Dasgupta N, Lin SM, Carin L (2002) ICA and PLS modelling for functional analysis and drug sensitivity for DNA microarray signals. In Proceedings of workshop on genomic signal processing and statistics

  7. Chen A, Hsu J-C (2010) Exploring novel algorithms for the prediction of cancer classification. In: 2nd international conference on software engineering and data mining (SEDM), pp 378–383

  8. Zhang R, Huang G-B, Sundararajan N, Saratchandran P (2007) Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. IEEE/ACM Trans Comput Biol Bioinform 4(3):485–495

    Article  Google Scholar 

  9. Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang C-H, Angelo M, Ladd C, Reich M, Latulippe E, Mesirov J, Poggio T, Gerald W, Loda M, Lander E, Golub T (2002) Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci USA 98(26):15149–15154

    Article  Google Scholar 

  10. Baboo D, Sasikala M (2010) Multicategory classification using support vector machine for microarray gene expression cancer diagnosis. Global J Comput Sci Technol

  11. Vapnik VN (1999) The nature of statistical learning theory, 2nd edn. Springer, New York

    Google Scholar 

  12. Abe S (2005) Support vector machines for pattern classification. Springer, London

    Google Scholar 

  13. Huang G-B, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501

    Article  Google Scholar 

  14. Helmy T, Rasheed Z (2009) Multi-category bioinformatics dataset classification using extreme learning machine. Evolutionary computation, 2009. CEC ‘09. IEEE congress on, pp 3234–3240

  15. Sanchez-Monedero J, Cruz-Ramirez M, Fernandez-Navarro F, Fernandez J, Gutierrez P, Hervas-Martinez C (2010) On the suitability of extreme learning machine for gene classification using feature selection. Intelligent systems design and applications (ISDA), 2010 10th international conference on, pp 507–512

  16. Baboo S, Sasikala S (2010) Multicategory classification using an Extreme Learning Machine for microarray gene expression cancer diagnosis. Communication control and computing technologies (ICCCCT), 2010 IEEE international conference on, pp 748–757

  17. Bharathi A, Natarajan A (2010) Microarray gene expression cancer diagnosis using machine learning algorithms. Signal and image processing (ICSIP), 2010 international conference on, pp 275–280

  18. Man Z, Lee K, Wang D, Cao Z, Miao C (2011) A new robust training algorithm for a class of single-hidden layer feedforward neural networks. Neurocomputing 74(16):2491–2501

    Article  Google Scholar 

  19. Diniz PSR, Silva EABD, Netto SL (2002) Digital signal processing system analysis and design. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  20. Unger G, Chor B (2010) Linear separability of gene expression data sets. IEEE/ACM Trans Comput Biol Bioinform 7(2):375–381

    Article  Google Scholar 

  21. Brody JP, Williams BA, Wold BJ, Quake SR (2002) Significance and statistical errors in the analysis of DNA microarray data. Proc Natl Acad Sci USA 99(20):12975–12978

    Article  Google Scholar 

  22. Arce G, Li Y (2002) Median power and median correlation theory. IEEE Trans Signal Process 50(11):2768–2776

    Article  Google Scholar 

  23. Salakhutdinov R (2009) Learning in Markov random fields using tempered transitions. In: Bengio Y, Schuurmans D, Lafferty J, Williams C, Culota A (eds) Advances in neural information processing systems, 22. MIT Press, Cambridge

    Google Scholar 

  24. Yang L, Yan H, Dong YX, Fei LY (2010) A kind of correlation classification distance of whole phase based on weight. Environmental science and information application technology (ESIAT), 2010 international conference on, 3: 668–671

  25. Chatfield C (2004) The analysis of time series: an introduction. 6th Ed, Chapman and Hall

  26. Ben-Dor A, Bruhn A, Friedman N, Nachman I, Schummer M, Yakhini Z (2000) Tissue classification with gene expression profiles. J Computational Biol 7(3/4):559–583

    Article  Google Scholar 

  27. Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, Golub TR, Mesirov JP (2003) Estimating dataset size requirements for classifying DNA microarray data. J Comput Biol 10(2):119–142

    Article  Google Scholar 

  28. Miche Y, Bas P, Jutten C, Simula O, Lendasse A (2008) A methodology for building regression models using extreme learning machine: OP-ELM. In: ESANN 2008, European symposium on artificial neural networks, Bruges, Belgium

  29. Huang G-B, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17:879–892

    Article  Google Scholar 

  30. Li J, Liu H (2004) Kent ridge bio-medical data set repository. School of Computer Engineering, Nanyang Technological University, Singapore, 2004. Online available: http://levis.tongji.edu.cn/gzli/data/mirror-kentridge.html

  31. Sarhan AM (2009) Cancer classification based on microarray gene expression data using DCT and ANN. J Theoretical Appl Inform Technol (JATIT) 6(2):208–216

    Google Scholar 

  32. Ali AH (2008) Self-organization maps for prediction of kidney dysfunction. In Proceedings of 16th Telecommunications Forum TELFOR, Belgrade, Serbia

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kevin Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, K., Man, Z., Wang, D. et al. Classification of bioinformatics dataset using finite impulse response extreme learning machine for cancer diagnosis. Neural Comput & Applic 22, 457–468 (2013). https://doi.org/10.1007/s00521-012-0847-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-0847-z

Keywords

Navigation