Skip to main content

Combination of Feature Selection Methods for the Effective Classification of Microarray Gene Expression Data

  • Conference paper
  • First Online:
Book cover Recent Trends in Image Processing and Pattern Recognition (RTIP2R 2016)

Abstract

Gene selection from microarray gene expression data is very difficult due to the large dimensionality of the data. The number of samples in the microarray data set is very small compared to the number of genes as features. To reduce dimensionality, selection of significant genes is necessary. An effective method of gene feature selection helps in dimensionality reduction and improves the performance of the sample classification. In this work, we have examined if combination of feature selection methods can improve the performance of classification algorithms. We propose two methods of combination of feature selection techniques. Experimental results suggest that appropriate combination of filter gene selection methods is more effective than individual techniques for microarray data classification. We have compared our combination methods using different learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ferreira, A.J., Figueiredo, M.A.T.: Efficient feature selection filters for high dimensional data. Pattern Recogn. Lett. 33, 1794–1804 (2012)

    Article  Google Scholar 

  2. Chan, D., Bridges, S.M., Burgess, S.C.: An Ensemble Method for Identifying Robust Features for Biomarker Discovery, pp. 377–392. Chapman & Hall, Boca Raton (2007)

    Google Scholar 

  3. Chandra, B., Gupta, M.: An efficient statistical feature selection approach for classification of gene expression data. J. Biomed. Inform. 44(4), 529–535 (2011)

    Article  Google Scholar 

  4. Chopra, P., Lee, J., Kang, J., Lee, S.: Improving cancer classification accuracy using gene pairs. PLoS ONE 5(12), e14305 (2010)

    Article  Google Scholar 

  5. Deegalla, S., Bostrom, H.: Improving fusion of dimensionality reduction methods for nearest neighbor classification. In: Proceedings of the 12th International Conference on Information Fusion, pp. 460–465 (2009)

    Google Scholar 

  6. Fawcett, T.: An introduction to ROC analysis. ROC Anal. Pattern Recogn. 27, 861–874 (2006)

    Article  Google Scholar 

  7. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  8. Han, F., Sun, W., Ling, Q.H.: A novel strategy for gene selection of microarray data based on gene-to-class sensitivity information. PLoS ONE 9(5), e97530 (2014)

    Article  Google Scholar 

  9. Hand, D.J.: Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77, 103–123 (2009)

    Article  Google Scholar 

  10. Dutkowski, J., Gambin, A.: On consensus biomarker selection. BMC Bioinform. 8(Suppl. 5), S5 (2007)

    Article  Google Scholar 

  11. Jin, C.L., Ling, C.X., Huang, J., Zhang, H.: AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of 18th International Conference on Artificial Intelligence, pp. 329–341 (2003)

    Google Scholar 

  12. Keedwell, E.C., Narayanan, A.: Intelligent Bioinformatics: The Application of Artificial Intelligence Techniques to Bioinformatics Problems. Wiley, London (2005)

    Book  Google Scholar 

  13. Kolde, R., Laur, S., Adler, P., Vilo, J.: Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics 28(4), 573–580 (2012)

    Article  Google Scholar 

  14. Mamitsuka, H.: Selecting features in microarray classification using ROC curves. Pattern Recogn. 39, 2393–2404 (2006)

    Article  MATH  Google Scholar 

  15. Perez, M.: Machine learning and soft computing approaches to microarray differential expression analysis and feature selection. Ph.D. Thesis 2011, University of the Witwatersrand, Johannesburg (2012)

    Google Scholar 

  16. MathWorks: Bioinformatics Toolbox. MATLAB edn. (2007)

    Google Scholar 

  17. Nguyen, T., Khosravi, A., Creighton, D.: Heirarchical gene selection and genetic fuzzy system for cancer microarray data classification. PLoS ONE 10(3), e0120364 (2015)

    Article  Google Scholar 

  18. Yang, P., Yang, Y.H., Zhou, B.B., Zomaya, A.Y.: A review of ensemble methods in bioinformatics. Curr. Bioinform. 5(4), 296–308 (2010)

    Article  Google Scholar 

  19. Yang, P., Zhou, B.B., Zhang, Z., Zomaya, A.Y.: A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data. BMC Bioinform. 11(Suppl. 1), S5 (2010). doi:10.1186/1471-2105-11-S1-S5

    Google Scholar 

  20. Pepe, M.S., Longton, G., Anderson, G.L., Schummer, M.: Selecting differentially expressed genes from microarray experiments. Biometrics 59, 133–142 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  21. Saeys, Y., Lnza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  22. Saeys, Y., Abeel, T., Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87481-2_21

    Chapter  Google Scholar 

  23. Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)

    Article  Google Scholar 

  24. Weka: A multi-task machine learning software. http://www.cs.waikato.ac.nz/ml/weka

  25. Xu, J., Sun, L., Gao, Y., Xu, T.: An ensemble feature selection technique for cancer recognition. Biomed. Mater. Eng. 24(1), 1001–1008 (2014). doi:10.3233/BME-130897

    Google Scholar 

  26. Yang, Y.H., Xiao, Y., Segal, M.R.: Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics 21(7), 1084–1093 (2005)

    Article  Google Scholar 

  27. Peng, Y., Wu, Z., Jiang, J.: A novel feature selection approach for biomedical data classification. J. Biomed. Inform. 43, 15–23 (2010)

    Article  Google Scholar 

  28. Zhang, Z., Yang, P., Wu, X., Zhang, C.: An agent-based hybrid system for microarray data analysis. IEEE Intell. Syst. 24(5), 53–63 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Sheela .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Sheela, T., Rangarajan, L. (2017). Combination of Feature Selection Methods for the Effective Classification of Microarray Gene Expression Data. In: Santosh, K., Hangarge, M., Bevilacqua, V., Negi, A. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-10-4859-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-4859-3_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-4858-6

  • Online ISBN: 978-981-10-4859-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics