Skip to main content

A Comparative Study of Gene Selection Methods for Microarray Cancer Classification

  • Conference paper
  • First Online:
Book cover Proceedings of the International Conference on Data Engineering 2015 (DaEng-2015)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 520))

Abstract

In recent years, a DNA microarray technique has gained more attraction in both scientific and in industrial fields. It is important to determine the informative genes that cause the cancer to improve early cancer diagnosis and to give effective chemotherapy treatment. In order to gain deep insight into the cancer classification problem, it is necessary to take a closer look at the proposed gene selection methods. We believe that they should be an integral preprocessing step for cancer classification. Furthermore, finding an accurate gene selection method is very significant issue in a cancer classification area, because it reduces the dimensionality of microarray dataset and selects informative genes. In this paper, we review, classify and compare the state-of-art gene selection methods. We proceed by evaluating the performance of each gene selection approach based on their classification accuracy and number of informative genes. In our evaluation, we will use four benchmark microarray datasets for cancer diagnosis (Leukemia, Colon, Lung, and Prostate). In addition, we compare the performance of gene selection method to investigate the effective gene selection method that has the ability to identify a small set of marker genes, and ensure high cancer classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ghorai, S., Mukherjee, A., Sengupta, S., Dutta, P.: Multicategory cancer classification from gene expression data by multiclass nppc ensemble. In: 2010 International Conference on Systems in Medicine and Biology (ICSMB), pp. 4–48 (2010)

    Google Scholar 

  2. Sheng-Bo, G., Michael, L., Ming, L.: Gene selection based on mutual information for the classification of multi-class cancer. In: Proceedings of the 2006 International Conference on Computational Intelligence and Bioinformatics, pp. 454–463 (2006)

    Google Scholar 

  3. Fu, L.M., Fu-Liu, C.S.: Multi-class cancer subtype classification based on gene expression signatures with reliability analysis. FEBS Lett. 561(13), 186–190 (2004)

    Article  Google Scholar 

  4. Yu, H., Xu, S.: Simple rule-based ensemble classifiers for cancer DNA microarray data classification. In: 2011 Inter-national Conference on Computer Science and Service System (CSSS), pp. 2555–2558 (2011)

    Google Scholar 

  5. Narendra, P., Fukunaga, K.: A branch and bound algorithm for feature subset selection. IEEE Trans. Comput. 26(9), 917–922 (1977)

    Article  MATH  Google Scholar 

  6. Kun, Y., Zhipeng, C., Jianzhong, L., Guohui, L.: A stable gene selection in microarray data analysis. BMC Bioinform. 7(1), 1–16 (2006)

    Article  Google Scholar 

  7. Alonso, C., Moro-Sancho, I., Simon-Hurtado, A., Varela-Arrabal, R.: Microarray gene expression classification with few genes: criteria to combine attribute selection and classification methods. Expert Syst. Appl. 39(8), 7270–7280 (2012)

    Article  Google Scholar 

  8. Yvan, S., Aki, I., Pedro, L.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  9. Jorng-Tzong, H., Li-Cheng, W., Baw-Juine, L., Jun-Li, K., Wen-Horng, K., Jin-Jian, Z.: An expert system to classify microarray gene expression data using gene selection by decision tree. Expert Syst. Appl. 36(5), 9072–9081 (2009)

    Article  Google Scholar 

  10. Juliusdottir, T., Keedwell, E., Corne, D., Narayanan, A.: Two-phase ea/k-nn for feature selection and classification in cancer microarray datasets. In: Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB ’05, pp. 1–8 (2005)

    Google Scholar 

  11. Lee, C.P., Leu, Y.: A novel hybrid feature selection method for microarray data analysis. Appl. Soft Comput. 11(1), 208–213 (2011)

    Article  Google Scholar 

  12. Mundra, P.A., Rajapakse, J.C.: Gene and sample selection for cancer classification with support vectors based t-statistic. Neurocomputing 73(15), 2353–2362 (2010). http://www.sciencedirect.com/science/article/pii/S0925231210002432

    Article  Google Scholar 

  13. Liu, H., Liu, L., Zhang, H.: Ensemble gene selection by grouping for microarray data classification. J. Biomed. Inf. 43(1), 81–87 (2010)

    Article  Google Scholar 

  14. Chen, Y., Zhao, Y.: A novel ensemble of classifiers for microarray data classification. Appl. Soft Comput. 8(4), 1664–1669 (2008)

    Article  Google Scholar 

  15. Feng, C., Lipo, W.: Applications of support vector machines to cancer classification with microarray data. Int. J. Neural Syst. 15(06), 475–484 (2005)

    Article  Google Scholar 

  16. Kulkarni, A., Kumar, B.N., Ravi, V., Murthy, U.S.: Colon cancer prediction with genetics profiles using evolutionary techniques. Expert Syst. Appl. 38(3), 2752–2757 (2011). http://www.sciencedirect.com/science/article/pii/S0957417410008614

    Article  Google Scholar 

  17. Lee, C.P., Lin, W.S., Chen, Y.M., Kuo, B.J.: Gene selection and sample classification on microarray data based on adaptive genetic algorithm/k-nearest neighbor method. Expert Syst. Appl. 38(5), 4661–4667 (2011)

    Article  Google Scholar 

  18. Huang, H.L., Lee, C.C., Ho, S.Y.: Selecting a minimal number of relevant genes from microarray data to design accurate tissue classifiers. Biosystems 90(1), 78–86 (2007)

    Article  Google Scholar 

  19. Huang, H.L., Chang, F.L.: Esvm: evolutionary support vector machine for automatic feature selection and classification of microarray data. Biosystems 90(2), 516–528 (2007)

    Article  Google Scholar 

  20. Abderrahim, A., Talbi, E., Khaled, M.: Hybridization of genetic and quantum algorithm for gene selection and classification of microarray data. In: IEEE International Symposium on Parallel Distributed Processing, IPDPS 2009, pp. 1–8 (2009)

    Google Scholar 

  21. Alba, E., Garcia-Nieto, J., Jourdan, J., Talbi, E.: Gene selection in cancer classification using pso/svm and ga/svm hybrid algorithms. In: IEEE Congress on Evolutionary Computation, CEC 2007, pp. 284–290 (2007)

    Google Scholar 

  22. Shen, Q., Shi, W.M., Kong, W., Ye, B.X.: A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification. Talanta 71(4), 1679–1683 (2007)

    Article  Google Scholar 

  23. Xiong, W., Wang, C.: A hybrid improved ant colony optimization and random forests feature selection method for microarray data. In: International Conference on Networked Computing and Advanced Information Management, pp. 559–563 (2009)

    Google Scholar 

  24. Mohamad, M., Omatu, S., Yoshioka, M., Deris, S.: An approach using hybrid methods to select informative genes from microarray data for cancer classification. In: Second Asia International Conference on Modeling Simulation, AICMS 08, pp. 603–608 (2008)

    Google Scholar 

  25. Yang, C.S., Chuang, L.Y., Ke, C.H., Yang, C.H.: A hybrid feature selection method for microarray classification. Int. J. Comput. Sci. 35, 285–290 (2008)

    Google Scholar 

  26. Chuang, L.Y., Yang, C.H., Wu, K.C., Yang, C.H.: A hybrid feature selection method for dna microarray data. Comput. Biol. Med. 41(4), 228–237 (2011)

    Article  Google Scholar 

  27. El Akadi, A., Amine, A., El Ouardighi, A., Aboutajdine, D.: A new gene selection approach based on minimum redundancy-maximum relevance (mrmr) and genetic algorithm (ga). In: IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2009, pp. 69-75 (2009)

    Google Scholar 

  28. Meir, P., DavidM., R., Marwala, T., Scott, L., Featherston, J., Stevens, W.: The fuzzy gene filter: an adaptive fuzzy inference system for expression array feature selection. In: Trends in Applied Intelligent Systems, vol. 6098, pp. 62–71. Springer, Berlin, Heidelberg (2010)

    Google Scholar 

  29. Huerta, E., Duval, B., kao Hao, J.: A hybrid ga/svm approach for gene selection and classification of microarray data. In: EvoWorkshops 2006, LNCS 3907, pp. 34–44. Springer (2006)

    Google Scholar 

  30. Kumar, P.G., Victoire, T.A.A., Renukadevi, P., Devaraj, D.: Design of fuzzy expert system for microarray data classification using a novel genetic swarm algorithm. Expert Syst. Appl. 39(2), 1811–1821 (2012)

    Article  Google Scholar 

  31. Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, L., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  32. Gordon, G.J., Jensen, R.V., li Hsiao, L., Gullans, S.R., Blumenstock, J.E., Ramaswamy, S., Richards, W.G., Sugarbaker, D.J., Bueno, R.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 62, 4963–4967 (2002)

    Google Scholar 

  33. Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  34. Singh, D., Febbo, P.G., Ross, K., Jackson, D., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S., Loda, M., Kantoff, P.W., Golub, T.R., Sellers, W.R.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203–209 (2002)

    Article  Google Scholar 

  35. Osareh, A., Shadgar, B.: Microarray data analysis for cancer classification. In: 2010 5th International Symposium on Health Informatics and Bioinformatics (HIBIT), pp. 125–132 (2010)

    Google Scholar 

  36. Simon, R.: Analysis of dna microarray expression data. Best Pract. Res. Clin. Haematol. 22(2), 271–282 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hala Alshamlan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alshamlan, H., Badr, G., Alohali, Y. (2019). A Comparative Study of Gene Selection Methods for Microarray Cancer Classification. In: Abawajy, J., Othman, M., Ghazali, R., Deris, M., Mahdin, H., Herawan, T. (eds) Proceedings of the International Conference on Data Engineering 2015 (DaEng-2015) . Lecture Notes in Electrical Engineering, vol 520. Springer, Singapore. https://doi.org/10.1007/978-981-13-1799-6_60

Download citation

Publish with us

Policies and ethics