Skip to main content

On the Effectiveness of Gene Selection for Microarray Classification Methods

  • Conference paper
Intelligent Information and Database Systems (ACIIDS 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5991))

Included in the following conference series:

Abstract

Microarray data usually contains a high level of noisy gene data, the noisy gene data include incorrect, noise and irrelevant genes. Before Microarray data classification takes place, it is desirable to eliminate as much noisy data as possible. An approach to improving the accuracy and efficiency of Microarray data classification is to make a small selection from the large volume of high dimensional gene expression dataset. An effective gene selection helps to clean up the existing Microarray data and therefore the quality of Microarray data has been improved. In this paper, we study the effectiveness of the gene selection technology for Microarray classification methods. We have conducted some experiments on the effectiveness of gene selection for Microarray classification methods such as two benchmark algorithms: SVMs and C4.5. We observed that although in general the performance of SVMs and C4.5 are improved by using the preprocessed datasets rather than the original data sets in terms of accuracy and efficiency, while an inappropriate choice of gene data can only be detrimental to the power of prediction. Our results also implied that with preprocessing, the number of genes selected affects the classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Ding, C.H.Q.: Unsupervised feature selection via two-way ordering in gene expression analysis. Bioinformatics 19(10), 1259–1266 (2003)

    Article  Google Scholar 

  2. Li, S., Wu, X., Hu, X.: Gene selection using genetic algorithm and support vectors machines. Soft Comput. 12(7), 693–698 (2008)

    Article  Google Scholar 

  3. Song, M., Rajasekaran, S.: A greedy correlation-incorporated SVM-based algorithm for gene selection. In: AINA Workshops (1), pp. 657–661. IEEE Computer Society, Los Alamitos (2007)

    Google Scholar 

  4. Golub, T.R., Slonim, D.K., Tamayo, P., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  5. Veer, L.V., Dai, H., de Vijver, M.V., et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002)

    Article  Google Scholar 

  6. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)

    Article  MATH  Google Scholar 

  7. Mukkamala, S., Liu, Q., Veeraghattam, R., Sung, A.H.: Feature selection and ranking of key genes for tumor classification: Using Microarray gene expression data. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Å»urada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 951–961. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Liu, X., Krishnan, A., Mondry, A.: An entropy-based gene selection method for cancer classification using Microarray data. BMC Bioinformatics 6, 76 (2005)

    Article  Google Scholar 

  9. Zhu, Z., Ong, Y.S., Dash, M.: Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognition 40(11), 3236–3248 (2007)

    Article  MATH  Google Scholar 

  10. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  11. Koller, D., Sahami, M.: Toward optimal feature selection. In: International Conference on Machine Learning, pp. 284–292 (1996)

    Google Scholar 

  12. Yu, L., Liu, H.: Redundancy based feature selection for Microarray data. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, pp. 737–742 (2004)

    Google Scholar 

  13. Blanco, R., Larrañaga, P., Inza, I., Sierra, B.: Gene selection for cancer classification using wrapper approaches. IJPRAI 18(8), 1373–1390 (2004)

    Google Scholar 

  14. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)

    Article  MATH  Google Scholar 

  15. Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20(3), 273–297 (1995)

    MATH  Google Scholar 

  16. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  17. Osuna, E., Freund, R., Girosi, F.: Training support vector machines:an application to face detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1997)

    Google Scholar 

  18. Furey, T.S., Christianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Hauessler, D.: Support vector machine classification and validation of cancer tissue samples using Microarray expression data. Bioinformatics 16(10), 906–914 (2000)

    Article  Google Scholar 

  19. Brown, M., Grundy, W., Lin, D., Cristianini, N., Sugnet, C., Furey Jr., T.M., Haussler, D.: Knowledge-based analysis of Microarray gene expression data by using suport vector machines. Proc. Natl. Acad. Sci. 97, 262–267 (2000)

    Article  Google Scholar 

  20. Cho, S.B., Won, H.H.: Machine learning in DNA Microarray analysis for cancer classification. In: CRPITS’19: Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003, Darlinghurst, Australia, pp. 189–198. Australian Computer Society, Inc. (2003)

    Google Scholar 

  21. Li, J., Liu, H.: Kent ridge bio-medical data set repository (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Z., Li, J., Hu, H., Zhou, H. (2010). On the Effectiveness of Gene Selection for Microarray Classification Methods. In: Nguyen, N.T., Le, M.T., ÅšwiÄ…tek, J. (eds) Intelligent Information and Database Systems. ACIIDS 2010. Lecture Notes in Computer Science(), vol 5991. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12101-2_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12101-2_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12100-5

  • Online ISBN: 978-3-642-12101-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics