Skip to main content

A Novel Hybrid Method of Gene Selection and Its Application on Tumor Classification

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5227))

Abstract

Microarray gene expression profile data is used to accurately predict different tumor types, which has great value in providing better treatment and toxicity minimization on the patients. However, it is difficult to classify different tumor types using microarray data because the number of samples is much smaller than the number of genes. It has been proved that a small feature gene subset can improve classification accuracy, so feature gene selection and extraction algorithm is very important in tumor classification. In this paper, a novel hybrid gene selection method is proposed to find a feature gene subset so that the feature genes related to certain cancer can be kept and the redundant genes can be leave out. In the proposed method, we combine the advantages of the PCA and the LDA and proposed a novel feature gene extraction scheme. We also compared several kinds of parametric and non-parametric feature gene selection methods. We use the SVM as the classifier in the experiment and compare the performance of three common SVM kernels. Their differences are analyzed. Using the n-fold cross validation, the proposed algorithm is carried out on three published benchmark tumor datasets and experimental results show that this algorithm leads to better classification performance than other methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Harrington, C.A., Rosenow, C., Retief, J.: Monitoring Gene Expression Using DNA Microarrays. Int. J. Current Opinion in Microbiology 3(3), 285–291 (2000)

    Article  Google Scholar 

  2. Patra, J.C., Lim, G.P., Meher, P.K.: DNA Microarray Data Analysis: Effective Feature Selection for Accurate Cancer Classification. In: IJCNN 2007, pp. 260–265 (2007)

    Google Scholar 

  3. Kohavi, R., John, G.H.: Wrapper for Feature Subset Selection. Artif. Intell. 97(1/2), 273–324 (1997)

    Article  MATH  Google Scholar 

  4. Zhang, H.P., Yu, C.Y., Singer, B., Xiong, M.M.: Recursive Partitioning for Tumor Classification with Gene Expression Microarray Data. PNAS 98(12), 6730–6735 (2001)

    Article  Google Scholar 

  5. Chu, W., Ghahramani, Z., Falciani, F., Wild, D.L.: Biomarker Discovery in Microarray Gene Expression Data with Gaussian Processes. Bioinformatics 21(16), 3385–3393 (2005)

    Article  Google Scholar 

  6. Brown, M.P.S., Grundy, W.N., Lin, D., Cristianini, N., Sugnet, C., Agnes, J.M., Haussler, D.: Support Vector Machine Classification of Microarray Gene Expression Data. Technical Report, U. California (Santa Cruz) (1999)

    Google Scholar 

  7. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  8. Guyon, I., Weston, J., Barnhill, S.: Gene Selection for Cancer Classification Using Support Vector Machines. Mach. Learn. 46, 389–422 (2002)

    Article  MATH  Google Scholar 

  9. Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, 1157–1182 (2003)

    Google Scholar 

  10. Wang, Y.H., Makedon, F.S., Ford, J.C., Pearlman, J.: HykGene: A Hybrid Approach for Selecting Marker Genes for Phenotype Classification Using Microarray Gene Expression Data. Bioinformatics 21(8), 1530–1537 (2005)

    Article  Google Scholar 

  11. Deng, L., Pei, J., Ma, J., Lee, D.L.: A Rank Sum Test Method for Informative Gene Discovery. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), Seattle, WA, USA, pp. 22–25 (2004)

    Google Scholar 

  12. Lehmann, E.L.: Non-parametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco (1975)

    Google Scholar 

  13. Liu, Z.Q., Chen, D.C., Bensmail, H.: Gene Expression Data Classification with Kernel Principal Component Analysis. Journal of Biomedicine and Biotechnology, 155–159 (2005)

    Google Scholar 

  14. Joliffe, I.T.: Principal Component Analysis, 2nd edn. Springer, New York (2002)

    Google Scholar 

  15. Niijima, S., Okuno, Y.: Laplacian Linear Discriminant Analysis Approach to Unsupervised Feature Selection. IEEE/ACM Transactions on Computational Biology and Bioinformatics (to appear, 2008)

    Google Scholar 

  16. Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1992)

    Google Scholar 

  17. Burges, C.: A Tutorial on Support Vector Machines for Pattern Recognition. Kluwer Academic Publishers, Dordrecht (1998)

    Google Scholar 

  18. Wang, S.L., Wang, J., Chen, H.W., Tang, W.S.: The Classification of Tumor Using Gene Expression Profile Based on Support Vector Machines and Factor Analysis. In: Intelligent Systems Design and Applications, Jinan, China, pp. 471–476. IEEE Computer Society Press, Los Alamitos (2006)

    Chapter  Google Scholar 

  19. Chang, C.C., Lin, C.J.: LIBSVM: A Library for Support Vector Machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm

Download references

Author information

Authors and Affiliations

Authors

Editor information

De-Shuang Huang Donald C. Wunsch II Daniel S. Levine Kang-Hyun Jo

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

You, Z., Wang, S., Gui, J., Zhang, S. (2008). A Novel Hybrid Method of Gene Selection and Its Application on Tumor Classification. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. ICIC 2008. Lecture Notes in Computer Science(), vol 5227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85984-0_127

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85984-0_127

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85983-3

  • Online ISBN: 978-3-540-85984-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics