Skip to main content

Hybrid Feature Ranking for Proteins Classification

  • Conference paper
Advanced Data Mining and Applications (ADMA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Included in the following conference series:

Abstract

Hybrid feature ranking is a feature selection method which combines the quickness of the filter approach and the accuracy of the wrapper approach. The main idea consists in a two steps procedure: building a sequence of feature subsets using an informational criterion, independently of the learning method; selecting the best one with a cross-validation error rate evaluation, using explicitly the learning method. In this paper, we show that in the protein discrimination domain, few examples but numerous descriptors, compared to a traditional approach where each descriptor is evaluated separately in the first step, to take account of their redundancy in the construction of candidate subsets of features reduces the size of the optimal subset and improves, in certain cases, the accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  2. Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: ICML 2000: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann Publishers Inc., San Francisco (2000)

    Google Scholar 

  3. Xing, E.P., Jordan, M.I., Karp, R.M.: Feature selection for high-dimensional genomic microarray data. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 601–608. Morgan Kaufmann Publishers Inc., San Francisco (2001)

    Google Scholar 

  4. Duch, W., Wieczorek, T., Biesiada, J., Blachnik, M.: Comparison of feature ranking methods based on information entropy. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1415–1420. IEEE Press, Los Alamitos (2004)

    Google Scholar 

  5. Mhamdi, F., Elloumi, M., Rakotomalala, R.: Text-mining, feature selection and data-mining for proteins classification. In: Proceedings of International Conference on Information and Communication Technologies: From Theory to Applications, pp. 457–458. IEEE Press, Los Alamitos (2004)

    Google Scholar 

  6. Yu, L., Liu, H.: Efficiently handling feature redundancy in high-dimensional data. In: KDD 2003: Proceedings of the ninth ACM SIGKDD, pp. 685–690. ACM Press, New York (2003)

    Chapter  Google Scholar 

  7. Murzin, A., Brenner, S., Hubbard, T., Chothia, C.: Scop: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology, 536–540 (1995)

    Google Scholar 

  8. Domingos, P., Pazzani, M.: On the optimality of the simple bayesian classifier under zero-one loss. Mach. Learn. 29, 103–130 (1997)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rakotomalala, R., Mhamdi, F., Elloumi, M. (2005). Hybrid Feature Ranking for Proteins Classification. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_72

Download citation

  • DOI: https://doi.org/10.1007/11527503_72

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27894-8

  • Online ISBN: 978-3-540-31877-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics