Hybrid Feature Ranking for Proteins Classification

Rakotomalala, Ricco; Mhamdi, Faouzi; Elloumi, Mourad

doi:10.1007/11527503_72

Ricco Rakotomalala²¹,
Faouzi Mhamdi²² &
Mourad Elloumi²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2343 Accesses
3 Citations

Abstract

Hybrid feature ranking is a feature selection method which combines the quickness of the filter approach and the accuracy of the wrapper approach. The main idea consists in a two steps procedure: building a sequence of feature subsets using an informational criterion, independently of the learning method; selecting the best one with a cross-validation error rate evaluation, using explicitly the learning method. In this paper, we show that in the protein discrimination domain, few examples but numerous descriptors, compared to a traditional approach where each descriptor is evaluated separately in the first step, to take account of their redundancy in the construction of candidate subsets of features reduces the size of the optimal subset and improves, in certain cases, the accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Article MATH Google Scholar
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: ICML 2000: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Google Scholar
Xing, E.P., Jordan, M.I., Karp, R.M.: Feature selection for high-dimensional genomic microarray data. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 601–608. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Google Scholar
Duch, W., Wieczorek, T., Biesiada, J., Blachnik, M.: Comparison of feature ranking methods based on information entropy. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1415–1420. IEEE Press, Los Alamitos (2004)
Google Scholar
Mhamdi, F., Elloumi, M., Rakotomalala, R.: Text-mining, feature selection and data-mining for proteins classification. In: Proceedings of International Conference on Information and Communication Technologies: From Theory to Applications, pp. 457–458. IEEE Press, Los Alamitos (2004)
Google Scholar
Yu, L., Liu, H.: Efficiently handling feature redundancy in high-dimensional data. In: KDD 2003: Proceedings of the ninth ACM SIGKDD, pp. 685–690. ACM Press, New York (2003)
Chapter Google Scholar
Murzin, A., Brenner, S., Hubbard, T., Chothia, C.: Scop: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology, 536–540 (1995)
Google Scholar
Domingos, P., Pazzani, M.: On the optimality of the simple bayesian classifier under zero-one loss. Mach. Learn. 29, 103–130 (1997)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

ERIC Laboratory, University of Lyon 2, France
Ricco Rakotomalala
URPAH, University of Tunis, Tunisie
Faouzi Mhamdi & Mourad Elloumi

Authors

Ricco Rakotomalala
View author publications
You can also search for this author in PubMed Google Scholar
Faouzi Mhamdi
View author publications
You can also search for this author in PubMed Google Scholar
Mourad Elloumi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, 4072, Brisbane, Queensland, Australia
Xue Li
The State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 430072, Wuhan, China
Shuliang Wang
School of ITEE, The Univ of Queensland, St. Lucia, 4072, QLD, Australia
Zhao Yang Dong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rakotomalala, R., Mhamdi, F., Elloumi, M. (2005). Hybrid Feature Ranking for Proteins Classification. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_72

Download citation

DOI: https://doi.org/10.1007/11527503_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics