Feature Ranking for Protein Classification

Mhamdi, Faouzi; Rakotomalala, Ricco; Elloumi, Mourad

doi:10.1007/3-540-32390-2_72

Faouzi Mhamdi³,
Ricco Rakotomalala⁴ &
Mourad Elloumi³

Part of the book series: Advances in Soft Computing ((AINSC,volume 30))

1565 Accesses
1 Citations

Abstract

In this paper, a knowledge discovery framework is used for protein classification. The processing is achieved in three steps: feature extraction, feature ranking and feature selection. Inspirited from text mining results for the first step, we use n-grams descriptors; descriptors are ranked from chi-2 statistical indices in the second step; and in the final step, the subset of descriptors is selected which will minimize the prediction error rate using a k-nearest neighbor classifier. Experiments show that this framework gives good results: the dimensionality reduction is effective and increases the classifier performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

ReliefE: feature ranking in high-dimensional spaces via manifold embeddings

Article Open access 17 June 2021

Influence of feature rankers in the construction of molecular activity prediction models

Article 31 December 2019

Classifier-dependent feature selection via greedy methods

Article Open access 06 July 2024

References

Fayyad UM, Shapiro G, Smyth P (1996) From data mining to knowledge discovery: An overview, Advances in Knowledge Discovery and Data Mining. AAAI Press and the MIT Press, Chapter 1: 1–34
Google Scholar
Sebastiani F (2002) Machine learning in automated text categorisation. In ACM Surveys, 34(1): 1–47
Article Google Scholar
Mhamdi F, Elloumi M, Rakotomalala R (2004) Textmining, features selection and datamining for proteins classification. In IEEE/ICTTA’04, Damascus, Syria
Google Scholar
Hastie T, Tibshirani R, Friedman J (2001) The Elements of Statical Learning: Datamining, Inference, and Prediction, Springer-Verlag
Google Scholar
Lefébure R, Venturi G, (2001) Data mining: Gestion de la relation client personnalisation de sites web, Eyrolles
Google Scholar
Molina LC, Belanche L, Nebot A (2002) Feature Selection Algorithms: A Survey and Experimental Evaluation, In ICDM’02, Maebashi City, Japan
Google Scholar
Duch W, Wieczorek T, Biesiada J, Blachnik M (2004) Comparison of feature ranking methods based on information entropy Proc. of International Joint Conference on Neural Networks (IJCNN), Budapest, IEEE Press: 1415–1420
Google Scholar
Isabelle G, André E (2003) An introduction to variable and feature selection. Journal of Machine Learning Research 3: 1157–1182
Article MATH Google Scholar
Murzin GA, Brenner ES, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Bio.. 247: 536–540
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Sciences of Tunis, URPAH, Tunisia
Faouzi Mhamdi & Mourad Elloumi
ERIC, University of Lyon 2, Lyon, France
Ricco Rakotomalala

Authors

Faouzi Mhamdi
View author publications
You can also search for this author in PubMed Google Scholar
Ricco Rakotomalala
View author publications
You can also search for this author in PubMed Google Scholar
Mourad Elloumi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Electronics, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland
Marek Kurzyński , Edward Puchała , Michał Woźniak & Andrzej żołnierek , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mhamdi, F., Rakotomalala, R., Elloumi, M. (2005). Feature Ranking for Protein Classification. In: Kurzyński, M., Puchała, E., Woźniak, M., żołnierek, A. (eds) Computer Recognition Systems. Advances in Soft Computing, vol 30. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32390-2_72

Download citation

DOI: https://doi.org/10.1007/3-540-32390-2_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25054-8
Online ISBN: 978-3-540-32390-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics