Reliable Attribute Selection Based on Random Forest (RASER)

Noura, Aboudi; Shili, Hechmi; Romdhane, Lotfi Ben

doi:10.1007/978-3-319-53480-0_2

Aboudi Noura¹⁸,
Hechmi Shili¹⁸ &
Lotfi Ben Romdhane¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 557))

Included in the following conference series:

International Conference on Intelligent Systems Design and Applications

1645 Accesses
1 Citations

Abstract

Feature selection has become one of the most active research areas in the field of data mining. It allows removing redundant and irrelevant data sets of large size. Furthermore, there are several methods in the literature for selecting attributes. In this article, a new multi-objective method is proposed to select relevant and non-redundant features. Our proposed feature selection method is divided into three stages: The first step computes the feature relevance value based on random forests. The second step, computes the dissimilarity matrix representing the dependence between the features of our training datasets, and transform it into a complete graph whose nodes represent features and edges represent the values of dissimilarities between them. The last step is for the optimization in which a multi-objective optimization algorithm is applied. The proposed method is applied on many datasets to find the most relevant and non-redundant features and the performance of the proposed method is compared with that of the popular MBEGA, mRMR (MIQ) and mRMR (MID).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Samb, M.L., Camara, F., Ndiaye, S., Slimani, Y., Esseghir, M.A.: Approche de sélection d’attributs pour la classification basée sur l’algorithme rfe-svm
Google Scholar
Chouaib, H.: Sélection de caractéristiques:méthodes et applications (2011). http://www.math-info.univ-paris5.fr/~vincent/siten/Publications/theses/pdf/chouaib.pdf
Zhu, Z., Ong, Y.-S., Dash, M.: Markov blanket-embedded genetic algorithm for gene selection. Pattern Recogn. 40, 3236–3248 (2007). http://www.sciencedirect.com/science/article/pii/S0031320307000945
Article MATH Google Scholar
John, G.H.: Enhancements to the data mining process. Doctoral dissertation, Ph.D. thesis of Stanford University (1997)
Google Scholar
Kohavi, R., Pfleger, K., John, G.H.: Irrelevant features and the subset selection problem, pp. 121–129 (1994)
Google Scholar
Mandal, M., Mukhopadhyay, A.: A graph-theoretic approach for identifying non-redundant and relevant gene markers from microarray data using multiobjective binary PSO. PLoS ONE 9(3), e90949 (2014)
Article Google Scholar
Koller, D., Sahami, M.: Toward Optimal Feature Selection. pp. 284–292. Stanford InfoLab, Stanford (1996)
Google Scholar
Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Article Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of maxdependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Article Google Scholar
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5, 537–550 (1994)
Article Google Scholar
You, W., Yang, Z., Ji, G.: PLS-based recursive feature elimination for high-dimensional small sample. Knowl.-Based Syst. 55, 15–28 (2014)
Article Google Scholar
Zhou, Q., Zhou, H., Zhou, Q., Yang, F., Luo, L.: Structure damage detection based on random forest recursive feature elimination. Mech. Syst. Sig. Process. 46(1), 82–90 (2014)
Article Google Scholar
Azhagusundari, B., Thanamani, A.S.: Feature selection based on information gain. Int. J. Innov. Technol. Explor. Eng. (IJITEE) ISSN 2278–3075 (2013)
Google Scholar
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Article Google Scholar
Yu, L., Liu, H., Feature selection for high-dimensional data: a fast correlation-based filter solution. In: ICML, vol. 3, pp. 856–863 (2003)
Google Scholar
Ghattas, B., Ishak, A.B.: Sélection de variables pour la classification binaire en grande dimension: comparaisons et application aux données de biopuces. J. de la société française de statistique 149(3), 43–66 (2008)
Google Scholar
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genomewide expression patterns. Proc. Natl. Acad. Sci. 95(25), 14863–14868 (1998)
Article Google Scholar
Crescenzi, P., Kann, V., Halldórsson, M.: A compendium of NP optimization problems (1995)
Google Scholar
https://archive.ics.uci.edu/ml/datasets.html

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Monastir, Monastir, Tunisia
Aboudi Noura & Hechmi Shili
Department of Computer Science, University of Sousse, Sousse, Tunisia
Lotfi Ben Romdhane

Authors

Aboudi Noura
View author publications
You can also search for this author in PubMed Google Scholar
Hechmi Shili
View author publications
You can also search for this author in PubMed Google Scholar
Lotfi Ben Romdhane
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aboudi Noura .

Editor information

Editors and Affiliations

Departamento de Engenharia Informática, Instituto Superior de Engenharia do Port, Porto, Portugal
Ana Maria Madureira
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs, Auburn, Washington, USA
Ajith Abraham
Polytechnic Institute of Porto, Felgueiras, Portugal
Dorabela Gamboa
Campus of Gualtar, University of Minho, Braga, Portugal
Paulo Novais

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noura, A., Shili, H., Romdhane, L.B. (2017). Reliable Attribute Selection Based on Random Forest (RASER). In: Madureira, A., Abraham, A., Gamboa, D., Novais, P. (eds) Intelligent Systems Design and Applications. ISDA 2016. Advances in Intelligent Systems and Computing, vol 557. Springer, Cham. https://doi.org/10.1007/978-3-319-53480-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-53480-0_2
Published: 23 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-53479-4
Online ISBN: 978-3-319-53480-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics