Skip to main content

Reliable Attribute Selection Based on Random Forest (RASER)

  • Conference paper
  • First Online:
Book cover Intelligent Systems Design and Applications (ISDA 2016)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 557))

Abstract

Feature selection has become one of the most active research areas in the field of data mining. It allows removing redundant and irrelevant data sets of large size. Furthermore, there are several methods in the literature for selecting attributes. In this article, a new multi-objective method is proposed to select relevant and non-redundant features. Our proposed feature selection method is divided into three stages: The first step computes the feature relevance value based on random forests. The second step, computes the dissimilarity matrix representing the dependence between the features of our training datasets, and transform it into a complete graph whose nodes represent features and edges represent the values of dissimilarities between them. The last step is for the optimization in which a multi-objective optimization algorithm is applied. The proposed method is applied on many datasets to find the most relevant and non-redundant features and the performance of the proposed method is compared with that of the popular MBEGA, mRMR (MIQ) and mRMR (MID).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Samb, M.L., Camara, F., Ndiaye, S., Slimani, Y., Esseghir, M.A.: Approche de sélection d’attributs pour la classification basée sur l’algorithme rfe-svm

    Google Scholar 

  2. Chouaib, H.: Sélection de caractéristiques:méthodes et applications (2011). http://www.math-info.univ-paris5.fr/~vincent/siten/Publications/theses/pdf/chouaib.pdf

  3. Zhu, Z., Ong, Y.-S., Dash, M.: Markov blanket-embedded genetic algorithm for gene selection. Pattern Recogn. 40, 3236–3248 (2007). http://www.sciencedirect.com/science/article/pii/S0031320307000945

    Article  MATH  Google Scholar 

  4. John, G.H.: Enhancements to the data mining process. Doctoral dissertation, Ph.D. thesis of Stanford University (1997)

    Google Scholar 

  5. Kohavi, R., Pfleger, K., John, G.H.: Irrelevant features and the subset selection problem, pp. 121–129 (1994)

    Google Scholar 

  6. Mandal, M., Mukhopadhyay, A.: A graph-theoretic approach for identifying non-redundant and relevant gene markers from microarray data using multiobjective binary PSO. PLoS ONE 9(3), e90949 (2014)

    Article  Google Scholar 

  7. Koller, D., Sahami, M.: Toward Optimal Feature Selection. pp. 284–292. Stanford InfoLab, Stanford (1996)

    Google Scholar 

  8. Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  9. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of maxdependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  10. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5, 537–550 (1994)

    Article  Google Scholar 

  11. You, W., Yang, Z., Ji, G.: PLS-based recursive feature elimination for high-dimensional small sample. Knowl.-Based Syst. 55, 15–28 (2014)

    Article  Google Scholar 

  12. Zhou, Q., Zhou, H., Zhou, Q., Yang, F., Luo, L.: Structure damage detection based on random forest recursive feature elimination. Mech. Syst. Sig. Process. 46(1), 82–90 (2014)

    Article  Google Scholar 

  13. Azhagusundari, B., Thanamani, A.S.: Feature selection based on information gain. Int. J. Innov. Technol. Explor. Eng. (IJITEE) ISSN 2278–3075 (2013)

    Google Scholar 

  14. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  15. Yu, L., Liu, H., Feature selection for high-dimensional data: a fast correlation-based filter solution. In: ICML, vol. 3, pp. 856–863 (2003)

    Google Scholar 

  16. Ghattas, B., Ishak, A.B.: Sélection de variables pour la classification binaire en grande dimension: comparaisons et application aux données de biopuces. J. de la société française de statistique 149(3), 43–66 (2008)

    Google Scholar 

  17. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genomewide expression patterns. Proc. Natl. Acad. Sci. 95(25), 14863–14868 (1998)

    Article  Google Scholar 

  18. Crescenzi, P., Kann, V., Halldórsson, M.: A compendium of NP optimization problems (1995)

    Google Scholar 

  19. https://archive.ics.uci.edu/ml/datasets.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aboudi Noura .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Noura, A., Shili, H., Romdhane, L.B. (2017). Reliable Attribute Selection Based on Random Forest (RASER). In: Madureira, A., Abraham, A., Gamboa, D., Novais, P. (eds) Intelligent Systems Design and Applications. ISDA 2016. Advances in Intelligent Systems and Computing, vol 557. Springer, Cham. https://doi.org/10.1007/978-3-319-53480-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-53480-0_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-53479-4

  • Online ISBN: 978-3-319-53480-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics