Ensemble Feature Selection for Rankings of Features

Seijo-Pardo, Borja; Bolón-Canedo, Verónica; Porto-Díaz, Iago; Alonso-Betanzos, Amparo

doi:10.1007/978-3-319-19222-2_3

Borja Seijo-Pardo¹⁶,
Verónica Bolón-Canedo¹⁶,
Iago Porto-Díaz¹⁶ &
…
Amparo Alonso-Betanzos¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9095))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

2064 Accesses
9 Citations

Abstract

In the last few years, ensemble learning has been the focus of much attention mainly in classification tasks, based on the assumption that combining the output of multiple experts is better than the output of any single expert. This idea of ensemble learning can be adapted for feature selection, in which different feature selection algorithms act as different experts. In this paper we propose an ensemble for feature selection based on combining rankings of features, trying to overcome the problem of selecting an appropriate ranker method for each problem at hand. The results of the individual rankings are combined with SVM Rank, and the adequacy of the ensemble was subsequently tested using SVM as classifier. Results on five UCI datasets showed that the use of the proposed ensemble gives better or comparable performance than the feature selection methods individually.

This research has been economically supported in part by the Ministerio de Economía y Competitividad of the Spanish Government through the research project TIN 2012-37954, partially funded by FEDER funds of the European Union; and by the Consellería de Industria of the Xunta de Galicia through the research project GRC2014/035.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)
Article Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository (2007). http://archive.ics.uci.edu/ml/datasets.html
Bay, S.D.: Combining nearest neighbor classifiers through multiple feature subsets. In: ICML, vol. 98, pp. 37–45. Citeseer (1998)
Google Scholar
Ben Brahim, A., Limam, M.: Robust ensemble feature selection for high dimensional data sets. In: 2013 International Conference on High Performance Computing and Simulation (HPCS), pp. 151–157. IEEE (2013)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Gao, K., Khoshgoftaar, T.M., Wang, H.: An empirical investigation of filter attribute selection techniques for software quality classification. In: IEEE International Conference on Information Reuse & Integration, IRI 2009, pp. 272–277. IEEE (2009)
Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1–3), 389–422 (2002)
Article MATH Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 1157–1182 (2003)
MATH Google Scholar
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature extraction. Foundations and applications (2006)
Google Scholar
Hall, M.A., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering 15(6), 1437–1447 (2003)
Article Google Scholar
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Article Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 133–142. ACM (2002)
Google Scholar
Khoshgoftaar, T.M., Golawala, M., Van Hulse, J.: An empirical study of learning from imbalanced data using random forest. In: 19th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2007, vol. 2, pp. 310–317. IEEE (2007)
Google Scholar
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: AAAI, pp. 129–134 (1992)
Google Scholar
Kodovsky, J., Fridrich, J., Holub, V.: Ensemble classifiers for steganalysis of digital media. IEEE Transactions on Information Forensics and Security 7(2), 432–444 (2012)
Article Google Scholar
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Chapter Google Scholar
Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience (2004)
Google Scholar
Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)
Article MATH Google Scholar
Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: 2012 IEEE 24th International Conference on Tools with Artificial Intelligence, pp. 388–388. IEEE Computer Society (1995)
Google Scholar
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)
Article Google Scholar
Mejía-Lavalle, M., Sucar, E., Arroyo, G.: Feature selection with a perceptron neural net. In: Proceedings of the International Workshop on Feature Selection for Data Mining, pp. 131–135 (2006)
Google Scholar
Olsson, J., Oard, D.W.: Combining feature selectors for text classification. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 798–799. ACM (2006)
Google Scholar
Opitz, D.W.: Feature selection for ensembles. In: AAAI/IAAI, pp. 379–384 (1999)
Google Scholar
Park, J., Sandberg, I.W.: Universal approximation using radial-basis-function networks. Neural Computation 3(2), 246–257 (1991)
Article Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8), 1226–1238 (2005)
Article Google Scholar
Quinlan, J.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Rodríguez, D., Ruiz, R., Cuadrado-Gallego, J., Aguilar-Ruiz, J.: Detecting fault modules applying feature selection to classifiers. In: IEEE International Conference on Information Reuse and Integration, IRI 2007, pp. 667–672. IEEE (2007)
Google Scholar
Schapire, R.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990)
Google Scholar
Tsymbal, A., Pechenizkiy, M., Cunningham, P.: Diversity in search strategies for ensemble feature selection. Information Fusion 6(1), 83–98 (2005)
Article Google Scholar
Tukey, J.W.: Comparing individual means in the analysis of variance. Biometrics, 99–114 (1949)
Google Scholar
Tuv, E., Borisov, A., Runger, G., Torkkola, K.: Feature selection with ensembles, artificial variables, and redundancy elimination. The Journal of Machine Learning Research 10, 1341–1366 (2009)
MATH MathSciNet Google Scholar
Wang, H., Khoshgoftaar, T.M., Gao, K.: Ensemble feature selection technique for software quality classification. In: SEKE, pp. 215–220 (2010)
Google Scholar
Wang, H., Khoshgoftaar, T.M., Napolitano, A.: A comparative study of ensemble feature selection techniques for software defect prediction. In: 2010 Ninth International Conference on Machine Learning and Applications (ICMLA), pp. 135–140. IEEE (2010)
Google Scholar
Windeatt, T., Duangsoithong, R., Smith, R.: Embedded feature ranking for ensemble mlp classifiers. IEEE Transactions on Neural Networks 22(6), 988–994 (2011)
Article Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann (2005)
Google Scholar
Yang, C.-H., Huang, C.-C., Wu, K.-C., Chang, H.-Y.: A novel GA-taguchi-based feature selection method. In: Fyfe, C., Kim, D., Lee, S.-Y., Yin, H. (eds.) IDEAL 2008. LNCS, vol. 5326, pp. 112–119. Springer, Heidelberg (2008)
Chapter Google Scholar
Yang, F., Mao, K.: Robust feature selection for microarray data based on multicriterion fusion. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 8(4), 1080–1092 (2011)
Article Google Scholar
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. The Journal of Machine Learning Research 5, 1205–1224 (2004)
MATH Google Scholar
Zheng, Z., Webb, G.I.: Stochastic attribute selection committees. Springer (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of A Coruña, Campus de Elviña s/n 15071, A Coruña, Spain
Borja Seijo-Pardo, Verónica Bolón-Canedo, Iago Porto-Díaz & Amparo Alonso-Betanzos

Authors

Borja Seijo-Pardo
View author publications
You can also search for this author in PubMed Google Scholar
Verónica Bolón-Canedo
View author publications
You can also search for this author in PubMed Google Scholar
Iago Porto-Díaz
View author publications
You can also search for this author in PubMed Google Scholar
Amparo Alonso-Betanzos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Borja Seijo-Pardo .

Editor information

Editors and Affiliations

University of Granada, Granada, Spain
Ignacio Rojas
Department of Electronics Technology, University of Malaga, Malaga, Spain
Gonzalo Joya
Polytechnic University of Catalonia, Vilanova i la Geltrú, Spain
Andreu Catala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Seijo-Pardo, B., Bolón-Canedo, V., Porto-Díaz, I., Alonso-Betanzos, A. (2015). Ensemble Feature Selection for Rankings of Features. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2015. Lecture Notes in Computer Science(), vol 9095. Springer, Cham. https://doi.org/10.1007/978-3-319-19222-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-19222-2_3
Published: 06 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19221-5
Online ISBN: 978-3-319-19222-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics