Abstract
In the last few years, ensemble learning has been the focus of much attention mainly in classification tasks, based on the assumption that combining the output of multiple experts is better than the output of any single expert. This idea of ensemble learning can be adapted for feature selection, in which different feature selection algorithms act as different experts. In this paper we propose an ensemble for feature selection based on combining rankings of features, trying to overcome the problem of selecting an appropriate ranker method for each problem at hand. The results of the individual rankings are combined with SVM Rank, and the adequacy of the ensemble was subsequently tested using SVM as classifier. Results on five UCI datasets showed that the use of the proposed ensemble gives better or comparable performance than the feature selection methods individually.
This research has been economically supported in part by the Ministerio de Economía y Competitividad of the Spanish Government through the research project TIN 2012-37954, partially funded by FEDER funds of the European Union; and by the Consellería de Industria of the Xunta de Galicia through the research project GRC2014/035.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abeel, T., Helleputte, T., Van de Peer, Y., Dupont, P., Saeys, Y.: Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26(3), 392–398 (2010)
Asuncion, A., Newman, D.: UCI machine learning repository (2007). http://archive.ics.uci.edu/ml/datasets.html
Bay, S.D.: Combining nearest neighbor classifiers through multiple feature subsets. In: ICML, vol. 98, pp. 37–45. Citeseer (1998)
Ben Brahim, A., Limam, M.: Robust ensemble feature selection for high dimensional data sets. In: 2013 International Conference on High Performance Computing and Simulation (HPCS), pp. 151–157. IEEE (2013)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
Gao, K., Khoshgoftaar, T.M., Wang, H.: An empirical investigation of filter attribute selection techniques for software quality classification. In: IEEE International Conference on Information Reuse & Integration, IRI 2009, pp. 272–277. IEEE (2009)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1–3), 389–422 (2002)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 1157–1182 (2003)
Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature extraction. Foundations and applications (2006)
Hall, M.A., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering 15(6), 1437–1447 (2003)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 133–142. ACM (2002)
Khoshgoftaar, T.M., Golawala, M., Van Hulse, J.: An empirical study of learning from imbalanced data using random forest. In: 19th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2007, vol. 2, pp. 310–317. IEEE (2007)
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: AAAI, pp. 129–134 (1992)
Kodovsky, J., Fridrich, J., Holub, V.: Ensemble classifiers for steganalysis of digital media. IEEE Transactions on Information Forensics and Security 7(2), 432–444 (2012)
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience (2004)
Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)
Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: 2012 IEEE 24th International Conference on Tools with Artificial Intelligence, pp. 388–388. IEEE Computer Society (1995)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)
Mejía-Lavalle, M., Sucar, E., Arroyo, G.: Feature selection with a perceptron neural net. In: Proceedings of the International Workshop on Feature Selection for Data Mining, pp. 131–135 (2006)
Olsson, J., Oard, D.W.: Combining feature selectors for text classification. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 798–799. ACM (2006)
Opitz, D.W.: Feature selection for ensembles. In: AAAI/IAAI, pp. 379–384 (1999)
Park, J., Sandberg, I.W.: Universal approximation using radial-basis-function networks. Neural Computation 3(2), 246–257 (1991)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8), 1226–1238 (2005)
Quinlan, J.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Rodríguez, D., Ruiz, R., Cuadrado-Gallego, J., Aguilar-Ruiz, J.: Detecting fault modules applying feature selection to classifiers. In: IEEE International Conference on Information Reuse and Integration, IRI 2007, pp. 667–672. IEEE (2007)
Schapire, R.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990)
Tsymbal, A., Pechenizkiy, M., Cunningham, P.: Diversity in search strategies for ensemble feature selection. Information Fusion 6(1), 83–98 (2005)
Tukey, J.W.: Comparing individual means in the analysis of variance. Biometrics, 99–114 (1949)
Tuv, E., Borisov, A., Runger, G., Torkkola, K.: Feature selection with ensembles, artificial variables, and redundancy elimination. The Journal of Machine Learning Research 10, 1341–1366 (2009)
Wang, H., Khoshgoftaar, T.M., Gao, K.: Ensemble feature selection technique for software quality classification. In: SEKE, pp. 215–220 (2010)
Wang, H., Khoshgoftaar, T.M., Napolitano, A.: A comparative study of ensemble feature selection techniques for software defect prediction. In: 2010 Ninth International Conference on Machine Learning and Applications (ICMLA), pp. 135–140. IEEE (2010)
Windeatt, T., Duangsoithong, R., Smith, R.: Embedded feature ranking for ensemble mlp classifiers. IEEE Transactions on Neural Networks 22(6), 988–994 (2011)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann (2005)
Yang, C.-H., Huang, C.-C., Wu, K.-C., Chang, H.-Y.: A novel GA-taguchi-based feature selection method. In: Fyfe, C., Kim, D., Lee, S.-Y., Yin, H. (eds.) IDEAL 2008. LNCS, vol. 5326, pp. 112–119. Springer, Heidelberg (2008)
Yang, F., Mao, K.: Robust feature selection for microarray data based on multicriterion fusion. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 8(4), 1080–1092 (2011)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. The Journal of Machine Learning Research 5, 1205–1224 (2004)
Zheng, Z., Webb, G.I.: Stochastic attribute selection committees. Springer (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Seijo-Pardo, B., Bolón-Canedo, V., Porto-Díaz, I., Alonso-Betanzos, A. (2015). Ensemble Feature Selection for Rankings of Features. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2015. Lecture Notes in Computer Science(), vol 9095. Springer, Cham. https://doi.org/10.1007/978-3-319-19222-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-19222-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19221-5
Online ISBN: 978-3-319-19222-2
eBook Packages: Computer ScienceComputer Science (R0)