Abstract
Selecting the best group of features from high-dimensional datasets is an important challenge in machine learning. Indeed problems with hundreds of features have now become usual. In the context of filter methods, the selected relevance criterion used for filtering is the key factor of a feature selection method. To select an appropriate criterion among the numerous existing ones, this paper proposes a list of six necessary properties. This paper describes then three relevance criteria, the mutual information, the noise variance and the adjusted R-squared, and compares them in the view of the aforementioned properties. Any new, or popular, criterion could be analysed in the light of these properties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5, 537–550 (1994)
Bing, X., Mengjie, Z., Will, N., B., Xin, Y.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)
Brown, G., Pocock, A., Zhao, M., Lujan, M.: Conditional likelihood maximisation: a unifying framework for mutual information feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)
Degeest, A., Verleysen, M., Frénay, B.: Feature ranking in changing environments where new features are introduced. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, July 2015
Degeest, A., Verleysen, M., Frénay, B.: Smoothness bias in relevance estimators for feature selection in regression. In: 2018 International Conference on Artificial Intelligence Applications and Innovations (IJCNN), pp. 285–294 (2018)
Doquire, G., Verleysen, M.: A comparison of multivariate mutual information estimators for feature selection. In: Proceeding of ICPRAM 2012 (2012)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Eirola, E., Lendasse, A., Corona, F., Verleysen, M.: The delta test: the 1-nn estimator as a feature selection criterion. In: Proceedings of the 2014 International Joint Conference on Neural Networks (IJCNN), pp. 4214–4222, July 2014
Eirola, E., Liitiäinen, E., Lendasse, A., Corona, F., Verleysen, M.: Using the delta test for variable selection. In: Proceedings of ESANN 2008 (2008)
François, D., Rossi, F., Wertz, V., Verleysen, M.: Resampling methods for parameter-free and robust feature selection with mutual information. Neurocomputing 70(7–9), 1276–1288 (2007)
Frénay, B., Doquire, G., Verleysen, M.: Theoretical and empirical study on the potential inadequacy of mutual information for feature selection in classification. Neurocomputing 112, 64–78 (2013)
Frénay, B., van Heeswijk, M., Miche, Y., Verleysen, M., Lendasse, A.: Feature selection for nonlinear models with extreme learning machines. Neurocomputing 102, 111–124 (2013)
Gao, W., Kannan, S., Oh, S., Viswanath, P.: Estimating mutual information for discrete-continuous mixtures. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5986–5997. Curran Associates Inc, Red Hook (2017)
Gómez-Verdejo, V., Verleysen, M., Fleury, J.: Information-theoretic feature selection for functional data classification. Neurocomputing 72(16–18), 3580–3589 (2009)
Guillén, A., Sovilj, D., Mateo, F., Rojas, I., Lendasse, A.: New methodologies based on delta test for variable selection in regression problems. In: Workshop on Parallel Architectures and Bioinspired Algorithms, Toronto, Canada (2008)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Hancer, E., Xue, B., Zhang, M.: Differential evolution for filter feature selection based on information theory and feature ranking. Knowl.-Based Syst. 140, 103–119 (2018)
Karegowda, A.G., Jayaram, M.A., Manjunath, A.S.: Feature subset selection problem using wrapper approach in supervised learning. Int. J. Comput. Appl. 1(7), 13–17 (2010)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Kozachenko, L.F., Leonenko, N.: Sample estimate of the entropy of a random vector. Probl. Inform. Transm. 23, 95–101 (1987)
Kraskov, A., Stögbauer, H., Grassberger, P.: Estimating mutual information. Phys. Rev. E 69, 066138 (2004)
Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. 50(6), 94:1–94:45 (2017). https://doi.org/10.1145/3136625
Paul, J., D’Ambrosio, R., Dupont, P.: Kernel methods for heterogeneous feature selection. Neurocomputing 169, 187–195 (2015)
Schaffernicht, E., Kaltenhaeuser, R., Verma, S.S., Gross, H.-M.: On estimating mutual information for feature selection. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010. LNCS, vol. 6352, pp. 362–367. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15819-3_48
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(379–423), 623–656 (1948)
Vergara, J.R., Estévez, P.A.: A review of feature selection methods based on mutual information. Neural Comput. Appl. 24, 175–186 (2014)
Verleysen, M., Rossi, F., François, D.: Advances in feature selection with mutual information. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds.) Similarity-Based Clustering. LNCS (LNAI), vol. 5400, pp. 52–69. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01805-3_4
Yu, Q., Séverin, E., Lendasse, A.: Variable selection for financial modeling. In: Proceedings of the CEF 2007, 13th International Conference on Computing in Economics and Finance, Montréal, Quebec, Canada, pp. 237–241 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Degeest, A., Verleysen, M., Frénay, B. (2019). About Filter Criteria for Feature Selection in Regression. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2019. Lecture Notes in Computer Science(), vol 11507. Springer, Cham. https://doi.org/10.1007/978-3-030-20518-8_48
Download citation
DOI: https://doi.org/10.1007/978-3-030-20518-8_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20517-1
Online ISBN: 978-3-030-20518-8
eBook Packages: Computer ScienceComputer Science (R0)