Abstract
Feature selection is a crucial pre-processing step in machine learning and data mining. A popular approach is based on information theoretic measures. Most of the existing methods used low-dimensional mutual information terms that are ineffective in detecting high-order feature interactions. To fill this gap, we employ higher-order interactions for feature selection. We first relax the assumptions of MI-based methods to allow for higher-order interactions. A direct calculation of the interaction terms is computationally expensive. We use four-dimensional joint mutual information, a computationally efficient measure, to estimate the interaction terms. We also use the ‘maximum of the minimum’ nonlinear approach to avoid the overestimation of feature significance. Finally, we arrive at an effective feature selection method that makes use of higher-order interactions. To evaluate the performance of the proposed method, we compare it with seven representative feature selection methods, including RelaxMRMR, JMIM, IWFS, CIFE, MIFS, MIM, and reliefF. Experimental results on eighteen benchmark data sets demonstrate that higher-order interactions are effective in improving MI-based feature selection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Robnik-Siknja, M., Kononeko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. PAMI 27(8), 1226–1238 (2005)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint l2,1-norms minimization. In: NIPS, pp. 1813–1821 (2010)
Hinton, G.E.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Brown, G., Pocock, A., Zhao, M.J., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. JMLR 13(1), 27–66 (2012)
Lewis, D.D.: Feature selection and feature extraction for text categorization. In: Proceedings of Speech and Natural Language Workshop, pp. 212–217 (1992)
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)
Lin, D., Tang, X.: Conditional infomax learning: an integrated framework for feature extraction and fusion. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 68–82. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_6
Vinh, N.X., Zhou, S., Chan, J., Bailey, J.: Can high-order dependencies improve mutual information based feature selection? Pattern Recogn. 53, 46–58 (2016)
Bennasar, M., Hicks, Y., Setchi, R.: Feature selection using joint mutual information maximisation. Expert Syst. Appl. 42(22), 8520–8532 (2015)
Zeng, Z., Zhang, H., Zhang, R., Yin, C.: A novel feature selection method considering feature interaction. Pattern Recogn. 48(8), 2656–2666 (2015)
Shishkin, A., Bezzubtseva, A., Drutsa, A., Shishkov, I., Gladkikh, E., Gusev, G., Serdyukov, P.: Efficient high-order interaction-aware feature selection based on conditional mutual information. In: NIPS, pp. 4637–4645 (2016)
Fleuret, F.: Fast binary feature selection with conditional mutual information. JMLR 5(8), 1531–1555 (2004)
Jakulin, A.: Machine learning based on attribute interactions. Ph.D. thesis, pp. 1–252 (2005)
Balagani, K.S., Phoha, V.V.: On the feature selection criterion based on an approximation of multidimensional mutual information. PAMI 32(7), 1342–1343 (2010)
Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18(4), 467–479 (1992)
Scott, D.W.: Multivariate Density Estimation. Wiley Series in Probability and Statistics, 2nd edn. Wiley, Hoboken (2015)
Acknowledgments
The authors would like to thank the anonymous reviewers for their careful reading of this paper and for their constructive comments and suggestions. This work is supported by the National Natural Science Foundation of China under Grant no. 61602094.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Tang, X., Dai, Y., Xiang, Y., Luo, L. (2018). An Interaction-Enhanced Feature Selection Algorithm. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10939. Springer, Cham. https://doi.org/10.1007/978-3-319-93040-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-93040-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93039-8
Online ISBN: 978-3-319-93040-4
eBook Packages: Computer ScienceComputer Science (R0)