Skip to main content

An Interaction-Enhanced Feature Selection Algorithm

  • Conference paper
  • First Online:
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10939))

Included in the following conference series:

Abstract

Feature selection is a crucial pre-processing step in machine learning and data mining. A popular approach is based on information theoretic measures. Most of the existing methods used low-dimensional mutual information terms that are ineffective in detecting high-order feature interactions. To fill this gap, we employ higher-order interactions for feature selection. We first relax the assumptions of MI-based methods to allow for higher-order interactions. A direct calculation of the interaction terms is computationally expensive. We use four-dimensional joint mutual information, a computationally efficient measure, to estimate the interaction terms. We also use the ‘maximum of the minimum’ nonlinear approach to avoid the overestimation of feature significance. Finally, we arrive at an effective feature selection method that makes use of higher-order interactions. To evaluate the performance of the proposed method, we compare it with seven representative feature selection methods, including RelaxMRMR, JMIM, IWFS, CIFE, MIFS, MIM, and reliefF. Experimental results on eighteen benchmark data sets demonstrate that higher-order interactions are effective in improving MI-based feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://home.penglab.com/proj/mRMR/.

References

  1. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    Article  Google Scholar 

  2. Robnik-Siknja, M., Kononeko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)

    Article  Google Scholar 

  3. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. PAMI 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  4. Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint l2,1-norms minimization. In: NIPS, pp. 1813–1821 (2010)

    Google Scholar 

  5. Hinton, G.E.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  6. Brown, G., Pocock, A., Zhao, M.J., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. JMLR 13(1), 27–66 (2012)

    MathSciNet  MATH  Google Scholar 

  7. Lewis, D.D.: Feature selection and feature extraction for text categorization. In: Proceedings of Speech and Natural Language Workshop, pp. 212–217 (1992)

    Google Scholar 

  8. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)

    Article  Google Scholar 

  9. Lin, D., Tang, X.: Conditional infomax learning: an integrated framework for feature extraction and fusion. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 68–82. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_6

    Chapter  Google Scholar 

  10. Vinh, N.X., Zhou, S., Chan, J., Bailey, J.: Can high-order dependencies improve mutual information based feature selection? Pattern Recogn. 53, 46–58 (2016)

    Article  Google Scholar 

  11. Bennasar, M., Hicks, Y., Setchi, R.: Feature selection using joint mutual information maximisation. Expert Syst. Appl. 42(22), 8520–8532 (2015)

    Article  Google Scholar 

  12. Zeng, Z., Zhang, H., Zhang, R., Yin, C.: A novel feature selection method considering feature interaction. Pattern Recogn. 48(8), 2656–2666 (2015)

    Article  Google Scholar 

  13. Shishkin, A., Bezzubtseva, A., Drutsa, A., Shishkov, I., Gladkikh, E., Gusev, G., Serdyukov, P.: Efficient high-order interaction-aware feature selection based on conditional mutual information. In: NIPS, pp. 4637–4645 (2016)

    Google Scholar 

  14. Fleuret, F.: Fast binary feature selection with conditional mutual information. JMLR 5(8), 1531–1555 (2004)

    MathSciNet  MATH  Google Scholar 

  15. Jakulin, A.: Machine learning based on attribute interactions. Ph.D. thesis, pp. 1–252 (2005)

    Google Scholar 

  16. Balagani, K.S., Phoha, V.V.: On the feature selection criterion based on an approximation of multidimensional mutual information. PAMI 32(7), 1342–1343 (2010)

    Article  Google Scholar 

  17. Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18(4), 467–479 (1992)

    Google Scholar 

  18. Scott, D.W.: Multivariate Density Estimation. Wiley Series in Probability and Statistics, 2nd edn. Wiley, Hoboken (2015)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their careful reading of this paper and for their constructive comments and suggestions. This work is supported by the National Natural Science Foundation of China under Grant no. 61602094.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaochuan Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tang, X., Dai, Y., Xiang, Y., Luo, L. (2018). An Interaction-Enhanced Feature Selection Algorithm. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10939. Springer, Cham. https://doi.org/10.1007/978-3-319-93040-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93040-4_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93039-8

  • Online ISBN: 978-3-319-93040-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics