Skip to main content

Feature Selection Using Mutual Information: An Experimental Study

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5351))

Abstract

In real-world application, data is often represented by hundreds or thousands of features. Most of them, however, are redundant or irrelevant, and their existence may straightly lead to poor performance of learning algorithms. Hence, it is a compelling requisition for their practical applications to choose most salient features. Currently, a large number of feature selection methods using various strategies have been proposed. Among these methods, the mutual information ones have recently gained much more popularity. In this paper, a general criterion function for feature selector using mutual information is firstly introduced. This function can bring up-to-date selectors based on mutual information together under an unifying scheme. Then an experimental comparative study of eight typical filter mutual information based feature selection algorithms on thirty-three datasets is presented. We evaluate them from four essential aspects, and the experimental results show that none of these methods outperforms others significantly. Even so, the conditional mutual information feature selection algorithm dominates other methods on the whole, if training time is not a matter.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Battiti, R.: Using Mutual Information for Selecting Features in Supervised Neural Net Learning. IEEE Transactions on Neural Networks 5(4), 537–550 (1994)

    Article  Google Scholar 

  2. Bell, D.A., Wang, H.: A Formalism for Relevance and Its Application in Feature Subset Selection. Machine Learning 41, 175–195 (2000)

    Article  MATH  Google Scholar 

  3. Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)

    Book  MATH  Google Scholar 

  4. Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  5. Blum, A.L., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97, 245–271 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  6. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, NY (1991)

    Book  MATH  Google Scholar 

  7. Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. Journal of Machine Learning Research 5, 845–889 (2004)

    MathSciNet  MATH  Google Scholar 

  8. Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artifical Intelligence, pp. 1022–1027 (1993)

    Google Scholar 

  9. Fleuret, F.: Fast Binary Feature Selection with Conditional Mutual Information. Journal of Machine Learning Research 5, 1531–1555 (2004)

    MathSciNet  MATH  Google Scholar 

  10. Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  11. Huang, J., Cai, Y., Xu, X.: A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recognition Letters 28, 1825–1844 (2007)

    Article  Google Scholar 

  12. Jain, A.K., Duin, R., Mao, J.: Statistical Pattern Recognition: A Review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1), 4–37 (2000)

    Article  Google Scholar 

  13. Kira, K., Rendell, L.: A practical approach to feature selection. In: Proceedings of the 9th International Conference on Machine Learning, pp. 249–256 (1992)

    Google Scholar 

  14. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)

    Article  MATH  Google Scholar 

  15. Kwak, N., Choi, C.-H.: Input feature selection by mutual information based on Parzen window. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12), 1667–1671 (2002)

    Article  Google Scholar 

  16. Lindenbaum, M., Markovitch, S., Rusakov, D.: Selective Sampling for Nearest Neighbor Classifiers. Machine Learning 54, 125–152 (2004)

    Article  MATH  Google Scholar 

  17. Liu, H., Yu, L.: Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)

    Article  MathSciNet  Google Scholar 

  18. Novovičová, J., Somol, P., Haindl, M., Pudil, P.: Conditional Mutual Information Based Feature Selection for Classification Task. In: Proc. of the 12th Iberoamericann Congress on Pattern Recognition, Valparaiso, Chile, pp. 417–426 (2007)

    Google Scholar 

  19. Peng, H., Long, F., Ding, C.: Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance,and Min-Redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  20. Qu, G., Hariri, S., Yousif, M.: A New Dependency and Correlation Analysis for Features. IEEE Transactions on Knowledge and Data Engineering 17(9), 1199–1207 (2005)

    Article  Google Scholar 

  21. Somol, P., Novovičová, J., Pudil, P.: Notes on The Evolution of Feature Selection Methodology. Kybernetika 43(5), 713–730 (2007)

    MathSciNet  MATH  Google Scholar 

  22. Wang, G., Lochovsky, F.H., Yang, Q.: Feature Selection with Conditional Mutual Information MaxiMin in Text Categorization. In: Proceedings of the 13th ACM CIKM 2004, Washington, USA, pp. 342–349 (2004)

    Google Scholar 

  23. Witten, I.H., Frank, E.: Data Mining - Pracitcal Machine Learning Tools and Techniques with JAVA Implementations, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    Google Scholar 

  24. Yu, L., Liu, H.: Efficient Feature Selection via Analysis of Relevance and Redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  25. Zhao, Z., Liu, H.: Semi-supervised feature selection via spectral analysis. In: Proceedings of the 7th SIAM International Conference on Data Mining, Minneapolis, MN, pp. 1151–1158 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, H., Liu, L., Zhang, H. (2008). Feature Selection Using Mutual Information: An Experimental Study. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89197-0_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89196-3

  • Online ISBN: 978-3-540-89197-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics