Skip to main content
Log in

Dynamic feature selection method with minimum redundancy information for linear data

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Feature selection plays a fundamental role in many data mining and machine learning tasks. In this paper, we proposed a novel feature selection method, namely, Dynamic Feature Selection Method with Minimum Redundancy Information (MRIDFS). In MRIDFS, the conditional mutual information is used to calculate the relevance and the redundancy among multiple features, and a new concept, the feature-dependent redundancy ratio, was introduced. Such ratio can represent redundancy more accurately. To evaluate our method, MRIDFS is tested and compared with seven popular methods on 16 benchmark data sets. Experimental results show that MRIDFS outperforms in terms of average classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Hu L, Gao W, Zhao K, et al (2018) Feature selection considering two types of feature relevancy and feature interdependency. Expert Syst Appl 93

  2. Gao W, Hu L, Zhang P (2018) Class-specific mutual information variation for feature selection[J]. Pattern Recogn 79

  3. Zhou HF, Zhang Y, Zhang YJ, et al (2018) Feature selection based on conditional mutual information: minimum conditional relevance and minimum conditional redundancy. Appl Intell

  4. Guyon I, Weston J, Barnhill S, et al (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1-3):389–422

    Article  MATH  Google Scholar 

  5. Weston J, Mukherjee S, Chapelle O, et al (2001) Feature selection for SVMs[j]. Adv Neural Inform Process Sys 13:668–674

    Google Scholar 

  6. Maldonado S, Weber R, Famili F (2014) Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines. Inf Sci 286:228–246

    Article  Google Scholar 

  7. Baranauskas JA, Netto SR (2017) A tree-based algorithm for attribute selection. Appl Intell 2017(19):1–13

    Google Scholar 

  8. Cawley GC, Talbot NLC, Girolami M (2007) Sparse multinomial logistic regression via Bayesian L1 regularisation[C]// International Conference on Neural Information Processing Systems. MIT Press 2007:209–216

    Google Scholar 

  9. Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc B Met, pp 267–288

  10. Wang L, Zhu J, Zou H (2008) Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24(3):412–419

    Article  Google Scholar 

  11. Xiang S, Nie F, Meng G, et al (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Sys 23(11):1738–1754

    Article  Google Scholar 

  12. Bishop CM (1995) Neural networks for pattern recognition. Agricultural Engineering International the Cigr Journal of Scientific Research & Development Manuscript Pm 12(5):1235–1242

    Google Scholar 

  13. Che J, Yang Y, Li L, et al (2017) Maximum relevance minimum common redundancy feature selection for nonlinear data. Inf Sci 409

  14. Kira K, Rendell LA (1992) A practical approach to feature selection // International Workshop on Machine Learning. Morgan Kaufmann Publishers Inc.

  15. Li F, Zhang Z, Jin C (2016) Feature selection with partition differentiation entropy for large-scale data sets. Inf Sci 329(C):690–700

    Article  MATH  Google Scholar 

  16. Borgwardt K (2012) Feature selection via dependence maximization. Journal of Machine Learning Research 1.1:1393–1434

    MathSciNet  MATH  Google Scholar 

  17. Mariello A, Battiti R (2018) Feature selection based on the neighborhood entropy. IEEE Trans Neural Netw Learn Sys PP(99):1–10

    Google Scholar 

  18. Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  19. Meyer P, Schretter C, Bontempi G (2008) Information-theoretic feature selection in microarray data using variable complementarity. IEEE J Select Topics Signal Process 2(3):261–274

    Article  Google Scholar 

  20. Koller D, Sahami M (1996) Toward optimal feature selection// Thirteenth International Conference on International Conference on Machine Learning. Morgan Kaufmann Publishers Inc. pp 284–292

  21. Guyon I (2003) An introduction to variable and feature selection[M] JMLR.org

  22. Zhao J, Zhou Y, Zhang X, et al (2016) Part mutual information for quantifying direct associations in networks. Proc Natl Acad Sci U S A 113(18):5130–5135

    Article  Google Scholar 

  23. Dionisio A, Menezes R, Mendes DA (2004) Mutual information: a measure of dependency for nonlinear time series. Phys A Stat Mech Appl 344(1):326–329

    Article  MathSciNet  Google Scholar 

  24. Cover TM, Thomas JA (2012) Elements of information theory. Wiley, New York

    MATH  Google Scholar 

  25. Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5(1):3–55

    Article  MathSciNet  Google Scholar 

  26. Cover TM, Thomas JA (1991) Elements of information theory. New York, Wiley

    Book  MATH  Google Scholar 

  27. Guyon I, Gunn S, Nikravesh M, et al (2005) Feature extraction: foundations and applications (studies in fuzziness and soft computing). Springer, New York

    Google Scholar 

  28. Bolón-Canedo V., Sánchez-Maroño N, Alonso-Betanzos A, et al (2014) A review of microarray datasets and applied feature selection methods. Inform Sci Int J 282(5):111– 135

    Article  Google Scholar 

  29. Li J, Cheng K, Wang S, et al (2016) Feature selection: a data perspective. Acm Computing Surveys 50(6)

  30. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. Neural Networks IEEE Transactions on 5(4):537–550

    Article  Google Scholar 

  31. Yang HH, Moody J (1999) Feature selection based on joint mutual information

  32. Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  33. Fleuret F (2004) Binary feature selection with conditional mutual information. J Mach Learn Res 5(3):1531–1555

    MathSciNet  MATH  Google Scholar 

  34. Lin D, Tang X (2006) Conditional infomax learning: an integrated framework for feature extraction and fusion// computer vision – ECCV. Springer, Berlin

    Google Scholar 

  35. Vinh NX, Zhou S, Chan J, et al (2016) Can high-order dependencies improve mutual information based feature selection? Pattern Recogn 53(C):46–58

    Article  MATH  Google Scholar 

  36. UCI repository of machine learning datasets [EB/OL]. http://archive.ics.uci.edu/ml/, 2015-04-10

  37. Li J, Cheng K, Wang S, et al (2016) Feature selection: a data perspective. Acm Computing Surveys

  38. Feature selection datasets [EB/OL]. http://featureselection.asu.edu/datasets.php, 2015-04-10

  39. Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532

    Article  Google Scholar 

  40. Bennasar M, Setchi R, Hicks Y (2013) Feature interaction maximisation. Pattern Recogn 34(14):1630–1635

    Article  Google Scholar 

  41. Sun X, Liu Y, Wei D, et al (2013) Selection of interdependent genes via dynamic relevance analysis for cancer diagnosis. J Biomed Inform 46(2):252–258

    Article  Google Scholar 

  42. Vinh NX, Zhou S, Chan J, et al (2015) Can high-order dependencies improve mutual information based feature selection? Pattern Recogn 53(C)):46–58

    MATH  Google Scholar 

  43. Herman G, Zhang B, Wang Y, et al (2013) Mutual information-based method for selecting informative feature sets. Pattern Recogn 46(12):3315–3327

    Article  Google Scholar 

  44. Cheng H, Qin Z, Qian W, Liu W (2008) Conditional mutual information based feature selection. In: Knowledge acquisition and modeling, pp 103–107

  45. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning[J]. Neural Networks IEEE Transactions on 5(4):537–550

    Article  Google Scholar 

Download references

Acknowledgements

The corresponding author would like to thank the support from the National Key Research and Development Plan under the Grant of 2017YFB1402103, the National Natural Science Foundation of China under the Grant of 61402363 and 61771387, the Education Department of Shaanxi Province Key Laboratory Project under the Grant of 15JS079, Xi’an Science Program Project under the Grant of 2017080CG/RC043(XALG017), the Ministry of Education of Shaanxi Province Research Project under the Grant of 17JK0534, and Beilin district of Xi’an Science and Technology Project under the Grant of GX1625.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to HongFang Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, H., Wen, J. Dynamic feature selection method with minimum redundancy information for linear data. Appl Intell 50, 3660–3677 (2020). https://doi.org/10.1007/s10489-020-01726-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01726-z

Keywords

Navigation