Dynamic feature selection method with minimum redundancy information for linear data

Zhou, HongFang; Wen, Jing

doi:10.1007/s10489-020-01726-z

Dynamic feature selection method with minimum redundancy information for linear data

Published: 22 June 2020

Volume 50, pages 3660–3677, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

HongFang Zhou¹ &
Jing Wen¹

433 Accesses
5 Citations
Explore all metrics

Abstract

Feature selection plays a fundamental role in many data mining and machine learning tasks. In this paper, we proposed a novel feature selection method, namely, Dynamic Feature Selection Method with Minimum Redundancy Information (MRIDFS). In MRIDFS, the conditional mutual information is used to calculate the relevance and the redundancy among multiple features, and a new concept, the feature-dependent redundancy ratio, was introduced. Such ratio can represent redundancy more accurately. To evaluate our method, MRIDFS is tested and compared with seven popular methods on 16 benchmark data sets. Experimental results show that MRIDFS outperforms in terms of average classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Feature dimensionality reduction: a review

Article Open access 21 January 2022

Weikuan Jia, Meili Sun, … Sujuan Hou

A survey on ensemble learning

Article 30 August 2019

Xibin Dong, Zhiwen Yu, … Qianli Ma

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Bartosz Krawczyk

References

Hu L, Gao W, Zhao K, et al (2018) Feature selection considering two types of feature relevancy and feature interdependency. Expert Syst Appl 93
Gao W, Hu L, Zhang P (2018) Class-specific mutual information variation for feature selection[J]. Pattern Recogn 79
Zhou HF, Zhang Y, Zhang YJ, et al (2018) Feature selection based on conditional mutual information: minimum conditional relevance and minimum conditional redundancy. Appl Intell
Guyon I, Weston J, Barnhill S, et al (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1-3):389–422
Article MATH Google Scholar
Weston J, Mukherjee S, Chapelle O, et al (2001) Feature selection for SVMs[j]. Adv Neural Inform Process Sys 13:668–674
Google Scholar
Maldonado S, Weber R, Famili F (2014) Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines. Inf Sci 286:228–246
Article Google Scholar
Baranauskas JA, Netto SR (2017) A tree-based algorithm for attribute selection. Appl Intell 2017(19):1–13
Google Scholar
Cawley GC, Talbot NLC, Girolami M (2007) Sparse multinomial logistic regression via Bayesian L1 regularisation[C]// International Conference on Neural Information Processing Systems. MIT Press 2007:209–216
Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc B Met, pp 267–288
Wang L, Zhu J, Zou H (2008) Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24(3):412–419
Article Google Scholar
Xiang S, Nie F, Meng G, et al (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Sys 23(11):1738–1754
Article Google Scholar
Bishop CM (1995) Neural networks for pattern recognition. Agricultural Engineering International the Cigr Journal of Scientific Research & Development Manuscript Pm 12(5):1235–1242
Google Scholar
Che J, Yang Y, Li L, et al (2017) Maximum relevance minimum common redundancy feature selection for nonlinear data. Inf Sci 409
Kira K, Rendell LA (1992) A practical approach to feature selection // International Workshop on Machine Learning. Morgan Kaufmann Publishers Inc.
Li F, Zhang Z, Jin C (2016) Feature selection with partition differentiation entropy for large-scale data sets. Inf Sci 329(C):690–700
Article MATH Google Scholar
Borgwardt K (2012) Feature selection via dependence maximization. Journal of Machine Learning Research 1.1:1393–1434
MathSciNet MATH Google Scholar
Mariello A, Battiti R (2018) Feature selection based on the neighborhood entropy. IEEE Trans Neural Netw Learn Sys PP(99):1–10
Google Scholar
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Article Google Scholar
Meyer P, Schretter C, Bontempi G (2008) Information-theoretic feature selection in microarray data using variable complementarity. IEEE J Select Topics Signal Process 2(3):261–274
Article Google Scholar
Koller D, Sahami M (1996) Toward optimal feature selection// Thirteenth International Conference on International Conference on Machine Learning. Morgan Kaufmann Publishers Inc. pp 284–292
Guyon I (2003) An introduction to variable and feature selection[M] JMLR.org
Zhao J, Zhou Y, Zhang X, et al (2016) Part mutual information for quantifying direct associations in networks. Proc Natl Acad Sci U S A 113(18):5130–5135
Article Google Scholar
Dionisio A, Menezes R, Mendes DA (2004) Mutual information: a measure of dependency for nonlinear time series. Phys A Stat Mech Appl 344(1):326–329
Article MathSciNet Google Scholar
Cover TM, Thomas JA (2012) Elements of information theory. Wiley, New York
MATH Google Scholar
Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5(1):3–55
Article MathSciNet Google Scholar
Cover TM, Thomas JA (1991) Elements of information theory. New York, Wiley
Book MATH Google Scholar
Guyon I, Gunn S, Nikravesh M, et al (2005) Feature extraction: foundations and applications (studies in fuzziness and soft computing). Springer, New York
Google Scholar
Bolón-Canedo V., Sánchez-Maroño N, Alonso-Betanzos A, et al (2014) A review of microarray datasets and applied feature selection methods. Inform Sci Int J 282(5):111– 135
Article Google Scholar
Li J, Cheng K, Wang S, et al (2016) Feature selection: a data perspective. Acm Computing Surveys 50(6)
Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. Neural Networks IEEE Transactions on 5(4):537–550
Article Google Scholar
Yang HH, Moody J (1999) Feature selection based on joint mutual information
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Article Google Scholar
Fleuret F (2004) Binary feature selection with conditional mutual information. J Mach Learn Res 5(3):1531–1555
MathSciNet MATH Google Scholar
Lin D, Tang X (2006) Conditional infomax learning: an integrated framework for feature extraction and fusion// computer vision – ECCV. Springer, Berlin
Google Scholar
Vinh NX, Zhou S, Chan J, et al (2016) Can high-order dependencies improve mutual information based feature selection? Pattern Recogn 53(C):46–58
Article MATH Google Scholar
UCI repository of machine learning datasets [EB/OL]. http://archive.ics.uci.edu/ml/, 2015-04-10
Li J, Cheng K, Wang S, et al (2016) Feature selection: a data perspective. Acm Computing Surveys
Feature selection datasets [EB/OL]. http://featureselection.asu.edu/datasets.php, 2015-04-10
Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532
Article Google Scholar
Bennasar M, Setchi R, Hicks Y (2013) Feature interaction maximisation. Pattern Recogn 34(14):1630–1635
Article Google Scholar
Sun X, Liu Y, Wei D, et al (2013) Selection of interdependent genes via dynamic relevance analysis for cancer diagnosis. J Biomed Inform 46(2):252–258
Article Google Scholar
Vinh NX, Zhou S, Chan J, et al (2015) Can high-order dependencies improve mutual information based feature selection? Pattern Recogn 53(C)):46–58
MATH Google Scholar
Herman G, Zhang B, Wang Y, et al (2013) Mutual information-based method for selecting informative feature sets. Pattern Recogn 46(12):3315–3327
Article Google Scholar
Cheng H, Qin Z, Qian W, Liu W (2008) Conditional mutual information based feature selection. In: Knowledge acquisition and modeling, pp 103–107
Battiti R (1994) Using mutual information for selecting features in supervised neural net learning[J]. Neural Networks IEEE Transactions on 5(4):537–550
Article Google Scholar

Download references

Acknowledgements

The corresponding author would like to thank the support from the National Key Research and Development Plan under the Grant of 2017YFB1402103, the National Natural Science Foundation of China under the Grant of 61402363 and 61771387, the Education Department of Shaanxi Province Key Laboratory Project under the Grant of 15JS079, Xi’an Science Program Project under the Grant of 2017080CG/RC043(XALG017), the Ministry of Education of Shaanxi Province Research Project under the Grant of 17JK0534, and Beilin district of Xi’an Science and Technology Project under the Grant of GX1625.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Xi’an University of Technology, No.5 South Jinhua Road, Xi’an, Shaanxi, China
HongFang Zhou & Jing Wen

Authors

HongFang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jing Wen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to HongFang Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, H., Wen, J. Dynamic feature selection method with minimum redundancy information for linear data. Appl Intell 50, 3660–3677 (2020). https://doi.org/10.1007/s10489-020-01726-z

Download citation

Published: 22 June 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s10489-020-01726-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Dynamic feature selection method with minimum redundancy information for linear data

Abstract

Access this article

Similar content being viewed by others

Feature dimensionality reduction: a review

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic feature selection method with minimum redundancy information for linear data

Abstract

Access this article

Similar content being viewed by others

Feature dimensionality reduction: a review

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation