Skip to main content
Log in

Feature selection considering weighted relevancy

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Feature selection plays an important role in pattern recognition and machine learning. Feature selection based on information theory intends to preserve the feature relevancy between features and class labels while eliminating irrelevant and redundant features. Previous feature selection methods have offered various explanations for feature relevancy, but they ignored the relationships between candidate feature relevancy and selected feature relevancy. To fill this gap, we propose a feature selection method named Feature Selection based on Weighted Relevancy (WRFS). In WRFS, we introduce two weight coefficients that use mutual information and joint mutual information to balance the importance between the two kinds of feature relevancy terms. To evaluate the classification performance of our method, WRFS is compared to three competing feature selection methods and three state-of-the-art methods by two different classifiers on 18 benchmark data sets. The experimental results indicate that WRFS outperforms the other baselines in terms of the classification accuracy, AUC and F1 score.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Alonso-Betanzos A, Herrera F (2014) A review of microarray datasets and applied feature selection methods. Information Sciences An International Journal 282(5):111–135

    Google Scholar 

  2. Bai L, Wang Z, Shao YH, Deng NY (2014) A novel feature selection method for twin support vector machine. Knowl-Based Syst 59:1–8

    Article  Google Scholar 

  3. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537–550

    Article  Google Scholar 

  4. Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532

    Article  Google Scholar 

  5. Bolón-Canedo V., Sánchez-Maroño N, Alonso-Betanzos A (2015) Feature selection for high-dimensional data Springer

  6. Che J, Yang Y, Li L, Bai X, Zhang S, Deng C, Fowler JE (2017) Maximum relevance minimum common redundancy feature selection for nonlinear data Information Sciences

  7. Cheng H, Qin Z, Feng C, Wang Y, Li F (2011) Conditional mutual Information-Based feature selection analyzing for synergy and redundancy John Murray ...

  8. Cover TM, Thomas JA (2003) Elements of information theory. Wiley, New Jersey

    MATH  Google Scholar 

  9. Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1–2):155–176

    Article  MathSciNet  Google Scholar 

  10. Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5 (Nov):1531–1555

    MathSciNet  MATH  Google Scholar 

  11. Freeman C, Kulić D., Basir O (2015) An evaluation of classifier-specific filter measure performance for feature selection. Pattern Recogn 48(5):1812–1826

    Article  Google Scholar 

  12. Gao W, Hu L, Zhang P (2018) Class-specific mutual information variation for feature selection. Pattern Recogn 79:328–339

    Article  Google Scholar 

  13. Hall MA (1999) Correlation-based feature selection for machine learning 19

  14. He S, Chen H, Zhu Z, Ward DG, Cooper HJ, Viant MR, Heath JK, Yao X (2015) Robust twin boosting for feature selection from high-dimensional omics data with label noise. Inf Sci 291:1–18

    Article  Google Scholar 

  15. Hoque N, Bhattacharyya DK, Kalita JK (2014) Mifs-nd: a mutual information-based feature selection method. Expert Syst Appl 41(14):6371–6385

    Article  Google Scholar 

  16. Hu L, Gao W, Zhao K, Zhang P, Wang F (2018) Feature selection considering two types of feature relevancy and feature interdependency. Expert Syst Appl 93:423–434

    Article  Google Scholar 

  17. Huang X, Zhang L, Wang B, Li F, Zhang Z (2017) Feature clustering based support vector machine recursive feature elimination for gene selection. Applied Intelligence (10) pp 1–14

  18. Kira K, Rendell LA (1992) A practical approach to feature selection. In: International workshop on machine learning, pp 249–256

    Chapter  Google Scholar 

  19. Lewis DD (1992) Feature selection and feature extraction for text categorization. In: Proceedings of the workshop on speech and natural language. Association for Computational Linguistics, pp 212–217

  20. Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2016) Feature selection: A data perspective. arXiv:1601.07996

  21. Lichman M (2013) UCI machine learning repository http://archive.ics.uci.edu/ml

  22. Moret-Bonillo V, Alonso-Betanzos A (2016) A comparison of performance of k-complex classification methods using feature selection. Inf Sci 328(C):1–14

    Google Scholar 

  23. Pascoal C, Oliveira MR, Pacheco A, Rui V (2016) Theoretical evaluation of feature selection methods based on mutual information. Neurocomputing 226(C):168–181

    Google Scholar 

  24. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence 27(8):1226–1238

    Article  Google Scholar 

  25. Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5(1):3–55

    Article  MathSciNet  Google Scholar 

  26. Song L, Smola A, Gretton A, Bedo J, Borgwardt K (2012) Feature selection via dependence maximization. J Mach Learn Res 1(1):1393–1434

    MathSciNet  MATH  Google Scholar 

  27. Tanveer M, Khan MA, Ho SS (2016) Robust energy-based least squares twin support vector machines. Appl Intell 45(1):174–186

    Article  Google Scholar 

  28. Vergara J, Estevez PA (2014) A review of feature selection methods based on mutual information. Neural Comput & Applic 24(1):175–186

    Article  Google Scholar 

  29. Vinh LT, Lee S, Park YT, d’Auriol BJ (2012) A novel feature selection method based on normalized mutual information. Appl Intell 37(1):100–120

    Article  Google Scholar 

  30. Vinh NX, Zhou S, Chan J, Bailey J (2016) Can high-order dependencies improve mutual information based feature selection? Pattern Recogn 53(C):46–58

    Article  Google Scholar 

  31. Wang J, Wei JM, Yang Z, Wang SQ (2017) Feature selection by maximizing independent classification information. IEEE Trans Knowl Data Eng 29(4):828–841

    Article  Google Scholar 

  32. Wang Y, Feng L, Zhu J (2017) Novel artificial bee colony based feature selection method for filtering redundant information. Applied Intelligence (3)

  33. Xuan VN, Chan J, Romano S, Bailey J (2014) Effective global approaches for mutual information based feature selection. In: ACM SIGKDD International conference on knowledge discovery and data mining, pp 512–521

  34. Yang HH, Moody JE (1999) Data visualization and feature selection: New algorithms for nongaussian data. In: NIPS, vol. 12

  35. Zeng Z, Zhang H, Zhang R, Yin C (2015) A novel feature selection method considering feature interaction. Pattern Recogn 48(8):2656–2666

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China [grant number61772226,61373051, 61502343]; Science and Technology Development Program of Jilin Province [grant number 20140204004GX]; Science Research Funds for the Guangxi Universities [grant number KY2015ZD122]; Science Research Funds for the Wuzhou University [grant number 2014A002]; Project of Science and Technology Innovation Platform of Computing and Software Science (985 Engineering); Key Laboratory for Symbol Computation and Knowledge Engineering of the National Education Ministry of China; Fundamental Research Funds for the Central.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guixia Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, P., Gao, W. & Liu, G. Feature selection considering weighted relevancy. Appl Intell 48, 4615–4625 (2018). https://doi.org/10.1007/s10489-018-1239-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1239-6

Keywords

Navigation