Skip to main content
Log in

A new feature selection using dynamic interaction

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

With the continuous development of Internet technology, data gradually present a complicated and high-dimensional trend. These high-dimensional data have a large number of redundant features and irrelevant features, which bring great challenges to the existing machine learning algorithms. Feature selection is one of the important research topics in the fields of machine learning, pattern recognition and data mining, and it is also an important means in the data preprocessing stage. Feature selection is to look for the optimal feature subset from the original feature set, which would improve the classification accuracy and reduce the machine learning time. The traditional feature selection algorithm tends to ignore the kind of feature which has a weak distinguishing capacity as a monomer, whereas the feature group’s distinguishing capacity is strong. Therefore, a new dynamic interaction feature selection (DIFS) algorithm is proposed in this paper. Initially, under the theoretical framework of interactive information, it redefines the relevance, irrelevance and redundancy of the features. Secondly, it offers the computational formulas for calculating interactive information. Finally, under the eleven data sets of UCI and three different classifiers, namely, KNN, SVM and C4.5, the DIFS algorithm increases the classification accuracy of the FullSet by 3.2848% and averagely decreases the number of features selected by 15.137. Hence, the DIFS algorithm can not only identify the relevance feature effectively, but also identify the irrelevant and redundant features. Moreover, it can effectively improve the classification accuracy of the data sets and reduce the feature dimensions of the data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Che J, Yang Y, Li L, Bai X, Zhang S, Deng C (2017) Maximum relevance minimum common redundancy feature selection for nonlinear data. Inf Sci 409–410:68–86. https://doi.org/10.1016/j.ins.2017.05.013

    Article  MATH  Google Scholar 

  2. Macedo F, Rosário Oliveira M, Pacheco A, Valadas R (2019) Theoretical foundations of forward feature selection methods based on mutual information. Neurocomputing 325:67–89. https://doi.org/10.1016/j.neucom.2018.09.077

    Article  Google Scholar 

  3. Lin X, Li C, Ren W, Luo X, Qi Y (2019) A new feature selection method based on symmetrical uncertainty and interaction gain. Comput Biol Chem 83:107149. https://doi.org/10.1016/j.compbiolchem.2019.107149

    Article  MathSciNet  Google Scholar 

  4. Cheng X, Zhu Y, Song J, Wen G, He W (2017) A novel low-rank hypergraph feature selection for multi-view classification. Neurocomputing 253:115–121. https://doi.org/10.1016/j.neucom.2016.10.089

    Article  Google Scholar 

  5. Liu H, Ditzler G (2019) A semi-parallel framework for greedy information-theoretic feature selection. Inf Sci 492:13–28. https://doi.org/10.1016/j.ins.2019.03.075

    Article  MathSciNet  MATH  Google Scholar 

  6. Shi H, Li H, Zhang D, Cheng C, Cao X (2018) An efficient feature generation approach based on deep learning and feature selection techniques for traffic classification. Comput Netw 132:81–98. https://doi.org/10.1016/j.comnet.2018.01.007

    Article  Google Scholar 

  7. Zhang Y, Yang A, Xiong C, Wang T, Zhang Z (2014) Feature selection using data envelopment analysis. Knowl Based Syst 64:70–80. https://doi.org/10.1016/j.knosys.2014.03.022

    Article  Google Scholar 

  8. Li Z, Tan J, Li S, Liu J, Chen H, Shen J, Huang R, Liu J (2019) An efficient online wkNN diagnostic strategy for variable refrigerant flow system based on coupled feature selection method. Energy Build 183:222–237. https://doi.org/10.1016/j.enbuild.2018.11.020

    Article  Google Scholar 

  9. Chamakura L, Saha G (2019) An instance voting approach to feature selection. Inf Sci 504:449–469. https://doi.org/10.1016/j.ins.2019.07.018

    Article  MathSciNet  Google Scholar 

  10. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. https://doi.org/10.1109/TPAMI.2005.159

    Article  Google Scholar 

  11. Lee J, Kim D-W (2015) Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recognit 48(9):2761–2771. https://doi.org/10.1016/j.patcog.2015.04.009

    Article  MATH  Google Scholar 

  12. Nagpal A, Singh V (2018) A feature selection algorithm based on qualitative mutual information for cancer microarray data. Procedia Comput Sci 132:244–252. https://doi.org/10.1016/j.procs.2018.05.195

    Article  Google Scholar 

  13. Liang Hu WG, Zhao K, Zhang P, Wang F (2018) Feature selection considering two types of feature relevancy and feature interdependency. Expert Syst Appl 93:423–434. https://doi.org/10.1016/j.eswa.2017.10.016

    Article  Google Scholar 

  14. Zheng K, Wang X (2018) Feature selection method with joint maximal information entropy between features and class. Pattern Recognit 77:20–29. https://doi.org/10.1016/j.patcog.2017.12.008

    Article  Google Scholar 

  15. Bennasar M, Hicks Y, Setchi R (2015) feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532. https://doi.org/10.1016/j.eswa.2015.07.007

    Article  Google Scholar 

  16. Yuan M, Yang Z, Ji G (2019) Partial maximum correlation information: a new feature selection method for microarray data classification. Neurocomputing 323:231–243. https://doi.org/10.1016/j.neucom.2018.09.084

    Article  Google Scholar 

  17. Gustavo S-C, Miguel G-T, Santiago G-G, Christian ES, Federico D (2019) A multivariate approach to the symmetrical uncertainty measure: application to feature selection problem. Inf Sci 494:1–20. https://doi.org/10.1016/j.ins.2019.04.046

    Article  MathSciNet  MATH  Google Scholar 

  18. Sharma V, Juglan KC (2018) Automated classification of fatty and normal liver ultrasound images based on mutual information feature selection. IRBM 39(5):313–323. https://doi.org/10.1016/j.irbm.2018.09.006

    Article  Google Scholar 

  19. Murthy SCA, Chanda B (2018) Generation of compound features based on feature interaction for classification. Expert Syst Appl 108:61–73. https://doi.org/10.1016/j.eswa.2018.04.033

    Article  Google Scholar 

  20. Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79. https://doi.org/10.1016/j.neucom.2017.11.077

    Article  Google Scholar 

  21. Wang X, Guo B, Shen Y, Zhou C, Duan X (2019) Input feature selection method based on feature set equivalence and mutual information gain maximization. IEEE Access 7:151525–151538. https://doi.org/10.1109/access.2019.2948095

    Article  Google Scholar 

  22. Wang J, Wei J-M, Yang Z, Wang S-Q (2017) Feature selection by maximizing independent classification information. IEEE Trans Knowl Data Eng 29(4):828–841. https://doi.org/10.1109/tkde.2017.2650906

    Article  Google Scholar 

  23. Bermejo P, Ldl O, Gámez JA, Puerta JM (2012) Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking. Knowl Based Syst 25(1):35–44. https://doi.org/10.1016/j.knosys.2011.01.015

    Article  Google Scholar 

  24. Cano A, Nguyen DT, Ventura S, Cios KJ (2016) ur-CAIM: improved CAIM discretization for unbalanced and balanced data. Soft Comput 20(1):173–188. https://doi.org/10.1007/s00500-014-1488-1

    Article  Google Scholar 

  25. Wang L-L, Ngan HYT, Yung NHC (2018) Automatic incident classification for large-scale traffic data by adaptive boosting SVM. Inf Sci 467:59–73. https://doi.org/10.1016/j.ins.2018.07.044

    Article  Google Scholar 

  26. Gómez-Verdejo V, Verleysen M, Fleury J (2009) Information-theoretic feature selection for functional data classification. Neurocomputing 72(16–18):3580–3589. https://doi.org/10.1016/j.neucom.2008.12.035

    Article  Google Scholar 

  27. Sun X, Liu Y, Wei D, Xu M, Chen H, Han J (2013) Selection of interdependent genes via dynamic relevance analysis for cancer diagnosis. J Biomed Inform 46(2):252–258. https://doi.org/10.1016/j.jbi.2012.10.004

    Article  Google Scholar 

Download references

Acknowledgements

The experimental data set selects the world-famous UCI universal data set (http:// https://archive.ics.uci.edu/ml/datasets.html).

Funding

This work is supported by the National Science and Technology Basic Work Special Project of China under Grant 2015FY111700-6.

Author information

Authors and Affiliations

Authors

Contributions

I wrote the manuscript, read and approved the final manuscript.

Corresponding author

Correspondence to Zhang Li.

Ethics declarations

Conflict of interest

The author declares that he has no competing interests.

Ethics approval and consent to participate

This study does not involve any ethical issues.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z. A new feature selection using dynamic interaction. Pattern Anal Applic 24, 203–215 (2021). https://doi.org/10.1007/s10044-020-00916-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-020-00916-2

Keywords

Navigation