Skip to main content
Log in

Semi-supervised feature selection with minimal redundancy based on local adaptive

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

With the speedy development of network technology, diverse data increase by hundreds of millions per hour, causing increasing pressure on the acquisition of data labels. Semi-supervised feature selection has been among the forefront of dimensionality reduction research due to the outstanding achievements of “small labels” and “high efficiency”. Especially, the graph-based methods use data with missing labels completely and effectively, prompting it to become a research hotspots in semi-supervised feature selection. However, the existing graph-based methods do not take into account the effects of outliers, noise, and redundancy of selected features simultaneously. To solve those problems, a novel semi-supervised feature selection method based on local adaptive and minimal redundancy is proposed. The local structure is flexibly assigned weights according to the data conditions, thereby reducing the impact of outliers and noise; moreover, a high similarity penalty mechanism is introduced in the feature mapping matrix to promote discrimination and low redundancy of the selected feature subset. In addition, an iterative method is designed and its convergence is proved theoretically and experimentally. Finally, the proposed algorithm is verified to be stable and effective through experiments from five aspects on sixteen public datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Sum of Squared Errors

  2. Non-negative Matrix Factorization

References

  1. Chu Y, Lin H, Yang L, Diao Y, Zhang D, Zhang S, Fan X, Shen C, Xu B, Yan D (2020) Discriminative globality-locality preserving extreme learning machine for image classification. Neurocomputing 387:13–21

    Article  Google Scholar 

  2. Lei Y, Chen X, Min M, Xie Y (2020) A semi-supervised laplacian extreme learning machine and feature fusion with cnn for industrial superheat identification. Neurocomputing 381:186–195

    Article  Google Scholar 

  3. Chen J, Zeng Y, Li Y, Huang G B (2020) Unsupervised feature selection based extreme learning machine for clustering. Neurocomputing 386:198–207

    Article  Google Scholar 

  4. Uċar M (2020) Classification performance-based feature selection algorithm for machine learning: P-score. Innov Res BioMed Eng 41:229–239

    Google Scholar 

  5. Bai X, Zhu L, Liang C, Li J, Nie X, Chang X (2020) Multi-view feature selection via nonnegative structured graph learning. Neurocomputing 387:110–122

    Article  Google Scholar 

  6. Shang R, Wang W, Stolkin R, Jiao L (2016) Subspace learning-based graph regularized feature selection. Knowl-Based Syst 112:152–165

    Article  Google Scholar 

  7. Song Q, Jiang H, Liu J (2017) Feature selection based on FDA and F-score for multi-class classification. Expert Syst Appl 81:22–27

    Article  Google Scholar 

  8. Zhou H, Wang X, Zhang Y (2020) Feature selection based on weighted conditional mutual information. Applied Computing and Informatics. https://doi.org/10.1016/j.aci.2019.12.003

  9. Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press

  10. Huang R, Jiang W, Sun G (2018) Manifold-based constraint laplacian score for multi-label feature selection. Pattern Recogn Lett 112:346–352

    Article  Google Scholar 

  11. Li C, Luo X, Qi Y, Gao Z, Lin X (2020) A new feature selection algorithm based on relevance, redundancy and complementarity. Comput Biol Med 119(103667)

  12. Lin M, Cui H, Chen W, Engelen A V, Bruijne M D, Azarpazhooh M R, Sohrevardi S M, Spence J D, Chiu B (2020) Longitudinal assessment of carotid plaque texture in three-dimensional ultrasound images based on semi-supervised graph-based dimensionality reduction and feature selection. Comput Biol Med 116(103586)

  13. Jiang L, Yu G, Guo M, Wang J (2020) Feature selection with missing labels based on label compression and local feature correlation. Neurocomputing 395:95–106

    Article  Google Scholar 

  14. Ma J, Chow T W (2019) Label-specific feature selection and two-level label recovery for multi-label classification with missing labels. Neural Netw 118:110–126

    Article  MATH  Google Scholar 

  15. Coelho F, Castro C, Braga A P, Verleysen M (2019) Semi-supervised relevance index for feature selection. Neural Comput Appl 31(2):989–997

    Article  Google Scholar 

  16. Yu E, Sun J, Li J, Chang X, Han X H, Hauptmann A G (2018) Adaptive semi-supervised feature selection for cross-modal retrieval. IEEE Trans Multimed 21(5):1276–1288

    Article  Google Scholar 

  17. Sheikhpour R, Sarram M A, Gharaghani S, Chahooki M A Z (2017) A survey on semi-supervised feature selection methods. Pattern Recogn 64:141–158

    Article  MATH  Google Scholar 

  18. Hasanloei M A V, Sheikhpour R, Sarram M A, Sheikhpour E, Sharifi H (2018) A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities. J Comput-Aaided Mol Des 32(2):375–384

    Article  Google Scholar 

  19. Feofanov V, Amini M R, Devijver E (2019) Semi-supervised wrapper feature selection with imperfect labels. arXiv:1911.04841

  20. Gu Y, Li K, Guo Z, Wang Y (2019) Semi-supervised k-means DDos detection method using hybrid feature selection algorithm. IEEE Access 7:64351–64365

    Article  Google Scholar 

  21. Xu Z, King I, Lyu M R T, Jin R (2010) Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans Neural netw 21(7):1033–1047

    Article  Google Scholar 

  22. Nie F, Xu D, Tsang I W H, Zhang C (2010) Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction. IEEE Trans Image Process 19(7):1921–1932

    Article  MathSciNet  MATH  Google Scholar 

  23. Ma Z, Nie F, Yang Y, Uijlings J R, Sebe N, Hauptmann A G (2012) Discriminating joint feature analysis for multimedia data understanding. IEEE Trans Multimed 14(6):1662–1672

    Article  Google Scholar 

  24. Sheikhpour R, Sarram M A, Gharaghani S, Chahooki M A Z (2020) A robust graph-based semi-supervised sparse feature selection method. Inf Sci 531:13–30

    Article  MathSciNet  MATH  Google Scholar 

  25. Jiang L, Yu G, Guo M, Wang J (2020) Feature selection with missing labels based on label compression and local feature correlation. Neurocomputing 395:95–106

    Article  Google Scholar 

  26. Zheng J, Yuan H, Lai LL, Zheng H, Wang Z, Wang F (2018) SGL-RFS: Semi-supervised graph learning robust feature selection. In: Proceedings of the 13th International Conference on Wavelet Analysis and Pattern Recognition DOI: https://doi.org/10.1109/ICWAPR.2018.8521274

  27. Sheikhpour R, Sarram M A, Sheikhpour E (2018) Semi-supervised sparse feature selection via graph laplacian based scatter matrix for regression problems. Inf Sci 468:14–28

    Article  MATH  Google Scholar 

  28. Wang X, Chen R, Hong C, Zeng Z, Zhou Z (2017) Semi-supervised multi-label feature selection via label correlation analysis with l1-norm graph embedding. Image Vis Comput 63:10–23

    Article  Google Scholar 

  29. Zhao Z K, Qian J S (2012) Locality sensitive semi-supervised dimensionality reduction on multimodal data. Appl Mech Mater 148-149:258–261

    Article  Google Scholar 

  30. Shi C, An G, Zhao R, Ruan Q, Tian Q (2016) Multiview hessian semisupervised sparse feature selection for multimedia analysis. IEEE Trans Circ Syst Video Technol 27(9):1947–1961

    Article  Google Scholar 

  31. Shi C, Duan C, Gu Z, Tian Q, An G, Zhao R (2019) Semi-supervised feature selection analysis with structured multi-view sparse regularization. Neurocomputing 330:412–424

    Article  Google Scholar 

  32. Cai H, Zheng V W, Chang K C C (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637

    Article  Google Scholar 

  33. Song X, Zhang J, Han Y, Jiang J (2016) Semi-supervised feature selection via hierarchical regression for web image classification. Multimed Syst 22(1):41–49

    Article  Google Scholar 

  34. Yan Y, Shen H, Liu G, Ma Z, Gao C, Sebe N (2014) GLOcal tells you more: Coupling GLocal structural for feature selection with sparsity for image and video classification. Comput Vis Image Underst 124:99–109

    Article  Google Scholar 

  35. Shi C, Gu Z, Duan C, Tian Q (2020) Multi-view adaptive semi-supervised feature selection with the self-paced learning. Signal Process 168(107332)

  36. Cai Z, Zhu W (2018) Multi-label feature selection via feature manifold learning and sparsity regularization. Int J Mach Learn Cybern 9(8):1321–1334

    Article  Google Scholar 

  37. Chartrand R (2007) Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Process Lett 14(10):707–710

    Article  Google Scholar 

  38. Ye Y, Shao Y, Deng N, Li C, Hua X (2017) Robust lp-norm least squares support vector regression with feature selection. Appl Math Comput 305:32–52

    MathSciNet  MATH  Google Scholar 

  39. Li C N, Ren P W, Shao Y H, Ye Y F, Guo Y R (2020) Generalized elastic net lp-norm nonparallel support vector machine. Eng Appl Artif Intell 88(103397)

  40. Xu S, Dai J, Hong S (2018) Semi-supervised feature selection by mutual information based on kernel density estimation. In: Proceedings of the 24th International Conference on Pattern Recognition DOI: https://doi.org/10.1109/ICPR.2018.8546181

  41. Chen S B, Zhang Y, Ding C H, Zhou Z L, Luo B (2018) A discriminative multi-class feature selection method via weighted L2,1,-norm and extended elastic net. Neurocomputing 275:1140–1149

    Article  Google Scholar 

  42. Wang L, Chen S (2013) L2,p matrix norm and its application in feature selection. arXiv:1911.04841

  43. Bishop C M (2006) Pattern recognition and machine learning. J Electron Imaging 16(4):140–155

    MATH  Google Scholar 

  44. Wang Y, Wang J, Liao H, Chen H (2017) An efficient semi-supervised representatives feature selection algorithm based on information theory. Pattern Recogn 61:511–523

    Article  MATH  Google Scholar 

  45. Xu J, Tang B, He H, Man H (2016) Semisupervised feature selection based on relevance and redundancy criteria. IEEE Trans Neural Netw Learn Syst 28(9):1974–1984

    Article  MathSciNet  Google Scholar 

  46. Benabdeslem K, Hindawi M (2013) Efficient semi-supervised feature selection: constraint, relevance, and redundancy. IEEE Trans Knowl Data Eng 26(5):1131–1143

    Article  Google Scholar 

  47. Yang X, He L, Qu D, Zhang W (2018) Semi-supervised minimum redundancy maximum relevance feature selection for audio classification. Multimed Tools Appl 77(1):713–739

    Article  Google Scholar 

  48. Xu S, Dai J, Shi H (2018) Semi-supervised Feature selection based on least square regression with redundancy minimization. In: Proceedings of the 2018 International Joint Conference on Neural Networks. https://doi.org/10.1109/IJCNN.2018.8489384

  49. Nie F, Wang H, Huang H, Ding C (2011) Unsupervised and semi-supervised learning via l1norm graph. In: Proceedings of the 13th International Conference on Computer Vision 2268–2273 DOI: https://doi.org/10.1109/ICCV.2011.6126506

  50. Wang X, Zhang X, Zeng Z, Wu Q, Zhang J (2016) Unsupervised spectral feature selection with l1-norm graph. Neurocomputing 200:47–54

    Article  Google Scholar 

  51. Ding C (2013) A new robust function that smoothly interpolates between l1 and l2 error functions. Univerisity of Texas at Arlington Technology Report

  52. Luo M, Nie F, Chang X, Yang Y, Hauptmann A G, Zheng Q (2017) Adaptive unsupervised feature selection with structure regularization. IEEE Trans Neural Netw Learn Syst 29(4):944–956

    Article  Google Scholar 

  53. Gao Y, Wang D, Pan J, Wang Z, Chen B (2019) A novel fuzzy c-means clustering algorithm using adaptive norm. Int J Fuzzy Syst 21(8):2632–2649

    Article  Google Scholar 

  54. Wang X, Chen R, Yan F, Zeng Z, Hong C (2019) Fast adaptive k-means subspace clustering for high-dimensional data. IEEE Access 7:42639–42651

    Article  Google Scholar 

  55. Zeng Z, Wang X, Yan F, Chen Y (2019) Local adaptive learning for semi-supervised feature selection with group sparsity. Knowl-Based Syst 181(104787)

  56. Chen X, Yuan G, Nie F, Ming Z (2018) Semi-supervised feature selection via sparse rescaled linear square regression. IEEE Trans Knowl Data Eng 32(1):165–176

    Article  Google Scholar 

  57. Nie F, Wang H, Huang H, Ding C (2013) Adaptive loss minimization for semi-supervised elastic embedding. In: Proceedings of the 23th International Joint Conference on Artificial Intelligence, pp 1565–1571

  58. Chang X, Nie F, Yang Y, Huang H (2014) A convex formulation for semi-supervised multi-label feature selection. In: Proceedings of the 37th National Conference on Artificial Intelligence, vol 2, pp 1171–1177

  59. Bache K, Lichman M (2013) UCI machine learning repository

  60. Zhang Y, Wang Q, Gong DW, Song XF (2019) Nonnegative Laplacian embedding guided subspace learning for unsupervised feature selection. Pattern Recogn 93:337–352

    Article  Google Scholar 

  61. Zhang Y, Gong D W, Gao X Z, Tian T, Sun Z Y (2020) Binary differential evolution with self-learning for multi-objective feature selection. Inf Sci 507:67–85

    Article  MathSciNet  MATH  Google Scholar 

  62. Song X F, Zhang Y, Guo Y N, Sun X Y, Wang Y L (2020) Variable-size cooperative coevolutionary particle swarm optimization for feature selection on high-dimensional data. IEEE Trans Evol Comput 24:882–895

    Article  Google Scholar 

  63. Tang BG, Zhang L (2018) Semi-supervised feature selection based on logistic I-RELIEF for multi-classification. In: Proceedings of the 15th Pacific Rim International Conference on Artificial Intelligence, pp 719–731

  64. Zhang Y, Li H G, Wang Q W, Peng C (2020) A filter-based bare-bone particle swarm optimization algorithm for unsupervised feature selection. Appl Intell 49:2889–2898

    Article  Google Scholar 

  65. Nie F, Huang H, Cai X, Ding C H Q (2010) Efficient and robust feature selection via joint l2,1-norms minimization. In: Proceedings of the 23th International Conference on Neural Information Processing Systems, vol 2, pp 1813–1821

  66. Nie F, Yang S, Zhang R, Li X L (2019) A general framework for auto-weighted feature selection via global redundancy minimization. IEEE Trans Image Process 28(5):2428– 2438

    Article  MathSciNet  MATH  Google Scholar 

  67. Liu Y, Nie F, Wu J, Chen L (2013) Efficient semi-supervised feature selection with noise insensitive trace ratio criterion. Neurocomputing 5:12–18

    Article  Google Scholar 

  68. Wang D, Nie F, Huang H (2015) Feature selection via global redundancy minimization. IEEE Trans Knowl Data Eng 27(10):2743–2755

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Nos. 61976182, 61602327, 61876157, 61976245, 62076171), Key program for International S&T Cooperation of Sichuan Province (2019YFH0097) and Sichuan Key R&D project (2020YFG0035).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongmei Chen.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, X., Chen, H., Li, T. et al. Semi-supervised feature selection with minimal redundancy based on local adaptive. Appl Intell 51, 8542–8563 (2021). https://doi.org/10.1007/s10489-021-02288-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02288-4

Keywords

Navigation