Skip to main content

Advertisement

Log in

Leveraging Local Density Decision Labeling and Fuzzy Dependency for Semi-supervised Feature Selection

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

In real-world scenarios, datasets often lack full supervision due to the high cost associated with acquiring decision labels. Completing datasets by filling in missing labels is essential for preserving the valuable feature information of individual samples. Furthermore, in the era of big data, datasets tend to exhibit high dimensionality, which adds complexity to subsequent data processing. In this study, a new semi-supervised feature selection technique is introduced. Firstly, a fully supervised dataset is created by utilizing a local density decision-labeling algorithm to fill in missing decision labels within the semi-supervised dataset. Next, a fuzzy dependency-based feature selection approach is presented to find and keep the most pertinent characteristics for the finished datasets. Finally, the effectiveness and reliability of our proposed method are validated through a series of rigorous experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Fig. 2

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Li, W., Deng, C., Pedrycz, W., Castillo, O., Zhang, C., Zhan, T.: Double-quantitative feature selection approach for multi-granularity ordered decision systems. IEEE Trans. Artif. Intell. 1–12 (2023)

  2. Li, Y., Wei, S., Liu, X., Zhang, Z.: A novel robust fuzzy rough set model for feature selection. Complexity 2021, 6685396 (2021)

    MATH  Google Scholar 

  3. Sun, L., Yin, T., Ding, W., Qian, Y., Xu, J.: Feature selection with missing labels using multilabel fuzzy neighborhood rough sets and maximum relevance minimum redundancy. IEEE Trans. Fuzzy Syst. 30(5), 1197–1211 (2021)

    Article  MATH  Google Scholar 

  4. Zhang, H.: Feature selection using approximate conditional entropy based on fuzzy information granule for gene expression data classification. Front. Genet. 12, 631505 (2021)

    Article  MATH  Google Scholar 

  5. Wang, Z., Zheng, X., Pan, H., Li, D.: Information entropy multi-decision attribute reduction fuzzy rough set for dust particulate imagery characteristic extraction. IEEE Access 8, 77865–77874 (2020)

    Article  Google Scholar 

  6. Xia, S., Bai, X., Wang, G., Cheng, Y., Meng, D., Gao, X., Zhai, Y., Giem, E.: An efficient and accurate rough set for feature selection, classification, and knowledge representation. IEEE Trans. Knowl. Data Eng. 35(8), 7724–7735 (2023)

  7. Yan, X., Sarkar, M., Gebru, B., Nazmi, S., Homaifar, A.: A supervised feature selection method for mixed-type data using density-based feature clustering. In: 2021 IEEE International conference on systems, man, and cybernetics (SMC), pp. 1900–1905. IEEE (2021)

  8. Zhong, W., Chen, X., Nie, F., Huang, J.Z.: Adaptive discriminant analysis for semi-supervised feature selection. Inf. Sci. 566, 178–194 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  9. Liu, K., Yang, X., Yu, H., Mi, J., Wang, P., Chen, X.: Rough set based semi-supervised feature selection via ensemble selector. Knowl.-Based Syst. 165, 282–296 (2019)

    Article  MATH  Google Scholar 

  10. Shu, W., Yan, Z., Yu, J., Qian, W.: Information gain-based semi-supervised feature selection for hybrid data. Appl. Intell. 53(6), 7310–7325 (2023)

    Article  MATH  Google Scholar 

  11. Li, Z., Tang, J.: Semi-supervised local feature selection for data classification. Sci. China Inf. Sci. 64(9), 192108 (2021)

    Article  MATH  Google Scholar 

  12. Du, W., Phlypo, R., Adalı, T.: Adaptive feature selection and feature fusion for semi-supervised classification. J. Signal Process. Syst. 91(5), 521–537 (2019)

    Article  MATH  Google Scholar 

  13. Coelho, F., Castro, C., Braga, A.P., Verleysen, M.: Semi-supervised relevance index for feature selection. Neural Comput. Appl. 31, 989–997 (2019)

    Article  Google Scholar 

  14. Khozaei, B., Eftekhari, M.: Unsupervised feature selection based on spectral clustering with maximum relevancy and minimum redundancy approach. Int. J. Pattern Recogn. Artif. Intell. 35(11), 2150031 (2021)

    Article  MATH  Google Scholar 

  15. Hamaide, V., Glineur, F.: Unsupervised minimum redundancy maximum relevance feature selection for predictive maintenance: application to a rotating machine. Int. J. Prognost. Health Manag. 12(2) (2021)

  16. Cheng, Q., et al.: Algorithmic stability and generalization of an unsupervised feature selection algorithm. Adv. Neural. Inf. Process. Syst. 34, 19860–19875 (2021)

    Google Scholar 

  17. Zhou, J., Liu, D.: A redundancy based unsupervised feature selection method for high-dimensional data. In: 2021 13th International Conference on Machine Learning and Computing, pp. 285–289 (2021)

  18. Zhang, P., Li, T., Yuan, Z., Deng, Z., Wang, G., Wang, D., Zhang, F.: A possibilistic information fusion-based unsupervised feature selection method using information quality measures. IEEE Trans. Fuzzy Syst. 31(9), 2975–988 (2023)

  19. Zhang, P., Li, T., Yuan, Z., Luo, C., Wang, G., Liu, J., Du, S.: A data-level fusion model for unsupervised attribute selection in multi-source homogeneous data. Inf. Fusion 80, 87–103 (2022)

    Article  MATH  Google Scholar 

  20. Zhang, P., Wang, D., Yu, Z., Zhang, Y., Jiang, T., Li, T.: A multi-scale information fusion-based multiple correlations for unsupervised attribute selection. Inf. Fusion 106, 102276 (2024)

  21. Li, W., Zhai, S., Xu, W., Pedrycz, W., Qian, Y., Ding, W., Zhan, T.: Feature selection approach based on improved fuzzy c-means with principle of refined justifiable granularity. IEEE Trans. Fuzzy Syst. 31(7), 2112–2126 (2022)

  22. Zeng, Z., Wang, X., Yan, F., Chen, Y.: Local adaptive learning for semi-supervised feature selection with group sparsity. Knowl.-Based Syst. 181, 104787 (2019)

    Article  MATH  Google Scholar 

  23. Shi, C., Gu, Z., Duan, C., Tian, Q.: Multi-view adaptive semi-supervised feature selection with the self-paced learning. Signal Process. 168, 107332 (2020)

    Article  MATH  Google Scholar 

  24. Feng, W., Ji-Chao, L., Wei, W.: Semi-supervised feature selection algorithm based on information entropy. Comput. Sci. 45(11), 427–30 (2018)

    MATH  Google Scholar 

  25. Dai, J., Liu, Q.: Semi-supervised attribute reduction for interval data based on misclassification cost. Int. J. Mach. Learn. Cybern. 13, 1739–1750 (2022)

  26. An, S., Zhang, M., Wang, C., Ding, W.: Robust fuzzy rough approximations with knn granules for semi-supervised feature selection. Fuzzy Sets Syst. 461, 108476 (2023)

    Article  MathSciNet  MATH  Google Scholar 

  27. Campagner, A., Ciucci, D., Hüllermeier, E.: Rough set-based feature selection for weakly labeled data. Int. J. Approx. Reason. 136, 150–167 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  28. Campagner, A., Ciucci, D., Denœux, T.: Belief functions and rough sets: survey and new insights. Int. J. Approx. Reason. 143, 192–215 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  29. Campagner, A., Ciucci, D.: Rough-set based genetic algorithms for weakly supervised feature selection. In: International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 761–773 (2022). Springer

  30. Li, W., Zhan, T.: Multi-granularity probabilistic rough fuzzy sets for interval-valued fuzzy decision systems. Int. J. Fuzzy Syst. 25(8), 3061–3073 (2023)

    Article  MATH  Google Scholar 

  31. Li, W., Zhou, H., Xu, W., Wang, X.-Z., Pedrycz, W.: Interval dominance-based feature selection for interval-valued ordered data. IEEE Trans. Neural Netw. Learn. Syst. 34(10), 6898–6912 (2023)

  32. Zeng, A., Li, T., Liu, D., Zhang, J., Chen, H.: A fuzzy rough set approach for incremental feature selection on hybrid information systems. Fuzzy Sets Syst. 258, 39–60 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  33. Hu, M., Tsang, E.C., Guo, Y., Chen, D., Xu, W.: A novel approach to attribute reduction based on weighted neighborhood rough sets. Knowl.-Based Syst. 220, 106908 (2021)

    Article  MATH  Google Scholar 

  34. An, S., Hu, Q., Wang, C.: Probability granular distance-based fuzzy rough set model. Appl. Soft Comput. 102, 107064 (2021)

    Article  MATH  Google Scholar 

  35. Yang, X., Chen, H., Li, T., Luo, C.: A noise-aware fuzzy rough set approach for feature selection. Knowl.-Based Syst. 250, 109092 (2022)

    Article  MATH  Google Scholar 

  36. Li, W., Xu, W., Zhang, X., Zhang, J.: Updating approximations with dynamic objects based on local multigranulation rough sets in ordered information systems. Artif. Intell. Rev. 55(3), 1821–1855 (2022)

    Article  MATH  Google Scholar 

  37. Li, W., Wei, Y., Xu, W.: General expression of knowledge granularity based on a fuzzy relation matrix. Fuzzy Sets Syst. 440, 149–163 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  38. Guo, Z., Shen, Y., Yang, T., Li, Y., Deng, Y., Qian, Y.: Semi-supervised feature selection based on fuzzy related family. Inf. Sci. 652, 119660 (2024)

    Article  MATH  Google Scholar 

  39. Gu, X., Angelov, P.P., Shen, Q.: Semi-supervised fuzzily weighted adaptive boosting for classification. IEEE Trans. Fuzzy Syst. 32(4), 2318–2330 (2024)

  40. Asuncion, A., Newman, D.: UCI machine learning repository. Irvine (2007)

  41. Alcalá-Fdez, J., Sanchez, L., Garcia, S., Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., et al.: Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft. Comput. 13, 307–318 (2009)

    Article  Google Scholar 

  42. Pan, Y., Xia, K., Wang, L., He, Z.: A novel approach to oil layer recognition model using whale optimization algorithm and semi-supervised svm. Symmetry 13(5), 757 (2021)

    Article  MATH  Google Scholar 

  43. Wan, J., Chen, H., Li, T., Yang, X., Sang, B.: Dynamic interaction feature selection based on fuzzy rough set. Inf. Sci. 581, 891–911 (2021)

    Article  MATH  Google Scholar 

  44. Adeniyi, D.A., Wei, Z., Yongquan, Y.: Automated web usage data mining and recommendation system using k-nearest neighbor (knn) classification method. Appl. Comput. Inform. 12(1), 90–108 (2016)

    Article  MATH  Google Scholar 

  45. Xu, J., Wang, Y., Xu, K., Zhang, T., et al.: Feature genes selection using fuzzy rough uncertainty metric for tumor diagnosis. Comput. Math. Methods Med. 2019, 6705648 (2019)

Download references

Acknowledgements

We would like to thank the Editor-in-Chief, editors, and anonymous reviewers for their insightful and constructive comments, which have greatly aided us in improving the quality of the paper. This work was supported by the National Natural Science Foundation of China (Grant nos. 12261010, 12326353), the Natural Science Foundation of Guangxi (2023GXNSFBA026019), the Key Laboratory of Software Engineering in Guangxi MinZu University (2022-18XJSY-03), the Postdoctoral Fellowship Program of CPSF (no. GZB20230092), the China Postdoctoral Science Foundation (no. 2023M740383), the Natural Science Foundation of Sichuan Province (no. 24NSFSC1654).

Author information

Authors and Affiliations

Authors

Contributions

GZ: Conceptualization, Methodology, Software, Data curation, Writing-original draft. JH: Software, Visualization. PZ: Methodology, Writing-review and editing.

Corresponding author

Correspondence to Pengfei Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, G., Hu, J. & Zhang, P. Leveraging Local Density Decision Labeling and Fuzzy Dependency for Semi-supervised Feature Selection. Int. J. Fuzzy Syst. 26, 2805–2820 (2024). https://doi.org/10.1007/s40815-024-01740-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-024-01740-0

Keywords