Skip to main content
Log in

ASFS: A novel streaming feature selection for multi-label data based on neighborhood rough set

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Neighborhood rough set based online streaming feature selection methods have aroused wide concern in recent years and played a vital role in processing high-dimensional data. However, most of the existing methods are directly applied to handle single-label data, or to handle multi-label data by converting multi-label data into a combination of multiple single-label datasets, which ignores that the label set of multi-label data is an integral whole. In this paper, we propose a novel online streaming feature selection for multi-label learning via the neighborhoorough set model, in which feature significance, feature redundancy, and label space integrity are taken into account, simultaneously. To be specific, we first define a new adaptive neighborhood relation to avoid the setting of neighborhood parameter and restructure the neighborhood rough set model to be suitable for processing multi-label data directly. Based on this model, we introduce a evaluation criterion to select features that are important relative to label set and the currently selected features, and present an optimization objective function to update the selected feature subset and filter out redundant features. Comparative experiments on different types of data sets explicitly verify the advantages of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://mulan.sourceforge.net/datasets-mlc.html

  2. http://www.lamda.nju.edu.cn/code_MDDM.ashx

  3. http://palm.seu.edu.cn/zhangml/

  4. http://sites.labic.icmc.usp.br/pub/mcmonard

  5. http://ml.cau.ac.kr/?f=softwares&m=pmu

  6. https://github.com/jiazhang-ml/GRRO

  7. https://github.com/jiazhang-ml/MDFS

  8. http://www.lamda.nju.edu.cn/code_MLkNN.ashx

References

  1. Bai SX, Lin YJ, Lv Y, Chen JK, Wang CX (2021) Kernelized fuzzy rough sets based online streaming feature selection for large-scale hierarchical classification. Appl Intell 51(3):1602–1615

    Google Scholar 

  2. Cabral RS, Torre FDL, Costeira JP, Bernardino R (2011) Matrix completion for multi-label image classification. Advances in Neural Information Processing Systems, pp 190–198

  3. Chen L, Xia MM (2021) A context-aware recommendation approach based on feature selection. Appl Intell 51(2):865– 875

    Google Scholar 

  4. Cheng YS, Chen F, Wang YB (2018) Feature selection for multi-label distribution learning with streaming data based on rough set. Journal of Computer Applications 38(11):3105– 3111

    Google Scholar 

  5. Dai L, Zhang J, Li CD, Zhou CG, Li SZ (2019) Multi-label feature selection with application to TCM state identification. Concurrency and Computation:, Practice and Experience 31(23):e4634. https://doi.org/10.1002/cpe.4634

    Article  Google Scholar 

  6. Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64

    MathSciNet  MATH  Google Scholar 

  7. Elisseeff A, Weston J (2002) A kernel method for multi-labelled classification. Advances in Neural Information Processing Systems, pp 681–687

  8. Eskandari S, Javidi MM (2016) Online streaming feature selection using rough sets. Int J Approx Reason 69(1):35–57

    MathSciNet  MATH  Google Scholar 

  9. Fan YL, Liu JH, Weng W, Chen BH, Chen YN, Wu SX (2021) Multi-label feature selection with local discriminant model and label correlations. Neurocomputing 442:98–115

    Google Scholar 

  10. Fan YL, Liu JH, Weng W, Chen BH, Chen YN, Wu SX (2021) Multi-label feature selection with constraint regression and adaptive spectral graph. Knowledge-Based Systems 212:106621. https://doi.org/10.1016/j.knosys.2020.106

    Article  Google Scholar 

  11. Jiang F, Sui YF, Zhou L (2015) A relative decision entropy-based feature selection approach. Pattern Recogn 48(7):2151– 2163

    MATH  Google Scholar 

  12. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics 11(1):86–92

    MathSciNet  MATH  Google Scholar 

  13. Gharroudi O, Elghazel H, Aussem A (2014) A comparison of multi-label feature selection methods using the random forest paradigm. Canadian Conference on Artificial Intelligence, pp 95–106

  14. Hu QH, Zhang LJ, Zhou YC, Pedrycz W (2018) Large-scale multimodality attribute reduction with multi-kernel fuzzy rough sets. IEEE Trans Fuzzy Syst 26(1):226–238

    Google Scholar 

  15. Javidi MM, Eskandari S (2018) Streamwise feature selection: a rough set method. International Journal of Machine Learning and Cybernetics 9(4):667–676

    Google Scholar 

  16. Jian L, Li JD, Shu K, Liu H (2016) Multi-label informed feature selection. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp v

  17. Lee J, Kim DW (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn Lett 34(3):349–357

    Google Scholar 

  18. Lewis DD, Yang YM, Rose T, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5(Apr):361–397

    Google Scholar 

  19. Li Y, Cheng YS (2019) Streaming feature selection for multi-label data with dynamic sliding windows and feature repulsion loss. Entropy 21(12):1151. https://doi.org/10.3390/e21121151

    Article  MathSciNet  Google Scholar 

  20. Li H, Li DY, Zhai YH, Wang SG, Zhang J (2016) A novel attribute reduction approach for multi-label data based on rough set theory. Inf Sci 367:827–847

    MATH  Google Scholar 

  21. Li YW, Lin YJ, Liu J, Weng W, Shi ZK, Wu SX (2018) Feature selection for multi-label learning based on kernelized fuzzy rough sets. Neurocomputing 318(1):271–286

    Google Scholar 

  22. Liu JH, Lin YJ, Wu SX, Wang CX (2018) Online multi-label group feature selection. Knowl-Based Syst 143(1):42–57

    Google Scholar 

  23. Liu J, Guo ZW, Sun ZW, Liu SL, Wang XP (2017) Online multi-label feature selection on imbalanced data sets. In: China conference on wireless sensor networks, Springer, Singapore, pp 165–174

  24. Liu JH, Lin YJ, Li YW, Weng W, Wu SX (2018) Online multi-label streaming feature selection based on neighborhood rough set. Pattern Recogn 84(1):273–287

    Google Scholar 

  25. Liu JH, Lin YJ, Lin ML, Wu SX, Zhang J (2017) Feature selection based on quality of information. Neurocomputing 255(1):11–22

    Google Scholar 

  26. Lin YJ, Hu QH, Liu JH, Li JJ, Wu XD (2017) Streaming feature selection for multilabel learning based on fuzzy mutual information. IEEE Trans Fuzzy Syst 25(6):1491–1507

    Google Scholar 

  27. Lin YJ, Hu QH, Liu JH, Duan J (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168(1):92–103

    Google Scholar 

  28. Lin YJ, Hu QH, Liu JH, Chen JK, Duan J (2016) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38(1):224–256

    Google Scholar 

  29. Lin YJ, Hu QH, Zhang J, Wu XD (2016) Multi-label feature selection with streaming labels. Inf Sci 372(1):256–275

    Google Scholar 

  30. Liu JH, Li YW, Weng W, Zhang J, Chen BH, Wu SX (2020) Feature selection for multi-label learning with streaming label. Neurocomputing 387(1):268–278

    Google Scholar 

  31. Qian WB, Huang JT, Wang YL, Xie YH (2021) Label distribution feature selection for multi-label classification with rough set. Int J Approx Reason 128:32–55

    MathSciNet  MATH  Google Scholar 

  32. Sha ZC, Liu ZM, Ma C, Chen J (2021) Feature selection for multi-label classification by maximizing full-dimensional conditional mutual information. Appl Intell 51(1):326–340

    Google Scholar 

  33. Sun L, Wang TX, Ding WP, Xu JC, Lin YJ (2021) Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification. Inf Sci 578(1):887–912

    MathSciNet  Google Scholar 

  34. Sun L, Yin TY, Ding WP, Qian YH, Xu JC (2020) Multilabel feature selection using ML-relieff and neighborhood mutual information for multilabel neighborhood decision systems. Inf Sci 537:401–424

    MathSciNet  MATH  Google Scholar 

  35. Sun L, Wang LY, Ding WP, Qian YH, Xu JC (2021) Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans Fuzzy Syst 29(1):19–33

    Google Scholar 

  36. Spolaor N, Cherman EA, Monard M, Lee HD (2013) ReliefF for multi-label feature selection, Proceedings of the. Brazilian Conference on Intelligent Systems 2013:6–11

    Google Scholar 

  37. Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I (1948) Mulan: a java library for multi-label learning. J Mach Learn Res 12(Jul):2411–2414

    MathSciNet  MATH  Google Scholar 

  38. Wang F, Liang JY, Qian YH (2013) Attribute reduction: a dimension incremental strategy. Knowl-Based Syst 39:95–108

    Google Scholar 

  39. Wang CZ, Qi YL, Shao MW, Hu QH, Chen DG, Qian YH, Lin YJ (2017) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753

    Google Scholar 

  40. Wang HM, Yu DM, Li Y, Li ZX, Wang GY (2018) Multi-label online streaming feature selection based on spectral granulation and mutual information, International Joint Conference on Rough Sets, Springer, Cham, pp 215–228

  41. Zhang ML, Pena JM, Robles V (2009) Feature selection for multi-label naive bayes classification. Inf Sci 179(19):3218–3229

    MATH  Google Scholar 

  42. Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048

    MATH  Google Scholar 

  43. Zhang Y, Zhou ZH (2010) Multilabel dimensionality reduction via dependence maximization. ACM Transactions on Knowledge Discovery from Data (TKDD) 4(3):1–21

    Google Scholar 

  44. Zhang HY, Yang SY (2017) Feature selection and approximate reasoning of large-scale set-valued decision tables based on α-dominance-based quantitative rough sets. Inf Sci 378(1):328–347

    MathSciNet  MATH  Google Scholar 

  45. Zhang HY, Song HJ, Yang SY (2019) Feature selection based on generalized variable-precision (𝜗,σ)-fuzzy granular rough set model over two universes. International Journal of Machine Learning and Cybernetics 10(5):913–924

    Google Scholar 

  46. Zhang J, Lin YD, Jiang M, Li SZ, Tang Y, Tan KC (2020) Multi-label feature selection via global relevance and redundancy optimization. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, pp 2512–2518

  47. Zhang J, Luo ZM, Li CD, Zhou CG, Li SZ (2019) Manifold regularized discriminative feature selection for multi-label learning. Pattern Recogn 95:137–150

    Google Scholar 

  48. Zhang J, Li CD, Cao DL, Lin YJ, Su SZ, Dai L, Li SZ (2018) Multi-label learning with label-specific features by resolving label correlations. Knowl-Based Syst 159:148–157

    Google Scholar 

  49. Zhang P, Liu G, Gao WF (2019) Distinguishing two types of labels for multi-label feature selection. Pattern Recogn 95:72–82

    Google Scholar 

  50. Zhou P, Hu XG, Li PP, Wu XD (2019) Ofs-density: a novel online streaming feature selection method. Pattern Recogn 86(1):48–61

    Google Scholar 

  51. Zhou P, Hu XG, Li PP, Wu XD (2019) Online streaming feature selection using adapted neighborhood rough set. Inf Sci 481(1):258–279

    Google Scholar 

  52. Zhu M, Wu L (2015) Multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120

    Google Scholar 

Download references

Acknowledgements

This work is supported by Grants from the National Natural Science Foundation of China (Nos.61673186, 61871196, 62001175, and 62076116), the National Key Research and Development Program of China (No. 2019YFC1604700), the Natural Science Foundation of Fujian Province (Nos. 2019J01081, 2019J01082, 2021J011187, and 2021J02049), Key Laboratory of Data Science and Intelligence Application, Minnan Normal University (NO. D202001), and the Scientific Research Funds of Huaqiao University(NO.605-50Y21005).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jinghua Liu or Jia Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Lin, Y., Du, J. et al. ASFS: A novel streaming feature selection for multi-label data based on neighborhood rough set. Appl Intell 53, 1707–1724 (2023). https://doi.org/10.1007/s10489-022-03366-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03366-x

Keywords

Navigation