Skip to main content
Log in

Multi-label feature selection via joint label enhancement and pairwise label correlations

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Multi-label feature selection(MFS) has gained in importance, and it is today confronted with the current need to process multi-semantic high-dimensional data. Recent studies usually figure out the MFS problems either simply assume that all associated labels are equally important for each instance; or that the labels are independent of each other. In many real-world applications, however, both cases may occur that the significance of each relevant label is generally different and label correlations are ubiquitous. Based on this observation, we propose a new algorithm, called FSEP, to perform MFS by considering label significance and pairwise label correlations. In FSEP, we first construct a label enhancement method that is able to obtain label distribution and further earn the information of label significance. Then, FSEP explores the influence mechanism of label correlations to features by using neighborhood mutual information and incorporates this influence into the process of feature evaluation. After that, a novel multi-label feature selection strategy, namely, Max-Relevance, Max-Contribution, and Min-Redundancy, is proposed, which achieves a favorable trade-off among feature relevance, the contribution of label correlations to features, and feature redundancy, simultaneously. Extensive experiments on both public and real-world datasets show that the proposed method achieves encouraging results compared with state-of-the-art MFS algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://www.lamda.nju.edu.cn/code_MDDM.ashx.

  2. http://mulan.sourceforge.net/datasets.html.

  3. https://github.com/XSilverBullet/parkinson-

References

  1. Al-Salemi B, Ayob M, Kendall G, Noah SAM (2019) Multi-label arabic text categorization: A benchmark and baseline comparison of multi-label learning algorithms. Information Processing & Management 56(1):212–227

    Article  Google Scholar 

  2. Zhang M-L, Zhou Z-H (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Article  Google Scholar 

  3. Trohidis K, Tsoumakas G, Kalliris G, Vlahavas IP et al (2008) Multi-label classification of music into emotions. International Society for Music Information Retrieval Conference (ISMIR) 8:325–330

    Google Scholar 

  4. Qian W, Dong P, Wang Y, Dai S, Huang J (2022) Local rough set-based feature selection for label distribution learning with incomplete labels, International Journal of Machine Learning and Cybernetics 1-20

  5. Liu J, Lin Y, Wu S, Wang C (2018) Online multi-label group feature selection. Knowl-Based Syst 143:42–57

    Article  Google Scholar 

  6. Spolaôr N, Cherman EA, Monard MC, Lee HD (2013) Relieff for multi-label feature selection, in: 2013 Brazilian Conference on Intelligent Systems, IEEE, pp. 6–11

  7. Wang J, Lin Y, Li Y, Wang Y, Xu M, Chen J (2022) Multi-label causal feature selection based on neighborhood mutual information, International Journal of Machine Learning and Cybernetics 1-14

  8. Qian W, Huang J, Wang Y, Shu W (2020) Mutual information-based label distribution feature selection for multi-label learning. Knowl-Based Syst 195:105684. https://doi.org/10.1016/j.knosys.2020.105684

    Article  Google Scholar 

  9. Hu L, Gao L, Li Y, Zhang P, Gao W (2022) Feature-specific mutual information variation for multi-label feature selection. Inf Sci 593:449–471

    Article  Google Scholar 

  10. Liu J, Lin Y, Ding W, Zhang H, Du J (2023) Fuzzy mutual information-based multi-label feature selection with label dependency and streaming labels. IEEE Trans Fuzzy Syst 31:77–89

    Article  Google Scholar 

  11. Lee J, Kim D-W (2015) Memetic feature selection algorithm for multi-label classification. Inf Sci 293:80–96

    Article  Google Scholar 

  12. Lin Y, Hu Q, Liu J, Duan J (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168:92–103

    Article  Google Scholar 

  13. Liu J, Li Y, Weng W, Zhang J, Chen B, Wu S (2020) Feature selection for multi-label learning with streaming label. Neurocomputing 387:268–278

    Article  Google Scholar 

  14. Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748

    Article  Google Scholar 

  15. Xu N, Liu Y-P, Geng X (2019) Label enhancement for label distribution learning. IEEE Trans Knowl Data Eng 33(4):1632–1643

    Article  Google Scholar 

  16. Xu N, Liu Y-P, Zhang Y, Geng X (2021) Progressive enhancement of label distributions for partial multilabel learning. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3125366

    Article  Google Scholar 

  17. Xu N, Shu J, Liu Y-P, Geng X (2020) Variational label enhancement, in: International Conference on Machine Learning, PMLR, 10597–10606

  18. Li W, Chen J, Gao P, Huang Z (2022) Label enhancement with label-specific feature learning, International Journal of Machine Learning and Cybernetics 1–11

  19. Xu N, Li J-Y, Liu Y-P, Geng X (2022) Trusted-data-guided label enhancement on noisy labels. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3162316

    Article  Google Scholar 

  20. Zhu Y, Kwok JT, Zhou Z-H (2017) Multi-label learning with global and local label correlation. IEEE Trans Knowl Data Eng 30(6):1081–1094

    Article  Google Scholar 

  21. Che X, Chen D, Mi J (2021) Feature distribution-based label correlation in multi-label classification. Int J Mach Learn Cybern 12(6):1705–1719

    Article  Google Scholar 

  22. Zhang J, Lin Y, Jiang M, Li S, Tang Y, Tan KC (2020) Multi-label feature selection via global relevance and redundancy optimization. In: International Joint Conference on Artificial Intelligence (IJCAI), pp 2512–2518

  23. Zhang J, Wu H, Jiang M, Liu J, Li S, Tang Y, Long J (2023) Group-preserving label-specific feature selection for multi-label learning. Expert Syst Appl 213:118861. https://doi.org/10.1016/j.eswa.2022.118861

    Article  Google Scholar 

  24. Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771

    Article  Google Scholar 

  25. Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359

    Article  MathSciNet  Google Scholar 

  26. Huang J, Li G, Huang Q, Wu X (2016) Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans Knowl Data Eng 28(12):3309–3323

    Article  Google Scholar 

  27. Zhao W, Kong S, Bai J, Fink D, Gomes C (2021) Hot-vae: Learning high-order label correlation for multi-label classification via attention-based variational autoencoders. Proceedings of the AAAI Conference on Artificial Intelligence 35:15016–15024

    Article  Google Scholar 

  28. Elisseeff A, Weston J (2001) A kernel method for multi-labeled classification. Adv Neural Inf Process Syst 14:1–7

    Google Scholar 

  29. Li Y, Song Y, Luo J (2017) Improving pairwise ranking for multi-label image classification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3617–3625

  30. Xie M-K, Huang S-J (2021) Multi-label learning with pairwise relevance ordering. Adv Neural Inf Process Syst 34:23545–23556

    Google Scholar 

  31. Brinker K, Hüllermeier E (2005) Calibrated label-ranking, in: Proceedings of the NIPS-2005 Workshop on Learning to Rank, Citeseer, pp. 1–6

  32. Weng W, Lin Y, Wu S, Li Y, Kang Y (2018) Multi-label learning based on label-specific features and local pairwise label correlation. Neurocomputing 273:385–394

    Article  Google Scholar 

  33. Huang J, Li G, Wang S, Xue Z, Huang Q (2017) Multi-label classification by exploiting local positive and negative pairwise label correlation. Neurocomputing 257:164–174

    Article  Google Scholar 

  34. Huang R, Kang L (2021) Local positive and negative label correlation analysis with label awareness for multi-label classification. Int J Mach Learn Cybern 12(9):2659–2672

    Article  Google Scholar 

  35. Wu G, Tian Y, Liu D (2018) Cost-sensitive multi-label learning with positive and negative label pairwise correlations. Neural Netw 108:411–423

    Article  Google Scholar 

  36. Zhang M-L, Peña JM, Robles V (2009) Feature selection for multi-label naive bayes classification. Inf Sci 179(19):3218–3229

    Article  MATH  Google Scholar 

  37. Zhang J, Luo Z, Li C, Zhou C, Li S (2019) Manifold regularized discriminative feature selection for multi-label learning. Pattern Recogn 95:136–150

    Article  Google Scholar 

  38. Gu Q, Li Z, Han J (2011) Correlated multi-label feature selection, in: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, 1087–1096

  39. Fan Y, Chen B, Huang W, Liu J, Weng W, Lan W (2022) Multi-label feature selection based on label correlations and feature redundancy. Knowl-Based Syst 241:108256. https://doi.org/10.1016/j.knosys.2019.105052

    Article  Google Scholar 

  40. Che X, Chen D, Mi J (2020) A novel approach for learning label correlation with application to feature selection of multi-label data. Inf Sci 512:795–812

    Article  MathSciNet  MATH  Google Scholar 

  41. Li Y, Hu J, Gao W (2022) Robust multi-label feature selection with shared label enhancement. Knowl Inf Syst 64:3343–3372

    Article  Google Scholar 

  42. Qian W, Long X, Wang Y, Xie Y (2020) Multi-label feature selection based on label distribution and feature complementarity. Appl Soft Comput 90:106167. https://doi.org/10.1016/j.asoc.2020.106167

    Article  Google Scholar 

  43. Long X, Qian W, Wang Y, Shu W (2021) Cost-sensitive feature selection on multi-label data via neighborhood granularity and label enhancement. Appl Intell 51(4):2210–2232

    Article  Google Scholar 

  44. Xiong C, Qian W, Wang Y, Huang J (2021) Feature selection based on label distribution and fuzzy mutual information. Inf Sci 574:297–319

    Article  MathSciNet  Google Scholar 

  45. Hu Q, Zhang L, Zhang D, Pan W, An S, Pedrycz W (2011) Measuring relevance between discrete and continuous features based on neighborhood mutual information. Expert Syst Appl 38(9):10737–10750

    Article  Google Scholar 

  46. Lin Y, Hu Q, Liu J, Chen J, Duan J (2016) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38:244–256

    Article  Google Scholar 

  47. Zhang Y, Zhou Z-H (2010) Multilabel dimensionality reduction via dependence maximization. ACM Transactions on Knowledge Discovery from Data (TKDD) 4(3):1–21

    Article  Google Scholar 

  48. Lee J, Lim H, Lim D-W (2012) Approximating mutual information for multi-label feature selection. Electron Lett 48(15):929–930

    Article  Google Scholar 

  49. Lin Y, Li Y, Wang C, Chen J (2018) Attribute reduction for multi-label learning with fuzzy rough set. Knowl-Based Syst 152:51–61

    Article  Google Scholar 

  50. Qian W, Xiong Y, Yang J, Shu W (2022) Feature selection for label distribution learning via feature similarity and label correlation. Inf Sci 582:38–59

    Article  MathSciNet  Google Scholar 

  51. Liu J, Lin Y, Du J, Zhang H, Chen Z, Zhang J (2022) Asfs: A novel streaming feature selection for multi-label data based on neighborhood rough set, Applied Intelligence 1–18

  52. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92

    Article  MathSciNet  MATH  Google Scholar 

  53. Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64

    Article  MathSciNet  MATH  Google Scholar 

  54. Liu J, Lin Y, Ding W, Zhang H, Wang C, Du J (2023) Multi-label feature selection based on label distribution and neighborhood rough set. Neurocomputing 524:142–157

    Article  Google Scholar 

  55. Xu N, Liu Y, Geng X (2021) Label enhancement for label distribution learning. IEEE Trans Knowl Data Eng 33(4):1632–1643

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by Grants from the National Natural Science Foundation of China (Nos. 61871196, 61976120, 62001175, and 62076116), the Guiding Project of Fujian Science and Technology Plan(No. 2021H0019), the Natural Science Foundation of Fujian Province (Nos. 2021J011187, 2021J02049, and 2022J01317), the Project of Key Laboratory of Big Data and Artificial Intelligence in Universities of Fujian Province (No. Fujian Education Science [2019]67), 2021 Fujian Young and Middle-aged Teacher Education and Scientific Research Project (No. JAT210614).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinghua Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Yang, S., Lin, Y. et al. Multi-label feature selection via joint label enhancement and pairwise label correlations. Int. J. Mach. Learn. & Cyber. 14, 3943–3964 (2023). https://doi.org/10.1007/s13042-023-01874-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-01874-x

Keywords

Navigation