Abstract
Multi-label feature selection(MFS) has gained in importance, and it is today confronted with the current need to process multi-semantic high-dimensional data. Recent studies usually figure out the MFS problems either simply assume that all associated labels are equally important for each instance; or that the labels are independent of each other. In many real-world applications, however, both cases may occur that the significance of each relevant label is generally different and label correlations are ubiquitous. Based on this observation, we propose a new algorithm, called FSEP, to perform MFS by considering label significance and pairwise label correlations. In FSEP, we first construct a label enhancement method that is able to obtain label distribution and further earn the information of label significance. Then, FSEP explores the influence mechanism of label correlations to features by using neighborhood mutual information and incorporates this influence into the process of feature evaluation. After that, a novel multi-label feature selection strategy, namely, Max-Relevance, Max-Contribution, and Min-Redundancy, is proposed, which achieves a favorable trade-off among feature relevance, the contribution of label correlations to features, and feature redundancy, simultaneously. Extensive experiments on both public and real-world datasets show that the proposed method achieves encouraging results compared with state-of-the-art MFS algorithms.
Similar content being viewed by others
References
Al-Salemi B, Ayob M, Kendall G, Noah SAM (2019) Multi-label arabic text categorization: A benchmark and baseline comparison of multi-label learning algorithms. Information Processing & Management 56(1):212–227
Zhang M-L, Zhou Z-H (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Trohidis K, Tsoumakas G, Kalliris G, Vlahavas IP et al (2008) Multi-label classification of music into emotions. International Society for Music Information Retrieval Conference (ISMIR) 8:325–330
Qian W, Dong P, Wang Y, Dai S, Huang J (2022) Local rough set-based feature selection for label distribution learning with incomplete labels, International Journal of Machine Learning and Cybernetics 1-20
Liu J, Lin Y, Wu S, Wang C (2018) Online multi-label group feature selection. Knowl-Based Syst 143:42–57
Spolaôr N, Cherman EA, Monard MC, Lee HD (2013) Relieff for multi-label feature selection, in: 2013 Brazilian Conference on Intelligent Systems, IEEE, pp. 6–11
Wang J, Lin Y, Li Y, Wang Y, Xu M, Chen J (2022) Multi-label causal feature selection based on neighborhood mutual information, International Journal of Machine Learning and Cybernetics 1-14
Qian W, Huang J, Wang Y, Shu W (2020) Mutual information-based label distribution feature selection for multi-label learning. Knowl-Based Syst 195:105684. https://doi.org/10.1016/j.knosys.2020.105684
Hu L, Gao L, Li Y, Zhang P, Gao W (2022) Feature-specific mutual information variation for multi-label feature selection. Inf Sci 593:449–471
Liu J, Lin Y, Ding W, Zhang H, Du J (2023) Fuzzy mutual information-based multi-label feature selection with label dependency and streaming labels. IEEE Trans Fuzzy Syst 31:77–89
Lee J, Kim D-W (2015) Memetic feature selection algorithm for multi-label classification. Inf Sci 293:80–96
Lin Y, Hu Q, Liu J, Duan J (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168:92–103
Liu J, Li Y, Weng W, Zhang J, Chen B, Wu S (2020) Feature selection for multi-label learning with streaming label. Neurocomputing 387:268–278
Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748
Xu N, Liu Y-P, Geng X (2019) Label enhancement for label distribution learning. IEEE Trans Knowl Data Eng 33(4):1632–1643
Xu N, Liu Y-P, Zhang Y, Geng X (2021) Progressive enhancement of label distributions for partial multilabel learning. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3125366
Xu N, Shu J, Liu Y-P, Geng X (2020) Variational label enhancement, in: International Conference on Machine Learning, PMLR, 10597–10606
Li W, Chen J, Gao P, Huang Z (2022) Label enhancement with label-specific feature learning, International Journal of Machine Learning and Cybernetics 1–11
Xu N, Li J-Y, Liu Y-P, Geng X (2022) Trusted-data-guided label enhancement on noisy labels. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3162316
Zhu Y, Kwok JT, Zhou Z-H (2017) Multi-label learning with global and local label correlation. IEEE Trans Knowl Data Eng 30(6):1081–1094
Che X, Chen D, Mi J (2021) Feature distribution-based label correlation in multi-label classification. Int J Mach Learn Cybern 12(6):1705–1719
Zhang J, Lin Y, Jiang M, Li S, Tang Y, Tan KC (2020) Multi-label feature selection via global relevance and redundancy optimization. In: International Joint Conference on Artificial Intelligence (IJCAI), pp 2512–2518
Zhang J, Wu H, Jiang M, Liu J, Li S, Tang Y, Long J (2023) Group-preserving label-specific feature selection for multi-label learning. Expert Syst Appl 213:118861. https://doi.org/10.1016/j.eswa.2022.118861
Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359
Huang J, Li G, Huang Q, Wu X (2016) Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans Knowl Data Eng 28(12):3309–3323
Zhao W, Kong S, Bai J, Fink D, Gomes C (2021) Hot-vae: Learning high-order label correlation for multi-label classification via attention-based variational autoencoders. Proceedings of the AAAI Conference on Artificial Intelligence 35:15016–15024
Elisseeff A, Weston J (2001) A kernel method for multi-labeled classification. Adv Neural Inf Process Syst 14:1–7
Li Y, Song Y, Luo J (2017) Improving pairwise ranking for multi-label image classification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3617–3625
Xie M-K, Huang S-J (2021) Multi-label learning with pairwise relevance ordering. Adv Neural Inf Process Syst 34:23545–23556
Brinker K, Hüllermeier E (2005) Calibrated label-ranking, in: Proceedings of the NIPS-2005 Workshop on Learning to Rank, Citeseer, pp. 1–6
Weng W, Lin Y, Wu S, Li Y, Kang Y (2018) Multi-label learning based on label-specific features and local pairwise label correlation. Neurocomputing 273:385–394
Huang J, Li G, Wang S, Xue Z, Huang Q (2017) Multi-label classification by exploiting local positive and negative pairwise label correlation. Neurocomputing 257:164–174
Huang R, Kang L (2021) Local positive and negative label correlation analysis with label awareness for multi-label classification. Int J Mach Learn Cybern 12(9):2659–2672
Wu G, Tian Y, Liu D (2018) Cost-sensitive multi-label learning with positive and negative label pairwise correlations. Neural Netw 108:411–423
Zhang M-L, Peña JM, Robles V (2009) Feature selection for multi-label naive bayes classification. Inf Sci 179(19):3218–3229
Zhang J, Luo Z, Li C, Zhou C, Li S (2019) Manifold regularized discriminative feature selection for multi-label learning. Pattern Recogn 95:136–150
Gu Q, Li Z, Han J (2011) Correlated multi-label feature selection, in: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, 1087–1096
Fan Y, Chen B, Huang W, Liu J, Weng W, Lan W (2022) Multi-label feature selection based on label correlations and feature redundancy. Knowl-Based Syst 241:108256. https://doi.org/10.1016/j.knosys.2019.105052
Che X, Chen D, Mi J (2020) A novel approach for learning label correlation with application to feature selection of multi-label data. Inf Sci 512:795–812
Li Y, Hu J, Gao W (2022) Robust multi-label feature selection with shared label enhancement. Knowl Inf Syst 64:3343–3372
Qian W, Long X, Wang Y, Xie Y (2020) Multi-label feature selection based on label distribution and feature complementarity. Appl Soft Comput 90:106167. https://doi.org/10.1016/j.asoc.2020.106167
Long X, Qian W, Wang Y, Shu W (2021) Cost-sensitive feature selection on multi-label data via neighborhood granularity and label enhancement. Appl Intell 51(4):2210–2232
Xiong C, Qian W, Wang Y, Huang J (2021) Feature selection based on label distribution and fuzzy mutual information. Inf Sci 574:297–319
Hu Q, Zhang L, Zhang D, Pan W, An S, Pedrycz W (2011) Measuring relevance between discrete and continuous features based on neighborhood mutual information. Expert Syst Appl 38(9):10737–10750
Lin Y, Hu Q, Liu J, Chen J, Duan J (2016) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38:244–256
Zhang Y, Zhou Z-H (2010) Multilabel dimensionality reduction via dependence maximization. ACM Transactions on Knowledge Discovery from Data (TKDD) 4(3):1–21
Lee J, Lim H, Lim D-W (2012) Approximating mutual information for multi-label feature selection. Electron Lett 48(15):929–930
Lin Y, Li Y, Wang C, Chen J (2018) Attribute reduction for multi-label learning with fuzzy rough set. Knowl-Based Syst 152:51–61
Qian W, Xiong Y, Yang J, Shu W (2022) Feature selection for label distribution learning via feature similarity and label correlation. Inf Sci 582:38–59
Liu J, Lin Y, Du J, Zhang H, Chen Z, Zhang J (2022) Asfs: A novel streaming feature selection for multi-label data based on neighborhood rough set, Applied Intelligence 1–18
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64
Liu J, Lin Y, Ding W, Zhang H, Wang C, Du J (2023) Multi-label feature selection based on label distribution and neighborhood rough set. Neurocomputing 524:142–157
Xu N, Liu Y, Geng X (2021) Label enhancement for label distribution learning. IEEE Trans Knowl Data Eng 33(4):1632–1643
Acknowledgements
This work is supported by Grants from the National Natural Science Foundation of China (Nos. 61871196, 61976120, 62001175, and 62076116), the Guiding Project of Fujian Science and Technology Plan(No. 2021H0019), the Natural Science Foundation of Fujian Province (Nos. 2021J011187, 2021J02049, and 2022J01317), the Project of Key Laboratory of Big Data and Artificial Intelligence in Universities of Fujian Province (No. Fujian Education Science [2019]67), 2021 Fujian Young and Middle-aged Teacher Education and Scientific Research Project (No. JAT210614).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, J., Yang, S., Lin, Y. et al. Multi-label feature selection via joint label enhancement and pairwise label correlations. Int. J. Mach. Learn. & Cyber. 14, 3943–3964 (2023). https://doi.org/10.1007/s13042-023-01874-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-01874-x