Multi-label feature selection for missing labels by granular-ball based mutual information

Shu, Wenhao; Hu, Yichen; Qian, Wenbin

doi:10.1007/s10489-024-05809-z

Multi-label feature selection for missing labels by granular-ball based mutual information

Published: 23 September 2024

Volume 54, pages 12589–12612, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Wenhao Shu¹,
Yichen Hu¹ &
Wenbin Qian²

527 Accesses
Explore all metrics

Abstract

Multi-label feature selection serves an effective dimensionality reduction technique in the high-dimensional multi-label data. However, most feature selection methods regard the label as complete. In fact, in real-world applications, labels in a multi-label dataset may be missing due to various difficulties in collecting sufficient labels, which enables some valuable information to be overlooked and leads to an inaccurate prediction in the classification. To address these issues, a feature selection algorithm based on the granular-ball based mutual information is proposed for the multi-label data with missing labels in this paper. At first, to improve the classification ability, a label recovery model is proposed to calculate some labels, which utilizes the correlation between labels, the properties of label specific features and global common features. Secondly, to avoid computing the neighborhood radius, a granular-ball based mutual information metric for evaluating candidate features is proposed, which well fits the data distribution. Finally, the corresponding feature selection algorithm is developed for selecting a subset from the multi-label data with missing labels. Experiments on the different datasets demonstrate that compared with the state-of-the-art algorithms the proposed algorithm considerably improves the classification accuracy. The code is publicly available online at https://github.com/skylark-leo/MLMLFS.git

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection for multi-label learning with missing labels

Article 23 February 2019

Robust Multi-label Feature Selection with Missing Labels

Multi-label feature selection via joint label enhancement and pairwise label correlations

Article 01 July 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

Data will be made available on request.

References

Zhang J, Liu K, Yang X et al (2023) Multi-label learning with relief-based label-specific feature selection. Appl Intell 53(15):18517–18530
Google Scholar
Wang K, Yang M, Yang W et al (2022) Dual-scale correlation analysis for robust multi-label classification. Appl Intell 52(14):16382–16397
Google Scholar
Zhang P, Liu G, Gao W et al (2021) Multi-label feature selection considering label supplementation. Pattern Recognit 120:108137
Google Scholar
Wang Z, Chen H, Mi Y et al (2024) Joint subspace reconstruction and label correlation for multi-label feature selection. Appl Intell 54(1):1117–1143
Google Scholar
Han Q, Hu L, Gao W (2024) Feature relevance and redundancy coefficients for multi-view multi-label feature selection. Inf Sci 652:119747
Google Scholar
Ma J, Xu F, Rong X (2024) Discriminative multi-label feature selection with adaptive graph diffusion. Pattern Recognit 148:110154
Google Scholar
Lim H, Kim D (2020) MFC: initialization method for multi-label feature selection based on conditional mutual information. Neurocomputing 382:40–51
Google Scholar
Sun L, Yin T, Ding W et al (2022) Feature selection with missing labels using multilabel fuzzy neighborhood rough sets and maximum relevance minimum redundancy. IEEE Trans Fuzzy Syst 30(5):1197–1211
Google Scholar
Liu Y, Chen H, Li T et al (2023) A robust graph based multi-label feature selection considering feature-label dependency. Appl Intell 53(1):837–863
Google Scholar
Lu H, Chen H, Li T et al (2022) Multi-label feature selection based on manifold regularization and imbalance ratio. Appl Intell 52(10):11652–11671
Google Scholar
Zhang Y, Zhou Z (2008) Multi-label dimensionality reduction via dependence maximization. In: Fox D, Gomes CP (eds) Proceedings of the twenty-third AAAI conference on artificial intelligence, AAAI 2008, Chicago, Illinois, USA, July 13-17, 2008. AAAI Press, pp 1503–1505
Zhang Y, Ma Y, Yang X (2022) Multi-label feature selection based on logistic regression and manifold learning. Appl Intell 52(8):9256–9273
Google Scholar
Kumar S, Ahmadi N, Rastogi R (2023) Multi-label learning with missing labels using sparse global structure for label-specific features. Appl Intell 53(15):18155–18170
Google Scholar
Cheng Z, Zeng Z (2020) Joint label-specific features and label correlation for multi-label learning with missing label. Appl Intell 50(11):4029–4049
Google Scholar
He Z, Yang M, Gao Y et al (2019) Joint multi-label classification and label correlations with missing labels and feature selection. Knowl Based Syst 163:145–158
Google Scholar
Guo B, Hou C, Shan J, et al (2018) Low rank multi-label classification with missing labels. In: 24th International conference on pattern recognition, ICPR 2018, Beijing, China, August 20-24, 2018. IEEE Computer Society, pp 417–422
Huang J, Qin F, Zheng X, et al (2018) Learning label-specific features for multi-label classification with missing labels. In: Fourth IEEE international conference on multimedia big data, BigMM 2018, Xi’an, China, September 13-16, 2018. IEEE, pp 1–5
Hu Q, Zhang L, Zhang D et al (2011) Measuring relevance between discrete and continuous features based on neighborhood mutual information. Expert Syst Appl 38(9):10737–10750
Google Scholar
Xia S, Liu Y, Ding X et al (2019) Granular-ball computing classifiers for efficient, scalable and robust learning. Inf Sci 483:136–152
MathSciNet Google Scholar
Zhu P, Xu Q, Hu Q et al (2018) Multi-label feature selection with missing labels. Pattern Recognit 74:488–502
Google Scholar
Ma J, Chow TWS (2018) Robust non-negative sparse graph for semi-supervised multi-label learning with missing labels. Inf Sci 422:336–351
MathSciNet Google Scholar
Jiang L, Yu G, Guo M et al (2020) Feature selection with missing labels based on label compression and local feature correlation. Neurocomputing 395:95–106
Google Scholar
Zhang J, Wu H, Jiang M et al (2023) Group-preserving label-specific feature selection for multi-label learning. Expert Syst Appl 213:118861
Google Scholar
Fan Y, Liu J, Weng W et al (2021) Multi-label feature selection with local discriminant model and label correlations. Neurocomputing 442:98–115
Google Scholar
Yu K, Cai M, Wu X et al (2023) Multilabel feature selection: A local causal structure learning approach. IEEE Trans Neural Networks Learn Syst 34(6):3044–3057
Google Scholar
Fan Y, Liu J, Weng W et al (2021) Multi-label feature selection with constraint regression and adaptive spectral graph. Knowl Based Syst 212:106621
Google Scholar
Zhang P, Gao W, Hu J et al (2021) Multi-label feature selection based on the division of label topics. Inf Sci 553:129–153
MathSciNet Google Scholar
Yao E, Li D, Zhai Y et al (2022) Multilabel feature selection based on relative discernibility pair matrix. IEEE Trans Fuzzy Syst 30(7):2388–2401
Google Scholar
Hu M, Tsang ECC, Guo Y et al (2021) A novel approach to attribute reduction based on weighted neighborhood rough sets. Knowl Based Syst 220:106908
Google Scholar
Li Y, Cai M, Zhou J et al (2022) Accelerated multi-granularity reduction based on neighborhood rough sets. Appl Intell 52(15):17636–17651
Google Scholar
Lee J, Kim D (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recognit Lett 34(3):349–357
Google Scholar
Liu J, Lin Y, Li Y et al (2018) Online multi-label streaming feature selection based on neighborhood rough set. Pattern Recognit 84:273–287
Google Scholar
Lin Y, Hu Q, Liu J et al (2016) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38:244–256
Google Scholar
Zhou H, Wang X, Zhu R (2022) Feature selection based on mutual information with correlation coefficient. Appl Intell 52(5):5457–5474
Google Scholar
Hu Q, Zhang L, Zhang D et al (2011) Measuring relevance between discrete and continuous features based on neighborhood mutual information. Expert Syst Appl 38(9):10737–10750
Google Scholar
Liu J, Lin Y, Ding W et al (2023) Multi-label feature selection based on label distribution and neighborhood rough set. Neurocomputing 524:142–157
Google Scholar
Sun L, Chen Y, Ding W et al (2023) AMFSA: Adaptive fuzzy neighborhood-based multilabel feature selection with ant colony optimization. Appl Soft Comput 138:110211
Google Scholar
Qian W, Dong P, Dai S et al (2022) Incomplete label distribution feature selection based on neighborhood-tolerance discrimination index. Appl Soft Comput 130:109693
Google Scholar
Zhang P, Liu G, Song J (2023) MFSJMI: multi-label feature selection considering join mutual information and interaction weight. Pattern Recognit 138:109378
Google Scholar
Lin Y, Hu Q, Liu J et al (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168:92–103
Google Scholar
Wang C, Lin Y, Liu J (2019) Feature selection for multi-label learning with missing labels. Appl Intell 49(8):3027–3042
Google Scholar
Xia S, Zheng S, Wang G et al (2023) Granular-ball sampling for noisy label classification or imbalanced classification. IEEE Trans Neural Networks Learn Syst 34(4):2144–2155
Google Scholar
Chen Y, Wang P, Yang X et al (2021) Granular-ball guided selector for attribute reduction. Knowl Based Syst 229:107326
Google Scholar
Zhang Q, Wu C, Xia S et al (2023) Incremental learning based on granular-ball rough sets for classification in dynamic mixed-type decision system. IEEE Trans Knowl Data Eng 35(9):9319–9332
Google Scholar
Zhang M, Zhou Z (2007) ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Google Scholar
Huang J, Li G, Huang Q et al (2016) Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans Knowl Data Eng 28(12):3309–3323
Google Scholar
Xia S, Wang G, Gao X, et al (2022) Gbsvm: Granular-ball support vector machine. ArXiv:2210.03120
Xie J, Kong W, Xia S et al (2023) An efficient spectral clustering algorithm based on granular-ball. IEEE Trans Knowl Data Eng 35:9743–9753
Qian W, Ruan W, Li Y et al (2023) Granular-ball-based label enhancement for dimensionality reduction in multi-label data. Appl Intell 53:24008–24033
Google Scholar
Xia S, Dai X, Wang G, et al (2022) An efficient and adaptive granular-ball generation method in classification problem. IEEE Trans Neural Networks Learn Syst 1-13
Xia S, Zhang Z, Li W et al (2020) Gbnrs: A novel rough set algorithm for fast adaptive attribute reduction in classification. IEEE Trans Knowl Data Eng 34:1231–1242
Google Scholar
Peng X, Wang P, Xia S et al (2022) Vpgb: A granular-ball based model for attribute reduction and classification with label noise. Inf Sci 611:504–521
Google Scholar
Ji X, Peng J, Zhao P et al (2023) Extended rough sets model based on fuzzy granular-ball and its attribute reduction. Inf Sci 640:119071
Google Scholar
Qian W, Li Y, Ye Q et al (2023) Disambiguation-based partial label feature selection via feature dependency and label consistency. Inf Fusion 94:152–168
Google Scholar
Huang J, Qin F, Zheng X et al (2019) Improving multi-label classification with missing labels by learning label-specific features. Inf Sci 492:124–146
MathSciNet Google Scholar
Wang Y, Zheng W, Cheng Y et al (2020) Joint label completion and label-specific features for multi-label learning algorithm. Soft Computing 24:6553–6569
Google Scholar
Multi-Label Classification Dataset Repository, http://www.uco.es/kdis/mllresources/

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China (No.62266018 and No.62366019), Natural Science Foundation of Jiangxi Province (No.20232BAB202052).

Author information

Authors and Affiliations

School of Information Engineering, East China Jiaotong University, Nanchang, China
Wenhao Shu & Yichen Hu
School of Software, Jiangxi Agricultural University, Nanchang, China
Wenbin Qian

Authors

Wenhao Shu
View author publications
You can also search for this author inPubMed Google Scholar
Yichen Hu
View author publications
You can also search for this author inPubMed Google Scholar
Wenbin Qian
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Wenhao Shu: Conceptualization, Methodology, Visualization, Writing - original draft. Yichen Hu: Data curation, Software, Validation, Formal analysis, Writing - original draft. Wenbin Qian: Investigation, Supervision, Writing-review & editing.

Corresponding author

Correspondence to Wenbin Qian.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical standard

Not applicable to this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shu, W., Hu, Y. & Qian, W. Multi-label feature selection for missing labels by granular-ball based mutual information. Appl Intell 54, 12589–12612 (2024). https://doi.org/10.1007/s10489-024-05809-z

Download citation

Accepted: 22 August 2024
Published: 23 September 2024
Issue Date: December 2024
DOI: https://doi.org/10.1007/s10489-024-05809-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-label feature selection for missing labels by granular-ball based mutual information

Abstract

Graphical abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature selection for multi-label learning with missing labels

Robust Multi-label Feature Selection with Missing Labels

Multi-label feature selection via joint label enhancement and pairwise label correlations

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethical standard

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now