Abstract
Multi-label feature selection aims to mitigate the curse of dimensionality in multi-label data by selecting a smaller subset of features from the original set for classification. Existing multi-label feature selection algorithms frequently neglect the inherent uncertainty in multi-label data and fail to adequately consider the relationships between features and labels when assessing the importance of features. In response to this challenge, a Fuzzy Information Gain Ratio-based multi-label feature selection considering Label Correlation (FIGR_LC) algorithm is proposed. FIGR_LC evaluates feature importance by combining the relationship between features and individual labels, as well as the correlation between features and label sets. Subsequently, a feature ranking is established based on these feature weights. Experimental results substantiate the effectiveness of FIGR_LC, showcasing its superiority over several established feature selection methods.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data that support the findings of this study are openly available in Mulan at https://mulan.sourceforge.net/.
References
Mitchell TM, Mitchell TM (1997) Machine learning. McGraw-Hill, New York
Zhang M-L, Zhou Z-H (2013) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Wareh Min 3(3):1–13
Li L, Wang M, Zhang L, Wang H (2014) Learning semantic similarity for multi-label text categorization. In: Workshop on Chinese lexical semantics. Macao, China, pp 260–269
Jiang J-Y, Tsai S-C, Lee S-J (2012) Fsknn: multi-label text categorization based on fuzzy similarity and k nearest neighbors. Expert Syst Appl 39(3):2813–2821
Wang C, Yan S, Zhang L, Zhang H-J (2009) Multi-label sparse coding for automatic image annotation. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). FL, USA, Miami, pp 1643–1650
Wu B, Lyu S, Hu B-G, Ji Q (2015) Multi-label learning with missing labels for image annotation and facial action unit recognition. Pattern Recognit 48(7):2279–2289
Yu Y, Pedrycz W, Miao D (2013) Neighborhood rough sets based multi-label classification for automatic image annotation. Int J Approx Reason 54(9):1373–1387
Zhao F, Huang Y, Wang L, Tan T (2015) Deep semantic ranking based hashing for multi-label image retrieval. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). MA, USA, Boston, pp 1556–1564
Yu K, Yu S, Tresp V (2005) Multi-label informed latent semantic indexing. In: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '05). Salvador, Brazil, pp 258–265
Zhu C, Liu Y, Miao D, Dong Y, Pedrycz W (2023) within-cross-consensus-view representation-based multi-view multi-label learning with incomplete data. Neurocomputing 557:126729
Zhu C, Miao D, Wang Z, Zhou R, Wei L, Zhang X (2020) global and local multi-view multi-label learning. Neurocomputing 371:67–77
Lin Y, Hu Q, Liu J, Chen J, Duan J (2016) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38:244–256
Li L, Liu H, Ma Z, Mo Y, Duan Z, Zhou J, Zhao J (2014) Multi-label feature selection via information gain. In: International conference on advanced data mining and applications (ADMA). Guilin, China, pp 345–355
Gao C, Zhou J, Xing J, Yue X (2022) parameterized maximum-entropy-based three-way approximate attribute reduction. Int J Approx Reason 151:85–100
Duan J, Hu Q, Zhang L, Qian Y, Li D (2015) Feature selection for multi-label classification based on neighborhood rough sets. J Comput Res Dev 52(1):56–65
Li J, Yang X, Wang P, Chen X (2018) Stable attribute reduction approach for fuzzy rough set. J Nanjing Univ Sci Technol (Nanjing Li Gong Daxue Xuebao) 42(1):68–75
Reyes O, Morell C, Ventura S (2015) Scalable extensions of the relieff algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing 161:168–182
Dai J, Hu Q, Hu H, Huang D (2017) Neighbor inconsistent pair selection for attribute reduction by rough set approach. IEEE Trans Fuzzy Syst 26(2):937–950
Pawlak Z (1982) Rough set. Int J Comput Inf Sci 11(5):341–356
Zhang C, Li D, Liang J (2018) Hesitant fuzzy linguistic rough set over two universes model and its applications. Int J Mach Learn Cybern 9(4):577–588
Qian Y, Liang J, Pedrycz W, Dang C (2010) Positive approximation: an accelerator for attribute reduction in rough set theory. Artif Intell 174(9–10):597–618
Yao Y, Zhang X (2017) Class-specific attribute reducts in rough set theory. Inf Sci 418:601–618
Liang J, Wang F, Dang C, Qian Y (2014) A group incremental approach to feature selection applying rough set technique. IEEE Trans Knowl Data Eng 26(2):294–308
Qian Y, Wang Q, Cheng H, Liang J, Dang C (2015) Fuzzy-rough feature selection accelerator. Fuzzy Sets Syst 258:61–78
Wang C, Wang Y, Shao M, Qian Y, Chen D (2019) Fuzzy rough attribute reduction for categorical data. IEEE Trans Fuzzy Syst 28(5):818–830
Zhao H, Wang P, Hu Q, Zhu P (2019) Fuzzy rough set based feature selection for large-scale hierarchical classification. IEEE Trans Fuzzy Syst 27(10):1891–1903
Ni P, Zhao S, Wang X, Chen H, Li C, Tsang EC (2020) Incremental feature selection based on fuzzy rough sets. Inf Sci 536:185–204
Hu Q, Yu D, Xie Z, Liu J (2006) Fuzzy probabilistic approximation spaces and their information measures. IEEE Trans Fuzzy Syst 14(2):191–201
Zhang X, Mei C, Chen D, Yang Y, Li J (2019) Active incremental feature selection using a fuzzy-rough-set-based information entropy. IEEE Trans Fuzzy Syst 28(5):901–915
Jensen R, Shen Q (2009) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 4(17):824–838
Jensen R, Shen Q (2007) Fuzzy-rough sets assisted attribute selection. IEEE Trans Fuzzy Syst 15(1):73–89
Zhang L, Hu Q, Duan J, Wang X (2014) Multi-label feature selection with fuzzy rough sets. In: International conference on Rough Sets and Knowledge Technology (RSKT). Shanghai, China, pp 121–128
Lin Y, Hu Q, Liu J, Li J, Wu X (2017) Streaming feature selection for multilabel learning based on fuzzy mutual information. IEEE Trans Fuzzy Syst 25(6):1491–1507
Lin Y, Li Y, Wang C, Chen J (2018) Attribute reduction for multi-label learning with fuzzy rough set. Knowl Based Syst 152:51–61
Li Y, Lin Y, Liu J, Weng W, Shi Z, Wu S (2018) Feature selection for multi-label learning based on kernelized fuzzy rough sets. Neurocomputing 318:271–286
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209
Dai J, Chen J, Liu Y, Hu H (2020) Novel multi-label feature selection via label symmetric uncertainty correlation learning and feature redundancy evaluation. Knowl Based Syst 207:106342
Zhang M-L, Peña JM, Robles V (2009) Feature selection for multi-label naive bayes classification. Inf Sci 179(19):3218–3229
Zhang Y, Zhou Z-H (2008) Multi-label dimensionality reduction via dependence maximization. ACM Trans Knowl Discovery Data 4(3):1–21 (Article No. 14)
Ge L, Li G, You M (2009) Embedded feature selection for multi-label learning. J Nanjing Univ (Nat Sci) 45(5):671–676
Zhang M-L, Zhou Z-H (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Acknowledgements
The authors would like to thank the Editors for their kindly help and the anonymous referees for their valuable comments and helpful suggestions. The work is partially supported by the National Natural Science Foundation of China (Serial No. 62163016, 62066014), the Natural Science Foundation of Jiangxi Provincial (Serial No. 20212ACB202001, 20232BAB202004), the open project of State Key Laboratory of Performance Monitoring and Protecting of Rail Transit Infrastructure, East China Jiaotong University (Grant No. HJGZ2023203), and the Jiangxi Double Thousand Plan.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yu, Y., Lv, M., Qian, J. et al. Fuzzy information gain ratio-based multi-label feature selection with label correlation. Int. J. Mach. Learn. & Cyber. 15, 2737–2747 (2024). https://doi.org/10.1007/s13042-023-02060-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-02060-9