Abstract
The training data in multi-label learning are often high dimensional and contains a quantity of noise and redundant information, resulting in high memory overhead and low classification performance during the learning process. Therefore, dimensionality reduction for multi-label data has become an important research topic. Existing dimensionality reduction methods for multi-label data focus on either the instance-level or the feature-level; few studies have achieved both. This paper proposes a novel two-stage method to reduce dimensionality for both instances and features on multi-label data. In the dimensionality reduction stage of instances, the original training data are converted into single-label data utilizing binary relevance. The learning vector quantization technique is employed to perform prototype selection on the transformed data and generate new instance-level low-dimensional multi-label data on the ground of the nearest neighbor information of the selected prototypes. Next, a filter-based feature selection method is proposed to choose discriminative features for each class label in the feature reduction phase. The number of retained features is determined according to the preset proportion parameters to achieve the feature-level dimensionality reduction. Experimental results on seven benchmarks verify the effectiveness of the proposed method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
Data available on request from the authors.
Notes
These benchmark datasets were sourced from: https://mulan.sourceforge.net/datasets-mlc.html.
References
Zhang M-L, Zhou Z-H (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351
Zhang M, Li Y, Liu X, Geng X (2018) Binary relevance for multi-label learning: an overview. Front Comp Sci 12(2):191–202
Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771
Calvo-Zaragoza J, Valero-Mas J, Rico-Juan J (2015) Improving kNN multi-label classification in prototype selection scenarios using class proposals. Pattern Recognit 48(5):1608–1622
Lin Y, Li Y, Wang C, Chen J (2018) Attribute reduction for multi-label learning with fuzzy rough set. Knowl Based Syst 152:51–61
Huang J et al (2019) Improving multi-label classification with missing labels by learning label-specific features. Inf Sci 492:124–146
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359
Tsoumakas G, Katakis I, Vlahavas I (2011) Random k-Labelsets for multilabel classification. IEEE Trans Knowl Data Eng 23(7):1079–1089
Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artif Intell 172(16–17):1897–1916
Fürnkranz J, Hüllermeier E, Loza Mencía E, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153
Zhang M, Zhou Z (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Zhang M, Zhou Z (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837. https://doi.org/10.1109/tkde.2013.39
Jia X, Zhu S, Li W (2020) Joint label-specific features and correlation information for multi-label learning. J Comput Sci Technol 35(2):247–258
Madjarov G, Kocev D, Gjorgjevikj D, Džeroski S (2012) An extensive experimental comparison of methods for multi-label learning. Pattern Recognit 45(9):3084–3104
Wilson D (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern 2(3):408–421
Charte F, Rivera AJ, Jesus MJ (2014) MLeNN: a first approach to heuristic multilabel undersampling. IDEAL. Springer, pp 1–9
Kanj S, Abdallah F, Denœux T, Tout K (2015) Editing training data for multi-label classification with the k-nearest neighbor rule. Pattern Anal Appl 19(1):145–161
Arnaiz-González Á, Díez-Pastor J et al (2018) Study of data transformation techniques for adapting single-label prototype selection algorithms to multi-label learning. Expert Syst Appl 109:114–130
Lin Y, Hu Q, Liu J, Duan J (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168:92–103
Lin Y, Hu Q, Liu J, Chen J, Duan J (2016) Multi-label feature selection based on neighborhood mutual information. Appl Soft Comput 38:244–256
Zhang M, Peña J, Robles V (2009) Feature selection for multi-label naive Bayes classification. Inf Sci 179(19):3218–3229
Zhang J, Luo Z, Li C, Zhou C, Li S (2019) Manifold regularized discriminative feature selection for multi-label learning. Pattern Recognit 95:136–150
Kohonen T (1997) Learning vector quantization. Self-organizing maps. Springer, Berlin, pp 203–217
Huang L, Tang J, Sun D, Luo B (2013) Feature selection algorithm based on multi-label ReliefF. J Comput Appl 32(10):2888–2890
Acknowledgments
This work is supported by National Natural Science Foundation of China (Grant No. 6217619761806155); National Natural Science Foundation of Shaanxi province under Grant No. 2020GY-062.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
These are no potential competing interests in our paper. And all authors have seen the manuscript and approved to submit to your journal. We confirm that the content of the manuscript has not been published or submitted for publication elsewhere.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, H., Fang, M. & Wang, P. Dual dimensionality reduction on instance-level and feature-level for multi-label data. Neural Comput & Applic 35, 24773–24782 (2023). https://doi.org/10.1007/s00521-022-08117-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-08117-0