Abstract
Feature selection has received considerable attention over the past decade. However, it is continuously challenged by new emerging issues. Semi-supervised multi-label learning is one of these promising novel approaches. In this work, we refer to it as an approach that combines data consisting of a huge amount of unlabeled instances with a small number of multi-labeled instances. Semi-supervised multi-label feature selection, like conventional feature selection algorithms, has a rather poor record as regards stability (i.e. robustness with respect to changes in data). To address this weakness and improve the robustness of the feature selection process in high-dimensional data, this document develops an ensemble methodology based on a 3-way resampling of data: (1) Bagging, (2) a random subspace method (RSM) and (3) an additional random sub-labeling strategy (RSL). The proposed framework contributes to enhancing the stability of feature selection algorithms and to improving their performance. Our research findings illustrate that bagging and RSM help improve the stability of the feature selection process and increase learning accuracy, while RSL addresses label correlation, which is a major concern with multi-label data. The paper presents the key findings of a series of experiments, which we conducted on selected benchmark data sets in the classification task. Results are promising, highlighting that the proposed method either outperforms state-of-the-art algorithms or produces at least comparable results.
Similar content being viewed by others
References
Alalga A, Benabdeslem K, Taleb N (2016) Soft-constrained Laplacian score for semi-supervised multi-label feature selection. Knowl Inf Syst 47(1):75–98
Amezcua J, Melin P (2019) A new fuzzy learning vector quantization method for classification problems based on a granular approach. Granul Comput 4(2):197–209
Aydav PSS, Minz S (2020) Granulation-based self-training for the semi-supervised classification of remote-sensing images. Granul Comput 5(3):309–327
Barnard K, Duygulu P, Forsyth D, Freitas ND, Blei DM, Jordan MI (2003) Matching words and pictures. J Mach Learn Res 3(Feb):1107–1135
Barutcuoglu Z, Schapire RE, Troyanskaya OG (2006) Hierarchical multi-label prediction of gene function. Bioinformatics 22(7):830–836
Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1–2):105–139
Benabdeslem K, Elghazel H, Hindawi M (2016) Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection. Knowl Inf Syst 49(3):1161–1185
Benabdeslem K, Hindawi M (2011) Constrained Laplacian score for semi-supervised feature selection. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 204–218
Benabdeslem K, Hindawi M (2014) Efficient semi-supervised feature selection: constraint, relevance and redundancy. IEEE Trans Knowl Data Eng 26(5):1131–1143
Benouini R, Batioua I, Ezghari S, Zenkouar K, Zahi A (2020) Fast feature selection algorithm for neighborhood rough set model based on bucket and trie structures. Granul Comput 5(3):329–347
Bolón-Canedo V, Alonso-Betanzos A (2019) Ensembles for feature selection: A review and future trends. Inf Fusion 52:1–12
Borchani H, Varando G, Bielza C, Larrañaga P (2015) A survey on multi-output regression. Wiley Interdiscip Rev Data Min Knowl Discov 5(5):216–233
Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Carmona-Cejudo JM, Baena-García M, del Campo-Avila J, Morales-Bueno R (2011) Feature extraction for multi-label learning in the domain of email classification, In: 2011 IEEE symposium on computational intelligence and data mining (CIDM). IEEE, pp 30–36
Chung FR, Graham FC (1997) Spectral graph theory, number 92. American Mathematical Society, Providence
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30
Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. arXiv preprint arXiv:cs/9501101 [cs]
Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5(Aug):845–889
Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. Adv Neural Inf Process Syst 14:681–687
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Freund Y, Schapire RE et al (1996) Experiments with a new boosting algorithm. ICML 96:148–156
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Process Syst 186:189
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Proceeding of NIPS. vol 186, p 189
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Huang Y, Jin W, Yu Z, Li B (2020) Supervised feature selection through deep neural networks with pairwise connected structure. Knowl-Based Syst 204:106202
Kalakech M, Biela P, Macaire L, Hamad D (2011) Constraint scores for semi-supervised feature selection: a comparative study. Pattern Recognit Lett 32(5):656–665
Kocev D, Džeroski S, White MD, Newell GR, Griffioen P (2009) Using single-and multi-target regression trees and ensembles to model a compound index of vegetation condition. Ecol Model 220(8):1159–1168
Kuznar D, Mozina M, Bratko I (2009) Curve prediction with kernel regression. In: Proceedings of the 1st workshop on learning from multi-label data, pp 61–68
Lee J, Kim D-W (2015) Memetic feature selection algorithm for multi-label classification. Inf Sci 293:80–96
Li X, Zhang H, Zhang R, Nie F (2019) Discriminative and uncorrelated feature selection with constrained spectral analysis in unsupervised learning. IEEE Trans Image Process 29(1):2139–2149
Liu H, Cocea M (2019) Granular computing-based approach of rule learning for binary classification. Granul Comput 4(2):275–283
Liu H, Cocea M (2019) Nature-inspired framework of ensemble learning for collaborative classification in granular computing context. Granul Comput 4(4):715–724
Liu M, Zhang D (2015) Pairwise constraint-guided sparse learning for feature selection. IEEE Trans Cybern 46(1):298–310
Mirzaei A, Pourahmadi V, Soltani M, Sheikhzadeh H (2020) Deep feature selection using a teacher–student network. Neurocomputing 383:396–408
Nasierding G, Kouzani AZ, Tsoumakas G (2010) A triple-random ensemble classification method for mining multi-label data. In: 2010 IEEE international conference on data mining workshops. IEEE, pp 49–56
Nogueira S, Sechidis K, Brown G (2017) On the stability of feature selection algorithms. J Mach Learn Res 18:174:1-174:54
Pes B (2020) Ensemble feature selection for high-dimensional data: a stability analysis across multiple domains. Neural Comput Appl 32(10):5951–5973
Qi G-J, Hua X-S, Rui Y, Tang J, Mei T, Zhang H-J (2007) Correlative multi-label video annotation. In: Proceedings of the 15th international conference on multimedia, ACM, pp 17–26
Read J, Bifet A, Holmes G, Pfahringer B (2012) Scalable and efficient multi-label classification for evolving data streams. Mach Learn 88(1–2):243–272
Read J, Pfahringer B, Holmes G (2008) Multi-label classification using ensembles of pruned sets. In: 2008 eighth IEEE international conference on data mining. IEEE, pp 995–1000
Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359
Saeys Y, Abeel T, Van de Peer Y (2008) Robust feature selection using ensemble feature selection techniques. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 313–325
Salmi A, Hammouche K, Macaire L (2020) Similarity-based constraint score for feature selection. Knowl-Based Syst 209:106429
Salton G (1991) Developments in automatic text retrieval. Science 253(5023):974–980
Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2–3):135–168
Spyromitros-Xioufis E, Tsoumakas G, Groves W, Vlahavas I (2012) Multi-label classification methods for multi-target regression. arXiv preprint arXiv:1211.6581, pp 1159–1168
Sun D, Zhang D (2010) Bagging constraint score for feature selection with pairwise constraints. Pattern Recognit 43(6):2106–2118
Sun L, Feng S, Wang T, Lang C, Jin Y (2019) Partial multi-label learning by low-rank and sparse decomposition. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 5016–5023
Trohidis K, Tsoumakas G, Kalliris G, Vlahavas IP (2008) Multi-label classification of music into emotions. ISMIR 8:325–330
Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13
Tsoumakas G, Katakis I, Vlahavas I (2008) Effective and efficient multilabel classification in domains with large number of labels. In: Proceedings of ECML/PKDD 2008 workshop on mining multidimensional data (MMD08), pp 30–44
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer, pp 667–685
Tsoumakas G, Vlahavas I (2007) Random k-labelsets: an ensemble method for multilabel classification, In: Machine learning: ECML 2007. Springer, pp 406–417
Wang X, Ding W, Liu H, Huang X (2020) Shape recognition through multi-level fusion of features and classifiers. Granul Comput 5:437–448
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Xu J (2013) Fast multi-label core vector machine. Pattern Recognit 46(3):885–898
Zhang D, Chen S, Zhou Z (2008) Constraint score: a new filter method for feature selection with pairwise constraints. Pattern Recognit 41(5):1440–1451
Zhang D, Zhou Z, Chen S (2007) Semi-supervised dimensionality reduction. In: Proceedings of SIAM international conference on data mining
Zhang M-L, Fang J-P (2020) Partial multi-label learning via credible label elicitation. IEEE Trans Pattern Anal Mach Intell 9:99. https://doi.org/10.1109/TPAMI.2020.2985210
Zhang M-L, Zhou Z-H (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Zhang R, Li X (2020) Unsupervised feature selection via data reconstruction and side information. IEEE Trans Image Process 29:8097–8106
Zhao H-H, Liu H (2020) Multiple classifiers fusion and CNN feature extraction for handwritten digits recognition. Granul Comput 5(3):411–418
Zhao Z, Liu H (2007) Semi-supervised feature selection via spectral analysis. In: Proceedings of the 2007 SIAM international conference on data mining. SIAM, pp 641–646
Acknowledgements
We thank anonymous reviewers for their very useful comments and suggestions. The authors would also like to thank the DGRSDT (General Directorate of Scientific Research and Technological Development) - MESRS (Ministry of Higher Education and Scientific Research), ALGERIA, for the support of the LISCO Laboratory.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Alalga, A., Benabdeslem, K. & Mansouri, D.E.K. 3-3FS: ensemble method for semi-supervised multi-label feature selection. Knowl Inf Syst 63, 2969–2999 (2021). https://doi.org/10.1007/s10115-021-01616-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-021-01616-x