Skip to main content
Log in

3-3FS: ensemble method for semi-supervised multi-label feature selection

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Feature selection has received considerable attention over the past decade. However, it is continuously challenged by new emerging issues. Semi-supervised multi-label learning is one of these promising novel approaches. In this work, we refer to it as an approach that combines data consisting of a huge amount of unlabeled instances with a small number of multi-labeled instances. Semi-supervised multi-label feature selection, like conventional feature selection algorithms, has a rather poor record as regards stability (i.e. robustness with respect to changes in data). To address this weakness and improve the robustness of the feature selection process in high-dimensional data, this document develops an ensemble methodology based on a 3-way resampling of data: (1) Bagging, (2) a random subspace method (RSM) and (3) an additional random sub-labeling strategy (RSL). The proposed framework contributes to enhancing the stability of feature selection algorithms and to improving their performance. Our research findings illustrate that bagging and RSM help improve the stability of the feature selection process and increase learning accuracy, while RSL addresses label correlation, which is a major concern with multi-label data. The paper presents the key findings of a series of experiments, which we conducted on selected benchmark data sets in the classification task. Results are promising, highlighting that the proposed method either outperforms state-of-the-art algorithms or produces at least comparable results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://bailando.berkeley.edu/enron_email.html.

References

  1. Alalga A, Benabdeslem K, Taleb N (2016) Soft-constrained Laplacian score for semi-supervised multi-label feature selection. Knowl Inf Syst 47(1):75–98

    Google Scholar 

  2. Amezcua J, Melin P (2019) A new fuzzy learning vector quantization method for classification problems based on a granular approach. Granul Comput 4(2):197–209

    Google Scholar 

  3. Aydav PSS, Minz S (2020) Granulation-based self-training for the semi-supervised classification of remote-sensing images. Granul Comput 5(3):309–327

    Google Scholar 

  4. Barnard K, Duygulu P, Forsyth D, Freitas ND, Blei DM, Jordan MI (2003) Matching words and pictures. J Mach Learn Res 3(Feb):1107–1135

    MATH  Google Scholar 

  5. Barutcuoglu Z, Schapire RE, Troyanskaya OG (2006) Hierarchical multi-label prediction of gene function. Bioinformatics 22(7):830–836

    Google Scholar 

  6. Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1–2):105–139

    Google Scholar 

  7. Benabdeslem K, Elghazel H, Hindawi M (2016) Ensemble constrained Laplacian score for efficient and robust semi-supervised feature selection. Knowl Inf Syst 49(3):1161–1185

    Google Scholar 

  8. Benabdeslem K, Hindawi M (2011) Constrained Laplacian score for semi-supervised feature selection. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 204–218

  9. Benabdeslem K, Hindawi M (2014) Efficient semi-supervised feature selection: constraint, relevance and redundancy. IEEE Trans Knowl Data Eng 26(5):1131–1143

    Google Scholar 

  10. Benouini R, Batioua I, Ezghari S, Zenkouar K, Zahi A (2020) Fast feature selection algorithm for neighborhood rough set model based on bucket and trie structures. Granul Comput 5(3):329–347

    Google Scholar 

  11. Bolón-Canedo V, Alonso-Betanzos A (2019) Ensembles for feature selection: A review and future trends. Inf Fusion 52:1–12

    Google Scholar 

  12. Borchani H, Varando G, Bielza C, Larrañaga P (2015) A survey on multi-output regression. Wiley Interdiscip Rev Data Min Knowl Discov 5(5):216–233

    Google Scholar 

  13. Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognit 37(9):1757–1771

    Google Scholar 

  14. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  15. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    MATH  Google Scholar 

  16. Carmona-Cejudo JM, Baena-García M, del Campo-Avila J, Morales-Bueno R (2011) Feature extraction for multi-label learning in the domain of email classification, In: 2011 IEEE symposium on computational intelligence and data mining (CIDM). IEEE, pp 30–36

  17. Chung FR, Graham FC (1997) Spectral graph theory, number 92. American Mathematical Society, Providence

    Google Scholar 

  18. Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156

    Google Scholar 

  19. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30

    MathSciNet  MATH  Google Scholar 

  20. Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. arXiv preprint arXiv:cs/9501101 [cs]

  21. Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5(Aug):845–889

    MathSciNet  MATH  Google Scholar 

  22. Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. Adv Neural Inf Process Syst 14:681–687

    Google Scholar 

  23. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    MathSciNet  MATH  Google Scholar 

  24. Freund Y, Schapire RE et al (1996) Experiments with a new boosting algorithm. ICML 96:148–156

    Google Scholar 

  25. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  26. He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Process Syst 186:189

    Google Scholar 

  27. He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Proceeding of NIPS. vol 186, p 189

  28. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Google Scholar 

  29. Huang Y, Jin W, Yu Z, Li B (2020) Supervised feature selection through deep neural networks with pairwise connected structure. Knowl-Based Syst 204:106202

    Google Scholar 

  30. Kalakech M, Biela P, Macaire L, Hamad D (2011) Constraint scores for semi-supervised feature selection: a comparative study. Pattern Recognit Lett 32(5):656–665

    Google Scholar 

  31. Kocev D, Džeroski S, White MD, Newell GR, Griffioen P (2009) Using single-and multi-target regression trees and ensembles to model a compound index of vegetation condition. Ecol Model 220(8):1159–1168

    Google Scholar 

  32. Kuznar D, Mozina M, Bratko I (2009) Curve prediction with kernel regression. In: Proceedings of the 1st workshop on learning from multi-label data, pp 61–68

  33. Lee J, Kim D-W (2015) Memetic feature selection algorithm for multi-label classification. Inf Sci 293:80–96

    Google Scholar 

  34. Li X, Zhang H, Zhang R, Nie F (2019) Discriminative and uncorrelated feature selection with constrained spectral analysis in unsupervised learning. IEEE Trans Image Process 29(1):2139–2149

    MathSciNet  Google Scholar 

  35. Liu H, Cocea M (2019) Granular computing-based approach of rule learning for binary classification. Granul Comput 4(2):275–283

    Google Scholar 

  36. Liu H, Cocea M (2019) Nature-inspired framework of ensemble learning for collaborative classification in granular computing context. Granul Comput 4(4):715–724

    Google Scholar 

  37. Liu M, Zhang D (2015) Pairwise constraint-guided sparse learning for feature selection. IEEE Trans Cybern 46(1):298–310

    Google Scholar 

  38. Mirzaei A, Pourahmadi V, Soltani M, Sheikhzadeh H (2020) Deep feature selection using a teacher–student network. Neurocomputing 383:396–408

    Google Scholar 

  39. Nasierding G, Kouzani AZ, Tsoumakas G (2010) A triple-random ensemble classification method for mining multi-label data. In: 2010 IEEE international conference on data mining workshops. IEEE, pp 49–56

  40. Nogueira S, Sechidis K, Brown G (2017) On the stability of feature selection algorithms. J Mach Learn Res 18:174:1-174:54

    MathSciNet  MATH  Google Scholar 

  41. Pes B (2020) Ensemble feature selection for high-dimensional data: a stability analysis across multiple domains. Neural Comput Appl 32(10):5951–5973

    Google Scholar 

  42. Qi G-J, Hua X-S, Rui Y, Tang J, Mei T, Zhang H-J (2007) Correlative multi-label video annotation. In: Proceedings of the 15th international conference on multimedia, ACM, pp 17–26

  43. Read J, Bifet A, Holmes G, Pfahringer B (2012) Scalable and efficient multi-label classification for evolving data streams. Mach Learn 88(1–2):243–272

    MathSciNet  Google Scholar 

  44. Read J, Pfahringer B, Holmes G (2008) Multi-label classification using ensembles of pruned sets. In: 2008 eighth IEEE international conference on data mining. IEEE, pp 995–1000

  45. Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359

    MathSciNet  Google Scholar 

  46. Saeys Y, Abeel T, Van de Peer Y (2008) Robust feature selection using ensemble feature selection techniques. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 313–325

  47. Salmi A, Hammouche K, Macaire L (2020) Similarity-based constraint score for feature selection. Knowl-Based Syst 209:106429

    Google Scholar 

  48. Salton G (1991) Developments in automatic text retrieval. Science 253(5023):974–980

    MathSciNet  Google Scholar 

  49. Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2–3):135–168

    MATH  Google Scholar 

  50. Spyromitros-Xioufis E, Tsoumakas G, Groves W, Vlahavas I (2012) Multi-label classification methods for multi-target regression. arXiv preprint arXiv:1211.6581, pp 1159–1168

  51. Sun D, Zhang D (2010) Bagging constraint score for feature selection with pairwise constraints. Pattern Recognit 43(6):2106–2118

    MATH  Google Scholar 

  52. Sun L, Feng S, Wang T, Lang C, Jin Y (2019) Partial multi-label learning by low-rank and sparse decomposition. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 5016–5023

  53. Trohidis K, Tsoumakas G, Kalliris G, Vlahavas IP (2008) Multi-label classification of music into emotions. ISMIR 8:325–330

    Google Scholar 

  54. Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13

    Google Scholar 

  55. Tsoumakas G, Katakis I, Vlahavas I (2008) Effective and efficient multilabel classification in domains with large number of labels. In: Proceedings of ECML/PKDD 2008 workshop on mining multidimensional data (MMD08), pp 30–44

  56. Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Data mining and knowledge discovery handbook. Springer, pp 667–685

  57. Tsoumakas G, Vlahavas I (2007) Random k-labelsets: an ensemble method for multilabel classification, In: Machine learning: ECML 2007. Springer, pp 406–417

  58. Wang X, Ding W, Liu H, Huang X (2020) Shape recognition through multi-level fusion of features and classifiers. Granul Comput 5:437–448

    Google Scholar 

  59. Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Google Scholar 

  60. Xu J (2013) Fast multi-label core vector machine. Pattern Recognit 46(3):885–898

    MATH  Google Scholar 

  61. Zhang D, Chen S, Zhou Z (2008) Constraint score: a new filter method for feature selection with pairwise constraints. Pattern Recognit 41(5):1440–1451

    MATH  Google Scholar 

  62. Zhang D, Zhou Z, Chen S (2007) Semi-supervised dimensionality reduction. In: Proceedings of SIAM international conference on data mining

  63. Zhang M-L, Fang J-P (2020) Partial multi-label learning via credible label elicitation. IEEE Trans Pattern Anal Mach Intell 9:99. https://doi.org/10.1109/TPAMI.2020.2985210

    Article  Google Scholar 

  64. Zhang M-L, Zhou Z-H (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048

    MATH  Google Scholar 

  65. Zhang R, Li X (2020) Unsupervised feature selection via data reconstruction and side information. IEEE Trans Image Process 29:8097–8106

    MathSciNet  Google Scholar 

  66. Zhao H-H, Liu H (2020) Multiple classifiers fusion and CNN feature extraction for handwritten digits recognition. Granul Comput 5(3):411–418

    MathSciNet  Google Scholar 

  67. Zhao Z, Liu H (2007) Semi-supervised feature selection via spectral analysis. In: Proceedings of the 2007 SIAM international conference on data mining. SIAM, pp 641–646

Download references

Acknowledgements

We thank anonymous reviewers for their very useful comments and suggestions. The authors would also like to thank the DGRSDT (General Directorate of Scientific Research and Technological Development) - MESRS (Ministry of Higher Education and Scientific Research), ALGERIA, for the support of the LISCO Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dou El Kefel Mansouri.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alalga, A., Benabdeslem, K. & Mansouri, D.E.K. 3-3FS: ensemble method for semi-supervised multi-label feature selection. Knowl Inf Syst 63, 2969–2999 (2021). https://doi.org/10.1007/s10115-021-01616-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-021-01616-x

Keywords

Navigation