Skip to main content
Log in

Multi-label causal feature selection based on neighbourhood mutual information

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Multi-label feature selection has gained significant attention over the past decades. However, most existing algorithms are lack of interpretability and uncover the causal mechanisms. As we know, Markov blanket (MB) is a key concept in Bayesian network, which can be used to represent the local causal structure of a variable and the selected optimal features for multi-label feature selection. To select casual features for multi-label learning, in this paper, Parents and Children (PC) of each label are discovered via the Hiton method. Then, we distinguish P & C and search Spouses (SP) of each label based on neighborhood conditional mutual information. Moreover, the equivalent information phenomenon brought by multi-label datasets will cause some features to be ignored. A metric of conditional independence test is designed, which can be used to retrieve ignored features. In addition, we search common features between relevant labels and label-specific features for a single label. Finally, we propose a Multi-label Causal Feature Selection with Neighbourhood Mutual Information algorithm, called MCFS-NMI. To verify the performance of MCFS-NMI, we compare it with five well-established multi-label feature selection algorithms on six datasets. Experiment results show that the proposed algorithm achieves highly competitive performance against all comparing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Samah JF, Aditya T (2018) Exploiting MEDLINE for gene molecular function prediction via NMF based multi-label classification. J Biomed Inform 86:160–166

    Article  Google Scholar 

  2. Liu Y, Wen K, Gao Q et al (2018) SVM based multi-label learning with missing labels for image annotation. Pattern Recognit 78:307–317

    Article  Google Scholar 

  3. Liu J, Wang C, Wu Y et al (2017) Deep learning for extreme multi-label text classification. In: Proceedings of the 40th international ACM conference on research and development in information retrieval, pp 115–124

  4. Wu Q, Tan M, Song H et al (2016) Ml-forest: a multi-label tree ensemble method for multi-label classification. IEEE Trans Knowl Data Eng 28(10):1–1

    Article  Google Scholar 

  5. Lin Y, Hu Q, Liu J et al (2015) Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing 168(30):92–103

    Article  Google Scholar 

  6. Pereira RB, Plastino A, Zadrozny B et al (2018) Categorizing feature selection methods for multi-label classification. Artif Intell Rev 49(1):57–78

    Article  Google Scholar 

  7. Huang R, Kang L (2021) Local positive and negative label correlation analysis with label awareness for multi-label classification. Int J Mach Learn. https://doi.org/10.1007/s13042-021-01352-2

  8. Karagoz G, Yazici A, Dokeroglu T et al (2021) A new framework of multi-objective evolutionary algorithms for feature selection and multi-label classification of video data. Int J Mach Learn. https://doi.org/10.1007/s13042-020-01156-w

  9. Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Conference on advances in knowledge discovery and data. Berlin, Heidelberg, pp 22–30

  10. Zhang Y, Zhou Z (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4:1–21

    Article  Google Scholar 

  11. Jian L, Li J, Shu, K et al (2016) Multi-label informed feature selection. In: Proceedings of 26th international joint conference on artificial intelligence. New York, New York, pp 1627–1633

  12. Aliferis CF, Statnikov AR, Tsamardinos I et al (2010) Local causal and Markov blanket induction for causal discovery and feature selection for classification part I: algorithms and empirical evaluation. J Mach Learn Res 11(1):171–234

    MathSciNet  MATH  Google Scholar 

  13. Yu K, Liu L, Li J et al (2020) Multi-source causal feature selection. IEEE Trans Pattern Anal Mach Intell 42(9):2240–2256

    Article  Google Scholar 

  14. Aliferis CF, Statnikov A, Tsamardinos I et al (2010) Local causal and Markov blanket induction for causal discovery and feature selection for classification part II: analysis and extensions. J Mach Learn Res 11(10):235–284

    MathSciNet  MATH  Google Scholar 

  15. Tsamardinos I, Aliferis CF (2003) Towards principled feature selection: relevancy, filters and wrappers. In: Proceedings of the 9th international workshop on artificial intelligence and statistics

  16. Andrs RM, Serafn M (2012) A Bayesian stochastic search method for discovering Markov boundaries. Knowl Based Syst 35:211–223

    Article  Google Scholar 

  17. Statnikov A, Lytkin NI, Lemeire J et al (2013) Algorithms for discovery of multiple Markov boundaries. J Mach Learn Res 14:499–566

    MathSciNet  MATH  Google Scholar 

  18. Pellet JP, Elisseeff A (2008) Using Markov blankets for causal structure learning. J Mach Learn Res 9(9):1295–1342

    MathSciNet  MATH  Google Scholar 

  19. Wu X, Jiang B, Yu K et al (2020) Multi-label causal feature selection. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no 4, pp 6430–6437

  20. Zhang M, Wu L (2015) Lift: multi-label learning with label-specific features. IEEE Trans Pattern Anal Mach Intell 37(1):107–120

    Article  Google Scholar 

  21. Wu X, Jiang B, Zhong Y et al (2020) Multi-label causal variable discovery: learning common causal variables and label-specific causal variables. https://arxiv.org/abs/2011.04176. Accessed 9 Nov 2020

  22. Wang YS, Drton M (2020) High-dimensional causal discovery under non-Gaussianity. Biometrika 107:41–59

    MathSciNet  MATH  Google Scholar 

  23. Cai R, Zhang Z, Hao Z (2013) Causal gene identification using combinatorial V-structure search. Neural Netw 43:63–71

    Article  MATH  Google Scholar 

  24. Wu X, Jiang B, Yu K (2020) Accurate Markov boundary discovery for causal feature selection. IEEE Trans Cybern 50(12):4983–4996

    Article  Google Scholar 

  25. Spirtes P, Glymour C, Scheines R (1993) Causation, prediction, and search. Springer, New York

    Book  MATH  Google Scholar 

  26. Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc, San Francisco

    MATH  Google Scholar 

  27. Neapolitan RE, Xia J (2007) Learning Bayesian networks—ScienceDirect. Probab Methods Financ Mark Inform 31(3):111–175

    Google Scholar 

  28. Pearl J (2014) Probabilistic reasoning in intelligent systems: networks of plausible inference. Elsevier, Amsterdam

    MATH  Google Scholar 

  29. Gao T, Ji Q (2017) Efficient Markov blanket discovery and its application. IEEE Trans Cybern 47(5):1169–1179

    Article  MathSciNet  Google Scholar 

  30. Hu Q, Zhang L, Zhang D et al (2011) Measuring relevance between discrete and continuous features based on neighborhood mutual information. Expert Syst Appl Int J 38(9):10737–10750

    Article  Google Scholar 

  31. Zhang M, Zhou Z (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Article  Google Scholar 

  32. Aliferis CF, Tsamardinos I, Statnikov A (2003) HITON: a novel Markov blanket algorithm for optimal variable selection. In: AMIA annual symposium proceedings 2003

  33. Ma J, Chow T (2020) Topic-based instance and feature selection in multilabel classification. IEEE Trans Neural Netw Learn Syst 33(1):315–329

    Article  MathSciNet  Google Scholar 

  34. Huang R, Jiang W, Sun G (2018) Manifold-based constraint Laplacian score for multi-label feature selection. Pattern Recognit Lett 112:346–352

    Article  Google Scholar 

  35. Lee J, Kim D (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recognit Lett 34:349–357

    Article  Google Scholar 

  36. Zhang M, Zhou Z (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40:2038–2048

    Article  MATH  Google Scholar 

  37. Schapire R, Singer Y (2000) BoosTexter: a boosting-based system for text categorization. Mach Learn 39:125–168

    Article  MATH  Google Scholar 

  38. Lin Y, Li Y, Wang C et al (2018) Attribute reduction for multi-label learning with fuzzy rough set. Knowl Based Syst 152(15):51–61

    Article  Google Scholar 

  39. Tellegen A, Watson D, Clark LA (1999) On the dimensional and hierarchical structure of affect. Psychol Sci 10(4):297–303

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by Grants from the National Natural Science Foundation of China (No.62076116), the Natural Science Foundation of Fujian Province (Nos. 2021J02049, 2020J01811, and 2020J01792).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yaojin Lin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Lin, Y., Li, L. et al. Multi-label causal feature selection based on neighbourhood mutual information. Int. J. Mach. Learn. & Cyber. 13, 3509–3522 (2022). https://doi.org/10.1007/s13042-022-01609-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01609-4

Keywords

Navigation