Skip to main content
Log in

Constrained nonnegative matrix factorization-based semi-supervised multilabel learning

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In many multilabel learning applications, instances with labels being fully provided are scarce, while partially labelled data and unlabelled data are more common due to the expensive cost of manual labelling. However, most of existing models are based on the assumption that the fully labelled training data is sufficient. To deal with the partially labelled and unlabelled data effectively, we present a novel semi-supervised multilabel learning approach based on constrained non-negative matrix factorization in this paper. This approach assumes that if two instances are highly similar in terms of their features, they would also be similar in their associated labels set. Specifically, We first define three matrices to measure the similarity of each pair of instances in two different ways. Then, the optimal assignation of labels to the unlabelled instance is determined by minimizing the differentiation between these two similarity sets via a non-negative matrix factorization process. We also present a threshold learning algorithm to determine the classification threshold for each label in our proposed approach. Extensive experiment is conducted on various datasets, and the results demonstrate that our method show significantly better performance than other state-of-the-art approaches. It is especially suitable for the situations with a smaller size of labelled training data, or subset of the training data are partially labelled.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://mulan.sourceforge.net/datasets-mlc.html

References

  1. Ashfaq RAR, Wang XZ, Huang JZ, Abbas H, He YL (2017) Fuzziness based semi-supervised learning approach for intrusion detection system. Inf Sci 378:484–497

    Article  Google Scholar 

  2. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434

    MathSciNet  MATH  Google Scholar 

  3. Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771

    Article  Google Scholar 

  4. Cheng W, Hüllermeier E (2009) Combining instance-based learning and logistic regression for multilabel classification. Mach Learn 76(2–3):211–225

    Article  Google Scholar 

  5. Clare A, King RD (2001) Knowledge discovery in multi-label phenotype data. In: European Conference on Principles of Data Mining and Knowledge Discovery, Springer, New York, pp 42–53

  6. Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. In: Proceedings of the 14th international conference on neural information processing systems: natural and synthetic. MIT Press, pp 681–687

  7. Fan RE, Lin CJ (2007) A study on threshold selection for multi-label classification. Department of Computer Science, National Taiwan University, pp 1–23

  8. Feng X, Jiao Y, Lv C, Zhou D (2016) Label consistent semi-supervised non-negative matrix factorization for maintenance activities identification. Eng Appl Artif Intell 52:161–167

    Article  Google Scholar 

  9. Fu B, Xu G, Wang Z, Cao L (2013) Leveraging supervised label dependency propagation for multi-label learning. 2013 IEEE 13th international conference on data mining (ICDM), IEEE, pp 1061–1066

  10. Fürnkranz J, Hüllermeier E, Mencía EL, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153

    Article  Google Scholar 

  11. Ghamrawi N, McCallum A (2005) Collective multi-label classification. In: Proceedings of the 14th ACM international conference on information and knowledge management. ACM, pp 195–200

  12. Guo Y, Gu S (2011) Multi-label classification using conditional dependency networks. In: IJCAI proceedings–international joint conference on artificial intelligence, vol 22, no 1, p 1300

  13. Kimura K, Kudo M, Sun L (2015) Dimension reduction using nonnegative matrix tri-factorization in multi-label classification. In: Proceedings of the international conference on parallel and distributed processing techniques and applications (PDPTA), p 250

  14. Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791

    Article  MATH  Google Scholar 

  15. Liu H, Wu Z, Li X, Cai D, Huang TS (2012) Constrained nonnegative matrix factorization for image representation. IEEE Trans Pattern Anal Mach Intell 34(7):1299–1311

    Article  Google Scholar 

  16. Liu Y, Jin R, Yang L (2006) Semi-supervised multi-label learning by constrained non-negative matrix factorization. In: Proceedings of the National Conference on Artificial Intelligence, vol 21, Menlo Park, Cambridge, London; AAAI Press; MIT Press

  17. Qi GJ, Hua XS, Rui Y, Tang J, Mei T, Zhang HJ (2007) Correlative multi-label video annotation. In: Proceedings of the 15th international conference on Multimedia, ACM, pp 17–26

  18. Qian B, Davidson I (2010) Semi-supervised dimension reduction for multi-label classification. AAAI 10:569–574

    Google Scholar 

  19. Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359

    Article  MathSciNet  Google Scholar 

  20. Schapire RE, Singer Y (2000) Boostexter: A boosting-based system for text categorization. Mach Learn 39(2):135–168

    Article  MATH  Google Scholar 

  21. Tai F, Lin HT (2012) Multilabel classification with principal label space transformation. Neural Comput 24(9):2508–2542

    Article  MathSciNet  MATH  Google Scholar 

  22. Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook. Springer, Boston, MA, pp 667–685

    Chapter  Google Scholar 

  23. Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I (2011) Mulan: a java library for multi-label learning. J Mach Learn Res 12:2411–2414

    MathSciNet  MATH  Google Scholar 

  24. Wang D, Gao X, Wang X (2016) Semi-supervised nonnegative matrix factorization via constraint propagation. IEEE Trans Cybern 46(1):233–244

    Article  Google Scholar 

  25. Wang R, Wang XZ, Kwong S, Xu C (2017) Incorporating diversity and informativeness in multiple-instance active learning. IEEE Trans Fuzzy Syst 25(6):1460–1475

    Article  Google Scholar 

  26. Wicker J, Pfahringer B, Kramer S (2012) Multi-label classification using boolean matrix decomposition. In: Proceedings of the 27th annual ACM symposium on applied computing. ACM, pp 179–186

  27. Yu D, Chen N, Jiang F, Fu B, Qin A (2017) Constrained nmf-based semi-supervised learning for social media spammer detection. Knowl-Based Syst 125:64–73

    Article  Google Scholar 

  28. Zhang ML, Zhang K (2010) Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 999–1008

  29. Zhang ML, Zhou ZH (2007) Ml-knn: A lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048

    Article  MATH  Google Scholar 

  30. Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Article  Google Scholar 

  31. Zhu H, Wang X (2017) A cost-sensitive semi-supervised learning model based on uncertainty. Neurocomputing 251:106–114

    Article  Google Scholar 

Download references

Acknowledgements

Many thanks to the Advanced Analytics Institute (AAi) at University of Technology, Sydney for the study provided by the learning and working conditions. This work is supported by the Public Welfare Technology Application Research Project of Zhejiang province, China (2016C33196, 2017C33105), the Natural Science Foundation of Zhejiang Province, China (LY15F020016), and the Science and Technology Project of state administration of press, publication, radio, film and television of China (201309).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dingguo Yu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, D., Fu, B., Xu, G. et al. Constrained nonnegative matrix factorization-based semi-supervised multilabel learning. Int. J. Mach. Learn. & Cyber. 10, 1093–1100 (2019). https://doi.org/10.1007/s13042-018-0787-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-018-0787-8

Keywords

Navigation