Abstract
Multi-class classification can be solved by decomposing it into a set of binary classification problems according to some encoding rules, e.g., one-vs-one, one-vs-rest, error-correcting output codes. Existing works solve these binary classification problems in the original feature space, while it might be suboptimal as different binary classification problems correspond to different positive and negative examples. In this paper, we propose to learn label-specific features for each decomposed binary classification problem to consider the specific characteristics containing in its positive and negative examples. Specifically, to generate the label-specific features, clustering analysis is respectively conducted on the positive and negative examples in each decomposed binary data set to discover their inherent information and then label-specific features for one example are obtained by measuring the similarity between it and all cluster centers. Experiments clearly validate the effectiveness of learning label-specific features for decomposition-based multi-class classification.
Similar content being viewed by others
References
Zhou Z H. Machine Learning. Singapore: Springer, 2021
Han J, Pei J, Tong H. Data Mining: Concepts and Techniques. 4th ed. Cambridge: Morgan Kaufmann, 2022
Zhou Z H. Open-environment machine learning. National Science Review, 2022, 9(8): nwac123
Zhang B, Zhu J, Su H. Toward the third generation artificial intelligence. Science China Information Sciences, 2023, 66(2): 121101
Zhao L, Song Y, Zhu Y, Zhang C, Zheng Y. Face recognition based on multi-class SVM. In: Proceedings of 2009 Chinese Control and Decision Conference. 2009, 5871–5873
Wu K, Jia F, Han Y. Domain-specific feature elimination: multi-source domain adaptation for image classification. Frontiers of Computer Science, 2023, 17(4): 174705
Wang T Y, Chiang H M. Fuzzy support vector machine for multi-class text categorization. Information Processing & Management, 2007, 43(4): 914–929
Moreo A, Esuli A, Sebastiani F. Word-class embeddings for multiclass text classification. Data Mining and Knowledge Discovery, 2021, 35(3): 911–963
Frid A, Manevitz L, Mosafi O. Multi-class classification in parkinson’s disease by leveraging internal topological structure of the data and of the label space. In: Proceedings of 2019 International Joint Conference on Neural Networks. 2019, 1–9
Wei K, Li T, Huang F, Chen J, He Z. Cancer classification with data augmentation based on generative adversarial networks. Frontiers of Computer Science, 2022, 16(2): 162601
Tsoumakas G, Katakis I, Vlahavas I. Random k-labelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(7): 1079–1089
Zhang M L, Li Y K, Yang H, Liu X Y. Towards class-imbalance aware multi-label learning. IEEE Transactions on Cybernetics, 2022, 52(6): 4459–4471
Read J, Martino L, Luengo D. Efficient monte carlo methods for multidimensional learning with classifier chains. Pattern Recognition, 2014, 47(3): 1535–1546
Jia B B, Zhang M L. Multi-dimensional classification via stacked dependency exploitation. Science China Information Sciences, 2020, 63(12): 222102
Jia B B, Zhang M L. Multi-dimensional classification via selective feature augmentation. Machine Intelligence Research, 2022, 19(1): 38–51
Lorena A C, De Carvalho A C P L F, Gama J M P. A review on the combination of binary classifiers in multiclass problems. Artificial Intelligence Review, 2008, 30(1–4): 19–37
Hsu C W, Lin C J. A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, 2002, 13(2): 415–425
Duan K, Keerthi S. Which is the best multiclass SVM method? An empirical study. In: Proceedings of the 6th International Workshop on Multiple Classifier Systems. 2005, 278–285
Dietterich T G, Bakiri G. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 1995, 2: 263–286
Allwein E L, Schapire R E, Singer Y. Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 2000, 1: 113–141
Pujol O, Radeva P, Vitrià J. Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(6): 1007–1012
Escalera S, Tax D M J, Pujol O, Radeva P, Duin R P W. Subclass problem-dependent design for error-correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(6): 1041–1054
Escalera S, Pujol O, Radeva P. On the decoding process in ternary error-correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(1): 120–134
Pujol O, Escalera S, Radeva P. An incremental node embedding technique for error correcting output codes. Pattern Recognition, 2008, 41(2): 713–725
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324
Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20(3): 273–297
Liu J Y, Jia B B. Combining one-vs-one decomposition and instance-based learning for multi-class classification. IEEE Access, 2020, 8: 197499–197507
Wang Z, Xue X. Multi-class support vector machine. In: Ma Y Q, Guo G D, eds. Support Vector Machines Applications. Cham: Springer, 2014, 23–48
Hastie T, Rosset S, Zhu J, Zou H. Multi-class adaboost. Statistics and Its Interface, 2009, 2(3): 349–360
Zheng F, Xue H, Chen X, Wang Y. Maximum margin tree error correcting output codes. In: Proceedings of the 14th Pacific Rim International Conference on Artificial Intelligence. 2016, 681–691
Zheng F, Xue H. Subclass maximum margin tree error correcting output codes. In Proceedings of the 15th Pacific Rim International Conference on Artificial Intelligence. 2018, 454–462
Kang S, Cho S, Kang P. Constructing a multi-class classifier using one-against-one approach with different binary classifiers. Neurocomputing, 2015, 149: 677–682
Liu M, Zhang D, Chen S, Xue H. Joint binary classifier learning for ECOC-based multi-class classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(11): 2335–2341
Zhang M L, Wu L. LIFT: multi-label learning with label-specific features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107–120
Jain A K, Murty M N, Flynn P J. Data clustering: a review. ACM Computing Surveys, 1999, 31(3): 264–323
Fan R E, Chang K W, Hsieh C J, Wang X R, Lin C J. LIBLINEAR: a library for large linear classification. Journal of Machine Learning Research, 2008, 9: 1871–1874
Crammer K, Singer Y. On the algorithmic implementation of multiclass kernel-based vector machines. The Journal of Machine Learning Research, 2001, 2: 265–292
Dobson A J, Barnett A G. An Introduction to Generalized Linear Models. 4th ed. Boca Raton: Chapman and Hall/CRC, 2018
Demšar J. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 2006, 7: 1–30
Wang S, Yao X. Multiclass imbalance problems: analysis and potential solutions. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2012, 42(4): 1119–1130
Acknowledgements
The authors wish to thank the associate editor and anonymous reviewers for their helpful comments and suggestions. This work was supported by the National Natural Science Foundation of China (Grant No. 62225602).
Author information
Authors and Affiliations
Corresponding author
Additional information
Bin-Bin Jia received the bachelor’s degree from North China Electric Power University, China in 2010, and the master’s degree from Beihang University, China in 2013. He joined Lanzhou University of Technology, China in 2013 and is an assistant professor currently. From September 2017 to March 2022, he studied in Southeast University where he received the PhD degree. His main research interests include machine learning and data mining.
Jun-Ying Liu received the bachelor’s degree from North China Electric Power University, China in 2010, and the master’s degree from Beijing Jiaotong University, China in 2012. Currently, she is an assistant professor at the College of Electrical and Information Engineering, Lanzhou University of Technology, China. Her main research interests include machine learning and data mining.
Jun-Yi Hang received the BSc and MSc degrees from Beihang University, China in 2017 and 2020, respectively. Currently, he is a PhD student at the School of Computer Science and Engineering, Southeast University, China. His main research interests include machine learning and data mining, especially in learning from multi-label data.
Min-Ling Zhang received the BSc, MSc, and PhD degrees in computer science from Nanjing University, China in 2001, 2004 and 2007, respectively. Currently, he is a Professor at the School of Computer Science and Engineering, Southeast University, China. His main research interests include machine learning and data mining. In recent years, Dr. Zhang has served as the General Co-Chairs of ACML’18, Program Co-Chairs of PAKDD’19, CCF-ICAI’19, ACML’17, CCFAI’17, PRICAI’16, Senior PC member or Area Chair of AAAI 2022–2024, IJCAI 2017–2023, KDD 2021–2023, ICDM 2015–2022, etc. He is also on the editorial board of IEEE Transactions on Pattern Analysis and Machine Intelligence, ACM Transactions on Intelligent Systems and Technology, Neural Networks, Science China Information Sciences, Frontiers of Computer Science, etc. Dr. Zhang is the Steering Committee Member of ACML and PAKDD, Vice Chair of the CAAI Machine Learning Society, standing committee member of the CCF Artificial Intelligence & Pattern Recognition Society. He is a Distinguished Member of CCF, CAAI, and Senior Member of AAAI, ACM, IEEE.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Jia, BB., Liu, JY., Hang, JY. et al. Learning label-specific features for decomposition-based multi-class classification. Front. Comput. Sci. 17, 176348 (2023). https://doi.org/10.1007/s11704-023-3076-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-023-3076-y