Abstract
Mutil-label classification is a machine learning task on a large number of labels where an instance may be associated with multiple class labels simultaneously. Although significant progress achieved, multi-label classification is still challenging due to the high-dimensional label space resulting from the emergence of multiple applications. To this end, dimensionality reduction originally for feature space is also applied to label space via exploiting label correlation information, deriving two kinds of techniques: label embedding and label selection. There have been many successful theories in the field of label embedding, but less attention has been paid to label selection. Infinite feature selection algorithm (Inf-FS) finds the most discriminative features subset. It treats feature subsets as paths in a graph, and by algebraic theory, the values of paths of arbitrary lengths can be evaluated. Letting the lengths of paths go to infinite allows to simplify the computational complexity of the selection process and considers the values of any path (subset) that contains a specific feature. Sorting features provides the feature subset to keep. We can apply this algorithm to label selection by treating label subsets as paths. After executing label selection, we need to design an operator to recover the original label space from the selected one. An effective classifier can be trained on the label subset. Then, we can propagate the predicted value for the selected label subset to the full label set in order to recover the original label space. We apply our model to five benchmark data sets with more than 100 labels. Experimental results show that our method achieves superior classification performance over other state-of-the-art methods, in terms of two performance evaluation metrics (precision and discounted gain@n) for high-dimensional label space.
Supported by the Natural Science Foundation of China (NSFC) under grants 62076134 and 62173186.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balasubramanian, K., Lebanon, G.: The landmark selection method for multiple output prediction. In: International Conference on Machine Learning, pp. 283–290 (2012)
Barezi, E.J., Wood, I.D., Fung, P., Rabiee, H.R.: A submodular feature-aware framework for label subset selection in extreme classification problems. In: Proceedings of the 2019 Conference of the North, pp. 1009–1018 (2019)
Belohlavek, R., Outrata, J., Trnecka, M.: Toward quality assessment of boolean matrix factorizations. Inf. Sci. 459, 71–85 (2018)
Bhatia, K., Jain, H., Kar, P., Varma, M., Jain, P.: Sparse local embeddings for extreme multi-label classification. In: Neural Information Processing Systems, pp. 730–738 (2015)
Bi, W., Kwok, J.: Efficient multi-label classification with many labels. In: International Conference on Machine Learning, pp. 405–413 (2013)
Cabral, R., Torre, F., Costeira, J.P., Bernardino, A.: Matrix completion for weakly-supervised multi-label image classification. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 121–135 (2014)
Chen, Y., Lin, H.: Feature-aware label space dimension reduction for multi-label classification. In: Neural Information Processing Systems, pp. 1529–1537 (2012)
Duda, R., Hart, P., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)
Hsu, D.J., Kakade, S.M., Langford, J., Zhang, T.: Multi-label prediction via compressed sensing. In: Neural Information Processing Systems, pp. 772–780 (2009)
Huang, K.H., Lin, H.T.: Cost-sensitive label embedding for multi-label classification. Mach. Learn. 106, 1725–1746 (2017)
Ji, T., Li, J., Xu, J.: Label selection algorithm based on boolean interpolative decomposition with sequential backward selection for multi-label classification. In: International Conference on Document Analysis and Recognition, pp. 130–144 (2021)
Lee, J., Yu, I., Park, J., Kim, D.W.: Memetic feature selection for multilabel text categorization using label frequency difference. Inf. Sci. 485, 263–280 (2019)
Li, J., Zhang, C., Zhu, P., Wu, B., Chen, L., Hu, Q.: SPL-MLL: selecting predictable landmarks for multi-label learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 783–799. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_45
Lin, Z., Ding, G., Hu, M., Wang, J.: Multi-label classification via feature-aware implicit label space encoding. In: International Conference on Machine Learning, pp. 325–333 (2014)
Liu, W., Wang, H., Shen, X., Tsang, I.: The emerging trends of multi-label learning. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7955–7974 (2021)
Miettinen, P., Neumann, S.: Recent developments in boolean matrix factorization. In: International Joint Conference on Artificial Intelligence and Pacific Rim International Conference on Artificial Intelligence IJCAI-PRICAI-20, pp. 4922–4928 (2020)
Roffo, G., Melzi, S., Castellani, U., Vinciarelli, A., Cristani, M.: Infinite feature selection: a graph-based feature filtering approach. IEEE Trans. Pattern Anal. Mach. Intell. 43(12), 4396–4410 (2020)
Sun, Y., Ye, S., Sun, Y., Kameda, T.: Exact and approximate boolean matrix decomposition with column-use condition. Int. J. Data Sci. Anal. 1(3), 199–214 (2016)
Tai, F., Lin, H.T.: Multilabel classification with principal label space transformation. Neural Comput. 24(9), 2508–2542 (2012)
Wang, X., Zhang, W., Zhang, Q., Li, G.Z.: Multip-schlo: multi-label protein subchloroplast localization prediction with chou’s pseudo amino acid composition and a novel multi-label classifier. Bioinformatics 31(16), 2639–2645 (2015)
Wang, X., Du, L., Li, J.: Pmae: pseudo multi-label attention ensemble. In: 2021 IEEE International Conference on Multimedia and Expo, pp. 1–6 (2021)
Wicker, J., Pfahringer, B., Kramer, S.: Multi-label classification using boolean matrix decomposition. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, pp. 179–186 (2012)
Zhang, Y., Schneider, J.: Multi-label output codes using canonical correlation analysis. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 873–882 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pan, Y., Li, J., Xu, J. (2023). Infinite Label Selection Method for Mutil-label Classification. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1791. Springer, Singapore. https://doi.org/10.1007/978-981-99-1639-9_30
Download citation
DOI: https://doi.org/10.1007/978-981-99-1639-9_30
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1638-2
Online ISBN: 978-981-99-1639-9
eBook Packages: Computer ScienceComputer Science (R0)