Label Selection Algorithm Based on Iteration Column Subset Selection for Multi-label Classification

Peng, Tao; Li, Jun; Xu, Jianhua

doi:10.1007/978-3-031-12423-5_22

Tao Peng¹²,
Jun Li¹² &
Jianhua Xu¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13426))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1170 Accesses
2 Citations

Abstract

In multi-label classification, each sample can be associated with a set of class labels. When the number of labels grows to the hundreds or even thousands, existing multi-label classification methods often become computationally inefficient. To this end, dimensionality reduction strategy is applied to label space via exploiting label correlation information, resulting in label embedding and label selection techniques. Compared with a lot of label embedding work, less attention has been paid to label selection techniques due to its difficulty. Therefore, it is a challenging task to design more effective label selection techniques for multi-label classification. Column subset selection is the problem of selecting a small portion of columns from a large data matrix as one form of interpretable data summarization. So, the column subset selection problem translates naturally to this purpose, as it provides simple linear models for low-rank data reconstruction. Iterative column subset selection is one of the methods to solve the problem of column subset selection, and this method can achieve a good result in the problem. In this paper, we first execute iterative column subset selection to select a small portion of columns from a large label matrix, in the prediction stage, we do some processing on the recovery matrix. So, a new method of multi-label classifier based on iterative column subset selection is proposed. The new method is tested on six publicly available datasets with varying numbers of labels. The experimental evaluation shows that the new method works particularly well on datasets with a large number of labels.

Supported by the Natural Science Foundation of China (NSFC) under grants 62076134 and 61703096.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bi, W., Kwok, J.: Efficient multi-label classification with many labels. In: ICML, pp. 405–413 (2013)
Google Scholar
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognit. 37(9), 1757–1771 (2004)
Article Google Scholar
Boutsidis, C., Mahoney, M.W., Drineas, P.: Unsupervised feature selection for principal components analysis. In: SIGKDD, pp. 61–69 (2008)
Google Scholar
Chen, Y.N., Lin, H.T.: Feature-aware label space dimension reduction for multi-label classification. In: NIPS, vol. 25, pp. 1538–1546 (2012)
Google Scholar
Chen, Z.M., Wei, X.S., Wang, P., Guo, Y.: Multi-label image recognition with graph convolutional networks. In: CVPR, pp. 5177–5186 (2019)
Google Scholar
Civril, A., Magdon-Ismail, M.: Column subset selection via sparse approximation of SVD. Theor. Comput. Sci. 421, 1–14 (2012)
Article MathSciNet Google Scholar
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Deng, X., Li, Y., Weng, J., Zhang, J.: Feature selection for text classification: a review. Multimedia Tools Appl. 78(3), 3797–3816 (2018). https://doi.org/10.1007/s11042-018-6083-5
Article Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)
MATH Google Scholar
Farahat, A.K., Ghodsi, A., Kamel, M.S.: An efficient greedy method for unsupervised feature selection. In: ICDM, pp. 161–170 (2011)
Google Scholar
Herrera, F., Charte, F., Rivera, A.J., del Jesus, M.J.: Multilabel Classification Problem Analysis, Metrics and Techniques. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41111-8
Book Google Scholar
Hsu, D.J., Kakade, S.M., Langford, J., Zhang, T.: Multi-label prediction via compressed sensing. In: NIPS, pp. 772–780 (2009)
Google Scholar
Jain, H., Prabhu, Y., Varma, M.: Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: SIGKDD, pp. 935–944 (2016)
Google Scholar
Ji, T., Li, J., Xu, J.: Label selection algorithm based on Boolean interpolative decomposition with sequential backward selection for multi-label classification. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 130–144. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_9
Chapter Google Scholar
Krömer, P., Platoš, J., Nowaková, J., Snášel, V.: Optimal column subset selection for image classification by genetic algorithms. Ann. Oper. Res. 265(2), 205–222 (2018)
Article MathSciNet Google Scholar
Lee, J., Yu, I., Park, J., Kim, D.W.: Memetic feature selection for multilabel text categorization using label frequency difference. Inf. Sci. 485, 263–280 (2019)
Article Google Scholar
Liu, L., Tang, L.: Boolean matrix decomposition for label space dimension reduction: method, framework and applications. In: CISAT, p. 052061 (2019)
Google Scholar
Maltoudoglou, L., Paisios, A., Lenc, L., Martínek, J., Král, P., Papadopoulos, H.: Well-calibrated confidence measures for multi-label text classification with a large number of labels. Pattern Recognit. 122, 108271 (2022)
Article Google Scholar
Nowaková, J., Krömer, P., Platoš, J., Snášel, V.: Preprocessing COVID-19 radiographic images by evolutionary column subset selection. In: Barolli, L., Li, K.F., Miwa, H. (eds.) INCoS 2020. AISC, vol. 1263, pp. 425–436. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-57796-4_41
Chapter Google Scholar
Ordozgoiti, B., Canaval, S.G., Mozo, A.: Iterative column subset selection. Knowl. Inf. Syst. 54(1), 65–94 (2018)
Article Google Scholar
Rastin, N., Taheri, M., Jahromi, M.Z.: A stacking weighted k-nearest neighbour with thresholding. Inf. Sci. 571, 605–622 (2021)
Article MathSciNet Google Scholar
Shitov, Y.: Column subset selection is NP-complete. Linear Algebra Appl. 610, 52–58 (2021)
Article MathSciNet Google Scholar
Sun, S., Zong, D.: LCBM: a multi-view probabilistic model for multi-label classification. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2682–2696 (2020)
Article Google Scholar
Sun, Y., Ye, S., Sun, Y., Kameda, T.: Exact and approximate Boolean matrix decomposition with column-use condition. Int. J. Data Sci. Anal. 1(3–4), 199–214 (2016)
Article Google Scholar
Tai, F., Lin, H.T.: Multilabel classification with principal label space transformation. Neural Comput. 24(9), 2508–2542 (2012)
Article MathSciNet Google Scholar
Wicker, J., Pfahringer, B., Kramer, S.: Multi-label classification using Boolean matrix decomposition. In: SAC, pp. 179–186 (2012)
Google Scholar
Zhang, D., Zhao, S., Duan, Z., Chen, J., Zhang, Y., Tang, J.: A multi-label classification method using a hierarchical and transparent representation for paper-reviewer recommendation. ACM Trans. Inf. Syst. 38(1), 1–20 (2020)
Google Scholar
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Electronic Information, School of Artificial Intelligence, Nanjing Normal University, Nanjing, 210023, Jiangsu, China
Tao Peng, Jun Li & Jianhua Xu

Authors

Tao Peng
View author publications
You can also search for this author in PubMed Google Scholar
Jun Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianhua Xu .

Editor information

Editors and Affiliations

University of Vienna, Vienna, Austria
Christine Strauss
University of Calabria, Rende, Italy
Alfredo Cuzzocrea
Johannes Kepler University of Linz, Linz, Austria
Gabriele Kotsis
Vienna University of Technology, Vienna, Austria
A Min Tjoa
Johannes Kepler University of Linz, Linz, Austria
Ismail Khalil

Appendix

The following is the detailed process of Iterative column subset selection algorithm.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, T., Li, J., Xu, J. (2022). Label Selection Algorithm Based on Iteration Column Subset Selection for Multi-label Classification. In: Strauss, C., Cuzzocrea, A., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2022. Lecture Notes in Computer Science, vol 13426. Springer, Cham. https://doi.org/10.1007/978-3-031-12423-5_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-12423-5_22
Published: 29 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-12422-8
Online ISBN: 978-3-031-12423-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Label Selection Algorithm Based on Iteration Column Subset Selection for Multi-label Classification

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation