Abstract
Multi-label text classification refers to assigning multiple relevant category labels to each text, which has been widely applied in the real world. To enhance the performance of multi-label text classification, most existing methods only focus on optimizing document and label representations, assuming accurate label-document similarity is crucial. However, whether the potential relevance between labels and if the problem of the long-tail distribution of labels could be solved are also key factors affecting the performance of multi-label classification. To this end, we propose a multi-label text classification model called DV-MLTC, which is based on a dual-view graph convolutional network to predict multiple labels for text. Specifically, we utilize graph convolutional neural networks to explore the potential correlation between labels in both the global and local views. First, we capture the global consistency of labels on the global label graph based on existing statistical information and generate label paths through a random walk algorithm to reconstruct the label graph. Then, to capture relationships between low-frequency co-occurring labels on the reconstructed graph, we guide the generation of reasonable co-occurring label pairs within the local neighborhood by utilizing the local consistency of labels, which also helps alleviate the long-tail distribution of labels. Finally, we integrate the global and local consistency of labels to address the problem of highly skewed distribution caused by incomplete label co-occurrence patterns in the label co-occurrence graph. The Evaluation shows that our proposed model achieves competitive results compared to existing state-of-the-art methods. Moreover, our model achieves a better balance between efficiency and performance.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of Data and Materials
The datasets analyzed during the current study were all derived from the following public domain resources. [AAPD: https://git.uwaterloo.ca/jimmylin/Castor-data/tree/master/datasets/AAPD/; RCV1: http://www.ai.mit.edu/projects/jmlr/papers/volume5/lewis04a/lyrl2004_rcv1v2_README.htm; EUR-Lex: http://nlp.cs.aueb.gr/software.html].
References
Huang B, Guo R, Zhu Y, Fang Z, Zeng G, Liu J, Wang Y, Fujita H, Shi Z (2022) Aspect-level sentiment analysis with aspect-specific context position information. Knowl-Based Syst 243:108473. https://doi.org/10.1016/j.knosys.2022.108473
Tang P, Jiang M, Xia BN, Pitera JW, Welser J, Chawla NV (2020) Multi-label patent categorization with non-local attention-based graph convolutional network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 9024–9031. https://ojs.aaai.org/index.php/AAAI/article/view/6435
Liu W, Wang H, Shen X, Tsang IW (2022) The emerging trends of multi-label learning. IEEE Trans Pattern Anal Mach Intell 44(11):7955–7974. https://doi.org/10.1109/TPAMI.2021.3119334
Liu J, Chang W-C, Wu Y, Yang Y (2017) Deep learning for extreme multi-label text classification. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. SIGIR ’17, pp 115–124. Association for computing machinery. https://doi.org/10.1145/3077136.3080834
Wu H, Qin S, Nie R, Cao J, Gorbachev S (2021) Effective collaborative representation learning for multilabel text categorization. IEEE Trans Neural Netw Learn Syst 33(10):5200–5214
Huang X, Chen B, Xiao L, Yu J, Jing L (2022) Label-aware document representation via hybrid attention for extreme multi-label text classification. Neural Process Lett 54(5):3601–3617
Xiao L, Zhang X, Jing L, Huang C, Song M (2021) Does head label help for long-tailed multi-label text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp 14103–14111
Zong D, Sun S (2023) Bgnn-xml: bilateral graph neural networks for extreme multi-label text classification. IEEE Trans Knowl Data Eng 35(7):6698–6709
Zhang Q-W, Zhang X, Yan Z, Liu R, Cao Y, Zhang M-L (2021) Correlation-guided representation for multi-label text classification. In: IJCAI, pp 3363–3369
Ionescu RT, Butnaru A (2019) Vector of locally-aggregated word embeddings (vlawe): a novel document-level representation. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 363–369. https://doi.org/10.18653/v1/N19-1033. https://aclanthology.org/N19-1033
Liu M, Liu L, Cao J, Du Q (2022) Co-attention network with label embedding for text classification. Neurocomputing 471:61–69
Wang J, Chen Z, Qin Y, He D, Lin F (2023) Multi-aspect co-attentional collaborative filtering for extreme multi-label text classification. Knowl-Based Syst 260:110110. https://doi.org/10.1016/j.knosys.2022.110110
Chen Z-M, Wei X-S, Wang P, Guo Y (2019) Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5177–5186
Pal A, Selvakumar M, Sankarasubbu M (2020) Magnet: multi-label text classification using attention-based graph neural network. In: Proceedings of the 12th international conference on agents and artificial intelligence 1, vol 2, pp 494–505. https://doi.org/10.5220/0008940304940505
Vu H, Nguyen M, Nguyen V, Tien M, Nguyen V (2022) Label correlation based graph convolutional network for multi-label text classification. In: 2022 International joint conference on neural networks (IJCNN), pp 01–08. https://ieeexplore.ieee.org/abstract/document/9892542
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International conference on learning representations (ICLR)
Liang Z, Guo J, Qiu W, Huang Z, Li S (2024) When graph convolution meets double attention: online privacy disclosure detection with multi-label text classification. Data Min Knowl Discov 1–22
Lewis DD, Yang Y, Russell-Rose T, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397
Yang P, Sun X, Li W, Ma S, Wu W, Wang H (2018) Sgm: sequence generation model for multi-label classification. In: Proceedings of the 27th international conference on computational linguistics, pp 3915–3926. https://aclanthology.org/C18-1330
Yang P, Luo F, Ma S, Lin J, Sun X (2019) A deep reinforced sequence-to-set model for multi-label classification. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 5252–5258. https://aclanthology.org/P19-1518
Liao W, Wang Y, Yin Y, Zhang X, Ma P (2020) Improved sequence generation model for multi-label classification via cnn and initialized fully connection. Neurocomputing 382:188–195
Zhang X, Tan X, Luo Z, Zhao J (2023) Multi-label sequence generating model via label semantic attention mechanism. Int J Mach Learn Cybern 14(5):1711–1723
Wang R, Ridley R, Qu W, Dai X et al (2021) A novel reasoning mechanism for multi-label text classification. Inf Process Manage 58(2):102441
You R, Zhang Z, Wang Z, Dai S, Mamitsuka H, Zhu S (2019) Attentionxml: label tree-based attention-aware deep model for high-performance extreme multi-label text classification. In: Advances in neural information processing systems, vol 32, pp 5820–5830
Xiao L, Huang X, Chen B, Jing L (2019) Label-specific document representation for multi-label text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 466–475. Association for Computational Linguistics. https://aclanthology.org/D19-1044
Liu Q, Chen J, Chen F, Fang K, An P, Zhang Y, Du S (2023) Mlgn: a multi-label guided network for improving text classification. IEEE Access 11:80392–80402. https://doi.org/10.1109/ACCESS.2023.3299566
Qin S, Wu H, Zhou L, Li J, Du G (2023) Learning metric space with distillation for large-scale multi-label text classification. Neural Comput Appl 35(15):11445–11458
Wang Q, Zhu J, Shu H, Asamoah KO, Shi J, Zhou C (2023) Gudn: a novel guide network with label reinforcement strategy for extreme multi-label text classification. J King Saud Univ Comput Inf Sci 35(4):161–171
Xu P, Xiao L, Liu B, Lu S, Jing L, Yu J (2023) Label-specific feature augmentation for long-tailed multi-label text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol. 37, pp 10602–10610
Xiao L, Xu P, Song M, Liu H, Jing L, Zhang X (2023) Triple alliance prototype orthotist network for long-tailed multi-label text classification. IEEE/ACM Trans Audio Speech Lang Process 31:2616–2628. https://doi.org/10.1109/TASLP.2023.3265860
Zhang W, Yan J, Wang X, Zha H (2018) Deep extreme multi-label learning. In: Proceedings of the 2018 ACM on international conference on multimedia retrieval, pp 100–107. https://doi.org/10.1145/3206025.3206030
Li I, Feng A, Wu H, Li T, Suzumura T, Dong R (2022) LiGCN: label-interpretable graph convolutional networks for multi-label text classification. In: Proceedings of the 2nd workshop on deep learning on graphs for natural language processing (DLG4NLP 2022), pp 60–70. Association for Computational Linguistics. https://aclanthology.org/2022.dlg4nlp-1.7
Vu H, Nguyen M, Nguyen V, Pham M, Nguyen V, Nguyen V (2023) Label-representative graph convolutional network for multi-label text classification. Appl Intell 53(12):14759–14774. https://doi.org/10.1007/s10489-022-04106-x
Ma Q, Yuan C, Zhou W, Hu S (2021) Label-specific dual graph neural network for multi-label text classification. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (vol 1: Long Papers), pp 3855–3864. Association for computational linguistics
Fan C, Chen W, Tian J, Li Y, He H, Jin Y (2023) Accurate use of label dependency in multi-label text classification through the lens of causality. Appl Intell 1–17
Zeng D, Zha E, Kuang J, Shen Y (2024) Multi-label text classification based on semantic-sensitive graph convolutional network. Knowl-Based Syst 284:111303
Zhao F, Ai Q, Li X, Wang W, Gao Q, Liu Y (2024) Tlc-xml: transformer with label correlation for extreme multi-label text classification. Neural Process Lett 56(1):25
Huang Y, Giledereli B, Köksal A, Özgür A, Ozkirimli E (2021) Balancing methods for multi-label text classification with long-tailed class distribution. In: Proceedings of the 2021 conference on empirical methods in natural language processing, pp 8153–8161. Association for computational linguistics
Guo H, Li X, Zhang L, Liu J, Chen W (2021) Label-aware text representation for multi-label text classification. In: ICASSP 2021-2021 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 7728–7732. https://doi.org/10.1109/ICASSP39728.2021.9413921
Zhuang C, Ma Q (2018) Dual graph convolutional networks for graph-based semi-supervised classification. In: Proceedings international world wide web conferences steering committee, pp 499–508. https://doi.org/10.1145/3178876.3186116
Loza Mencía E, Fürnkranz J (2008) Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: Joint European conference on machine learning and knowledge discovery in databases, pp 50–65
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:1907.11692
Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11)
Acknowledgements
This work was supported in part by National Natural Science Foundation of China (No. 61862058), Natural Science Foundation of Gansu Province (No. 20JR5RA518, 21JR7RA114). Industrial Support Project of Gansu Colleges (No. 2022CYZC11).
Author information
Authors and Affiliations
Contributions
X.L and B.Y: Conceptualization, Methodology, Formal analysis, Software, Investigation, Validation, Resources, Writing—original draft, review and editing, Visualization. Q.P and S.F: Resources, Writing—review and editing, Supervision.
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Consent to Participate
The authors declare that they agree to participate.
Consent for Publication
The authors declare that they agree to publish.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, X., You, B., Peng, Q. et al. Dual-view graph convolutional network for multi-label text classification. Appl Intell 54, 9363–9380 (2024). https://doi.org/10.1007/s10489-024-05666-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05666-w