skip to main content
research-article

Multi-Label Text Classification Model Based on Multi-Level Constraint Augmentation and Label Association Attention

Published: 15 January 2024 Publication History

Abstract

In the multi-label text classification task, a text usually corresponds to multiple label categories, and the labels have correlation and hierarchical structure. However, when the label hierarchy is unknown, the number of various labels is not balanced, which makes it difficult for the model to classify low-frequency labels. In addition, labels have semantic similarities that make it difficult for the model to distinguish between them. In this article, we propose a multi-label text classification model based on multi-level constraint augmentation and label association attention. Compared with traditional methods, our method has two contributions: (1) In order to alleviate the problem of unbalanced number of different label categories and ensure the rationality of sample generation, we propose a data augmentation method based on multi-level constraints. In the process of sample generation, this method uses historical generation information, sample original text information, and sample topic to constrain the generated text. (2) In order to make the model recognize the associated labels accurately, we propose an interaction mechanism based on label association attention and filter gate. This method combines text information and label weight information. At the same time, our classification model considers the important weights of text sentences and effectively utilizes the co-occurrence relationship between labels. Experimental results on three benchmark datasets show that our model outperforms state-of-the-art methods on all main evaluation metrics, especially on low-frequency label prediction with sparse samples.

References

[1]
J. Gao, H. Yu, and S. Zhang. 2022. Joint event causality extraction using dual-channel enhanced neural network. Knowledge-Based Systems 258 (2022), 109935.
[2]
N. Xia, H. Yu, Y. Wang, and X. Luo. 2022. DAFS: A domain aware few shot generative model for event detection. Machine Learning (2022), 1–21.
[3]
Z. Zhao, H. Yu, X. Luo, and G. Shengming. 2022. IA-ICGCN: Integrating prior knowledge via intra-event association and inter-event causality for Chinese causal event extraction. In Proceedings of the International Conference on Artificial Neural Networks. Springer, Cham, 519–531.
[4]
H. Gu, H. Yu, and X. Luo. 2022. DBGARE: Across-within dual bipartite graph attention for enhancing distantly supervised relation extraction. In Proceedings of theInternational Conference on Knowledge Science, Engineering and Management. Springer, Cham, 400–412.
[5]
Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.
[6]
X. Liu, J. Gao, X. He, L. Deng, K. Duh, and Y. Y. Wang. 2015. Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 912–921.
[7]
J. Mullenbach, S. Wiegreffe, J. Duke, et al. 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 1101–1111.
[8]
S. Xie, C. Hou, H. Yu, et al. 2022. Multi-label disaster text classification via supervised contrastive learning for social media data. Computers and Electrical Engineering 104 (2022), 108401.
[9]
O. M. Şulea, M. Zampieri, M. Vela, et al. 2017. Predicting the law area and decisions of French supreme court cases. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP’17). 716–722.
[10]
J. Nam, J. Kim, E. L. Mencía, et al. 2014. Large-scale multi-label text classification—revisiting neural networks. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, 437–452.
[11]
R. Aly, S. Remus, and C. Biemann. 2019. Hierarchical multi-label classification of text with capsule networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 323–330.
[12]
P. Liu, X. Qiu, and X. Huang. 2016. Recurrent neural network for text classification with multi-task learning. In Proceedings of the 25fth International Joint Conference on Artificial Intelligence. 2873–2879.
[13]
B. Shi, X. Bai, and C. Yao. 2016. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 11 (2016), 2298–2304.
[14]
Y. Xu, X. Ran, W. Sun, et al. 2019. Gated neural network with regularized loss for multi-label text classification. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN’19). IEEE, 1–8.
[15]
C. Du, Z. Chen, F. Feng, et al. 2019. Explicit interaction model towards text classification. In Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (2019), 6359–6366.
[16]
N. Xu, P. Wang, L. Chen, et al. 2020. Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3086–3095.
[17]
S. Baker, D. Kiela, and A. Korhonen. 2016. Robust text classification for sparsely labelled data using multi-level embeddings. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). 2333–2343.
[18]
L. Xiao, X. Huang, B. Chen, et al. 2019. Label-specific document representation for multi-label text classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 466–475.
[19]
W. Zhang, J. Yan, X. Wang, et al. 2018. Deep extreme multi-label learning. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval. 100–107.
[20]
P. Yang, X. Sun, W. Li, et al. 2018. SGM: Sequence generation model for multi-label classification. In Proceedings of the 27th International Conference on Computational Linguistics. 3915–3926.
[21]
J. Lin, Q. Su, P. Yang, et al. 2018. Semantic-unit-based dilated convolution for multi-label text classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 4554–4564.
[22]
P. Yang, F. Luo, S. Ma, et al. 2019. A deep reinforced sequence-to-set model for multi-label classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5252–5258.
[23]
Y. Mao, J. Tian, J. Han, et al. 2019. Hierarchical text classification with reinforced label assignment. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 445–455.
[24]
Y. Mao, J. Tian, J. Han, et al. 2019. Hierarchical text classification with reinforced label assignment. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 445–455.
[25]
K. Bhatia, H. Jain, P. Kar, et al. 2015. Sparse local embeddings for extreme multi-label classification. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Volume 1, 730–738.
[26]
Y. Tagami. 2017. AnnexML: Approximate nearest neighbor search for extreme multi-label classification. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 455–464.
[27]
N. Pappas and J. Henderson. 2019. GILE: A generalized input-label embedding for text classification. Transactions of the Association for Computational Linguistics 7 (2019), 139–155.
[28]
Y. Prabhu, A. Kag, S. Harsola, et al. 2018. Parabel: Partitioned label trees for extreme classification with application to dynamic search advertising. In Proceedings of the 2018 World Wide Web Conference. 993–1002.
[29]
R. You, Z. Zhang, Z. Wang, et al. 2019. AttentionXML: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. 5820–5830.
[30]
X. Liang, D. Cheng, F. Yang, et al. 2020. F-HMTC: Detecting financial events for investment decisions based on neural hierarchical multi-label text classification. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI’20). 4490-4496.
[31]
H. Peng, J. Li, Y. He, et al. 2018. Large-scale hierarchical text classification with recursively regularized deep graph-CNN. In Proceedings of the 2018 World Wide Web Conference. 1063–1072.
[32]
K. Shimura, J. Li, and F. Fukumoto. 2018. HFT-CNN: Learning hierarchical category structure for multi-label short text categorization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 811–816.
[33]
S. Banerjee, C. Akkaya, F. Perez-Sorrosal, et al. 2019. Hierarchical transfer learning for multi-label text classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 6295–6300.
[34]
J. Wehrmann, R. Cerri, and R. Barros. 2018. Hierarchical multi-label classification networks. In Proceedings of the International Conference on Machine Learning (PMLR’18). 5075–5084.
[35]
W. Huang, E. Chen, Q. Liu, et al. 2019. Hierarchical multi-label text classification: An attention-based recurrent network approach. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 1051–1060.
[36]
X. Zhang, J. Zhao, and Y. LeCun. 2015. Character-level convolutional networks for text classification. Advances in Neural Information Processing Systems 28 (2015), 649–657.
[37]
J. Wei and K. Zou. 2019. EDA: Easy data augmentation techniques for boosting performance on text classification tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 6382–6388.
[38]
S. Kobayashi. 2018. Contextual augmentation: Data augmentation by words with paradigmatic relations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 452–457.
[39]
X. Wu, S. Lv, L. Zang, et al. 2019. Conditional BERT contextual augmentation. In Proceedings of the International Conference on Computational Science. Springer, Cham, 84–95.
[40]
M. Dehouck and C. Gómez-Rodríguez. 2020. Data augmentation via subtree swapping for dependency parsing of low-resource languages. In Proceedings of the 28th International Conference on Computational Linguistics. 3818–3830.
[41]
Q. Xie, Z. Dai, E. Hovy, et al. 2020. Unsupervised data augmentation for consistency training. Advances in Neural Information Processing Systems 33 (2020).
[42]
A. R. Fabbri, S. Han, H. Li, et al. 2021. Improving zero and few-shot abstractive summarization with intermediate fine-tuning and data augmentation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 704–717.
[43]
Y. Zhang, T. Ge, and X. Sun. 2020. Parallel data augmentation for formality style transfer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3221–3228.
[44]
D. Kang, T. Khot, A. Sabharwal, et al. 2018. AdvEntuRe: Adversarial training for textual entailment with knowledge-guided examples. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1 (Long Papers). 2418–2428.
[45]
Y. Zhang, Z. Gan, K. Fan, et al. 2017. Adversarial feature matching for text generation. In Proceedings of the International Conference on Machine Learning (PMLR). 4006–4015.
[46]
D. Shen, Y. Zhang, R. Henao, et al. 2018. Deconvolutional latent-variable model for text sequence matching. In Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018).
[47]
L. Yu, W. Zhang, J. Wang, et al. 2017. Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence 31, 1 (2017).
[48]
A. Anaby-Tavor, B. Carmeli, E. Goldbraich, et al. 2020. Do not have enough data? Deep learning to the rescue. In Proceedings of the AAAI Conference on Artificial Intelligence 34, 05 (2020), 7383–7390.
[49]
H. Q. Abonizio and S. B. Junior. 2020. Pre-trained data augmentation for text classification. In Proceedings of the Brazilian Conference on Intelligent Systems. Springer, Cham, 551–565.
[50]
J. Liu, W. C. Chang, Y. Wu, et al. 2017. Deep learning for extreme multi-label text classification. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 115–124.
[51]
C. Xiao, H. Zhong, Z. Guo, et al. 2018. CAIL2018: A large-scale legal dataset for judgment prediction. https://arxiv.org/abs/1807.02478.
[52]
H. Liu, C. Yuan, and X. Wang. 2020. Label-wise document pre-training for multi-label text classification. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 641–653.
[53]
Y. Xiao, Y. Li, J. Yuan, et al. 2021. History-based attention in Seq2Seq model for multi-label text classification. Knowledge-Based Systems 224 (2021), 107094.

Cited By

View all
  • (2025)Source Code Error Understanding Using BERT for Multi-Label ClassificationIEEE Access10.1109/ACCESS.2024.352506113(3802-3822)Online publication date: 2025
  • (2025)A social context-aware graph-based multimodal attentive learning framework for disaster content classification during emergenciesExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125337259:COnline publication date: 1-Jan-2025
  • (2025)All is attention for multi-label text classificationKnowledge and Information Systems10.1007/s10115-024-02253-w67:2(1249-1270)Online publication date: 1-Feb-2025
  • Show More Cited By

Index Terms

  1. Multi-Label Text Classification Model Based on Multi-Level Constraint Augmentation and Label Association Attention

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 1
    January 2024
    385 pages
    EISSN:2375-4702
    DOI:10.1145/3613498
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 January 2024
    Online AM: 01 May 2023
    Accepted: 17 February 2023
    Revised: 24 December 2022
    Received: 30 September 2022
    Published in TALLIP Volume 23, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Multi-label text classification
    2. data augmentation
    3. label association attention

    Qualifiers

    • Research-article

    Funding Sources

    • National Social Science Fund of China
    • SSPU young talent

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)381
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Source Code Error Understanding Using BERT for Multi-Label ClassificationIEEE Access10.1109/ACCESS.2024.352506113(3802-3822)Online publication date: 2025
    • (2025)A social context-aware graph-based multimodal attentive learning framework for disaster content classification during emergenciesExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125337259:COnline publication date: 1-Jan-2025
    • (2025)All is attention for multi-label text classificationKnowledge and Information Systems10.1007/s10115-024-02253-w67:2(1249-1270)Online publication date: 1-Feb-2025
    • (2024)Context-based Marwari text clustering2024 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)10.1109/SPICES62143.2024.10779675(1-5)Online publication date: 20-Sep-2024
    • (2024)Exploring Multi-Label Data Augmentation for LLM Fine-Tuning and Inference in Requirements Engineering: A Study with Domain Expert Evaluation2024 International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA61862.2024.00064(432-439)Online publication date: 18-Dec-2024
    • (2023)Improving Clothing Product Quality and Reducing Waste Based on Consumer Review Using RoBERTa and BERTopic Language ModelBig Data and Cognitive Computing10.3390/bdcc70401687:4(168)Online publication date: 25-Oct-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media