research-article

Multi-Label Text Classification Model Based on Multi-Level Constraint Augmentation and Label Association Attention

Authors:

Zheng XuAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 23, Issue 1

Article No.: 14, Pages 1 - 20

https://doi.org/10.1145/3586008

Published: 15 January 2024 Publication History

Abstract

In the multi-label text classification task, a text usually corresponds to multiple label categories, and the labels have correlation and hierarchical structure. However, when the label hierarchy is unknown, the number of various labels is not balanced, which makes it difficult for the model to classify low-frequency labels. In addition, labels have semantic similarities that make it difficult for the model to distinguish between them. In this article, we propose a multi-label text classification model based on multi-level constraint augmentation and label association attention. Compared with traditional methods, our method has two contributions: (1) In order to alleviate the problem of unbalanced number of different label categories and ensure the rationality of sample generation, we propose a data augmentation method based on multi-level constraints. In the process of sample generation, this method uses historical generation information, sample original text information, and sample topic to constrain the generated text. (2) In order to make the model recognize the associated labels accurately, we propose an interaction mechanism based on label association attention and filter gate. This method combines text information and label weight information. At the same time, our classification model considers the important weights of text sentences and effectively utilizes the co-occurrence relationship between labels. Experimental results on three benchmark datasets show that our model outperforms state-of-the-art methods on all main evaluation metrics, especially on low-frequency label prediction with sparse samples.

References

[1]

J. Gao, H. Yu, and S. Zhang. 2022. Joint event causality extraction using dual-channel enhanced neural network. Knowledge-Based Systems 258 (2022), 109935.

Digital Library

[2]

N. Xia, H. Yu, Y. Wang, and X. Luo. 2022. DAFS: A domain aware few shot generative model for event detection. Machine Learning (2022), 1–21.

[3]

Z. Zhao, H. Yu, X. Luo, and G. Shengming. 2022. IA-ICGCN: Integrating prior knowledge via intra-event association and inter-event causality for Chinese causal event extraction. In Proceedings of the International Conference on Artificial Neural Networks. Springer, Cham, 519–531.

[4]

H. Gu, H. Yu, and X. Luo. 2022. DBGARE: Across-within dual bipartite graph attention for enhancing distantly supervised relation extraction. In Proceedings of theInternational Conference on Knowledge Science, Engineering and Management. Springer, Cham, 400–412.

Digital Library

[5]

Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.

[6]

X. Liu, J. Gao, X. He, L. Deng, K. Duh, and Y. Y. Wang. 2015. Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 912–921.

[7]

J. Mullenbach, S. Wiegreffe, J. Duke, et al. 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 1101–1111.

[8]

S. Xie, C. Hou, H. Yu, et al. 2022. Multi-label disaster text classification via supervised contrastive learning for social media data. Computers and Electrical Engineering 104 (2022), 108401.

Digital Library

[9]

O. M. Şulea, M. Zampieri, M. Vela, et al. 2017. Predicting the law area and decisions of French supreme court cases. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP’17). 716–722.

[10]

J. Nam, J. Kim, E. L. Mencía, et al. 2014. Large-scale multi-label text classification—revisiting neural networks. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, 437–452.

Digital Library

[11]

R. Aly, S. Remus, and C. Biemann. 2019. Hierarchical multi-label classification of text with capsule networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 323–330.

[12]

P. Liu, X. Qiu, and X. Huang. 2016. Recurrent neural network for text classification with multi-task learning. In Proceedings of the 25fth International Joint Conference on Artificial Intelligence. 2873–2879.

[13]

B. Shi, X. Bai, and C. Yao. 2016. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 11 (2016), 2298–2304.

Digital Library

[14]

Y. Xu, X. Ran, W. Sun, et al. 2019. Gated neural network with regularized loss for multi-label text classification. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN’19). IEEE, 1–8.

[15]

C. Du, Z. Chen, F. Feng, et al. 2019. Explicit interaction model towards text classification. In Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (2019), 6359–6366.

Digital Library

[16]

N. Xu, P. Wang, L. Chen, et al. 2020. Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3086–3095.

[17]

S. Baker, D. Kiela, and A. Korhonen. 2016. Robust text classification for sparsely labelled data using multi-level embeddings. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). 2333–2343.

[18]

L. Xiao, X. Huang, B. Chen, et al. 2019. Label-specific document representation for multi-label text classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 466–475.

[19]

W. Zhang, J. Yan, X. Wang, et al. 2018. Deep extreme multi-label learning. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval. 100–107.

Digital Library

[20]

P. Yang, X. Sun, W. Li, et al. 2018. SGM: Sequence generation model for multi-label classification. In Proceedings of the 27th International Conference on Computational Linguistics. 3915–3926.

[21]

J. Lin, Q. Su, P. Yang, et al. 2018. Semantic-unit-based dilated convolution for multi-label text classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 4554–4564.

[22]

P. Yang, F. Luo, S. Ma, et al. 2019. A deep reinforced sequence-to-set model for multi-label classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5252–5258.

[23]

Y. Mao, J. Tian, J. Han, et al. 2019. Hierarchical text classification with reinforced label assignment. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 445–455.

[24]

Y. Mao, J. Tian, J. Han, et al. 2019. Hierarchical text classification with reinforced label assignment. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 445–455.

[25]

K. Bhatia, H. Jain, P. Kar, et al. 2015. Sparse local embeddings for extreme multi-label classification. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Volume 1, 730–738.

[26]

Y. Tagami. 2017. AnnexML: Approximate nearest neighbor search for extreme multi-label classification. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 455–464.

Digital Library

[27]

N. Pappas and J. Henderson. 2019. GILE: A generalized input-label embedding for text classification. Transactions of the Association for Computational Linguistics 7 (2019), 139–155.

[28]

Y. Prabhu, A. Kag, S. Harsola, et al. 2018. Parabel: Partitioned label trees for extreme classification with application to dynamic search advertising. In Proceedings of the 2018 World Wide Web Conference. 993–1002.

Digital Library

[29]

R. You, Z. Zhang, Z. Wang, et al. 2019. AttentionXML: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. 5820–5830.

[30]

X. Liang, D. Cheng, F. Yang, et al. 2020. F-HMTC: Detecting financial events for investment decisions based on neural hierarchical multi-label text classification. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI’20). 4490-4496.

[31]

H. Peng, J. Li, Y. He, et al. 2018. Large-scale hierarchical text classification with recursively regularized deep graph-CNN. In Proceedings of the 2018 World Wide Web Conference. 1063–1072.

Digital Library

[32]

K. Shimura, J. Li, and F. Fukumoto. 2018. HFT-CNN: Learning hierarchical category structure for multi-label short text categorization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 811–816.

[33]

S. Banerjee, C. Akkaya, F. Perez-Sorrosal, et al. 2019. Hierarchical transfer learning for multi-label text classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 6295–6300.

[34]

J. Wehrmann, R. Cerri, and R. Barros. 2018. Hierarchical multi-label classification networks. In Proceedings of the International Conference on Machine Learning (PMLR’18). 5075–5084.

[35]

W. Huang, E. Chen, Q. Liu, et al. 2019. Hierarchical multi-label text classification: An attention-based recurrent network approach. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 1051–1060.

Digital Library

[36]

X. Zhang, J. Zhao, and Y. LeCun. 2015. Character-level convolutional networks for text classification. Advances in Neural Information Processing Systems 28 (2015), 649–657.

[37]

J. Wei and K. Zou. 2019. EDA: Easy data augmentation techniques for boosting performance on text classification tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 6382–6388.

[38]

S. Kobayashi. 2018. Contextual augmentation: Data augmentation by words with paradigmatic relations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 452–457.

[39]

X. Wu, S. Lv, L. Zang, et al. 2019. Conditional BERT contextual augmentation. In Proceedings of the International Conference on Computational Science. Springer, Cham, 84–95.

Digital Library

[40]

M. Dehouck and C. Gómez-Rodríguez. 2020. Data augmentation via subtree swapping for dependency parsing of low-resource languages. In Proceedings of the 28th International Conference on Computational Linguistics. 3818–3830.

[41]

Q. Xie, Z. Dai, E. Hovy, et al. 2020. Unsupervised data augmentation for consistency training. Advances in Neural Information Processing Systems 33 (2020).

[42]

A. R. Fabbri, S. Han, H. Li, et al. 2021. Improving zero and few-shot abstractive summarization with intermediate fine-tuning and data augmentation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 704–717.

[43]

Y. Zhang, T. Ge, and X. Sun. 2020. Parallel data augmentation for formality style transfer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3221–3228.

[44]

D. Kang, T. Khot, A. Sabharwal, et al. 2018. AdvEntuRe: Adversarial training for textual entailment with knowledge-guided examples. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1 (Long Papers). 2418–2428.

[45]

Y. Zhang, Z. Gan, K. Fan, et al. 2017. Adversarial feature matching for text generation. In Proceedings of the International Conference on Machine Learning (PMLR). 4006–4015.

[46]

D. Shen, Y. Zhang, R. Henao, et al. 2018. Deconvolutional latent-variable model for text sequence matching. In Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018).

[47]

L. Yu, W. Zhang, J. Wang, et al. 2017. Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence 31, 1 (2017).

[48]

A. Anaby-Tavor, B. Carmeli, E. Goldbraich, et al. 2020. Do not have enough data? Deep learning to the rescue. In Proceedings of the AAAI Conference on Artificial Intelligence 34, 05 (2020), 7383–7390.

[49]

H. Q. Abonizio and S. B. Junior. 2020. Pre-trained data augmentation for text classification. In Proceedings of the Brazilian Conference on Intelligent Systems. Springer, Cham, 551–565.

[50]

J. Liu, W. C. Chang, Y. Wu, et al. 2017. Deep learning for extreme multi-label text classification. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 115–124.

Digital Library

[51]

C. Xiao, H. Zhong, Z. Guo, et al. 2018. CAIL2018: A large-scale legal dataset for judgment prediction. https://arxiv.org/abs/1807.02478.

[52]

H. Liu, C. Yuan, and X. Wang. 2020. Label-wise document pre-training for multi-label text classification. In Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing. Springer, Cham, 641–653.

Digital Library

[53]

Y. Xiao, Y. Li, J. Yuan, et al. 2021. History-based attention in Seq2Seq model for multi-label text classification. Knowledge-Based Systems 224 (2021), 107094.

Cited By

Amin MWatanobe YRahman MShirafuji A(2025)Source Code Error Understanding Using BERT for Multi-Label ClassificationIEEE Access10.1109/ACCESS.2024.352506113(3802-3822)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2024.3525061
Dar SRehman MBais KHaseeb MKumar N(2025)A social context-aware graph-based multimodal attentive learning framework for disaster content classification during emergenciesExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125337259:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.eswa.2024.125337
Liu ZHuang YXia XZhang Y(2025)All is attention for multi-label text classificationKnowledge and Information Systems10.1007/s10115-024-02253-w67:2(1249-1270)Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s10115-024-02253-w
Show More Cited By

Index Terms

Multi-Label Text Classification Model Based on Multi-Level Constraint Augmentation and Label Association Attention
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Multi-label Text Classification with Label Correction under Noise
ICCPR '21: Proceedings of the 2021 10th International Conference on Computing and Pattern Recognition

Multi-label text classification (MLTC) is a fundamental but difficult problem in text mining, the goal of MLTC is to assign a set of most relevant labels for the given document. While existing supervised training of deep learning models for MLTC ...
Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification
WWW '22: Proceedings of the ACM Web Conference 2022

Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant labels from a large candidate set. Most existing LMTC approaches rely on massive human-annotated training data, which are often costly to obtain and suffer ...
Semi-supervised multi-label classification using incomplete label information
Highlights
- An inductive semi-supervised method called Smile is proposed for multi-label classification using incomplete label information.
Abstract
Classifying multi-label instances using incompletely labeled instances is one of the fundamental tasks in multi-label learning. Most existing methods regard this task as supervised weak-label learning problem and assume sufficient ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 23, Issue 1

January 2024

385 pages

EISSN:2375-4702

DOI:10.1145/3613498

Editor:
Imed Zitoun
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 January 2024

Online AM: 01 May 2023

Accepted: 17 February 2023

Revised: 24 December 2022

Received: 30 September 2022

Published in TALLIP Volume 23, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Social Science Fund of China
SSPU young talent

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
749
Total Downloads

Downloads (Last 12 months)381
Downloads (Last 6 weeks)19

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Amin MWatanobe YRahman MShirafuji A(2025)Source Code Error Understanding Using BERT for Multi-Label ClassificationIEEE Access10.1109/ACCESS.2024.352506113(3802-3822)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2024.3525061
Dar SRehman MBais KHaseeb MKumar N(2025)A social context-aware graph-based multimodal attentive learning framework for disaster content classification during emergenciesExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125337259:COnline publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1016/j.eswa.2024.125337
Liu ZHuang YXia XZhang Y(2025)All is attention for multi-label text classificationKnowledge and Information Systems10.1007/s10115-024-02253-w67:2(1249-1270)Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s10115-024-02253-w
Malani PBafna P(2024)Context-based Marwari text clustering2024 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES)10.1109/SPICES62143.2024.10779675(1-5)Online publication date: 20-Sep-2024
https://doi.org/10.1109/SPICES62143.2024.10779675
Liu HGarcía MKorkakakis N(2024)Exploring Multi-Label Data Augmentation for LLM Fine-Tuning and Inference in Requirements Engineering: A Study with Domain Expert Evaluation2024 International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA61862.2024.00064(432-439)Online publication date: 18-Dec-2024
https://doi.org/10.1109/ICMLA61862.2024.00064
Alamsyah AGirawan N(2023)Improving Clothing Product Quality and Reducing Waste Based on Consumer Review Using RoBERTa and BERTopic Language ModelBig Data and Cognitive Computing10.3390/bdcc70401687:4(168)Online publication date: 25-Oct-2023
https://doi.org/10.3390/bdcc7040168

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents