Abstract
With the development of deep learning technology and the disclosure of legal texts, the classification of legal texts has attracted the attention of researchers. At present, research on the classification of legal texts is mainly focused on multiclass classification. There are few studies on multi-label classification for legal texts. This paper addresses the use of a label sequence generation model to study the multi-label classification of legal texts at the sentence level. The current general multi-label classification methods are often designed for long texts and ignore the transfer relationships between labels. We propose a method based on label embedding and a capsule neural network for the multi-label classification of legal text. Our proposed method applies the graph convolutional network to learn label embeddings and the correlations between labels, a fusion layer to combine the label information with the contextual semantic information of texts and a capsule neural network to extract the spatial feature information of text. Experimental results on three legal text datasets show that our proposed model outperforms the baseline methods, verifying the effectiveness of our proposed model for legal text with an uncertain number of characters in words and short lengths. In addition, we experimented on two datasets that are usually applied in multi-label classification, and the performance of the model shows that the method we proposed is competitive with state-of-the-art models of multi-label text classification.
Similar content being viewed by others
Data Availability
The processed legal text data used to support the findings of this study are currently under embargo, while the research findings are being commercialized. Requests for data 6–12 months after the publication of this article will be considered by the corresponding author.
References
Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771
Cai L, Song Y, Liu T, Zhang K (2020) A hybrid BERT model that incorporates label semantics via adjustive attention for multi-label text classification. IEEE Access 8:152183–152192
Chen G, Ye D, Xing Z, Chen J, Cambria E (2017) Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. In: 2017 international joint conference on neural networks (IJCNN), pp 2377–2383
Chen Z, Wei X, Wang P, Guo Y (2019) Multi-label image recognition with graph convolutional networks. In: IEEE Conference on computer vision and pattern recognition CVPR, pp 5177–5186
Christopher D (2008) Manning: introduction to information retrieval. J Am Soc Inf Sci Technol 43(3):824–825
Clare A, King RD (2001) Knowledge discovery in multi-label phenotype data. In: European conference on principles of data mining and knowledge discovery
Elisseeff A, Weston J (2002) A kernel method for multi-labelled classification. In: Advances in neural information processing systems, pp 681–687
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Kim J, Jang S, Park EL, Choi S (2020) Text classification using capsules. Neurocomputing 376:214–221
Kurata G, Xiang B, Zhou B (2016) Improved neural network-based multi-label classification with better initialization leveraging label co-occurrence. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 521–526
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP 2014), pp 1746–1751
Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397
Lin J, Su Q, Yang P, Ma S, Sun X (2018) Semantic-unit-based dilated convolution for multi-label text classification. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 4554–4564
Yaoqiang X, Yi L, Jin Y, Songrui G, Yi X (2021) History-based attention in Seq2Seq model for multi-label text classification. Knowl-Based Syst 224:107094
Boyan W, Xuegang H, Peipei L (2021) Philip Cognitive structure learning model for hierarchical multi-label text classification. Knowl-Based Syst 218:106876
Nam J, Loza Mencía E, Kim HJ, Fürnkranz J (2017) Maximizing subset accuracy with recurrent neural networks in multi-label classification. Adv Neural Inform Process Syst 30:5413–5423
Patrick MK, Weyori BA, Ayidzoe MA (2021) Capsule network with k-means routingfor plant disease recognition. J Intell Fuzzy Syst 40(1):1025–1036
Read J, Pfahringer B, Holmes G, Frank E (2009) Classifier chains for multi-label classification. In: Joint European conference on machine learning and knowledge discovery in databases
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. In: Advances in neural information processing systems 30: annual conference on neural information processing systems
Schapire RE, Singer Y (1998) Improved boosting algorithms using confidence-rated predictions. Machine Learning, 80–91
Sun G, Ding S, Sun T, Zhang C (2021) Sa-capsgan: using capsule networks with embedded self-attention for generative adversarial network. Neurocomputing 423:399–406
Tsoumakas G, Katakis I (2006) Multi-label classification: an overview. International Journal of Data Warehousing and Mining 3(3)
Wang G, Li C, Wang W, Zhang Y, Shen D, Zhang X, Henao R, Carin L (2018) Joint embedding of words and labels for text classification. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018
Wang T, Liu L, Liu N, Zhang H, Zhang L, Feng S (2020) A multi-label text classification method via dynamic semantic representation model and deep neural network. Appl Intell 50(8):2339– 2351
Yujia W, Li J, Jia W, Chang J (2020) Siamese capsule networks with global and local features for text classification. Neurocomputing 390:88–98
Yang P, Sun X, Li W, Ma S, Wu W, Wang H (2018) SGM: sequence generation model for multi-label classification. In: Proceedings of the 27th international conference on computational linguistics, COLING 2018, pp 3915–3926
Gao W, Huang H (2021) A gating context-aware text classification model with BERT and graph convolutional networks. J Intell Fuzzy Syst 40(3):4331–4343
Liu N, Wang Q, Ren J (2021) Label-embedding bi-directional attentive model for multi-label text classification. Neural Process Lett 53:375–389
Zhang ML, Zhou ZH (2006) Multi-label neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351
Zhang ML, Zhou ZH (2007) Ml-knn: A lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Ming Y, Wei Z, Lei C (2019) Investigating the transferring capability of capsule networks for text classification. Neur Netw 118(6):247–261
Chen Z, Ren J (2021) Multi-label text classification with latent word-wise label information. Appl Intell 51(2):966–979
Liu, Chen, Li (2021) Multi-label text classification via joint learning from label embedding and label correlation. Neurocomputing 460:385–398
Wang R, Ridley R, Su X, Qu W, Dai X (2021) A novel reasoning mechanism for multi-label text classification. Inform Process Manag 58(2):102441
Peng H, et al. (2021) Hierarchical taxonomy-aware and attentional graph capsule RCNNs for large-scale multi-label text classification. IEEE Trans Knowl Data Eng 33(6):2505–2519
Acknowledgments
This work is supported by the National Natural Science Foundation of China (NSFC) under Grant 61872111. This work is also supported by the Opening Project of Science and Technology on Communication Networks Laboratory.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that there is no conflict of interest regarding the publication of this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, Z., Li, S., Ye, L. et al. Multi-label classification of legal text based on label embedding and capsule network. Appl Intell 53, 6873–6886 (2023). https://doi.org/10.1007/s10489-022-03455-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03455-x