ABSTRACT
The legal charge prediction task aims to judge appropriate charges according to the given fact description in cases. Most existing methods formulate it as a multi-class text classification problem and have achieved tremendous progress. However, the performance on low-frequency charges is still unsatisfactory. Previous studies indicate leveraging the charge label information can facilitate this task, but the approaches to utilizing the label information are not fully explored. In this paper, inspired by the vision-language information fusion techniques in the multi-modal field, we propose a novel model (denoted as LeapBank) by fusing the representations of text and labels to enhance the legal charge prediction task. Specifically, we devise a representation fusion block based on the bilinear attention network to interact the labels and text tokens seamlessly. Extensive experiments are conducted on three real-world datasets to compare our proposed method with state-of-the-art models. Experimental results show that LeapBank obtains up to 8.5% Macro-F1 improvements on the low-frequency charges, demonstrating our model's superiority and competitiveness.
Supplemental Material
- Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. 2016. Label-Embedding for Image Classification. TPAMI 38 (2016), 1425--1438.Google ScholarCross Ref
- Hedi Ben-younes, Rémi Cadène, Matthieu Cord, and Nicolas Thome. 2017. MU- TAN: Multimodal Tucker Fusion for Visual Question Answering. In Proc. of ICCV. 2631--2639.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of NAACL-HLT. 4171--4186.Google Scholar
- Cunxiao Du, Zhaozheng Chen, Fuli Feng, Lei Zhu, Tian Gan, and Liqiang Nie. 2019. Explicit Interaction Model towards Text Classification. In Proc. of AAAI. 6359--6366.Google ScholarDigital Library
- Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc'Aurelio Ranzato, and Tomás Mikolov. 2013. DeViSE: A Deep Visual- Semantic Embedding Model. In Proc. of NeuIPS. 2121--2129.Google Scholar
- Congqing He, Li Peng, Yuquan Le, Jiawei He, and Xiangyu Zhu. 2019. SECaps: a sequence enhanced capsule model for charge prediction. In Proc. of ICANN. Springer, 227--239.Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google ScholarDigital Library
- Zikun Hu, Xiang Li, Cunchao Tu, Zhiyuan Liu, and Maosong Sun. 2018. Few-Shot Charge Prediction with Discriminative Legal Attributes. In Proc. of COLING. 487--498.Google Scholar
- Xin Jiang, Hai Ye, Zhunchen Luo, WenHan Chao, and Wenjia Ma. 2018. Inter-pretable Rationale Augmented Charge Prediction System. In Proc. of COLING. 146--151.Google Scholar
- Liangyi Kang, Jie Liu, Lingqiao Liu, Qinfeng Shi, and Dan Ye. 2019. Creating auxiliary representations from charge definitions for criminal charge prediction. ArXiv preprint abs/1911.05202 (2019).Google Scholar
- Daniel Martin Katz, Michael J Bommarito II, and Josh Blackman. 2017. A general approach for predicting the behavior of the Supreme Court of the United States. PloS one 12, 4 (2017), e0174698.Google ScholarCross Ref
- R Keown. 1980. Mathematical models for legal prediction. Computer/LJ 2 (1980), 829.Google Scholar
- Jin-Hwa Kim, Jaehyun Jun, and Byoung-Tak Zhang. 2018. Bilinear Attention Networks. In Proc. of NeurIPS. 1571--1581.Google Scholar
- Jin-Hwa Kim, Kyoung Woon On, Woosang Lim, Jeonghee Kim, Jung-Woo Ha, and Byoung-Tak Zhang. 2017. Hadamard Product for Low-rank Bilinear Pooling. In Proc. of ICLR.Google Scholar
- Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proc. of EMNLP. 1746--1751.Google ScholarCross Ref
- Fred Kort. 1957. Predicting Supreme Court Decisions Mathematically: A Quantitative Analysis of the ''Right to Counsel" Cases. American Political Science Review 51, 1 (1957), 1--12.Google ScholarCross Ref
- Yuquan Le, Congqing He, Meng Chen, Youzheng Wu, Xiaodong He, and Bowen Zhou. 2020. Learning to Predict Charges for Legal Judgment via Self-Attentive Capsule Network. In Proc. of ECAI.Google Scholar
- Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. 2015. Bilinear CNN Models for Fine-Grained Visual Recognition. In Proc. of ICCV. 1449--1457.Google ScholarDigital Library
- Wan-Chen Lin, Tsung-Ting Kuo, Tung-Jia Chang, Chueh-An Yen, Chao-Ju Chen, and Shou-de Lin. 2012. Exploiting Machine Learning Models for Chinese Legal Documents Labeling, Case Classification, and Sentencing Prediction. In Inter- national Journal of Computational Linguistics & Chinese Language Processing, Vol. 17. 49--68.Google Scholar
- Chao-Lin Liu, Cheng-Tsung Chang, and Jim-How Ho. 2004. Case instance generation and refinement for case-based criminal summary judgments in Chinese. JISE (2004), 783--800.Google Scholar
- Chao-Lin Liu and Chwen-Dar Hsieh. 2006. Exploring phrase-based classification of judicial documents for criminal charges in chinese. In Proc. of ISMIS. Springer, 681--690.Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. ArXiv preprint abs/1907.11692 (2019).Google Scholar
- Yi-Hung Liu, Yen-Liang Chen, and Wu-Liang Ho. 2015. Predicting associated statutes for legal problems. Information Processing & Management 51, 1 (2015), 194--211.Google ScholarCross Ref
- Zhiyuan Liu, Cunchao Tu, and Maosong Sun. 2019. Legal cause prediction with inner descriptions and outer hierarchies. In Proc. of CCL. Springer, 573--586.Google Scholar
- Shangbang Long, Cunchao Tu, Zhiyuan Liu, and Maosong Sun. 2019. Automatic judgment prediction via legal reading comprehension. In China National Conference on Chinese Computational Linguistics. Springer, 558--572.Google ScholarDigital Library
- Bingfeng Luo, Yansong Feng, Jianbo Xu, Xiang Zhang, and Dongyan Zhao. 2017. Learning to Predict Charges for Criminal Cases with Legal Basis. In Proc. of EMNLP. 2727--2736.Google ScholarCross Ref
- Ejan Mackaay and Pierre Robillard. 1974. Predicting judicial decisions: The nearest neighbour rule and visual representation of case patterns. 3(3/4):302--331 pages.Google Scholar
- Eneldo Loza Mencia and Johannes Fürnkranz. 2008. Efficient pairwise multilabel classification for large-scale problems in the legal domain. In Proc. of ECML-PKDD. Springer, 50--65.Google Scholar
- Taro Miyazaki, Kiminobu Makino, Yuka Takei, Hiroki Okamoto, and Jun Goto. 2019. Label Embedding using Hierarchical Structure of Labels for Twitter Classification. In Proc. of EMNLP. 6317--6322.Google ScholarCross Ref
- Stuart S Nagel. 1963. Applying correlation analysis to case prediction. Tex. L. Rev. 42 (1963), 1006.Google Scholar
- Vinod Nair and Geoffrey E. Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proc. of ICML. 807--814.Google ScholarDigital Library
- Jinseok Nam, Eneldo Loza Mencía, and Johannes Fürnkranz. 2016. All-in Text: Learning Document, Label, and Word Representations Jointly. In Proc. of AAAI. 1948--1954.Google ScholarCross Ref
- Yingwei Pan, Ting Yao, Yehao Li, and Tao Mei. 2020. X-Linear Attention Networks for Image Captioning. In Proc. of CVPR. 10968--10977.Google ScholarCross Ref
- Nikolaos Pappas and James Henderson. 2019. GILE: A Generalized Input-Label Embedding for Text Classification. TACL 7 (2019), 139--155.Google ScholarCross Ref
- Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proc. of NAACL-HLT. 2227--2237.Google ScholarCross Ref
- Hamed Pirsiavash, Deva Ramanan, and Charless C. Fowlkes. 2009. Bilinear classifiers for visual recognition. In Proc. of NeurIPS. 1482--1490.Google Scholar
- José A. Rodríguez-Serrano and Florent Perronnin. 2013. Label embedding for text recognition. In Proc. of BMVC.Google Scholar
- Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. 2017. Dynamic Routing Between Capsules. In Proc. of NeurIPS. 3856--3866.Google ScholarDigital Library
- Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513--523.Google Scholar
- Octavia-Maria Sulea, Marcos Zampieri, Shervin Malmasi, Mihaela Vela, Liviu P Dinu, and Josef van Genabith. 2017. Exploring the Use of Text Classification in the Legal Domain. In Proceedings of ASAIL workshop.Google Scholar
- Johan AK Suykens and Joos Vandewalle. 1999. Least squares support vector machine classifiers. Neural processing letters 9, 3 (1999), 293--300.Google Scholar
- Jian Tang, Meng Qu, and Qiaozhu Mei. 2015. PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks. In Proc. of SIGKDD. 1165--1174.Google ScholarDigital Library
- Joshua B Tenenbaum and William T Freeman. 2000. Separating style and content with bilinear models. Neural computation 12, 6 (2000), 1247--1283.Google Scholar
- Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. JMLR 9, 11 (2008).Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Proc. of NeurIPS. 5998--6008.Google Scholar
- Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, and Lawrence Carin. 2018. Joint Embedding of Words and Labels for Text Classification. In Proc. of ACL. 2321--2331.Google ScholarCross Ref
- Lin Xiao, Xin Huang, Boli Chen, and Liping Jing. 2019. Label-Specific Document Representation for Multi-Label Text Classification. In Proc. of EMNLP. 466--475.Google ScholarCross Ref
- Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, and Junzhou Zhao. 2020. Distinguish Confusing Law Articles for Legal Judgment Prediction. In Proc. of ACL. 3086--3095.Google ScholarCross Ref
- Wenmian Yang, Weijia Jia, Xiaojie Zhou, and Yutao Luo. 2019. Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network. In Proc. of IJCAI. 4085--4091.Google ScholarCross Ref
- Zhilin Yang, Zihang Dai, Yiming Yang, Jaime G. Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Proc. of NeurIPS. 5754--5764.Google Scholar
- Majid Yazdani and James Henderson. 2015. A Model of Zero-Shot Learning of Spoken Language Understanding. In Proc. of EMNLP. 244--249.Google ScholarCross Ref
- Zhou Yu, Jun Yu, Jianping Fan, and Dacheng Tao. 2017. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering. In Proc. of ICCV. 1839--1848.Google ScholarCross Ref
- Chao Zhang, Zichao Yang, Xiaodong He, and Li Deng. 2020. Multimodal intelligence: Representation learning, information fusion, and applications. JSTSP 14, 3 (2020), 478--493.Google Scholar
- Honglun Zhang, Liqiang Xiao, Wenqing Chen, Yongkun Wang, and Yaohui Jin. 2018. Multi-Task Label Embedding for Text Classification. In Proc. of EMNLP. 4545--4553.Google ScholarCross Ref
- Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Chaojun Xiao, Zhiyuan Liu, and Maosong Sun. 2018. Legal Judgment Prediction via Topological Learning. In Proc. of EMNLP. 3540--3549.Google ScholarCross Ref
- Haoxi Zhong, Yuzhong Wang, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, and Maosong Sun. 2020. Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction. In Proc. of AAAI. 1250--1257.Google ScholarCross Ref
- Haoxi Zhong, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, and Maosong Sun. 2020. How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence. In Proc. of ACL. 5218--5230.Google ScholarCross Ref
Index Terms
- Legal Charge Prediction via Bilinear Attention Network
Recommendations
Contrastive Learning for Legal Judgment Prediction
Legal judgment prediction (LJP) is a fundamental task of legal artificial intelligence. It aims to automatically predict the judgment results of legal cases. Three typical subtasks are relevant law article prediction, charge prediction, and term-of-...
Charge Prediction with Legal Attention
Natural Language Processing and Chinese ComputingAbstractCharge prediction aims to predict the corresponding charges for a specific case. In civil law system, human judges will match the facts with relevant laws, and the final judgments are usually made in accordance with relevant law articles. ...
A Joint Label-Enhanced Representation Based on Pre-trained Model for Charge Prediction
Natural Language Processing and Chinese ComputingAbstractAs one of the important subtasks of legal judgment prediction, charge prediction aims to predict the final charge according to the fact description of a legal case. It can help make legal judgments or provide legal professional guidance for non-...
Comments