research-article

Legal Charge Prediction via Bilinear Attention Network

Authors:
Yuquan Le

Hunan University, Changsha, China

Hunan University, Changsha, China
View Profile

,
Yuming Zhao

JD AI Research, Beijing, China

JD AI Research, Beijing, China
View Profile

,
Meng Chen

JD AI Research, Beijing, China

JD AI Research, Beijing, China
View Profile

,
Zhe Quan

Hunan University, Changsha, China

Hunan University, Changsha, China
View Profile

,
Xiaodong He

JD AI Research, Beijing, China

JD AI Research, Beijing, China
View Profile

,
Kenli Li

Hunan University, Changsha, China

Hunan University, Changsha, China
View Profile

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementOctober 2022Pages 1024–1033https://doi.org/10.1145/3511808.3557379

Published:17 October 2022Publication History

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Pages 1024–1033

ABSTRACT

The legal charge prediction task aims to judge appropriate charges according to the given fact description in cases. Most existing methods formulate it as a multi-class text classification problem and have achieved tremendous progress. However, the performance on low-frequency charges is still unsatisfactory. Previous studies indicate leveraging the charge label information can facilitate this task, but the approaches to utilizing the label information are not fully explored. In this paper, inspired by the vision-language information fusion techniques in the multi-modal field, we propose a novel model (denoted as LeapBank) by fusing the representations of text and labels to enhance the legal charge prediction task. Specifically, we devise a representation fusion block based on the bilinear attention network to interact the labels and text tokens seamlessly. Extensive experiments are conducted on three real-world datasets to compare our proposed method with state-of-the-art models. Experimental results show that LeapBank obtains up to 8.5% Macro-F1 improvements on the low-frequency charges, demonstrating our model's superiority and competitiveness.

Supplemental Material

CIKM22-fp0487.mp4

mp4

26.9 MB

Download

References

Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. 2016. Label-Embedding for Image Classification. TPAMI 38 (2016), 1425--1438.Google ScholarCross Ref
Hedi Ben-younes, Rémi Cadène, Matthieu Cord, and Nicolas Thome. 2017. MU- TAN: Multimodal Tucker Fusion for Visual Question Answering. In Proc. of ICCV. 2631--2639.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of NAACL-HLT. 4171--4186.Google Scholar
Cunxiao Du, Zhaozheng Chen, Fuli Feng, Lei Zhu, Tian Gan, and Liqiang Nie. 2019. Explicit Interaction Model towards Text Classification. In Proc. of AAAI. 6359--6366.Google ScholarDigital Library
Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc'Aurelio Ranzato, and Tomás Mikolov. 2013. DeViSE: A Deep Visual- Semantic Embedding Model. In Proc. of NeuIPS. 2121--2129.Google Scholar
Congqing He, Li Peng, Yuquan Le, Jiawei He, and Xiangyu Zhu. 2019. SECaps: a sequence enhanced capsule model for charge prediction. In Proc. of ICANN. Springer, 227--239.Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google ScholarDigital Library
Zikun Hu, Xiang Li, Cunchao Tu, Zhiyuan Liu, and Maosong Sun. 2018. Few-Shot Charge Prediction with Discriminative Legal Attributes. In Proc. of COLING. 487--498.Google Scholar
Xin Jiang, Hai Ye, Zhunchen Luo, WenHan Chao, and Wenjia Ma. 2018. Inter-pretable Rationale Augmented Charge Prediction System. In Proc. of COLING. 146--151.Google Scholar
Liangyi Kang, Jie Liu, Lingqiao Liu, Qinfeng Shi, and Dan Ye. 2019. Creating auxiliary representations from charge definitions for criminal charge prediction. ArXiv preprint abs/1911.05202 (2019).Google Scholar
Daniel Martin Katz, Michael J Bommarito II, and Josh Blackman. 2017. A general approach for predicting the behavior of the Supreme Court of the United States. PloS one 12, 4 (2017), e0174698.Google ScholarCross Ref
R Keown. 1980. Mathematical models for legal prediction. Computer/LJ 2 (1980), 829.Google Scholar
Jin-Hwa Kim, Jaehyun Jun, and Byoung-Tak Zhang. 2018. Bilinear Attention Networks. In Proc. of NeurIPS. 1571--1581.Google Scholar
Jin-Hwa Kim, Kyoung Woon On, Woosang Lim, Jeonghee Kim, Jung-Woo Ha, and Byoung-Tak Zhang. 2017. Hadamard Product for Low-rank Bilinear Pooling. In Proc. of ICLR.Google Scholar
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proc. of EMNLP. 1746--1751.Google ScholarCross Ref
Fred Kort. 1957. Predicting Supreme Court Decisions Mathematically: A Quantitative Analysis of the ''Right to Counsel" Cases. American Political Science Review 51, 1 (1957), 1--12.Google ScholarCross Ref
Yuquan Le, Congqing He, Meng Chen, Youzheng Wu, Xiaodong He, and Bowen Zhou. 2020. Learning to Predict Charges for Legal Judgment via Self-Attentive Capsule Network. In Proc. of ECAI.Google Scholar
Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. 2015. Bilinear CNN Models for Fine-Grained Visual Recognition. In Proc. of ICCV. 1449--1457.Google ScholarDigital Library
Wan-Chen Lin, Tsung-Ting Kuo, Tung-Jia Chang, Chueh-An Yen, Chao-Ju Chen, and Shou-de Lin. 2012. Exploiting Machine Learning Models for Chinese Legal Documents Labeling, Case Classification, and Sentencing Prediction. In Inter- national Journal of Computational Linguistics & Chinese Language Processing, Vol. 17. 49--68.Google Scholar
Chao-Lin Liu, Cheng-Tsung Chang, and Jim-How Ho. 2004. Case instance generation and refinement for case-based criminal summary judgments in Chinese. JISE (2004), 783--800.Google Scholar
Chao-Lin Liu and Chwen-Dar Hsieh. 2006. Exploring phrase-based classification of judicial documents for criminal charges in chinese. In Proc. of ISMIS. Springer, 681--690.Google Scholar
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. ArXiv preprint abs/1907.11692 (2019).Google Scholar
Yi-Hung Liu, Yen-Liang Chen, and Wu-Liang Ho. 2015. Predicting associated statutes for legal problems. Information Processing & Management 51, 1 (2015), 194--211.Google ScholarCross Ref
Zhiyuan Liu, Cunchao Tu, and Maosong Sun. 2019. Legal cause prediction with inner descriptions and outer hierarchies. In Proc. of CCL. Springer, 573--586.Google Scholar
Shangbang Long, Cunchao Tu, Zhiyuan Liu, and Maosong Sun. 2019. Automatic judgment prediction via legal reading comprehension. In China National Conference on Chinese Computational Linguistics. Springer, 558--572.Google ScholarDigital Library
Bingfeng Luo, Yansong Feng, Jianbo Xu, Xiang Zhang, and Dongyan Zhao. 2017. Learning to Predict Charges for Criminal Cases with Legal Basis. In Proc. of EMNLP. 2727--2736.Google ScholarCross Ref
Ejan Mackaay and Pierre Robillard. 1974. Predicting judicial decisions: The nearest neighbour rule and visual representation of case patterns. 3(3/4):302--331 pages.Google Scholar
Eneldo Loza Mencia and Johannes Fürnkranz. 2008. Efficient pairwise multilabel classification for large-scale problems in the legal domain. In Proc. of ECML-PKDD. Springer, 50--65.Google Scholar
Taro Miyazaki, Kiminobu Makino, Yuka Takei, Hiroki Okamoto, and Jun Goto. 2019. Label Embedding using Hierarchical Structure of Labels for Twitter Classification. In Proc. of EMNLP. 6317--6322.Google ScholarCross Ref
Stuart S Nagel. 1963. Applying correlation analysis to case prediction. Tex. L. Rev. 42 (1963), 1006.Google Scholar
Vinod Nair and Geoffrey E. Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proc. of ICML. 807--814.Google ScholarDigital Library
Jinseok Nam, Eneldo Loza Mencía, and Johannes Fürnkranz. 2016. All-in Text: Learning Document, Label, and Word Representations Jointly. In Proc. of AAAI. 1948--1954.Google ScholarCross Ref
Yingwei Pan, Ting Yao, Yehao Li, and Tao Mei. 2020. X-Linear Attention Networks for Image Captioning. In Proc. of CVPR. 10968--10977.Google ScholarCross Ref
Nikolaos Pappas and James Henderson. 2019. GILE: A Generalized Input-Label Embedding for Text Classification. TACL 7 (2019), 139--155.Google ScholarCross Ref
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proc. of NAACL-HLT. 2227--2237.Google ScholarCross Ref
Hamed Pirsiavash, Deva Ramanan, and Charless C. Fowlkes. 2009. Bilinear classifiers for visual recognition. In Proc. of NeurIPS. 1482--1490.Google Scholar
José A. Rodríguez-Serrano and Florent Perronnin. 2013. Label embedding for text recognition. In Proc. of BMVC.Google Scholar
Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. 2017. Dynamic Routing Between Capsules. In Proc. of NeurIPS. 3856--3866.Google ScholarDigital Library
Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513--523.Google Scholar
Octavia-Maria Sulea, Marcos Zampieri, Shervin Malmasi, Mihaela Vela, Liviu P Dinu, and Josef van Genabith. 2017. Exploring the Use of Text Classification in the Legal Domain. In Proceedings of ASAIL workshop.Google Scholar
Johan AK Suykens and Joos Vandewalle. 1999. Least squares support vector machine classifiers. Neural processing letters 9, 3 (1999), 293--300.Google Scholar
Jian Tang, Meng Qu, and Qiaozhu Mei. 2015. PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks. In Proc. of SIGKDD. 1165--1174.Google ScholarDigital Library
Joshua B Tenenbaum and William T Freeman. 2000. Separating style and content with bilinear models. Neural computation 12, 6 (2000), 1247--1283.Google Scholar
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. JMLR 9, 11 (2008).Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Proc. of NeurIPS. 5998--6008.Google Scholar
Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, and Lawrence Carin. 2018. Joint Embedding of Words and Labels for Text Classification. In Proc. of ACL. 2321--2331.Google ScholarCross Ref
Lin Xiao, Xin Huang, Boli Chen, and Liping Jing. 2019. Label-Specific Document Representation for Multi-Label Text Classification. In Proc. of EMNLP. 466--475.Google ScholarCross Ref
Nuo Xu, Pinghui Wang, Long Chen, Li Pan, Xiaoyan Wang, and Junzhou Zhao. 2020. Distinguish Confusing Law Articles for Legal Judgment Prediction. In Proc. of ACL. 3086--3095.Google ScholarCross Ref
Wenmian Yang, Weijia Jia, Xiaojie Zhou, and Yutao Luo. 2019. Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network. In Proc. of IJCAI. 4085--4091.Google ScholarCross Ref
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime G. Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Proc. of NeurIPS. 5754--5764.Google Scholar
Majid Yazdani and James Henderson. 2015. A Model of Zero-Shot Learning of Spoken Language Understanding. In Proc. of EMNLP. 244--249.Google ScholarCross Ref
Zhou Yu, Jun Yu, Jianping Fan, and Dacheng Tao. 2017. Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering. In Proc. of ICCV. 1839--1848.Google ScholarCross Ref
Chao Zhang, Zichao Yang, Xiaodong He, and Li Deng. 2020. Multimodal intelligence: Representation learning, information fusion, and applications. JSTSP 14, 3 (2020), 478--493.Google Scholar
Honglun Zhang, Liqiang Xiao, Wenqing Chen, Yongkun Wang, and Yaohui Jin. 2018. Multi-Task Label Embedding for Text Classification. In Proc. of EMNLP. 4545--4553.Google ScholarCross Ref
Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Chaojun Xiao, Zhiyuan Liu, and Maosong Sun. 2018. Legal Judgment Prediction via Topological Learning. In Proc. of EMNLP. 3540--3549.Google ScholarCross Ref
Haoxi Zhong, Yuzhong Wang, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, and Maosong Sun. 2020. Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction. In Proc. of AAAI. 1250--1257.Google ScholarCross Ref
Haoxi Zhong, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, and Maosong Sun. 2020. How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence. In Proc. of ACL. 5218--5230.Google ScholarCross Ref

Index Terms

Legal Charge Prediction via Bilinear Attention Network
1. Applied computing
  1. Law, social and behavioral sciences
    1. Law
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Contrastive Learning for Legal Judgment Prediction
Legal judgment prediction (LJP) is a fundamental task of legal artificial intelligence. It aims to automatically predict the judgment results of legal cases. Three typical subtasks are relevant law article prediction, charge prediction, and term-of-...
Read More
Charge Prediction with Legal Attention
Natural Language Processing and Chinese Computing
Abstract
Charge prediction aims to predict the corresponding charges for a specific case. In civil law system, human judges will match the facts with relevant laws, and the final judgments are usually made in accordance with relevant law articles. ...
Read More
A Joint Label-Enhanced Representation Based on Pre-trained Model for Charge Prediction
Natural Language Processing and Chinese Computing
Abstract
As one of the important subtasks of legal judgment prediction, charge prediction aims to predict the final charge according to the fact description of a legal case. It can help make legal judgments or provide legal professional guidance for non-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
October 2022
5274 pages
ISBN:9781450392365
DOI:10.1145/3511808
General Chairs:
Mohammad Al Hasan
Indiana University Purdue University, Indianapolis, USA
,
Li Xiong
Emory University, Atlanta, USA
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bilinear attention network
charge prediction
label embedding
legal artificial intelligence
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '22 Paper Acceptance Rate621of2,257submissions,28%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 192
  Total Downloads
- Downloads (Last 12 months)101
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Legal Charge Prediction via Bilinear Attention Network

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Contrastive Learning for Legal Judgment Prediction

Charge Prediction with Legal Attention

A Joint Label-Enhanced Representation Based on Pre-trained Model for Charge Prediction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media