short-paper

Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction

Authors:
Junyi Chen

School of Information Engineering, Zhengzhou University, Zhengzhou, Henan, China

School of Information Engineering, Zhengzhou University, Zhengzhou, Henan, China

0000-0003-0789-9445
View Profile

,
Lan Du

Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton, Victoria, Australia

Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton, Victoria, Australia

0000-0002-9925-0223
View Profile

,
Ming Liu

Faculty of Science Engineering and Built Environment, Deakin University, Waurn Ponds, Victoria, Australia

Faculty of Science Engineering and Built Environment, Deakin University, Waurn Ponds, Victoria, Australia

0000-0002-2160-6111
View Profile

,
Xiabing Zhou

School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu, China

School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu, China

0000-0002-6497-8118
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 21 Issue 4Article No.: 81pp 1–15https://doi.org/10.1145/3503157

Published:04 April 2022Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Legal judgment prediction (LJP) is used to predict judgment results based on the description of individual legal cases. In order to be more suitable for actual application scenarios in which the case has cited multiple articles and has multiple charges, we formulate legal judgment prediction as a multiple label learning problem and present a deep learning model that can effectively encode the content of each legal case via a multi-residual convolution neural network and the semantics of law articles via an article encoder. An article-wise attention mechanism is proposed to fuse the two types of encoded information. Experimental results derived on the CAIL2018 datasets show that our model provides a significant performance improvement over the existing neural models in predicting relevant law articles and charges.

REFERENCES

[1] Bao Qiaoben, Zan Hongying, Gong Peiyuan, Chen Junyi, and Xiao Yanghua. 2019. Charge prediction with legal attention. In Natural Language Processing and Chinese Computing, Tang Jie, Kan Min-Yen, Zhao Dongyan, Li Sujian, and Zan Hongying (Eds.), Cham, Dunhuang, China. Springer International Publishing, 447–458. Google ScholarDigital Library
[2] Chen Huajie, Cai Deng, Dai Wei, Dai Zehui, and Ding Yadong. 2019. Charge-based prison term prediction with deep gating network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 6362–6367. Google ScholarCross Ref
[3] Dembczyński Krzysztof, Waegeman Willem, Cheng Weiwei, and Hüllermeier Eyke. 2012. On label dependence and loss minimization in multi-label classification. Machine Learning 88 (07 2012), 5–45. Google ScholarDigital Library
[4] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, MN, 4171–4186. Google ScholarCross Ref
[5] Dong Qian and Niu Shuzi. 2021. Legal judgment prediction via relational learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21). ACM, New York, NY, 983–992. Google ScholarDigital Library
[6] Gan Leilei, Kuang Kun, Yang Yi, and Wu Fei. 2021. Judgment prediction via injecting legal knowledge into neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 35, 14 (May 2021), 12866–12874. https://ojs.aaai.org/index.php/AAAI/article/view/17522.Google Scholar
[7] He Congqing, Peng Li, Le Yuquan, and He Jiawei. 2019. SECaps: A sequence enhanced capsule model for charge prediction. In International Conference on Artificial Neural Networks (ICANN), Munich, Germany. Springer, Cham, 227–239.Google Scholar
[8] Hou Yutai, Che Wanxiang, Lai Yongkui, Zhou Zhihan, Liu Yijia, Liu Han, and Liu Ting. 2020. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 1381–1393. Google ScholarCross Ref
[9] Hu Zikun, Li Xiang, Tu Cunchao, Liu Zhiyuan, and Sun Maosong. 2018. Few-shot charge prediction with discriminative legal attributes. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, NM, 487–498. https://www.aclweb.org/anthology/C18-1041.Google Scholar
[10] Johnson Rie and Zhang Tong. 2017. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Vancouver, Canada, 562–570. Google ScholarCross Ref
[11] Katz Daniel Martin, Bommarito Michael James, and Blackman Josh. 2017. A general approach for predicting the behavior of the Supreme Court of the United States. PLoS ONE 12 (2017), 1–18.Google ScholarCross Ref
[12] Kim Yoon. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1746–1751. Google ScholarCross Ref
[13] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).Google Scholar
[14] Kort Fred. 1957. Predicting Supreme Court decisions mathematically: A quantitative analysis of the “right to counsel” cases. American Political Science Review 51, 1 (1957), 1–12.Google ScholarCross Ref
[15] Cho Kyunghyun, Merrienboer Bart van, Gulcehre Caglar, Bahdanau Dzmitry, Bougares Fethi, Schwenk Holger, and Bengio Yoshua. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724–1734.Google ScholarCross Ref
[16] Lai Siwei, Xu Liheng, Liu Kang, and Zhao Jun. 2015. Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence 29, 1 (Feb. 2015), 2267–2273.Google Scholar
[17] Li Fei and Yu Hong. 2020. ICD coding from clinical text using multi-filter residual convolutional neural network. In Proceedings of the AAAI Conference on Artificial Intelligence 34, 5 (2020), 8180–8187.Google Scholar
[18] Lin Wan-Chen, Kuo Tsung-Ting, Chang Tung-Jia, Yen Chueh-An, Chen Chao-Ju, and Lin Shou-de. 2012. Exploiting machine learning models for Chinese legal documents labeling, case classification, and sentencing prediction. International Journal of Computational Linguistics Chinese Language Processing 17, 4 (2012), 49–68.Google Scholar
[19] Liu Chao-Lin and Hsieh Chwen-Dar. 2006. Exploring phrase-based classification of judicial documents for criminal charges in Chinese. In Foundations of Intelligent Systems, Esposito Floriana, Raś Zbigniew W., Malerba Donato, and Semeraro Giovanni (Eds.). Springer, Berlin, 681–690. Google ScholarDigital Library
[20] Liu Liqun, Mu Funan, Li Pengyu, Mu Xin, Tang Jing, Ai Xingsheng, Fu Ran, Wang Lifeng, and Zhou Xing. 2019. NeuralClassifier: An open-source neural hierarchical multi-label text classification toolkit. In Proceedings of the 57th 350 Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 87–92.Google Scholar
[21] Luo Bingfeng, Feng Yansong, Xu Jianbo, Zhang Xiang, and Zhao Dongyan. 2017. Learning to predict charges for criminal cases with legal basis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 2727–2736. Google ScholarCross Ref
[22] Mikolov Tomas, Chen Kai, Corrado Greg S., and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. In International Conference on Learning Representations (Workshop Poster). Scottsdale, Arizona, USA, 1–12.Google Scholar
[23] Mullenbach James, Wiegreffe Sarah, Duke Jon, Sun Jimeng, and Eisenstein Jacob. 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, New Orleans, Louisiana, USA, 1101–1111.Google Scholar
[24] Nagel S. S.. 1964. Applying correlation analysis to case prediction. Texas Law Review 42, 7 (1964), 1006–1017.Google Scholar
[25] Rios Anthony and Kavuluru Ramakanth. 2018. Few-shot and zero-shot multi-label learning for structured label spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3132–3142.Google Scholar
[26] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Lukasz, and Polosukhin Illia. 2017. Attention is all you need. CoRR abs/1706.03762 (2017). http://arxiv.org/abs/1706.03762.Google Scholar
[27] Xiao Chaojun, Zhong Haoxi, Guo Zhipeng, Tu Cunchao, Liu Zhiyuan, Sun Maosong, Feng Yansong, Han Xianpei, Hu Zhen, Wang Heng, and Xu Jianfeng. 2018. CAIL2018: A large-scale legal dataset for judgment prediction. http://arxiv.org/abs/1807.02478.Google Scholar
[28] Xu Nuo, Wang Pinghui, Chen Long, Pan Li, Wang Xiaoyan, and Zhao Junzhou. 2020. Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 3086–3095. Google ScholarCross Ref
[29] Yang Wenmian, Jia Weijia, Zhou Xiaojie, and Luo Yutao. 2019. Legal judgment prediction via multi-perspective Bi-feedback network. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19), Macao, China. International Joint Conferences on Artificial Intelligence Organization, 4085–4091. Google ScholarCross Ref
[30] Yang Zichao, Yang Diyi, Dyer Chris, He Xiaodong, Smola Alex, and Hovy Eduard. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, USA, 1480–1489. Google ScholarCross Ref
[31] Zhang Hu, Wang Xin, Tan Hongye, and Li Ru. 2019. Applying data discretization to DPCNN for law article prediction. In Natural Language Processing and Chinese Computing, Tang Jie, Kan Min-Yen, Zhao Dongyan, Li Sujian, and Zan Hongying (Eds.). Springer International Publishing, Cham, Dunhuang, China, 459–470.Google Scholar
[32] Zhong Haoxi, Guo Zhipeng, Tu Cunchao, Xiao Chaojun, Liu Zhiyuan, and Sun Maosong. 2018. Legal judgment prediction via topological learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3540–3549. Google ScholarCross Ref
[33] Zhong Haoxi, Xiao Chaojun, Tu Cunchao, Zhang Tianyang, Liu Zhiyuan, and Sun Maosong. 2020. How does NLP benefit legal system: A summary of legal artificial intelligence. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5218–5230. Google ScholarCross Ref
[34] Zhong Huilin, Zhou Junsheng, Qu Weiguang, Long Yunfei, and Gu Yanhui. 2020. An element-aware multi-representation model for law article prediction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). Association for Computational Linguistics, Online, 6663–6668. Google ScholarCross Ref

Index Terms

Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction

Recommendations

Contrastive Learning for Legal Judgment Prediction
Legal judgment prediction (LJP) is a fundamental task of legal artificial intelligence. It aims to automatically predict the judgment results of legal cases. Three typical subtasks are relevant law article prediction, charge prediction, and term-of-...
Read More
Legal Judgment Prediction Incorporating Guiding Cases Matching
Natural Language Processing and Chinese Computing
Abstract
Legal judgment prediction aims to predict the judgment result based on the case fact description. It is an important application of natural language processing within the legal field. To enhance the impartiality and consistency of the judiciary, ...
Read More
Improving legal judgment prediction through reinforced criminal element extraction
Abstract
Legal text mining is targeted at automatically analyzing the texts in the legal domain by employing various natural language processing techniques and has attracted enormous attention from the NLP community. As one of the most crucial ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 21, Issue 4
July 2022
464 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3511099
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 April 2022
- Accepted: 1 November 2021
- Revised: 1 October 2021
- Received: 1 May 2021
Published in tallip Volume 21, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Legal judgment prediction
neural networks
Qualifiers
- short-paper
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 563
  Total Downloads
- Downloads (Last 12 months)189
- Downloads (Last 6 weeks)23
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Contrastive Learning for Legal Judgment Prediction

Legal Judgment Prediction Incorporating Guiding Cases Matching

Improving legal judgment prediction through reinforced criminal element extraction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Caption

Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Contrastive Learning for Legal Judgment Prediction

Legal Judgment Prediction Incorporating Guiding Cases Matching

Improving legal judgment prediction through reinforced criminal element extraction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Share this Publication link

Share on Social Media