skip to main content
short-paper

Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction

Authors Info & Claims
Published:04 April 2022Publication History
Skip Abstract Section

Abstract

Legal judgment prediction (LJP) is used to predict judgment results based on the description of individual legal cases. In order to be more suitable for actual application scenarios in which the case has cited multiple articles and has multiple charges, we formulate legal judgment prediction as a multiple label learning problem and present a deep learning model that can effectively encode the content of each legal case via a multi-residual convolution neural network and the semantics of law articles via an article encoder. An article-wise attention mechanism is proposed to fuse the two types of encoded information. Experimental results derived on the CAIL2018 datasets show that our model provides a significant performance improvement over the existing neural models in predicting relevant law articles and charges.

REFERENCES

  1. [1] Bao Qiaoben, Zan Hongying, Gong Peiyuan, Chen Junyi, and Xiao Yanghua. 2019. Charge prediction with legal attention. In Natural Language Processing and Chinese Computing, Tang Jie, Kan Min-Yen, Zhao Dongyan, Li Sujian, and Zan Hongying (Eds.), Cham, Dunhuang, China. Springer International Publishing, 447458. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Chen Huajie, Cai Deng, Dai Wei, Dai Zehui, and Ding Yadong. 2019. Charge-based prison term prediction with deep gating network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 63626367. Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Dembczyński Krzysztof, Waegeman Willem, Cheng Weiwei, and Hüllermeier Eyke. 2012. On label dependence and loss minimization in multi-label classification. Machine Learning 88 (07 2012), 5–45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, MN, 41714186. Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Dong Qian and Niu Shuzi. 2021. Legal judgment prediction via relational learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21). ACM, New York, NY, 983992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Gan Leilei, Kuang Kun, Yang Yi, and Wu Fei. 2021. Judgment prediction via injecting legal knowledge into neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 35, 14 (May 2021), 1286612874. https://ojs.aaai.org/index.php/AAAI/article/view/17522.Google ScholarGoogle Scholar
  7. [7] He Congqing, Peng Li, Le Yuquan, and He Jiawei. 2019. SECaps: A sequence enhanced capsule model for charge prediction. In International Conference on Artificial Neural Networks (ICANN), Munich, Germany. Springer, Cham, 227–239.Google ScholarGoogle Scholar
  8. [8] Hou Yutai, Che Wanxiang, Lai Yongkui, Zhou Zhihan, Liu Yijia, Liu Han, and Liu Ting. 2020. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 13811393. Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Hu Zikun, Li Xiang, Tu Cunchao, Liu Zhiyuan, and Sun Maosong. 2018. Few-shot charge prediction with discriminative legal attributes. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, NM, 487498. https://www.aclweb.org/anthology/C18-1041.Google ScholarGoogle Scholar
  10. [10] Johnson Rie and Zhang Tong. 2017. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Vancouver, Canada, 562570. Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Katz Daniel Martin, Bommarito Michael James, and Blackman Josh. 2017. A general approach for predicting the behavior of the Supreme Court of the United States. PLoS ONE 12 (2017), 1–18.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Kim Yoon. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 17461751. Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  14. [14] Kort Fred. 1957. Predicting Supreme Court decisions mathematically: A quantitative analysis of the “right to counsel” cases. American Political Science Review 51, 1 (1957), 112.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Cho Kyunghyun, Merrienboer Bart van, Gulcehre Caglar, Bahdanau Dzmitry, Bougares Fethi, Schwenk Holger, and Bengio Yoshua. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724–1734.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Lai Siwei, Xu Liheng, Liu Kang, and Zhao Jun. 2015. Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence 29, 1 (Feb. 2015), 2267–2273.Google ScholarGoogle Scholar
  17. [17] Li Fei and Yu Hong. 2020. ICD coding from clinical text using multi-filter residual convolutional neural network. In Proceedings of the AAAI Conference on Artificial Intelligence 34, 5 (2020), 8180–8187.Google ScholarGoogle Scholar
  18. [18] Lin Wan-Chen, Kuo Tsung-Ting, Chang Tung-Jia, Yen Chueh-An, Chen Chao-Ju, and Lin Shou-de. 2012. Exploiting machine learning models for Chinese legal documents labeling, case classification, and sentencing prediction. International Journal of Computational Linguistics Chinese Language Processing 17, 4 (2012), 49–68.Google ScholarGoogle Scholar
  19. [19] Liu Chao-Lin and Hsieh Chwen-Dar. 2006. Exploring phrase-based classification of judicial documents for criminal charges in Chinese. In Foundations of Intelligent Systems, Esposito Floriana, Raś Zbigniew W., Malerba Donato, and Semeraro Giovanni (Eds.). Springer, Berlin, 681690. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Liu Liqun, Mu Funan, Li Pengyu, Mu Xin, Tang Jing, Ai Xingsheng, Fu Ran, Wang Lifeng, and Zhou Xing. 2019. NeuralClassifier: An open-source neural hierarchical multi-label text classification toolkit. In Proceedings of the 57th 350 Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 87–92.Google ScholarGoogle Scholar
  21. [21] Luo Bingfeng, Feng Yansong, Xu Jianbo, Zhang Xiang, and Zhao Dongyan. 2017. Learning to predict charges for criminal cases with legal basis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 27272736. Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Mikolov Tomas, Chen Kai, Corrado Greg S., and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. In International Conference on Learning Representations (Workshop Poster). Scottsdale, Arizona, USA, 1–12.Google ScholarGoogle Scholar
  23. [23] Mullenbach James, Wiegreffe Sarah, Duke Jon, Sun Jimeng, and Eisenstein Jacob. 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, New Orleans, Louisiana, USA, 1101–1111.Google ScholarGoogle Scholar
  24. [24] Nagel S. S.. 1964. Applying correlation analysis to case prediction. Texas Law Review 42, 7 (1964), 10061017.Google ScholarGoogle Scholar
  25. [25] Rios Anthony and Kavuluru Ramakanth. 2018. Few-shot and zero-shot multi-label learning for structured label spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3132–3142.Google ScholarGoogle Scholar
  26. [26] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Lukasz, and Polosukhin Illia. 2017. Attention is all you need. CoRR abs/1706.03762 (2017). http://arxiv.org/abs/1706.03762.Google ScholarGoogle Scholar
  27. [27] Xiao Chaojun, Zhong Haoxi, Guo Zhipeng, Tu Cunchao, Liu Zhiyuan, Sun Maosong, Feng Yansong, Han Xianpei, Hu Zhen, Wang Heng, and Xu Jianfeng. 2018. CAIL2018: A large-scale legal dataset for judgment prediction. http://arxiv.org/abs/1807.02478.Google ScholarGoogle Scholar
  28. [28] Xu Nuo, Wang Pinghui, Chen Long, Pan Li, Wang Xiaoyan, and Zhao Junzhou. 2020. Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 30863095. Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Yang Wenmian, Jia Weijia, Zhou Xiaojie, and Luo Yutao. 2019. Legal judgment prediction via multi-perspective Bi-feedback network. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19), Macao, China. International Joint Conferences on Artificial Intelligence Organization, 40854091. Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Yang Zichao, Yang Diyi, Dyer Chris, He Xiaodong, Smola Alex, and Hovy Eduard. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, USA, 14801489. Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Zhang Hu, Wang Xin, Tan Hongye, and Li Ru. 2019. Applying data discretization to DPCNN for law article prediction. In Natural Language Processing and Chinese Computing, Tang Jie, Kan Min-Yen, Zhao Dongyan, Li Sujian, and Zan Hongying (Eds.). Springer International Publishing, Cham, Dunhuang, China, 459470.Google ScholarGoogle Scholar
  32. [32] Zhong Haoxi, Guo Zhipeng, Tu Cunchao, Xiao Chaojun, Liu Zhiyuan, and Sun Maosong. 2018. Legal judgment prediction via topological learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 35403549. Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Zhong Haoxi, Xiao Chaojun, Tu Cunchao, Zhang Tianyang, Liu Zhiyuan, and Sun Maosong. 2020. How does NLP benefit legal system: A summary of legal artificial intelligence. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 52185230. Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Zhong Huilin, Zhou Junsheng, Qu Weiguang, Long Yunfei, and Gu Yanhui. 2020. An element-aware multi-representation model for law article prediction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). Association for Computational Linguistics, Online, 66636668. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Asian and Low-Resource Language Information Processing
            ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 4
            July 2022
            464 pages
            ISSN:2375-4699
            EISSN:2375-4702
            DOI:10.1145/3511099
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 4 April 2022
            • Accepted: 1 November 2021
            • Revised: 1 October 2021
            • Received: 1 May 2021
            Published in tallip Volume 21, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • short-paper
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text

          HTML Format

          View this article in HTML Format .

          View HTML Format