Abstract
Legal judgment prediction (LJP) is used to predict judgment results based on the description of individual legal cases. In order to be more suitable for actual application scenarios in which the case has cited multiple articles and has multiple charges, we formulate legal judgment prediction as a multiple label learning problem and present a deep learning model that can effectively encode the content of each legal case via a multi-residual convolution neural network and the semantics of law articles via an article encoder. An article-wise attention mechanism is proposed to fuse the two types of encoded information. Experimental results derived on the CAIL2018 datasets show that our model provides a significant performance improvement over the existing neural models in predicting relevant law articles and charges.
- [1] . 2019. Charge prediction with legal attention. In Natural Language Processing and Chinese Computing, , , , , and (Eds.), Cham, Dunhuang, China. Springer International Publishing, 447–458. Google ScholarDigital Library
- [2] . 2019. Charge-based prison term prediction with deep gating network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 6362–6367. Google ScholarCross Ref
- [3] . 2012. On label dependence and loss minimization in multi-label classification. Machine Learning 88 (
07 2012), 5–45. Google ScholarDigital Library - [4] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, MN, 4171–4186. Google ScholarCross Ref
- [5] . 2021. Legal judgment prediction via relational learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21). ACM, New York, NY, 983–992. Google ScholarDigital Library
- [6] . 2021. Judgment prediction via injecting legal knowledge into neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 35, 14 (
May 2021), 12866–12874. https://ojs.aaai.org/index.php/AAAI/article/view/17522.Google Scholar - [7] . 2019. SECaps: A sequence enhanced capsule model for charge prediction. In International Conference on Artificial Neural Networks (ICANN), Munich, Germany. Springer, Cham, 227–239.Google Scholar
- [8] . 2020. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 1381–1393. Google ScholarCross Ref
- [9] . 2018. Few-shot charge prediction with discriminative legal attributes. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, NM, 487–498. https://www.aclweb.org/anthology/C18-1041.Google Scholar
- [10] . 2017. Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Vancouver, Canada, 562–570. Google ScholarCross Ref
- [11] . 2017. A general approach for predicting the behavior of the Supreme Court of the United States. PLoS ONE 12 (2017), 1–18.Google ScholarCross Ref
- [12] . 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1746–1751. Google ScholarCross Ref
- [13] . 2014. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).Google Scholar
- [14] . 1957. Predicting Supreme Court decisions mathematically: A quantitative analysis of the “right to counsel” cases. American Political Science Review 51, 1 (1957), 1–12.Google ScholarCross Ref
- [15] . 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724–1734.Google ScholarCross Ref
- [16] . 2015. Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence 29, 1 (Feb. 2015), 2267–2273.Google Scholar
- [17] . 2020. ICD coding from clinical text using multi-filter residual convolutional neural network. In Proceedings of the AAAI Conference on Artificial Intelligence 34, 5 (2020), 8180–8187.Google Scholar
- [18] . 2012. Exploiting machine learning models for Chinese legal documents labeling, case classification, and sentencing prediction. International Journal of Computational Linguistics Chinese Language Processing 17, 4 (2012), 49–68.Google Scholar
- [19] . 2006. Exploring phrase-based classification of judicial documents for criminal charges in Chinese. In Foundations of Intelligent Systems, , , , and (Eds.). Springer, Berlin, 681–690. Google ScholarDigital Library
- [20] . 2019. NeuralClassifier: An open-source neural hierarchical multi-label text classification toolkit. In Proceedings of the 57th 350 Annual Meeting of the Association for Computational Linguistics. Florence, Italy. 87–92.Google Scholar
- [21] . 2017. Learning to predict charges for criminal cases with legal basis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 2727–2736. Google ScholarCross Ref
- [22] . 2013. Efficient estimation of word representations in vector space. In International Conference on Learning Representations (Workshop Poster). Scottsdale, Arizona, USA, 1–12.Google Scholar
- [23] . 2018. Explainable prediction of medical codes from clinical text. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, New Orleans, Louisiana, USA, 1101–1111.Google Scholar
- [24] . 1964. Applying correlation analysis to case prediction. Texas Law Review 42, 7 (1964), 1006–1017.Google Scholar
- [25] . 2018. Few-shot and zero-shot multi-label learning for structured label spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3132–3142.Google Scholar
- [26] . 2017. Attention is all you need. CoRR abs/1706.03762 (2017). http://arxiv.org/abs/1706.03762.Google Scholar
- [27] . 2018. CAIL2018: A large-scale legal dataset for judgment prediction. http://arxiv.org/abs/1807.02478.Google Scholar
- [28] . 2020. Distinguish confusing law articles for legal judgment prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 3086–3095. Google ScholarCross Ref
- [29] . 2019. Legal judgment prediction via multi-perspective Bi-feedback network. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19), Macao, China. International Joint Conferences on Artificial Intelligence Organization, 4085–4091. Google ScholarCross Ref
- [30] . 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, USA, 1480–1489. Google ScholarCross Ref
- [31] . 2019. Applying data discretization to DPCNN for law article prediction. In Natural Language Processing and Chinese Computing, , , , , and (Eds.). Springer International Publishing, Cham, Dunhuang, China, 459–470.Google Scholar
- [32] . 2018. Legal judgment prediction via topological learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3540–3549. Google ScholarCross Ref
- [33] . 2020. How does NLP benefit legal system: A summary of legal artificial intelligence. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5218–5230. Google ScholarCross Ref
- [34] . 2020. An element-aware multi-representation model for law article prediction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). Association for Computational Linguistics, Online, 6663–6668. Google ScholarCross Ref
Index Terms
- Mulan: A Multiple Residual Article-Wise Attention Network for Legal Judgment Prediction
Recommendations
Contrastive Learning for Legal Judgment Prediction
Legal judgment prediction (LJP) is a fundamental task of legal artificial intelligence. It aims to automatically predict the judgment results of legal cases. Three typical subtasks are relevant law article prediction, charge prediction, and term-of-...
Legal Judgment Prediction Incorporating Guiding Cases Matching
Natural Language Processing and Chinese ComputingAbstractLegal judgment prediction aims to predict the judgment result based on the case fact description. It is an important application of natural language processing within the legal field. To enhance the impartiality and consistency of the judiciary, ...
Improving legal judgment prediction through reinforced criminal element extraction
AbstractLegal text mining is targeted at automatically analyzing the texts in the legal domain by employing various natural language processing techniques and has attracted enormous attention from the NLP community. As one of the most crucial ...
Comments