Skip to main content

Few-Shot Charge Prediction with Multi-grained Features and Mutual Information

  • Conference paper
  • First Online:
Chinese Computational Linguistics (CCL 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12869))

Included in the following conference series:

Abstract

Charge prediction aims to predict the final charge for a case according to its fact description and plays an important role in legal assistance systems. With deep learning based methods, prediction on high-frequency charges has achieved promising results but that on few-shot charges is still challenging. In this work, we propose a framework with multi-grained features and mutual information for few-shot charge prediction. Specifically, we extract coarse- and fine-grained features to enhance the model’s capability on representation, based on which the few-shot charges can be better distinguished. Furthermore, we propose a loss function based on mutual information. This loss function leverages the prior distribution of the charges to tune their weights, so the few-shot charges can contribute more on model optimization. Experimental results on several datasets demonstrate the effectiveness and robustness of our method. Besides, our method can work well on tiny datasets and has better efficiency in the training, which provides better applicability in real scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We suggest the reader to refer to the original paper [20] for the details of capsule network.

  2. 2.

    https://github.cosm/thunlp/THULAC-Python.

References

  1. Chen, H., Cai, D., Dai, W., Dai, Z., Ding, Y.: Charge-based prison term prediction with deep gating network. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019, pp. 6361–6366. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1667

  2. Gao, T., Han, X., Liu, Z., Sun, M.: Hybrid attention-based prototypical networks for noisy few-shot relation classification. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, 27 January–1 February 2019, pp. 6407–6414. AAAI Press (2019). https://doi.org/10.1609/aaai.v33i01.33016407

  3. Geng, R., Li, B., Li, Y., Ye, Y., Jian, P., Sun, J.: Few-shot text classification with induction network. CoRR abs/1902.10482 (2019). http://arxiv.org/abs/1902.10482

  4. He, C., Peng, L., Le, Y., He, J., Zhu, X.: SEcaps: a sequence enhanced capsule model for charge prediction. In: Tetko, I.V., Kurková, V., Karpov, P., Theis, F.J. (eds.) Artificial Neural Networks and Machine Learning - ICANN 2019: Text and Time Series - 28th International Conference on Artificial Neural Networks, Munich, Germany, 17–19 September 2019, Proceedings, Part IV. LNCS, vol. 11730, pp. 227–239. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-030-30490-4_19

    Chapter  Google Scholar 

  5. Hu, Z., Li, X., Tu, C., Liu, Z., Sun, M.: Few-shot charge prediction with discriminative legal attributes. In: Bender, E.M., Derczynski, L., Isabelle, P. (eds.) Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA, 20–26 August 2018, pp. 487–498. Association for Computational Linguistics (2018). https://www.aclweb.org/anthology/C18-1041/

  6. Katz, D.M., Bommarito, M.J., Blackman, J.: A general approach for predicting the behavior of the supreme court of the united states. PLOS ONE 12(4), e0174698 (2017)

    Google Scholar 

  7. Keown, R.: Mathematical models for legal prediction. Comput./Law J. 2, 829 (1980)

    Google Scholar 

  8. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1412.6980

  9. Kort, F.: Predicting supreme court decisions mathematically: A quantitative analysis of the “right to counsel” cases. Am. Polit. Sci. Rev. 51(01), 1–12 (1957)

    Article  Google Scholar 

  10. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Bonet, B., Koenig, S. (eds.) Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 25–30 January 2015, Austin, Texas, USA, pp. 2267–2273. AAAI Press (2015). http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9745

  11. Lin, T., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1449–1457. IEEE Computer Society (2015). https://doi.org/10.1109/ICCV.2015.170

  12. Lin, W., Kuo, T., Chang, T., Yen, C., Chen, C., Lin, S.: Exploiting machine learning models for Chinese legal documents labeling, case classification, and sentencing prediction. Int. J. Comput. Linguist. Chin. Lang. Process. 17(4) (2012). http://www.aclclp.org.tw/clclp/v17n4/v17n4a4.pdf

  13. Liu, C., Chang, C., Ho, J.: Case instance generation and refinement for case-based criminal summary judgments in Chinese. J. Inf. Sci. Eng. 20(4), 783–800 (2004). http://www.iis.sinica.edu.tw/page/jise/2004/200407_12.html

  14. Liu, C., Hsieh, C.: Exploring phrase-based classification of judicial documents for criminal charges in Chinese. In: Esposito, F., Ras, Z.W., Malerba, D., Semeraro, G. (eds.) Foundations of Intelligent Systems, 16th International Symposium, ISMIS 2006, Bari, Italy, 27–29 September 2006, Proceedings. LNCS, vol. 4203, pp. 681–690. Springer, Heidelberg (2006). https://doi.org/10.1007/11875604_75

    Chapter  Google Scholar 

  15. Liu, Z., Tu, C., Liu, Z., Sun, M.: Legal cause prediction with inner descriptions and outer hierarchies. In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds.) Chinese Computational Linguistics - 18th China National Conference, CCL 2019, Kunming, China, 18–20 October 2019, Proceedings. LNCS, vol. 11856, pp. 573–586. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32381-3_46

    Chapter  Google Scholar 

  16. Luo, B., Feng, Y., Xu, J., Zhang, X., Zhao, D.: Learning to predict charges for criminal cases with legal basis. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, 9–11 September 2017, pp. 2727–2736. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/d17-1289

  17. Menon, A.K., Jayasumana, S., Rawat, A.S., Jain, H., Veit, A., Kumar, S.: Long-tail learning via logit adjustment. CoRR abs/2007.07314 (2020). https://arxiv.org/abs/2007.07314

  18. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a Meeting Held 5–8 December 2013, Lake Tahoe, Nevada, United States, pp. 3111–3119 (2013). https://proceedings.neurips.cc/paper/2013/hash/9aa42b31882ec039965f3c4923ce901b-Abstract.html

  19. Pan, S., Lu, T., Gu, N., Zhang, H., Xu, C.: Charge prediction for multi-defendant cases with multi-scale attention. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds.) Computer Supported Cooperative Work and Social Computing - 14th CCF Conference, ChineseCSCW 2019, Kunming, China, August 16–18, 2019, Revised Selected Papers, Communications in Computer and Information Science, vol. 1042, pp. 766–777. Springer, Singapore (2019). https://doi.org/10.1007/978-981-15-1377-0_59

    Chapter  Google Scholar 

  20. Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 3856–3866 (2017). https://proceedings.neurips.cc/paper/2017/hash/2cad8fa47bbef282badbb8de5374b894-Abstract.html

  21. Sulea, O., Zampieri, M., Malmasi, S., Vela, M., Dinu, L.P., van Genabith, J.: Exploring the use of text classification in the legal domain. In: Ashley, K.D., et al. (eds.) Proceedings of the Second Workshop on Automated Semantic Analysis of Information in Legal Texts Co-located with the 16th International Conference on Artificial Intelligence and Law (ICAIL 2017), London, UK, 16 June 2017. CEUR Workshop Proceedings, vol. 2143. CEUR-WS.org (2017). http://ceur-ws.org/Vol-2143/paper5.pdf

  22. Wang, P., Fan, Y., Niu, S., Yang, Z., Zhang, Y., Guo, J.: Hierarchical matching network for crime classification. In: Piwowarski, B., Chevalier, M., Gaussier, É., Maarek, Y., Nie, J., Scholer, F. (eds.) Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, 21–25 July 2019, pp. 325–334. ACM (2019). https://doi.org/10.1145/3331184.3331223

  23. Xiao, C., et al.: CAIL2018: A large-scale legal dataset for judgment prediction. CoRR abs/1807.02478 (2018). http://arxiv.org/abs/1807.02478

  24. Xu, H., Liu, B., Shu, L., Yu, P.S.: Lifelong domain word embedding via meta-learning. In: Lang, J. (ed.) Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, 13–19 July 2018, Stockholm, Sweden, pp. 4510–4516. ijcai.org (2018). https://doi.org/10.24963/ijcai.2018/627

  25. Xu, H., Liu, B., Shu, L., Yu, P.S.: Open-world learning and application to product classification. In: Liu, L., et al. (eds.) The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, 13–17 May 2019, pp. 3413–3419. ACM (2019). https://doi.org/10.1145/3308558.3313644

  26. Xu, N., Wang, P., Chen, L., Pan, L., Wang, X., Zhao, J.: Distinguish confusing law articles for legal judgment prediction. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, 5–10 July 2020, pp. 3086–3095. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.acl-main.280

  27. Yang, W., Jia, W., Zhou, X., Luo, Y.: Legal judgment prediction via multi-perspective bi-feedback network. In: Kraus, S. (ed.) Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019, pp. 4085–4091. ijcai.org (2019). https://doi.org/10.24963/ijcai.2019/567

  28. Yu, M., et al.: Diverse few-shot text classification with multiple metrics. In: Walker, M.A., Ji, H., Stent, A. (eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, 1–6 June 2018, Volume 1 (Long Papers), pp. 1206–1215. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/n18-1109

  29. Zhong, H., Guo, Z., Tu, C., Xiao, C., Liu, Z., Sun, M.: Legal judgment prediction via topological learning. In: Riloff, E., Chiang, D., Hockenmaier, J., Tsujii, J. (eds.) Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018, pp. 3540–3549. Association for Computational Linguistics (2018). https://doi.org/10.18653/v1/d18-1390

Download references

Acknowledgements

We thank all the anonymous reviewers for their insightful comments. This work was supported by National Natural Science Foundation of China No. 61872370 and No. 61832017, and Beijing Outstanding Young Scientist Program NO. BJJWZYJH012019100020098.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhicheng Dou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, H., Dou, Z., Zhu, Y., Wen, J. (2021). Few-Shot Charge Prediction with Multi-grained Features and Mutual Information. In: Li, S., et al. Chinese Computational Linguistics. CCL 2021. Lecture Notes in Computer Science(), vol 12869. Springer, Cham. https://doi.org/10.1007/978-3-030-84186-7_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-84186-7_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-84185-0

  • Online ISBN: 978-3-030-84186-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics