Abstract
In order to justify rulings, legal documents need to present facts as well as an analysis built thereon. In this paper, we present two methods to automatically extract case-relevant facts from French-language legal documents pertaining to tenant-landlord disputes. Our models consist of an ensemble that classifies a given sentence as either Fact or non-Fact, regardless of its context, and a recurrent architecture that contextually determines the class of each sentence in a given document. Both models are combined with a heuristic-based segmentation system that identifies the optimal point in the legal text where the presentation of facts ends and the analysis begins. When tested on a dataset of rulings from the Régie du Logement of the city of ANONYMOUS, the recurrent architecture achieves a better performance than the sentence ensemble classifier. The fact segmentation task produces a splitting index which can be weighted in order to favour shorter segments with few instances of non-facts or longer segments that favour the recall of facts. Our best configuration successfully segments 40% of the dataset within a single sentence of offset with respect to the gold standard. An analysis of the results leads us to believe that the commonly accepted assumption that, in legal documents, facts should precede the analysis is often not followed.
Supported by the CLaC Lab and the CyberJustice Lab at the University of Montréal.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The source code is publicly available at https://gitlab.com/Feasinde/fact-extraction-from-legal-documents.
References
Attardi, G.: Experiments with a multilanguage non-projective dependency parser. In: Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X), pp. 166–170, June 2006
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, May 2014
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, September 2014
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dell’Orletta, F., Marchi, S., Montemagni, S., Plank, B., Venturi, G.: The SPLeT-2012 shared task on dependency parsing of legal texts. In: Proceedings of the 4th Workshop on Semantic Processing of Legal Texts, pp. 42–51, May 2012
de Maat, E., Winkels, R.: A next step towards automated modelling of sources of law. In: Proceedings of the 12th International Conference on Artificial Intelligence and Law, pp. 31–39. ACM, June 2009
de Maat, E., Winkels, R., Van Engers, T.: Automated detection of reference structures in law. Front. Artif. Intell. Appl. 41 (2006)
Dragoni, M., Villata, S., Rizzi, W., Governatori, G.: Combining natural language processing approaches for rule extraction from legal documents. In: Pagallo, U., Palmirani, M., Casanovas, P., Sartor, G., Villata, S. (eds.) AICOL 2015-2017. LNCS (LNAI), vol. 10791, pp. 287–300. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00178-0_19
Grabmair, M., et al.: Introducing LUIMA: an experiment in legal conceptual retrieval of vaccine injury decisions using a UIMA-type system and tools. In: Proceedings of the 15th International Conference on Artificial Intelligence and Law, pp. 69–78. ACM, June 2015
Honnibal, M., Montani, I.: spaCy 2: natural language understanding with bloom embeddings. Convolutional Neural Netw. Incremental Parsing. 7(1) (2017, to appear)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882, September 2014
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2267–2273, February 2015
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Martin, L., et al.: CamemBERT: a Tasty French Language Model. arXiv e-prints, arXiv:1911.03894, November 2019
Nance, D.A.: The best evidence principle. Iowa Law Rev. 73, 227 (1987)
Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, Heidelberg (2012). https://doi.org/10.1007/978-1-4614-3223-4_3
Savelka, J., Ashley, K.D.: Using conditional random fields to detect different functional types of content in decisions of united states courts with example application to sentence boundary detection. In: Workshop on Automated Semantic Analysis of Information in Legal Texts, p. 10 (2017)
Savelka, J., Ashley, K.D.: Segmenting US court decisions into functional and issue specific parts. In: JURIX: The 31st International Conference on Legal Knowledge and Information Systems, pp. 111–120 (2018)
Sagae, K., Tsujii, J.-I.: Dependency parsing and domain adaptation with data-driven LR models and parser ensembles. In: Bunt, H., Merlo, P., Nivre, J. (eds.) Trends in Parsing Technology, pp. 57–68. Springer, Heidelberg (2010). https://doi.org/10.1007/978-90-481-9352-3_4
Stein, A.: Epistemological corollary. In: Foundations of Evidence Law. Oxford University Press (2005)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112, December 2014
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Wald, P.M.: The rhetoric of results and the results of rhetoric: judicial writings. Univ. Chicago Law Rev. 62(4), 1371–1419 (1995)
Walker, V.R., Han, J.H., Ni, X., Yoseda, K.: Semantic types for computational legal reasoning: propositional connectives and sentence roles in the veterans’ claims dataset. In: Proceedings of the 16th Edition of the International Conference on Artificial Intelligence and Law, pp. 217–226 (2017)
Walker, V.R., Lopez, B.C., Rutchik, M.T., Agris, J.L.: Representing the logic of statutory rules in the United States. In: Araszkiewicz, M., Płeszka, K. (eds.) Logic in the Theory and Practice of Lawmaking. LL, vol. 2, pp. 357–381. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19575-9_13
Westermann, H., Walker, V.R., Ashley, K.D., Benyekhlef, K.: Using factors to predict and analyze landlord-tenant decisions to increase access to justice. In: Proceedings of the 17th International Conference on Artificial Intelligence and Law, pp. 133–142. ACM, June 2019
Xiang, R., Chersoni, E., Long, Y., Lu, Q., Huang, C.-R.: Lexical data augmentation for text classification in deep learning. In: Goutte, C., Zhu, X. (eds.) Canadian AI 2020. LNCS (LNAI), vol. 12109, pp. 521–527. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47358-7_53
Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., Xu, B.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. arXiv preprint arXiv:1611.06639, April 2016
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657, December 2015
Acknowledgment
The authors would like to thank the anonymous reviewers for their comments on an earlier version of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Lou, A., Salaün, O., Westermann, H., Kosseim, L. (2021). Extracting Facts from Case Rulings Through Paragraph Segmentation of Judicial Decisions. In: Métais, E., Meziane, F., Horacek, H., Kapetanios, E. (eds) Natural Language Processing and Information Systems. NLDB 2021. Lecture Notes in Computer Science(), vol 12801. Springer, Cham. https://doi.org/10.1007/978-3-030-80599-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-80599-9_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80598-2
Online ISBN: 978-3-030-80599-9
eBook Packages: Computer ScienceComputer Science (R0)