skip to main content
10.1145/3322640.3326736acmconferencesArticle/Chapter ViewAbstractPublication PagesicailConference Proceedingsconference-collections
research-article

Improving Sentence Retrieval from Case Law for Statutory Interpretation

Published: 17 June 2019 Publication History

Abstract

Statutory texts employ vague terms that are difficult to understand. Here we study and evaluate methods for retrieving useful sentences from court opinions that elaborate on the meaning of a vague statutory term. Retrieving sentences instead of whole cases may spare a user the need to review long lists of cases in search of useful explanations. We assembled a data set of 4,635 sentences that were responses to three statutory queries and labeled them in terms of their usefulness for interpretation. We have run a series of experiments on this data set, which we have made public, assessing different techniques to solve the task. These include techniques that measure the similarity between the sentence and the query, utilize the context of a sentence, expand queries, or assess the novelty of a sentence with respect to a statutory provision from which the interpreted term comes. Based on a detailed error analysis we propose a specialized sentence retrieval framework that mitigates the challenges of retrieving case law sentences for interpreting statutory terms. The results of evaluating different implementations of the framework are promising (.725 for NDGC at 10, .662 at 100).

References

[1]
James Allan, Courtney Wade, and Alvaro Bolivar. 2003. Retrieval and novelty detection at the sentence level. In Proc. of the 26th international ACM SIGIR conference on Research and development in informaion retrieval. ACM, 314--321.
[2]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135--146.
[3]
Gerlof Bouma. 2009. Normalized (pointwise) mutual information in collocation extraction. Proceedings of GSCL (2009), 31--40.
[4]
Hang Cui, Min-Yen Kan, Tat-Seng Chua, and Jing Xiao. 2004. A comparative study on sentence retrieval for definitional question answering. In SIGIR Workshop on Information Retrieval for Question Answering (IR4QA). 383--390.
[5]
Jordan Daci. 2010. Legal Principles, Legal Values and Legal Norms: are they the same or different? Academicus International Scientific Journal 02 (2010), 109--115.
[6]
Emile de Maat, Kai Krabben, Radboud Winkels, et al. 2010. Machine Learning versus Knowledge Based Classification of Legal Texts. In JURIX. 87--96.
[7]
Emile de Maat and Radboud Winkels. 2009. A next step towards automated modelling of sources of law. In Proc. of the 12th ICAIL. ACM, 31--39.
[8]
Alen Doko, Maja Stula, and Darko Stipanicev. 2013. A recursive tf-isf based sentence retrieval method with local context. IJMLC 3, 2 (2013), 195.
[9]
Timothy Endicott. 2000. Vagueness in Law. Oxford University Press.
[10]
Timothy Endicott. 2014. Law and Language The Stanford Encyclopedia of Philosophy. http://plato.stanford.edu/. Accessed: 2016-02-03.
[11]
Ronald T Fernández, David E Losada, and Leif A Azzopardi. 2011. Extending the language modeling framework for sentence retrieval to include local context. Information Retrieval 14, 4 (2011), 355--389.
[12]
Alejandro Figueroa and John Atkinson. 2012. Contextual language models for ranking answers to natural language definition questions. Computational Intelligence 28, 4 (2012), 528--548.
[13]
John Rupert Firth. 1957. A synopsis of linguistic theory 1930--1955. Studies in Linguistic Analysis (1957).
[14]
Ingo Glaser, Elena Scepankova, and Florian Matthes. 2018. Classifying Semantic Types of Legal Sentences: Portability of Machine Learning Models. In JURIX.
[15]
Zellig S. Harris. 1954. Distributional Structure. WORD 10, 2-3 (1954), 146--162.
[16]
Herbert L. Hart. 1994. The Concept of Law (2nd ed.). Clarendon Press.
[17]
Stefan Höfler, Alexandra Bünzli, and Kyoko Sugisaki. 2011. Detecting legal definitions for automated style checking in draft laws. Technical Report. Department of Informatics, University of Zurich.
[18]
Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, and Tomas Mikolov. 2016. Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016).
[19]
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016).
[20]
Matjaz Juršic, Igor Mozetic, Tomaz Erjavec, and Nada Lavrac. 2010. Lemmagen: Multilingual lemmatisation with induced ripple-down rules. Journal of Universal Computer Science 16, 9 (2010), 1190--1214.
[21]
Klaus Krippendorff. 2011. Computing Krippendorff's alpha-reliability. (2011).
[22]
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In International Conference on ML. 957--966.
[23]
D. N. MacCormick and R. S. Summers. 1991. Interpreting Statutes. Darmouth.
[24]
C.D. Manning, P. Raghavan, and H. Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press.
[25]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013).
[26]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.
[27]
Tomas Mikolov, Scott Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic Regularities in Continuous Space Word Representations. In Proc. of the 2013 Conference of the North American Chapter of the ACL: HLT. ACL.
[28]
Saeedeh Momtazi, Matthew Lease, and Dietrich Klakow. 2010. Effective term weighting for sentence retrieval. In International Conference on Theory and Practice of Digital Libraries. Springer, 482--485.
[29]
Vanessa G Murdock. 2006. Aspects of sentence retrieval. Technical Report. Massachusetts University Amherst Department of Computer Science.
[30]
María-Dolores Olvera-Lobo and Juncal Gutiérrez-Artacho. 2010. Question-answering systems as efficient sources of terminological information: an evaluation. Health Information & Libraries Journal 27, 4 (2010), 268--276.
[31]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing(EMNLP). 1532--1543.
[32]
Jay M Ponte and W Bruce Croft. 1998. A language modeling approach to information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 275--281.
[33]
The President and Fellows of Harvard University. 2018. Caselaw Access Project. https://case.law/. Accessed: 2018-12-21.
[34]
Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45--50.
[35]
Jaromir Savelka and Kevin D Ashley. 2016. Extracting case law sentences for argumentation about the meaning of statutory terms. In Proceedings of the Third Workshop on Argument Mining(ArgMining2016). 50--59.
[36]
Jaromír Savelka and Kevin D Ashley. 2017. Detecting Agent Mentions in US Court Decisions. In JURIX. 39--48.
[37]
J Savelka and Kevin D Ashley. 2017. Using conditional random fields to detect diferent functional types of content in decisions of united states courts with example application to sentence boundary detection. In Workshop on Automated Semantic Analysis of Information in Legal Texts.
[38]
Jaromir Savelka and Kevin D. Ashley. 2018. Segmenting U.S. Court Decisions into Functional and Issue Specific Parts. In JURIX.
[39]
Jaromir Savelka, Vern R Walker, Matthias Grabmair, and Kevin D Ashley. 2017. Sentence boundary detection in adjudicatory decisions in the united states. Traitement automatique des langues 58, 2 (2017), 21--45.
[40]
Vern R Walker, Parisa Bagheri, and Andrew J Lauria. 2015. Argumentation Mining from Judicial Decisions: The Attribution Problem and the Need for Legal Discourse Models. In Workshop on Automated Detection, Extraction and Analysis of Semantic Information in Legal Texts (ASAIL-2015).
[41]
Stephan Walter. 2009. Definition extraction from court decisions using computational linguistic technology. Formal Linguistics and Law 212 (2009), 183.
[42]
Bernhard Waltl, Florian Matthes, Tobias Waltl, and Thomas Grass. 2016. LEXIA: A data science environment for Semantic analysis of german legal texts. Jusletter IT 4, 1 (2016), 4--1.

Cited By

View all
  • (2024)Text mining and machine learning for crime classification: using unstructured narrative court documents in police academicCogent Engineering10.1080/23311916.2024.235985011:1Online publication date: 3-Jun-2024
  • (2024)I beg to differ: how disagreement is handled in the annotation of legal machine learning data setsArtificial Intelligence and Law10.1007/s10506-023-09369-432:3(839-862)Online publication date: 1-Sep-2024
  • (2023)An Intent Taxonomy of Legal Case RetrievalACM Transactions on Information Systems10.1145/362609342:2(1-27)Online publication date: 29-Sep-2023
  • Show More Cited By

Index Terms

  1. Improving Sentence Retrieval from Case Law for Statutory Interpretation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICAIL '19: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law
      June 2019
      312 pages
      ISBN:9781450367547
      DOI:10.1145/3322640
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      In-Cooperation

      • Univ. of Montreal: University of Montreal
      • AAAI
      • IAAIL: Intl Asso for Artifical Intel & Law

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 June 2019

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Information retrieval
      2. case-law analysis
      3. relevant sentences
      4. statutory interpretation

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ICAIL '19
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 69 of 169 submissions, 41%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)28
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 08 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Text mining and machine learning for crime classification: using unstructured narrative court documents in police academicCogent Engineering10.1080/23311916.2024.235985011:1Online publication date: 3-Jun-2024
      • (2024)I beg to differ: how disagreement is handled in the annotation of legal machine learning data setsArtificial Intelligence and Law10.1007/s10506-023-09369-432:3(839-862)Online publication date: 1-Sep-2024
      • (2023)An Intent Taxonomy of Legal Case RetrievalACM Transactions on Information Systems10.1145/362609342:2(1-27)Online publication date: 29-Sep-2023
      • (2023)Identification of Legislative ErrorsProceedings of the Nineteenth International Conference on Artificial Intelligence and Law10.1145/3594536.3595172(2-11)Online publication date: 19-Jun-2023
      • (2023)Semantic matching based legal information retrieval system for COVID-19 pandemicArtificial Intelligence and Law10.1007/s10506-023-09354-x32:2(397-426)Online publication date: 14-Mar-2023
      • (2022)Exploring Narrative Court Documents for Use in Police Academic Education2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)10.1109/CICN56167.2022.10008327(41-45)Online publication date: 4-Dec-2022
      • (2021)Casefinder: A Non-Law Students Smartphone App for Legal Writing2021 3rd International Conference on Modern Educational Technology10.1145/3468978.3468981(13-19)Online publication date: 21-May-2021
      • (2021)Legal information retrieval for understanding statutory termsArtificial Intelligence and Law10.1007/s10506-021-09293-530:2(245-289)Online publication date: 8-Jul-2021
      • (2021)Towards Grad-CAM Based Explainability in a Legal Text Processing Pipeline. Extended VersionAI Approaches to the Complexity of Legal Systems XI-XII10.1007/978-3-030-89811-3_11(154-168)Online publication date: 27-Nov-2021
      • (2021)On the Role of Past Treatment of Terms from Written Laws in Legal ReasoningNew Developments in Legal Reasoning and Logic10.1007/978-3-030-70084-3_15(379-395)Online publication date: 17-Dec-2021

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media