research-article

Improving Sentence Retrieval from Case Law for Statutory Interpretation

Authors:

Jaromir Savelka,

Kevin D. AshleyAuthors Info & Claims

ICAIL '19: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law

Pages 113 - 122

https://doi.org/10.1145/3322640.3326736

Published: 17 June 2019 Publication History

Abstract

Statutory texts employ vague terms that are difficult to understand. Here we study and evaluate methods for retrieving useful sentences from court opinions that elaborate on the meaning of a vague statutory term. Retrieving sentences instead of whole cases may spare a user the need to review long lists of cases in search of useful explanations. We assembled a data set of 4,635 sentences that were responses to three statutory queries and labeled them in terms of their usefulness for interpretation. We have run a series of experiments on this data set, which we have made public, assessing different techniques to solve the task. These include techniques that measure the similarity between the sentence and the query, utilize the context of a sentence, expand queries, or assess the novelty of a sentence with respect to a statutory provision from which the interpreted term comes. Based on a detailed error analysis we propose a specialized sentence retrieval framework that mitigates the challenges of retrieving case law sentences for interpreting statutory terms. The results of evaluating different implementations of the framework are promising (.725 for NDGC at 10, .662 at 100).

References

[1]

James Allan, Courtney Wade, and Alvaro Bolivar. 2003. Retrieval and novelty detection at the sentence level. In Proc. of the 26th international ACM SIGIR conference on Research and development in informaion retrieval. ACM, 314--321.

Digital Library

[2]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135--146.

[3]

Gerlof Bouma. 2009. Normalized (pointwise) mutual information in collocation extraction. Proceedings of GSCL (2009), 31--40.

[4]

Hang Cui, Min-Yen Kan, Tat-Seng Chua, and Jing Xiao. 2004. A comparative study on sentence retrieval for definitional question answering. In SIGIR Workshop on Information Retrieval for Question Answering (IR4QA). 383--390.

[5]

Jordan Daci. 2010. Legal Principles, Legal Values and Legal Norms: are they the same or different? Academicus International Scientific Journal 02 (2010), 109--115.

[6]

Emile de Maat, Kai Krabben, Radboud Winkels, et al. 2010. Machine Learning versus Knowledge Based Classification of Legal Texts. In JURIX. 87--96.

Digital Library

[7]

Emile de Maat and Radboud Winkels. 2009. A next step towards automated modelling of sources of law. In Proc. of the 12th ICAIL. ACM, 31--39.

Digital Library

[8]

Alen Doko, Maja Stula, and Darko Stipanicev. 2013. A recursive tf-isf based sentence retrieval method with local context. IJMLC 3, 2 (2013), 195.

[9]

Timothy Endicott. 2000. Vagueness in Law. Oxford University Press.

[10]

Timothy Endicott. 2014. Law and Language The Stanford Encyclopedia of Philosophy. http://plato.stanford.edu/. Accessed: 2016-02-03.

[11]

Ronald T Fernández, David E Losada, and Leif A Azzopardi. 2011. Extending the language modeling framework for sentence retrieval to include local context. Information Retrieval 14, 4 (2011), 355--389.

Digital Library

[12]

Alejandro Figueroa and John Atkinson. 2012. Contextual language models for ranking answers to natural language definition questions. Computational Intelligence 28, 4 (2012), 528--548.

Digital Library

[13]

John Rupert Firth. 1957. A synopsis of linguistic theory 1930--1955. Studies in Linguistic Analysis (1957).

[14]

Ingo Glaser, Elena Scepankova, and Florian Matthes. 2018. Classifying Semantic Types of Legal Sentences: Portability of Machine Learning Models. In JURIX.

[15]

Zellig S. Harris. 1954. Distributional Structure. WORD 10, 2-3 (1954), 146--162.

[16]

Herbert L. Hart. 1994. The Concept of Law (2nd ed.). Clarendon Press.

[17]

Stefan Höfler, Alexandra Bünzli, and Kyoko Sugisaki. 2011. Detecting legal definitions for automated style checking in draft laws. Technical Report. Department of Informatics, University of Zurich.

[18]

Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, and Tomas Mikolov. 2016. Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016).

[19]

Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2016. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016).

[20]

Matjaz Juršic, Igor Mozetic, Tomaz Erjavec, and Nada Lavrac. 2010. Lemmagen: Multilingual lemmatisation with induced ripple-down rules. Journal of Universal Computer Science 16, 9 (2010), 1190--1214.

[21]

Klaus Krippendorff. 2011. Computing Krippendorff's alpha-reliability. (2011).

[22]

Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In International Conference on ML. 957--966.

Digital Library

[23]

D. N. MacCormick and R. S. Summers. 1991. Interpreting Statutes. Darmouth.

[24]

C.D. Manning, P. Raghavan, and H. Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press.

Digital Library

[25]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013).

[26]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.

Digital Library

[27]

Tomas Mikolov, Scott Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic Regularities in Continuous Space Word Representations. In Proc. of the 2013 Conference of the North American Chapter of the ACL: HLT. ACL.

[28]

Saeedeh Momtazi, Matthew Lease, and Dietrich Klakow. 2010. Effective term weighting for sentence retrieval. In International Conference on Theory and Practice of Digital Libraries. Springer, 482--485.

Digital Library

[29]

Vanessa G Murdock. 2006. Aspects of sentence retrieval. Technical Report. Massachusetts University Amherst Department of Computer Science.

[30]

María-Dolores Olvera-Lobo and Juncal Gutiérrez-Artacho. 2010. Question-answering systems as efficient sources of terminological information: an evaluation. Health Information & Libraries Journal 27, 4 (2010), 268--276.

[31]

Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing(EMNLP). 1532--1543.

[32]

Jay M Ponte and W Bruce Croft. 1998. A language modeling approach to information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 275--281.

Digital Library

[33]

The President and Fellows of Harvard University. 2018. Caselaw Access Project. https://case.law/. Accessed: 2018-12-21.

[34]

Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45--50.

[35]

Jaromir Savelka and Kevin D Ashley. 2016. Extracting case law sentences for argumentation about the meaning of statutory terms. In Proceedings of the Third Workshop on Argument Mining(ArgMining2016). 50--59.

[36]

Jaromír Savelka and Kevin D Ashley. 2017. Detecting Agent Mentions in US Court Decisions. In JURIX. 39--48.

[37]

J Savelka and Kevin D Ashley. 2017. Using conditional random fields to detect diferent functional types of content in decisions of united states courts with example application to sentence boundary detection. In Workshop on Automated Semantic Analysis of Information in Legal Texts.

[38]

Jaromir Savelka and Kevin D. Ashley. 2018. Segmenting U.S. Court Decisions into Functional and Issue Specific Parts. In JURIX.

[39]

Jaromir Savelka, Vern R Walker, Matthias Grabmair, and Kevin D Ashley. 2017. Sentence boundary detection in adjudicatory decisions in the united states. Traitement automatique des langues 58, 2 (2017), 21--45.

[40]

Vern R Walker, Parisa Bagheri, and Andrew J Lauria. 2015. Argumentation Mining from Judicial Decisions: The Attribution Problem and the Need for Legal Discourse Models. In Workshop on Automated Detection, Extraction and Analysis of Semantic Information in Legal Texts (ASAIL-2015).

[41]

Stephan Walter. 2009. Definition extraction from court decisions using computational linguistic technology. Formal Linguistics and Law 212 (2009), 183.

[42]

Bernhard Waltl, Florian Matthes, Tobias Waltl, and Thomas Grass. 2016. LEXIA: A data science environment for Semantic analysis of german legal texts. Jusletter IT 4, 1 (2016), 4--1.

Cited By

Bifari EBasbrain AMirza RBafail AAlbaradei SAlhalabi W(2024)Text mining and machine learning for crime classification: using unstructured narrative court documents in police academicCogent Engineering10.1080/23311916.2024.235985011:1Online publication date: 3-Jun-2024
https://doi.org/10.1080/23311916.2024.2359850
Braun D(2024)I beg to differ: how disagreement is handled in the annotation of legal machine learning data setsArtificial Intelligence and Law10.1007/s10506-023-09369-432:3(839-862)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s10506-023-09369-4
Shao YLi HWu YLiu YAi QMao JMa YMa S(2023)An Intent Taxonomy of Legal Case RetrievalACM Transactions on Information Systems10.1145/362609342:2(1-27)Online publication date: 29-Sep-2023
https://dl.acm.org/doi/10.1145/3626093
Show More Cited By

Index Terms

Improving Sentence Retrieval from Case Law for Statutory Interpretation
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Rank aggregation
      2. Similarity measures

Recommendations

Legal information retrieval for understanding statutory terms
Abstract
In this work we study, design, and evaluate computational methods to support interpretation of statutory terms. We propose a novel task of discovering sentences for argumentation about the meaning of statutory terms. The task models the analysis ...
Statute Law Information Retrieval and Entailment
ICAIL '19: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law

Our Yes/No statute law question answering system combines components for both statute law information retrieval and confirmation of textual entailment between statues and legal questions. We describe a statute law question answering system that exploits ...
Statistical query expansion for sentence retrieval and its effects on weak and strong queries
Abstract
The retrieval of sentences that are relevant to a given information need is a challenging passage retrieval task. In this context, the well-known vocabulary mismatch problem arises severely because of the fine granularity of the task. Short ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICAIL '19: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law

June 2019

312 pages

ISBN:9781450367547

DOI:10.1145/3322640

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

In-Cooperation

Univ. of Montreal: University of Montreal
AAAI
IAAIL: Intl Asso for Artifical Intel & Law

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICAIL '19

Sponsor:

SIGAI

ICAIL '19: Seventeenth International Conference on Artificial Intelligence and Law

June 17 - 21, 2019

QC, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 69 of 169 submissions, 41%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
367
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)3

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bifari EBasbrain AMirza RBafail AAlbaradei SAlhalabi W(2024)Text mining and machine learning for crime classification: using unstructured narrative court documents in police academicCogent Engineering10.1080/23311916.2024.235985011:1Online publication date: 3-Jun-2024
https://doi.org/10.1080/23311916.2024.2359850
Braun D(2024)I beg to differ: how disagreement is handled in the annotation of legal machine learning data setsArtificial Intelligence and Law10.1007/s10506-023-09369-432:3(839-862)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s10506-023-09369-4
Shao YLi HWu YLiu YAi QMao JMa YMa S(2023)An Intent Taxonomy of Legal Case RetrievalACM Transactions on Information Systems10.1145/362609342:2(1-27)Online publication date: 29-Sep-2023
https://dl.acm.org/doi/10.1145/3626093
Araszkiewicz MFrancesconi EZurek TAndrade FGrabmair M(2023)Identification of Legislative ErrorsProceedings of the Nineteenth International Conference on Artificial Intelligence and Law10.1145/3594536.3595172(2-11)Online publication date: 19-Jun-2023
https://dl.acm.org/doi/10.1145/3594536.3595172
Zhu JWu JLuo XLiu J(2023)Semantic matching based legal information retrieval system for COVID-19 pandemicArtificial Intelligence and Law10.1007/s10506-023-09354-x32:2(397-426)Online publication date: 14-Mar-2023
https://doi.org/10.1007/s10506-023-09354-x
Bifari EAlhalabi W(2022)Exploring Narrative Court Documents for Use in Police Academic Education2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)10.1109/CICN56167.2022.10008327(41-45)Online publication date: 4-Dec-2022
https://doi.org/10.1109/CICN56167.2022.10008327
Saian RMohd Zakuan Z(2021)Casefinder: A Non-Law Students Smartphone App for Legal Writing2021 3rd International Conference on Modern Educational Technology10.1145/3468978.3468981(13-19)Online publication date: 21-May-2021
https://dl.acm.org/doi/10.1145/3468978.3468981
Šavelka JAshley K(2021)Legal information retrieval for understanding statutory termsArtificial Intelligence and Law10.1007/s10506-021-09293-530:2(245-289)Online publication date: 8-Jul-2021
https://doi.org/10.1007/s10506-021-09293-5
Górski ŁRamakrishna SNowosielski J(2021)Towards Grad-CAM Based Explainability in a Legal Text Processing Pipeline. Extended VersionAI Approaches to the Complexity of Legal Systems XI-XII10.1007/978-3-030-89811-3_11(154-168)Online publication date: 27-Nov-2021
https://doi.org/10.1007/978-3-030-89811-3_11
Savelka JAshley K(2021)On the Role of Past Treatment of Terms from Written Laws in Legal ReasoningNew Developments in Legal Reasoning and Logic10.1007/978-3-030-70084-3_15(379-395)Online publication date: 17-Dec-2021
https://doi.org/10.1007/978-3-030-70084-3_15

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten