Abstract
Legal decision-support systems have the potential to improve access to justice, administrative efficiency, and judicial consistency, but broad adoption of such systems is contingent on development of technologies with low knowledge-engineering, validation, and maintenance costs. This paper describes two approaches to an important form of legal decision support—explainable outcome prediction—that obviate both annotation of an entire decision corpus and manual processing of new cases. The first approach, which uses an attention network for prediction and attention weights to highlight salient case text, was shown to be capable of predicting decisions, but attention-weight-based text highlighting did not demonstrably improve human decision speed or accuracy in an evaluation with 61 human subjects. The second approach, termed semi-supervised case annotation for legal explanations, exploits structural and semantic regularities in case corpora to identify textual patterns that have both predictable relationships to case decisions and explanatory value.
Similar content being viewed by others
Notes
We note that models for legal prediction, as with other inductive models in dynamic domains, can be subject to concept drift (Medvedeva et al. 2020).
See The EXplainable AI in Law (XAILA) (2018) for a recent exception to this generalization.
Cases in which decisions consist of numerical awards can be modeled as regression problems. For simplicity, we confine the discussion in this paper to categorical classification.
Mean.
SE.
At the time of writing, we have not yet completed annotation of each individual issue for ever instance in our data set. The experiments described below therefore involve prediction only of the overall outcome of the case without individual issue decisions.
Decision sections were annotated as well. However, since Decision sections consisted only of brief conclusory text, this portion was useful only for obtaining the decision label—transferred or not transferred—but was not useful for explanation purposes and were therefore not used in tag projection.
The annotated corpus available to researchers at https://github.com/johnaberdeen/Scalable-and-Explainable-Legal-Prediction.
In tenfold cross validation, we observed a mean f-measure of 0.971 and MCC of 0.815 for transfer prediction using an SVM (Platt 1999) applied to the 1133 highest information n-grams \((n=1{-}5)\) occurring in the the stop-word filtered text of the Findings sections of the full corpus.
We used a 300-dimension word embedding based on 55,975,964 words and a skipgram model, which we found outperformed cbow for our task.
See 15(e) of the Rules for Uniform Domain Name Dispute Resolution Policy for CIBF, https://www.icann.org/resources/pages/udrp-rules-2015-03-11-en.
References
Al-Abdulkarim L, Atkinson K, Bench-Capon TJM, Whittle S, Williams R, Wolfenden C (2017) Noise induced hearing loss: an application of the angelic methodology. In: Legal knowledge and information systems—JURIX 2017: the thirtieth annual conference, Luxembourg, 13–15 December 2017, pp 79–88
Alarie B, Niblett A, Yoon A (2017) Using machine learning to predict outcomes in tax law. Available at SSRN https://ssrn.com/abstract=2855977 or https://doi.org/10.2139/ssrn.2855977
Aletras N, Tsarapatsanis D, Preotiuc-Pietro D, Lampos V (2016) Predicting judicial decisions of the European Court of Human Rights: a natural language processing perspective. PeerJ CompSci. https://peerj.com/articles/cs-93/
Aleven VAWMM (1997) Teaching case-based argumentation through a model and examples. PhD thesis, University of Pittsburgh, Pittsburgh. AAI9821228
Aleven V, Ashley K (1996) Doing things with factors. In: Proceedings of the 3rd European workshop on case-based reasoning (EWCR-96), Lausanne, pp 76–90
Ashley KD (2017) Artificial intelligence and legal analytics: new tools for law practice in the digital age. Cambridge University Press, Cambridge
Ashley KD, Aleven V (1997) Reasoning symbolically about partially matched cases. In: Proceedings of the 15th international joint conference on artificial intelligence. Morgan Kauffmann, San Francisco, pp 335–341
Ashley KD, Brüninghaus S (2009) Automatically classifying case texts and predicting outcomes. Artif Intell Law 17(2):125–165
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Preprint arXiv:1409.0473
Bench-Capon TJM, Dunne PE (2007) Argumentation in artificial intelligence. Artif Intell 171(10–15):619–641
Berger AL, Pietra VJD, Pietra SAD (1996) A maximum entropy approach to natural language processing. Comput Linguist 22(1):39–71
Bojanowski P, Grave E, Joulin A, Mikolov T (2016) Enriching word vectors with subword information. CoRR arXiv:abs/1607.04606
Boles DB, Adair LP (2001) The multiple resources questionnaire (MRQ). Proc Hum Fact Ergon Soci Ann Meet 45(25):1790–1794
Bouckaert RR (2005) Bayesian network classifiers in Weka. https://www.cs.waikato.ac.nz/~remco/weka.bn.pdf. Accessed 23 June 2020
Boughorbel S, Jarray F, El-Anbari M (2017) Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS ONE 12(6):E0177678
Branting LK (2000a) An advisory system for pro se protection order applicants. Int Rev Law Comput Technol 14(3):357–369
Branting LK (2000b) Reasoning with rules and precedents: a computational model of legal analysis. Kluwer, Dordrect
Branting LK, Yeh A, Weiss B, Merkhofer EM, Brown B (2017) Inducing predictive models for decision support in administrative adjudication. In: AI approaches to the complexity of legal systems—AICOL international workshops 2015–2017, revised selected papers. Lecture notes in computer science, vol. 10791. Springer, Berlin, pp 465–477
Brooke J (1996) SUS—a quick and dirty usability scale. Usab Eval Ind 189(194):4–7
Brüninghaus S, Ashley KD (1999) Toward adding knowledge to learning algorithms for indexing legal cases. In: Proceedings of the 7th international conference on artificial intelligence and law, ICAIL’99. ACM, New York, pp 9–17. https://doi.org/10.1145/323706.323709
Bruninghaus S, Ashley KD (2003) Predicting outcomes of case based legal arguments. In: Proceedings of the 9th international conference on artificial intelligence and law, ICAIL’03. ACM, New York, pp 233–242
Chalkidis I, Androutsopoulos I, Aletras N (2019) Neural legal judgment prediction in English. CoRR arXiv:1906.02059
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD’16. ACM, New York, pp 785–794
Dunn PH (2003) How judges overrule: speech act theory and the doctrine of stare decisis. Yale Law J 113(2):493–532
Ferro L, Aberdeen J, Branting K, Pfeifer C, Yeh A, Chakraborty A (2019) Scalable methods for annotating legal-decision corpora. In: Proceedings of the natural legal language processing workshop 2019. Association for Computational Linguistics, Minneapolis, pp 12–20
Gunning D (2018) Defense advanced research projects agency (DARPA) program information: explainable artificial intelligence (XAI). https://www.darpa.mil/program/explainable-artificial-intelligence. Last visited Dec 26, 2018
Hadfield G (2016) Rules for a flat world: why humans invented law and how to reinvent it for a complex global economy. Oxford University Press, Oxford
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18
Henschen B (2018) Judging in a mismatch: the ethical challenges of pro se litigation. Public Integr 20(1):34–46
Herrera F, Charte F, Rivera A, del Jesus M (2016) Multilabel classification: problem analysis, metrics and techniques. Springer, Berlin
Hill F, Cho K, Korhonen A (2016) Learning distributed representations of sentences from unlabelled data. In: 2016 conference of the North American chapter of the association for computational linguistics, pp 1367–1377. Association for Computational Linguistics (ACL). 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016; Conference date: 12-06-2016 Through 17-06-2016
Katz DM, Bommarito MJ II, Blackman J (2017) A general approach for predicting the behavior of the supreme court of the united states. PLoS ONE 12(4):e0174698
Lauritsen M, Steenhuis Q (2019) Substantive legal software quality: a gathering storm? In: Proceedings of the 17th international conference on artificial intelligence and law, ICAIL’19. ACM, New York, pp 52–62
Lawrence J, Reed C (2020) Argument mining: a survey. Comput Linguist 45(4):765–818
Lippi M, Torroni P (2016) Argumentation mining: state of the art and emerging trends. ACM Trans Int Technol 16(2):10:1–10:25
Maxwell KT, Oberlander J, Lavrenko V (2009) Evaluation of semantic events for legal case retrieval. In: Proceedings of the WSDM’09 workshop on exploiting semantic annotations in information retrieval, ESAIR’09. ACM, New York, pp 39–41. https://doi.org/10.1145/1506250.1506259
McCarty LT (2018) Research handbook on the law of artificial intelligence, Chap. Finding the right balance in artificial intelligence and law. Edward Elgar Publishing
Medvedeva M, Vols M, Wieling M (2020) Using machine learning to predict decisions of the European court of human rights. Artif Intell Law 28(2):237–266
Mikolov T, Yih SWT, Zweig G (2013) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL-HLT-2013). Association for Computational Linguistics, New York
Peterson M, Waterman D (1985) Rule-based models of legal expertise. In: Walters C (ed) Computing power and legal reasoning. West Publishing Company, Minneapolis, pp 627–659
Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods. MIT Press, Cambridge, pp 185–208
Ren Y, Fei H, Peng Q (2018) Detecting the scope of negation and speculation in biomedical texts by using recursive neural network. In: 2018 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 739–742
Rissland EL, Skalak DB (1989) Combining case-based and rule-based reasoning: a heuristic approach. In: 11th international joint conference on artificial intelligence, Detroit, pp 524–530
Rissland EL, Ashley KD, Branting LK (2005) Case-based reasoning and law. Knowl Eng Rev 20(3):293–298. https://doi.org/10.1017/S0269888906000701
Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. CoRR arXiv:1509.00685
Sauro J, Dumas JS (2009) Comparison of three one-question, post-task usability questionnaires. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI’09, pp 1599–1608
Sergot MJ, Sadri F, Kowalski RA, Kriwaczek F, Hammond P, Cory HT (1986) The British Nationality Act as a logic program. Commun ACM 29(5):370–386. https://doi.org/10.1145/5689.5920
Shulayeva O, Siddharthan A, Wyner A (2017) Recognizing cited facts and principles in legal judgements. Artif Intell Law 25(1):107–126
Sulea O, Zampieri M, Vela M, van Genabith J (2017) Predicting the law area and decisions of French supreme court cases. In: RANLP. INCOMA Ltd, pp 716–722
Surdeanu M, Nallapati R, Gregory G, Walker J, Manning C (2011) Risk analysis for intellectual property litigation. In: Proceedings of the 13th international conference on artificial intelligence and law. ACM, Pittsburgh
The EXplainable AI in Law (XAILA) (2018) 2018 workshop, Groningen. http://xaila.geist.re
Westermann H, Walker VR, Ashley KD, Benyekhlef K (2019) Using factors to predict and analyze landlord-tenant decisions to increase access to justice. In: Proceedings of the 17th international conference on artificial intelligence and law, ICAIL’19. Association for Computing Machinery, New York, pp 133–142
Wyner AZ, Peters W (2010) Lexical semantics and expert legal knowledge towards the identification of legal case factors. In: JURIX, frontiers in artificial intelligence and applications, vol 223. IOS Press, New York, pp 127–136
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of NAACL-HLT, pp 1480–1489
Yu H, Hatzivassiloglou V (2003) Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. In: Proceedings of the 2003 conference on empirical methods in natural language processing, EMNLP’03. Association for Computational Linguistics, Stroudsburg, pp 129–136
Acknowledgements
The MITRE Corporation is a not-for-profit company, chartered in the public interest. This document is approved for Public Release; Distribution Unlimited. Case No. 19-3739. \(\copyright\) 2019 The MITRE Corporation. All rights reserved.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Annotation
Appendix: Annotation
A key goal of SCALE is a methodology that permits development of explainable legal prediction systems by agencies that lack the resources to engineer domain-specific feature sets, a process that requires both extensive expertise in the particular legal domain and experience in feature engineering. Instead, SCALE requires only the linguistic skills necessary to annotate the decision portion of representative subset of cases, a much more limited process.
Our annotation schema for WIPO decisions consists of three layers: Argument Elements, Issues, and Factors (sub-issues).
Tags are applied to clauses and sentences, as opposed to shorter units such as noun phrases, in order to identify the complete linguistic proposition corresponding to the annotation label. The MITRE Annotation Toolkit (MAT)Footnote 13 was used to perform the annotation.
1.1 Argument elements
Although our approach to predictive-text identification is to leverage the Factual Findings and Legal Findings, the annotation schema is designed to capture the full range of argument elements present in cases. These argument elements are as follows:
-
1.
Policy
-
2.
Contention
-
3.
Factual Finding
-
4.
Legal Finding
-
5.
Case Rule
-
6.
Decision
We have found that with these six argument elements, the majority of sentences within the “Findings” and “Decision” sections of WIPO cases can be assigned an argument element label. These argument elements are not specific to WIPO decisions and should be applicable in other domains.
1.2 Issues
Each Argument Element tag is assigned an Issue. The Issue tags include the three required elements that the complainant must establish in order to prevail in a WIPO case. These issues, which are documented in the Uniform Domain Name Dispute Resolution Policy, paragraph 4,Footnote 14 form the backbone of every decision:
-
(1) ICS: Domain name is Identical or Confusingly Similar to a trademark or service mark in which the complainant has rights
-
(2) NRLI: Respondent has No Rights or Legitimate Interests in respect of the domain name
-
(3) Bad Faith: Domain name has been registered and is being used in Bad Faith
For element (2), NRLI, although the dispute is typically approached from the point of view of the complainant demonstrating that the respondent has NRLI, it is very often the case that the panel considers the rights or legitimate interests of the complainant and/or the respondent. In that case, RLI is available as an Issue tag. In addition, the domain name resolution procedure allows for situations in which the complainant abuses the process by filing the complaint in bad faith (CIBF).Footnote 15
The schema thus consists of five Issue tags, plus an Other category:
-
ICS
-
NRLI
-
RLI
-
BadFaith
-
CIBF
-
OTHER.
1.3 Factors
In our annotation scheme factors are the elements which we hypothesize will prove most useful for explainable legal prediction. The factors and corresponding tags are specific to the WIPO issues. For ICS, the ICANN policy does not explicitly identify specific factors that will be considered by the panel, so our tag set for ICS is derived from factual findings commonly observed in the data, such as CownsTM (Complainant owns Trademark) and TMentire (Trademark is contained in its entirety within the Domain Name). For NRLI/RLI, the policy establishes three factors, and for Bad Faith, four factors. Each of these has a corresponding tag. For example, under NRLI there is PriorBizUse from 4(c)(i) of the policy (“Bona fide business use of Domain Name or demonstrable preparations to do so, prior to notice of the dispute”) and under BadFaith there is Confusion4CommGain from 4(b)(iv) of the policy “For commercial gain from confusion with complainant’s mark”). The tag set also includes labels for other common factors observed in the data, such as PrimaFacieEst (Prima Facie Case Established). For CIBF, two factor tags are available: RDNH (Reverse Domain Name Hijacking) and Harass (complaint brought primarily to harass DN holder).
Each level of annotation also has an “Other” option to be used when none of the predefined tags is appropriate, and there is a free-form Comment field which the annotator can use to capture ad hoc labels and enter notes.
1.4 Attributes
A Citation attribute is used to capture the paragraph citation of Policy and Case Rule argument elements. A polarity attribute is used to capture positive/negative values for issues and factors. Figure 5 shows four typical annotations.
Rights and permissions
About this article
Cite this article
Branting, L.K., Pfeifer, C., Brown, B. et al. Scalable and explainable legal prediction. Artif Intell Law 29, 213–238 (2021). https://doi.org/10.1007/s10506-020-09273-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10506-020-09273-1