ABSTRACT
In this paper, we focus on automatic extraction of statute facets from legal statutes such as Act documents. We define statute facets to be key specific aspects of a statute which can potentially be used in legal arguments. For example, Section 25F of the Industrial Disputes Act (India) contains statute facets such as workman, employer, retrenchment of workmen, continuous service for not less than one year, etc. Such statute facets are often used by lawyers as part of their argumentation and also by judges for deciding on a case. In this paper, we propose a weakly supervised technique for extracting such statute facets from legal text. We use dependency tree structure to extract candidate statute facets and use BM25 ranking function to determine statute-specificity of these candidates. We propose a set of facet types which enable us to realize the definition of statute facets in a more computational way. We use recent deep learning models in a few-shot setting to predict an appropriate facet type for each candidate. Only those candidates with high statute-specificity and for which a facet type is predicted with high confidence, are selected as acceptable statute facets. We evaluate the extracted statute facets through both direct and indirect evaluation as well as conduct a user-study to get validation and feedback from lawyers.
- Vincent AWMM Aleven. 1997. Teaching case-based argumentation through a model and examples. Citeseer.Google Scholar
- Paheli Bhattacharya, Shounak Paul, Kripabandhu Ghosh, Saptarshi Ghosh, and Adam Wyner. 2019. Identification of rhetorical roles of sentences in indian legal judgments. In Legal Knowledge and Information Systems: JURIX 2019: The Thirty-second Annual Conference, Vol. 322. IOS Press, 3.Google Scholar
- Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, and Ion Androutsopoulos. 2020. LEGAL-BERT: The Muppets straight out of Law School. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 2898--2904.Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://aclanthology.org/N19-1423Google Scholar
- Mohammad Hassan Falakmasir and Kevin D Ashley. 2017. Utilizing Vector Space Models for Identifying Legal Factors from Text.. In JURIX. 183--192.Google Scholar
- Ariel Gera, Alon Halfon, Eyal Shnarch, Yotam Perlitz, Liat Ein-Dor, and Noam Slonim. 2022. Zero-Shot Text Classification with Self-Training. In Conference on Empirical Methods in Natural Language Processing.Google Scholar
- Matthew Honnibal, Ines Montani, Sofie Van Landeghem, Adriane Boyd, et al. 2020. spaCy: Industrial-strength natural language processing in python. (2020). https://spacy.io/Google Scholar
- John F Horty and Trevor JM Bench-Capon. 2012. A factor-based definition of precedential constraint. Artificial intelligence and Law 20, 2 (2012), 181--214.Google Scholar
- Andrew JI Jones and Marek Sergot. 1992. Deontic logic in the representation of law: Towards a methodology. Artificial Intelligence and Law 1 (1992), 45--64.Google ScholarDigital Library
- Prathamesh Kalamkar, Aman Tiwari, Astha Agarwal, Saurabh Karn, Smita Gupta, Vivek Raghavan, and Ashutosh Modi. 2022. Corpus for Automatic Structuring of Legal Documents. In Proceedings of the Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 4420--4429. https://aclanthology.org/2022.lrec-1.470Google Scholar
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 7871--7880. https://aclanthology.org/2020.acl-main.703Google ScholarCross Ref
- Jack Mumford, Katie Atkinson, and Trevor Bench-Capon. 2021. Explaining Factor Ascription. In Legal Knowledge and Information Systems. IOS Press, 191--196.Google Scholar
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3982--3992.Google ScholarCross Ref
- M Saravanan and Balaraman Ravindran. 2010. Identification of rhetorical roles for segmentation and summarization of a legal judgment. Artificial Intelligence and Law 18, 1 (2010), 45--76.Google ScholarDigital Library
- Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. 2008. Introduction to information retrieval. Vol. 39. Cambridge University Press Cambridge.Google Scholar
- Andrew Trotman, Antti Puurula, and Blake Burgess. 2014. Improvements to BM25 and language models examined. In Proceedings of the 2014 Australasian Document Computing Symposium. 58--65.Google ScholarDigital Library
- Michael van der Veen and Natalia Sidorova. 2021. Signal Phrase Extraction: A Gateway to Information Retrieval Improvement in Law Texts. In Legal Knowledge and Information Systems. IOS Press, 127--130.Google Scholar
- Hannes Westermann, Vern R Walker, Kevin D Ashley, and Karim Benyekhlef. 2019. Using factors to predict and analyze landlord-tenant decisions to increase access to justice. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law. 133--142.Google ScholarDigital Library
- Adam Wyner and Wim Peters. 2010. Towards annotating and extracting textual legal case factors. In Proceedings of the Language Resources and Evaluation Conference Workshop on Semantic Processing of Legal Texts, Malta.Google Scholar
- Wenpeng Yin, Jamaal Hay, and Dan Roth. 2019. Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3914--3923.Google ScholarCross Ref
Index Terms
- Extraction and Classification of Statute Facets using Few-shot Learning
Recommendations
Unsupervised Legal Concept Extraction from Indian Case Documents using Statutes
FIRE '20: Proceedings of the 12th Annual Meeting of the Forum for Information Retrieval EvaluationFinding legal concepts pertaining to court case judgement documents, is an important task in the field of legal data mining. These concepts are also popularly termed as catchwords/keywords. Existing methods for the task lack the ability to extract ...
Lifting facets of the cut polytope
The cut polytope P"c(G) of a graph G is the convex hull of the incidence vectors of the edge sets of all cuts of G. We give a sufficient condition for an inequality defining a facet of P"c(G) to define a facet of the cut polytope of a graph containing G ...
Clique-Web Facets for Multicut Polytopes
Let G = V , E be a graph. An edge set { uv ∈ E | u ∈ S i , v ∈ S j , i â j }, where S 1, ', S k is a partition of V , is called a multicut with k shores. We investigate the polytopes MC k â n and MC k â n that are defined ...
Comments