CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question Answering

Wiratunga, Nirmalie; Abeyratne, Ramitha; Jayawardena, Lasal; Martin, Kyle; Massie, Stewart; Nkisi-Orji, Ikechukwu; Weerasinghe, Ruvan; Liret, Anne; Fleisch, Bruno

doi:10.1007/978-3-031-63646-2_29

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14775))

Included in the following conference series:

International Conference on Case-Based Reasoning

2067 Accesses
13 Citations

Abstract

Retrieval-Augmented Generation (RAG) enhances Large Language Model (LLM) output by providing prior knowledge as context to input. This is beneficial for knowledge-intensive and expert reliant tasks, including legal question-answering, which require evidence to validate generated text outputs. We highlight that Case-Based Reasoning (CBR) presents key opportunities to structure retrieval as part of the RAG process in an LLM. We introduce CBR-RAG, where CBR cycle’s initial retrieval stage, its indexing vocabulary, and similarity knowledge containers are used to enhance LLM queries with contextually relevant cases. This integration augments the original LLM query, providing a richer prompt. We present an evaluation of CBR-RAG, and examine different representations (i.e. general and domain-specific embeddings) and methods of comparison (i.e. inter, intra and hybrid similarity) on the task of legal question-answering. Our results indicate that the context provided by CBR’s case reuse enforces similarity between relevant components of the questions and the evidence base leading to significant improvements in the quality of generated answers.

This research is funded by SFC International Science Partnerships Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Objection, your honor!: an LLM-driven approach for generating Korean criminal case counterarguments

Article 18 February 2025

LegalGPT: Legal Chain of Thought for the Legal Large Language Model Multi-agent Framework

CAPTAIN at COLIEE 2024: Large Language Model for Legal Text Retrieval and Entailment

Notes

1.
Reproducible code is available: https://github.com/rgu-iit-bt/cbr-for-legal-rag.
2.
https://case.law/.
3.
https://huggingface.co/datasets/umarbutler/open-australian-legal-qa.
4.
Note: Our notation uses calligraphic font for the prompt components ($f(\mathcal {Q}), g(\mathcal {Q}), \mathcal {C}$) to distinguish them from those of cases. Despite the stylistic difference, both prompts and cases employ similar embedding representations.
5.
Test dataset available at open-australian-legal-qa-test.

References

Aleven, V., Ashley, K.D.: Teaching case-based argumentation through a model and examples: empirical evaluation of an intelligent learning environment. In: Artificial Intelligence in Education, vol. 39, pp. 87–94. Citeseer (1997)
Google Scholar
Asai, A., Wu, Z., Wang, Y., Sil, A., Hajishirzi, H.: Self-RAG: learning to retrieve, generate, and critique through self-reflection. In: The Twelfth International Conference on Learning Representations (2024)
Google Scholar
Ashley, K.D.: Reasoning with cases and hypotheticals in hypo. Int. J. Man-Mach. Stud. 34(6), 753–796 (1991)
Article Google Scholar
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “Siamese” time delay neural network. In: Advances in Neural Information Processing Systems, vol. 6. Morgan-Kaufmann (1993)
Google Scholar
Brüninghaus, S., Ashley, K.D.: The role of information extraction for textual CBR. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 74–89. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44593-5_6
Chapter Google Scholar
Butler, U.: Open Australian legal corpus (2024). https://huggingface.co/datasets/umarbutler/open-australian-legal-corpus
Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N., Androutsopoulos, I.: LEGAL-BERT: the muppets straight out of law school. In: Cohn, T., He, Y., Liu, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 2898–2904. Association for Computational Linguistics, Online (2020)
Google Scholar
Chalkidis, I., et al.: LexGLUE: a benchmark dataset for legal language understanding in English. In: Muresan, S., Nakov, P., Villavicencio, A. (eds.) Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland (Volume 1: Long Papers), pp. 4310–4330 (2022)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Google Scholar
Guha, N., et al.: LegalBench: a collaboratively built benchmark for measuring legal reasoning in large language models. Preprint arXiv:2308.11462 (2023)
Hacker, P., Engel, A., Mauer, M.: Regulating chatGPT and other large generative AI models. In: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, pp. 1112–1123 (2023)
Google Scholar
Jiang, A.Q., et al.: Mistral 7b. preprint arXiv:2310.06825 (2023)
Lai, J., Gan, W., Wu, J., Qi, Z., Yu, P.S.: Large language models in law: a survey. preprint arXiv:2312.03718 (2023)
Lee, J.S.: LexGPT 0.1: pre-trained GPT-J models with pile of law. preprint arXiv:2306.05431 (2023)
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020)
Google Scholar
Li, X., Li, J.: Angle-optimized text embeddings. Preprint arXiv:2309.12871 (2023)
Rissland, E.L., Daniels, J.J.: A hybrid CBR-IR approach to legal information retrieval. In: Proceedings of the 5th International Conference on Artificial Intelligence and Law, pp. 52–61 (1995)
Google Scholar
Tang, C., et al.: PolicyGPT: automated analysis of privacy policies with large language models. preprint arXiv:2309.10238 (2023)
Thulke, D., Daheim, N., Dugast, C., Ney, H.: Efficient retrieval augmented generation from unstructured knowledge for task-oriented dialog. Preprint arXiv:2102.04643 (2021)
Tuggener, D., von Däniken, P., Peetz, T., Cieliebak, M.: LEDGAR: a large-scale multi-label corpus for text classification of legal provisions in contracts. In: Calzolari, N., et al. (eds.) Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France, pp. 1235–1241. European Language Resources Association (2020)
Google Scholar
Upadhyay, A., Massie, S.: A case-based approach for content planning in data-to-text generation. In: Keane, M.T., Wiratunga, N. (eds.) ICCBR 2022. LNCS, vol. 13405, pp. 380–394. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-14923-8_25
Chapter Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wiratunga, N., Koychev, I., Massie, S.: Feature selection and generalisation for retrieval of textual cases. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 806–820. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28631-8_58
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Robert Gordon University, Aberdeen, UK
Nirmalie Wiratunga, Ramitha Abeyratne, Lasal Jayawardena, Kyle Martin, Stewart Massie & Ikechukwu Nkisi-Orji
Informatics Institute of Technology, Colombo, Sri Lanka
Lasal Jayawardena & Ruvan Weerasinghe
BT France, Puteaux, France
Anne Liret & Bruno Fleisch

Authors

Nirmalie Wiratunga
View author publications
You can also search for this author in PubMed Google Scholar
Ramitha Abeyratne
View author publications
You can also search for this author in PubMed Google Scholar
Lasal Jayawardena
View author publications
You can also search for this author in PubMed Google Scholar
Kyle Martin
View author publications
You can also search for this author in PubMed Google Scholar
Stewart Massie
View author publications
You can also search for this author in PubMed Google Scholar
Ikechukwu Nkisi-Orji
View author publications
You can also search for this author in PubMed Google Scholar
Ruvan Weerasinghe
View author publications
You can also search for this author in PubMed Google Scholar
Anne Liret
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Fleisch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nirmalie Wiratunga .

Editor information

Editors and Affiliations

Universidad Complutense de Madrid, Madrid, Spain
Juan A. Recio-Garcia
Tecnológico Nacional de México / IT de Mérida, Merida, Mexico
Mauricio G. Orozco-del-Castillo
University College Cork, Cork, Ireland
Derek Bridge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wiratunga, N. et al. (2024). CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question Answering. In: Recio-Garcia, J.A., Orozco-del-Castillo, M.G., Bridge, D. (eds) Case-Based Reasoning Research and Development. ICCBR 2024. Lecture Notes in Computer Science(), vol 14775. Springer, Cham. https://doi.org/10.1007/978-3-031-63646-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-63646-2_29
Published: 24 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63645-5
Online ISBN: 978-3-031-63646-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question Answering