Skip to main content

Advertisement

Log in

CASRank: A ranking algorithm for legal statute retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Unlike the courts in western countries, legal documents of the Indian judiciary are unstructured, verbose, and noisy. In the justice system, statutes are written laws referred to by judges in support of judicial decisions. Retrieving relevant statutes for a given legal problem can be helpful to lawyers as well as the common man. Moreover, the dearth of publicly available annotated datasets of Indian legal documents limits the scope of legal analytics research. In this paper, we propose a ranking algorithm called CASRank to identify relevant statutes for a legal case query. We also develop a new dataset consisting of 858 Central Acts enacted by the Indian Parliament. Each Central Act is annotated with several attributes, like the act title, enactment date, act definition, chapters, sections, schedules, and footnotes. The first part of the experiment determines the best retrieval model suited for CASRank. The second set of experiments aims to identify the extent to which the attributes of the proposed Central Act dataset contribute towards the retrieval effectiveness of statutes. Experimental results show that the proposed approach obtains a MAP score of 0.0776 with a Precision@10 of 0.0420, showing a considerable increase in retrieval efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability Statement

Central Act dataset [46] generated during this study has been deposited in the Zenodo repository. https://doi.org/10.5281/zenodo.5088102

Notes

  1. https://www.indiacode.nic.in/

  2. https://pypi.org/project/pdfminer.six/

  3. https://sites.google.com/view/fire-2019-aila/dataset-evaluation-plan?authuser=0

  4. https://github.com/mtkh0602/LegalSummarization

  5. www.liiofindia.org/in/cases/cen/INSC/

  6. https://trec.nist.gov/trec_eval/

  7. https://www.nltk.org/

  8. http://terrier.org/

References

  1. Amati G, Van Rijsbergen C J (2002) Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans Inform Syst (TOIS) 20(4):357–389

    Article  Google Scholar 

  2. Belkin N J, Kantor P, Fox E A, Shaw J A (1995) Combining the evidence of multiple query representations for information retrieval. Inform Process Manag 31(3):431–448

    Article  Google Scholar 

  3. Bhattacharya P, Ghosh K, Ghosh S, Pal A, Mehta P, Bhattacharya A, Majumder P (2019) Overview of the FIRE 2019 AILA track: artificial intelligence for legal assistance. In: FIRE (working notes). CEUR workshop proceedings, vol 2517, pp 1–12

  4. Bhattacharya P, Paul S, Ghosh K, Ghosh S, Wyner A Z (2019) Identification of rhetorical roles of sentences in indian legal judgments. arXiv:1911.05405

  5. Bhatti U A, Huang M, Wu D, Zhang Y, Mehmood A, Han H (2019) Recommendation system using feature extraction and pattern recognition in clinical care systems. Enterp Inform Syst 13(3):329–351. https://doi.org/10.1080/17517575.2018.1557256

    Article  Google Scholar 

  6. Das A, Ganguly D, Garain U (2017) Named entity recognition with word embeddings and wikipedia categories for a low-resource language. ACM Trans Asian Low-Resource Lang Inform Process (TALLIP) 16(3):1–19

    Article  Google Scholar 

  7. Farzindar A, Lapalme G (2004) Letsum, an automatic legal text summarizing system. In: Legal knowledge and information systems: JURIX 2004, the seventeenth annual conference, vol 120. IOS Press, pp 11–18

  8. Galgani F, Compton P, Hoffmann A (2012) Citation based summarisation of legal texts. In: PRICAI 2012: trends in artificial intelligence. Springer, Berlin, pp 40–52

  9. Géry M, Largeron C (2012) Bm25t: a bm25 extension for focused information retrieval. Knowl Inform Syst 32(1):217–241

    Article  Google Scholar 

  10. Hachey B, Grover C (2006) Extractive summarisation of legal texts. Artif Intell Law 14(4):305–345

    Article  Google Scholar 

  11. Hliaoutakis A, Varelas G, Voutsakis E, Petrakis EGM, Milios E (2006) Information retrieval by semantic similarity. International Journal on Semantic Web and Information Systems (IJSWIS) 2(3):55–73

    Article  Google Scholar 

  12. Jain D, Borah M D, Biswas A (2020) Fine-tuning textrank for legal document summarization: a bayesian optimization based approach. In: Forum for information retrieval evaluation. FIRE 2020, pp 41–48

  13. Jain R, Agarwal A, Sharma Y (2020) Spectre@aila-fire2020: Supervised rhetorical role labeling for legal judgments using transformers. In: FIRE (working notes). CEUR Workshop proceedings, vol 2826, pp 66–70

  14. Kanapala A, Pal S, Pamula R (2019) Text summarization from legal documents: a survey. Artif Intell Rev 51(3):371–402

    Article  Google Scholar 

  15. Kim M-Y, Rabelo J, Goebel R (2019) Statute law information retrieval and entailment. In: Proceedings of the seventeenth international conference on artificial intelligence and law. ICAIL ’19, pp 283–289

  16. Kim W, Lee Y, Kim D, Won M, Jung H (2016) Ontology-based model of law retrieval system for r&d projects. In: Proceedings of the 18th annual international conference on electronic commerce: e-commerce in smart connected world. ICEC ’16

  17. Lefoane M, Koboyatshwene T, Rammidi G, Narasimham V L (2019) Legal statutes retrieval: a comparative approach on performance of title and statutes descriptive text. In: FIRE (working notes). CEUR Workshop Proceedings, vol 2517. CEUR-WS.org, pp 52–57

  18. Li J, Sun A, Han J, Li C (2020) A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng 34(1):50–70

    Article  Google Scholar 

  19. Liu C-L, Chen K-C (2019) Extracting the gist of chinese judgments of the supreme court. In: Proceedings of the seventeenth international conference on artificial intelligence and law, pp 73–82

  20. Liu S, Zhou M X, Pan S, Song Y, Qian W, Cai W, Lian X (2012) Tiara: interactive, topic-based visual text summarization and analysis. ACM Trans Intell Syst Technol (TIST) 3(2):1–28

    Article  Google Scholar 

  21. Liu Y-H, Chen Y-L, Ho W-L (2015) Predicting associated statutes for legal problems. Inform Process Manag 51 (1):194–211. https://doi.org/10.1016/j.ipm.2014.07.003

    Article  Google Scholar 

  22. Lloret E, Palomar M (2012) Text summarisation in progress: a literature review. Artif Intell Rev 37(1):1–41

    Article  Google Scholar 

  23. Lovins J B (1968) Development of a stemming algorithm. Mech Transl Comput Linguistics 11(1–2):22–31

    Google Scholar 

  24. Mandal A, Ghosh K, Bhattacharya A, Pal A, Ghosh S (2017) Overview of the FIRE 2017 irled track: information retrieval from legal documents. In: FIRE (working notes). CEUR Workshop Proceedings, vol 2036, pp 63–68

  25. Merchant K, Pande Y (2018) Nlp based latent semantic analysis for legal text summarization. In: 2018 International conference on advances in computing, communications and informatics (ICACCI). IEEE, pp 1803–1807

  26. Moens M-F (2005) Combining structured and unstructured information in a retrieval model for accessing legislation. In: Proceedings of the 10th international conference on artificial intelligence and law. ICAIL ’05, pp 141–145

  27. More R, Patil J, Palaskar A, Pawde A (2019) Removing named entities to find precedent legal cases. In: FIRE (working notes). CEUR Workshop proceedings, vol 2517, pp 13–18

  28. Oard D W, Baron J R, Hedin B, Lewis D D, Tomlinson S (2010) Evaluation of information retrieval for e-discovery. Artif Intell Law 18 (4):347–386

    Article  Google Scholar 

  29. Parikh V, Mathur V, Mehta P, Mittal N, Majumder P (2021) Lawsum: a weakly supervised approach for indian legal document summarization. arXiv:2110.01188

  30. Polsley S, Jhunjhunwala P, Huang R (December 2016) CaseSummarizer: a system for automated summarization of legal texts. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: system demonstrations, pp 258–262

  31. Rabelo J, Kim M-Y, Goebel R, Yoshioka M, Kano Y, Satoh K (2019) A summary of the coliee 2019 competition. In: JSAI International symposium on artificial intelligence. Springer, pp 34–49

  32. Robertson S, Zaragoza H (April 2009) The probabilistic relevance framework: Bm25 and beyond. Found Trends Inf Retr 3 (4):333–389. https://doi.org/10.1561/1500000019

  33. Rogers A, Kovaleva O, Rumshisky A (2020) A primer in bertology: what we know about how bert works. Trans Assoc Comput Ling 8:842–866

    Google Scholar 

  34. Saravanan M, Ravindran B, Raman S (2006) Improving legal document summarization using graphical models. In: Proceedings of the 2006 conference on legal knowledge and information systems: JURIX 2006: the nineteenth annual conference. IOS Press, NLD, pp 51–60

  35. Shao Y, Ye Z (2019) Thuir@aila 2019: information retrieval approaches for identifying relevant precedents and statutes. In: FIRE (working notes). CEUR Workshop Proceedings, vol 2517, pp 46–51

  36. Teufel S, Moens M (2002) Summarizing scientific articles: experiments with relevance and rhetorical status. Comput Linguist 28(4):409–445

    Article  Google Scholar 

  37. Thenmozhi D, Kannan K, Aravindan C (2017) A text similarity approach for precedence retrieval from legal documents. In: FIRE (working notes). CEUR Workshop Proceedings, vol 2036, pp 90–91

  38. Trappey C V, Trappey A JC, Liu B-H (2020) Identify trademark legal case precedents - using machine learning to enable semantic analysis of judgments. World Patent Inf 62:101980. https://doi.org/10.1016/j.wpi.2020.101980

    Article  Google Scholar 

  39. Turtle H (1995) Text retrieval in the legal world. Artif Intell Law 3(1):5–54

    Article  Google Scholar 

  40. Van Opijnen M, Santos C (2017) On the concept of relevance in legal information retrieval. Artif Intell Law 25(1):65–87

    Article  Google Scholar 

  41. Wang T, Chen P, Simovici D (2016) A new evaluation measure using compression dissimilarity on text summarization. Appl Intell 45(1):127–134

    Article  Google Scholar 

  42. Wu H C, Luk R W P, Wong K F, Kwok K L (2008) Interpreting tf-idf term weights as making relevance decisions. ACM Trans Inform Syst (TOIS) 26(3):1–37

    Article  Google Scholar 

  43. Zhang N, Pu Y-F, Wang P (2015) An ontology-based approach for chinese legal information retrieval. In: Proc CENet, pp 1–7

  44. Zhang W, Yoshida T, Tang X (2011) A comparative study of tf* idf, lsi and multi-words for text classification. Expert Syst Appl 38(3):2758–2765

    Article  Google Scholar 

  45. Zhao Z, Ning H, Liu L, Huang C, Kong L, Han Y, Han Z (2019) Fire2019@aila: Legal information retrieval using improved BM25. In: FIRE (working notes). CEUR workshop proceedings, vol 2517, pp 40–45

  46. Parashar S (2021) An annotated dataset of Central Acts enacted by the Indian Parliament for legal research. Zenodo. https://doi.org/10.5281/zenodo.5088102

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sakshi Parashar.

Ethics declarations

Conflict of Interests

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Parashar, S., Mittal, N. & Mehta, P. CASRank: A ranking algorithm for legal statute retrieval. Multimed Tools Appl 83, 5369–5386 (2024). https://doi.org/10.1007/s11042-023-15464-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15464-0

Keywords

Navigation