Skip to main content
Log in

Ensemble methods for improving extractive summarization of legal case judgements

  • Original Research
  • Published:
Artificial Intelligence and Law Aims and scope Submit manuscript

Abstract

Summarization of legal case judgement documents is a practical and challenging problem, for which many summarization algorithms of different varieties have been tried. In this work, rather than developing yet another summarization algorithm, we investigate if intelligently ensembling (combining) the outputs of multiple (base) summarization algorithms can lead to better summaries of legal case judgements than any of the base algorithms. Using two datasets of case judgement documents from the Indian Supreme Court, one with extractive gold standard summaries and the other with abstractive gold standard summaries, we apply various ensembling techniques on summaries generated by a wide variety of summarization algorithms. The ensembling methods applied range from simple voting-based methods to ranking-based and graph-based ensembling methods. We show that many of our ensembling methods yield summaries that are better than the summaries produced by any of the individual base algorithms, in terms of ROUGE and METEOR scores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. A few sentences present in the original legal documents have been modified slightly by the experts to improve upon the grammatical flow of the sentences.

  2. Since there is only one supervised domain-specific algorithm, we do not separately consider domain-specific and domain-independent algorithms among supervised ones.

  3. https://en.wikipedia.org/wiki/Positional_voting.

References

  • Ali S, Tirumala SS, Sarrafzadeh A (2015) Ensemble learning methods for decision making: Status and future prospects. In: Proceedings of international conference on machine learning and cybernetics (ICMLC), pp 211–216

  • Banerjee S, Lavie A (2005) METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72

  • Bhattacharya P, Hiware K, Rajgaria S, et al (2019) A comparative study of summarization algorithms applied to legal case judgments. In: ECIR

  • Bhattacharya P, Poddar S, Rudra K, et al (2021) Incorporating domain knowledge for extractive summarization of legal case documents. In: Proc. international conference on artificial intelligence and law

  • Collins E, Augenstein I, Riedel S (2017) A supervised approach to extractive summarisation of scientific papers. In: Proceedings of the 21st conference on computational natural language learning (CoNLL 2017), pp 195–205

  • Deroy A, Bhattacharya P, Ghosh K, et al (2021) An analytical study of algorithmic and expert summaries of legal cases. In: Legal knowledge and information systems. IOS Press, pp 90–99

  • Dong X, Yu Z, Cao W et al (2019) A survey on ensemble learning. Front Comp Sci 14:241–258

    Article  Google Scholar 

  • Dutta S, Chandra V, Mehra K et al (2018) Ensemble algorithms for microblog summarization. IEEE Intell Syst 33(3):4–14

    Article  Google Scholar 

  • Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479

    Article  Google Scholar 

  • Fabbri AR, Kryściński W, McCann B et al (2021) SummEval: re-evaluating summarization evaluation. Trans Assoc Comput Linguist 9:391–409

    Article  Google Scholar 

  • Farzindar A, Lapalme G (2004) Letsum, an automatic legal text summarizing system. In: JURIX

  • Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp 855–864

  • He Z, Chen C, Bu J, et al (2012) Document summarization based on data reconstruction. In: AAAI

  • Kleinberg JM (1999) Hubs, authorities, and communities. ACM Comput Surv (CSUR) 31:5–7

    Article  Google Scholar 

  • Kobayashi H (2018) Frustratingly easy model ensemble for abstractive summarization. In: Proceedings of the conference on empirical methods in natural language processing, pp 4165–4176

  • Li K, Han Y (2010) Study of selective ensemble learning method and its diversity based on decision tree and neural network. In: Proceedings of Chinese control and decision conference, pp 1310–1315

  • Lin CY (2004) ROUGE: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81

  • Liu CL, Chen KC (2019) Extracting the gist of Chinese judgments of the supreme court. In: ICAIL

  • Liu Y (2019) Fine-tune BERT for extractive summarization. ArXiv:1903.10318

  • Mallick C, Das AK, Ding W et al (2021) Ensemble summarization of bio-medical articles integrating clustering and multi-objective evolutionary algorithms. Appl Soft Comput 106(107):347

    Google Scholar 

  • Maslov S, Redner S (2008) Promise and pitfalls of extending google’s pagerank algorithm to citation networks. J Neurosci 28(44):11,103-11,105

    Article  CAS  Google Scholar 

  • Mehta P, Majumder P (2018) Effective aggregation of various summarization techniques. Inf Process Manage 54(2):145–158

    Article  Google Scholar 

  • Moawad I, Aref M (2012) Semantic graph reduction approach for abstractive text summarization. In: International conference on computer engineering and systems, pp 132–138

  • Mohammadi M, Rezaei J (2020) Ensemble ranking: aggregation of rankings produced by different multi-criteria decision-making methods. Omega 96(102):254

    Google Scholar 

  • Nallapati R, Zhai F, Zhou B (2017) Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of AAAI international conference

  • Nenkova A, Maskey S, Liu Y (2011) Automatic summarization. In: Proceedings of ACL

  • Page L, Brin S, Motwani R et al (1999) The pagerank citation ranking: Bringing order to the web. Tech. rep, Stanford InfoLab

  • Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 701–710

  • Polsley S, Jhunjhunwala P, Huang R (2016) Casesummarizer: A system for automated summarization of legal texts. In: COLING

  • Rincy TN, Gupta R (2020) Ensemble learning techniques and its efficiency in machine learning: a survey. In: International conference on data, engineering and applications (IDEA), pp 1–6

  • Saravanan M, Ravindran B, Raman S (2006) Improving legal document summarization using graphical models. In: Proceedings of the 2006 conference on legal knowledge and information systems: JURIX 2006: the nineteenth annual conference. IOS Press, NLD, pp 51–60

  • Shukla A, Bhattacharya P, Poddar S, et al (2022) Legal case document summarization: extractive and abstractive methods and their evaluation. In: Proceedings of the conference of the Asia-Pacific chapter of the association for computational linguistics and the international joint conference on natural language processing (Volume 1: Long Papers), pp 1048–1064

  • Xu H, Savelka J, Ashley KD (2021) Toward summarizing case decisions via extracting argument issues, reasons, and conclusions. In: Proceedings of the international conference on artificial intelligence and law (ICAIL), pp 250–254

  • Yeh JY, Ke HR, Yang WP et al (2005) Text summarization using a trainable summarizer and latent semantic analysis. Inf Process Manage 41:75–95

    Article  Google Scholar 

  • Zhong L, Zhong Z, Zhao Z, et al (2019) Automatic summarization of legal decisions using iterative masking of predictive sentences. In: Proceedings of ICAIL

Download references

Acknowledgements

The authors acknowledge the anonymous reviewers whose comments greatly helped to improve the paper. The research is partially supported by the TCG Centres for Research and Education in Science and Technology (CREST), India through a project titled “Smart Legal Consultant: AI-based Legal Analytics”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aniket Deroy.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deroy, A., Ghosh, K. & Ghosh, S. Ensemble methods for improving extractive summarization of legal case judgements. Artif Intell Law 32, 231–289 (2024). https://doi.org/10.1007/s10506-023-09349-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10506-023-09349-8

Keywords

Navigation