Previous Opinions is All You Need—Legal Information Retrieval System

Osowski, Maciej; Lorenc, Katarzyna; Drozda, Paweł; Scherer, Rafał; Szałapak, Konrad; Komar-Komarowski, Kajetan; Szymański, Julian; Sobecki, Andrzej

doi:10.1007/978-3-031-41774-0_5

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1864))

Included in the following conference series:

International Conference on Computational Collective Intelligence

406 Accesses

Abstract

We present a system for retrieving the most relevant legal opinions to a given legal case or question. To this end, we checked several state-of-the-art neural language models. As a training and testing data, we use tens of thousands of legal cases as question-opinion pairs. Text data has been subjected to advanced pre-processing adapted to the specifics of the legal domain. We empirically chose the BERT-based HerBERT model to perform the best in the considered scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aghdam, M.H.: Automatic extractive and generic document summarization based on NMF. J. Artif. Intell. Soft Comput. Res. 12(1), 37–49 (2023). https://doi.org/10.2478/jaiscr-2023-0003
Article Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Biagioli, C., Francesconi, E., Passerini, A., Montemagni, S., Soria, C.: Automatic semantics extraction in law documents. In: Proceedings of the 10th International Conference on Artificial Intelligence and Law, pp. 133–140 (2005)
Google Scholar
Chen, Y., Feng, Y., Gao, D., Li, J., Xiong, D., Liu, L.: The best of both worlds: Combining recent advances in neural machine translation. arXiv preprint arXiv:1804.09847 (2018)
Dedek, M., Scherer, R.: Transformer-based original content recovery from obfuscated powershell scripts. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds.) ICONIP 2022. CCIS, vol. 1794, pp. 284–295. Springer, Singapore (2022). https://doi.org/10.1007/978-981-99-1648-1_24
Chapter Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423, https://aclanthology.org/N19-1423
Feng, F., Yang, Y., Cer, D., Arivazhagan, N., Wang, W.: Language-agnostic BERT sentence embedding. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 878–891 (2022)
Google Scholar
Feng, F., Yang, Y., Cer, D., Arivazhagan, N., Wang, W.: Language-agnostic BERT sentence embedding. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 878–891. Association for Computational Linguistics, Dublin, Ireland (2022). https://doi.org/10.18653/v1/2022.acl-long.62, https://aclanthology.org/2022.acl-long.62
Grycuk, R., Scherer, R., Marchlewska, A., Napoli, C.: Semantic hashing for fast solar magnetogram retrieval. J. Artif. Intell. Soft Comput. Res. 12(4), 299–306 (2022). https://doi.org/10.2478/jaiscr-2022-0020
Article Google Scholar
Jain, D., Borah, M.D., Biswas, A.: A sentence is known by the company it keeps: improving legal document summarization using deep clustering. Artif. Intell. Law, 1–36 (2023)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Kim, Y., Denton, C., Hoang, L., Rush, A.M.: Structured attention networks. In: International Conference on Learning Representations (2017)
Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Ma, Y., et al.: LeCaRD: a legal case retrieval dataset for Chinese law system. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2342–2348 (2021)
Google Scholar
Maxwell, K.T., Oberlander, J., Lavrenko, V.: Evaluation of semantic events for legal case retrieval. In: Proceedings of the WSDM 2009 Workshop on Exploiting Semantic Annotations in Information Retrieval, pp. 39–41 (2009)
Google Scholar
Mroczkowski, R., Rybak, P., Wróblewska, A., Gawlik, I.: HerBERT: efficiently pretrained transformer-based language model for Polish. In: Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, pp. 1–10. Association for Computational Linguistics, Kiyv, Ukraine (2021). https://aclanthology.org/2021.bsnlp-1.1
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. ACM SIGIR Forum 51(2), 202–208 (2017)
Article Google Scholar
Rabelo, J., Kim, M.-Y., Goebel, R., Yoshioka, M., Kano, Y., Satoh, K.: A summary of the COLIEE 2019 competition. In: Sakamoto, M., Okazaki, N., Mineshima, K., Satoh, K. (eds.) JSAI-isAI 2019. LNCS (LNAI), vol. 12331, pp. 34–49. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58790-1_3
Chapter Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I., et al.: Improving language understanding by generative pre-training (2018)
Google Scholar
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at trec-3. Nist Special Publication Sp 109, 109 (1995)
Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)
Article Google Scholar
Shao, Y., et al.: BERT-PLI: modeling paragraph-level interactions for legal case retrieval. In: IJCAI, pp. 3501–3507 (2020)
Google Scholar
Shao, Y., Wu, Y., Liu, Y., Mao, J., Zhang, M., Ma, S.: Investigating user behavior in legal case retrieval. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 962–972 (2021)
Google Scholar
Talmor, A., Berant, J.: MultiQA: an empirical investigation of generalization and transfer in reading comprehension. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4911–4921 (2019)
Google Scholar
Tran, V., Le Nguyen, M., Tojo, S., Satoh, K.: Encoded summarization: summarizing documents into continuous vector space for legal case retrieval. Artif. Intell. Law 28, 441–467 (2020)
Article Google Scholar
Vaissnave, V., Deepalakshmi, P.: Modeling of automated glowworm swarm optimization based deep learning model for legal text summarization. Multimedia Tools Appl. 82, 1–20 (2022)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., Zhou, M.: Minilm: deep self-attention distillation for task-agnostic compression of pre-trained transformers. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 5776–5788. Curran Associates, Inc. (2020). https://proceedings.neurips.cc/paper/2020/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Zhang, Y., Chen, Y., Feng, Y., Gao, D., Liu, L.: HiBERT: hierarchical attention networks for document classification. arXiv preprint arXiv:1909.09610 (2019)

Download references

Acknowledgments

The work was supported by founds of the project “A semi-autonomous system for generating legal advice and opinions based on automatic query analysis using the transformer-type deep neural network architecture with multitasking learning”, POIR.01.01.01-00-1965/20.

The project financed under the program of the Polish Minister of Science and Higher Education under the name “Regional Initiative of Excellence“ in the years 2019–2023 project number 020/RID/2018/19 the amount of financing PLN 12,000,000.

Author information

Authors and Affiliations

Emplocity Ltd, Warsaw, Poland
Maciej Osowski & Katarzyna Lorenc
University of Warmia and Mazury, Olsztyn, Poland
Paweł Drozda
Czestochowa University of Technology, Czestochowa, Poland
Rafał Scherer
Lex Secure 24H Opieka Prawna, Sopot, Poland
Konrad Szałapak & Kajetan Komar-Komarowski
Gdansk University of Technology, Faculty of Electronics, Telecommunications and Informatics, Gdańsk, Poland
Julian Szymański & Andrzej Sobecki

Authors

Maciej Osowski
View author publications
You can also search for this author in PubMed Google Scholar
Katarzyna Lorenc
View author publications
You can also search for this author in PubMed Google Scholar
Paweł Drozda
View author publications
You can also search for this author in PubMed Google Scholar
Rafał Scherer
View author publications
You can also search for this author in PubMed Google Scholar
Konrad Szałapak
View author publications
You can also search for this author in PubMed Google Scholar
Kajetan Komar-Komarowski
View author publications
You can also search for this author in PubMed Google Scholar
Julian Szymański
View author publications
You can also search for this author in PubMed Google Scholar
Andrzej Sobecki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rafał Scherer .

Editor information

Editors and Affiliations

Wrocław University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
Eötvös Loránd University, Budapest, Hungary
János Botzheim
Eötvös Loránd University, Budapest, Hungary
László Gulyás
Universidad Complutense de Madrid, Madrid, Spain
Manuel Nunez
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Jan Treur
University of Münster, Münster, Germany
Gottfried Vossen
Wrocław University of Science and Technology, Wrocław, Poland
Adrianna Kozierkiewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Osowski, M. et al. (2023). Previous Opinions is All You Need—Legal Information Retrieval System. In: Nguyen, N.T., et al. Advances in Computational Collective Intelligence. ICCCI 2023. Communications in Computer and Information Science, vol 1864. Springer, Cham. https://doi.org/10.1007/978-3-031-41774-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-41774-0_5
Published: 22 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41773-3
Online ISBN: 978-3-031-41774-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics