Skip to main content

A Legal Information Retrieval System for Statute Law

  • Conference paper
  • First Online:
Recent Challenges in Intelligent Information and Database Systems (ACIIDS 2022)

Abstract

The information retrieval task for statute law requires a system to retrieve the relevant legal articles given a legal bar exam query. The Transformer-based approaches have demonstrated robustness over traditional machine learning and information retrieval methods for legal documents. However, those approaches are mainly domain adaptation without attempting to tackle the challenges in the characteristics of the legal queries and the legal documents. This paper specifies two challenges related to the characteristics of the two legal materials and proposes methods to tackle them effectively. Specifically, the challenge of different language used (while the articles use abstract language, the queries may use the language to describe a specific scenario) is addressed by a specialized model. Besides, another specialized model can overcome the challenge of long articles and queries. As shown in the experimental results, our proposed system achieved a state-of-the-art F2 score of 76.87%, with an improvement of 3.85% compared to the previous best system. The code will be available at https://github.com/nguyenlab/statute_law_IR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://huggingface.co/nguyenthanhasia/BERTLaw.

References

  1. Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv:2004.05150 (2020)

  2. Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Aletras, N., Androutsopoulos, I.: LEGAL-BERT: the muppets straight out of law school. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 2898–2904. Association for Computational Linguistics, Online, November 2020

    Google Scholar 

  3. Chalkidis, I., Fergadiotis, M., Manginas, N., Katakalou, E., Malakasiotis, P.: Regulatory compliance through doc2doc information retrieval: a case study in eu/uk legislation where text similarity has limitations. arXiv preprint arXiv:2101.10726 (2021)

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota June 2019

    Google Scholar 

  5. Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthcare 3(1) (2021). https://doi.org/10.1145/3458754, https://doi.org/10.1145/3458754

  6. Hearst, M., Dumais, S., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998). https://doi.org/10.1109/5254.708428

    Article  Google Scholar 

  7. Kim, K., Hong, K., Rhim, Y.Y.: LSTM Based Legal Text Representation Learning (2017)

    Google Scholar 

  8. Kim, M.Y., Rabelo, J., Goebel., R.: Bm25 and transformer-based legal information extraction and entailment. In: Proceedings of the COLIEE Workshop in ICAIL (2021)

    Google Scholar 

  9. Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, ICM 2015, Vol. 37, pp. 957–966. JMLR.org (2015)

    Google Scholar 

  10. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: a lite BERT for self-supervised learning of language representations. In: International Conference on Learning Representations (2020)

    Google Scholar 

  11. Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2019). https://doi.org/10.1093/bioinformatics/btz682, https://doi.org/10.1093/bioinformatics/btz682

  12. Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. CoRR (2019)

    Google Scholar 

  13. Nguyen, H.T., et al.: JNLP team: deep learning approaches for legal processing tasks in COLIEE 2021. In: Proceedings of the COLIEE Workshop in ICAIL (2021)

    Google Scholar 

  14. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (2019). https://doi.org/10.48550/ARXIV.1910.01108, https://arxiv.org/abs/1910.01108

  15. Schilder, F., Chinnappa, D., Madan, K., Harmouche, J., Vold, A., Bretz, H., Hudzina., J.: A pentapus grapples with legal reasoning. In: Proceedings of the COLIEE Workshop in ICAIL (2021)

    Google Scholar 

  16. Silveira, R., Fernandes, C., Neto, J.A.M., Furtado, V., Pimentel Filho, J.E.: Topic modelling of legal documents via legal-BERT. https://ceur-ws.org/. ISSN:1613-0073 (2021)

  17. Strohman, T., Metzler, D., Turtle, H., Croft, W.: Indri: a language-model based search engine for complex queries. Information Retrieval-IR, January 2005

    Google Scholar 

  18. Wehnert, S., Sudhi, V., Dureja, S., Kutty, L., Shahania, S., De Luca, E.W.: Legal norm retrieval with variations of the BERT model combined with TF-IDF vectorization. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, ICAIL 2021, pp. 285–294. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3462757.3466104, https://doi.org/10.1145/3462757.3466104

  19. Yoshioka, M., Kano, Y., Kiyota, N., Satoh, K.: Overview of Japanese statute law retrieval and entailment task at COLIEE-2018. In: Twelfth international workshop on Juris-informatics (JURISIN 2018) (2018)

    Google Scholar 

  20. Yoshioka, M., et al.: BERT-based ensemble methods for information retrieval and legal textual entailment in COLIEE statute law task. In: Proceedings of the COLIEE Workshop in ICAIL (2021)

    Google Scholar 

Download references

Acknowledgment

This work was supported by JSPS Kakenhi Grant Number 20H04295, 20K20406, and 20K20625.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chau Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, C., Le, NK., Nguyen, DH., Nguyen, P., Nguyen, LM. (2022). A Legal Information Retrieval System for Statute Law. In: Szczerbicki, E., Wojtkiewicz, K., Nguyen, S.V., Pietranik, M., Krótkiewicz, M. (eds) Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2022. Communications in Computer and Information Science, vol 1716. Springer, Singapore. https://doi.org/10.1007/978-981-19-8234-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-8234-7_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-8233-0

  • Online ISBN: 978-981-19-8234-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics