research-article

Incorporating Structural Information into Legal Case Retrieval

Authors:

Shaoping MaAuthors Info & Claims

ACM Transactions on Information Systems, Volume 42, Issue 2

Article No.: 40, Pages 1 - 28

https://doi.org/10.1145/3609796

Published: 08 November 2023 Publication History

Get Access

Abstract

Legal case retrieval has received increasing attention in recent years. However, compared to ad hoc retrieval tasks, legal case retrieval has its unique challenges. First, case documents are rather lengthy and contain complex legal structures. Therefore, it is difficult for most existing dense retrieval models to encode an entire document and capture its inherent complex structure information. Most existing methods simply truncate part of the document content to meet the input length limit of PLMs, which will lead to information loss. Additionally, the definition of relevance in the legal domain differs from that in the general domain. Previous semantic-based or lexical-based methods fail to provide a comprehensive understanding of the relevance of legal cases. In this article, we propose a Structured Legal case Retrieval (SLR) framework, which incorporates internal and external structural information to address the above two challenges. Specifically, to avoid the truncation of long legal documents, the internal structural information, which is the organization pattern of legal documents, can be utilized to split a case document into segments. By dividing the document-level semantic matching task into segment-level subtasks, SLR can separately process segments using different methods based on the characteristic of each segment. In this way, the key elements of a case document can be highlighted without losing other content information. Second, toward a better understanding of relevance in the legal domain, we investigate the connections between criminal charges appearing in large-scale case corpus to generate a chargewise relation graph. Then, the similarity between criminal charges can be pre-computed as the external structural information to enhance the recognition of relevant cases. Finally, a learning-to-rank algorithm integrates the features collected from internal and external structures to output the final retrieval results. Experimental results on public legal case retrieval benchmarks demonstrate the superior effectiveness of SLR over existing state-of-the-art baselines, including traditional bag-of-words and neural-based methods. Furthermore, we conduct a case study to visualize how the proposed model focuses on key elements and improves retrieval performance.

References

[1]

A. A. Askari, S. V. Verberne, O. Alonso, S. Marchesin, M. Najork, and G. Silvello. 2021. Combining lexical and neural retrieval with longformer-based summarization for effective case law retrieva. In Proceedings of the 2nd International Conference on Design of Experimental Search & Information REtrieval Systems. CEUR, 162–170.

Abstract

References

Cited By

Index Terms

Recommendations

SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval

LeCaRD: A Legal Case Retrieval Dataset for Chinese Law System

An Intent Taxonomy of Legal Case Retrieval

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Full Text

Share

Share this Publication link

Share on social media

Affiliations