ABSTRACT
Domain-specific Information Retrieval (IR) is generally challenging because of the rare datasets or benchmarks, niche vocabularies and more limited literature coverage. Legal IR is no exception and presents other obstacles, reinforcing the need for innovation and, sometimes, paradigm shifts. Doctrine, one of the largest Legaltech companies in Europe, dedicates an entire data science team to advance on these problems and identify new opportunities. In this presentation, we provide some intuition regarding the specificities of legal IR (e.g., what is relevance?), and we introduce some of the solutions currently used on doctrine.fr.
Particularly, we show how we use named entity recognition in the various forms of contents we host, and how it enhances the search engine. With knowledge extracted from documents, we may built large enough datasets and train learning-to-rank algorithms. This, combined with several specific-domain vocabulary enrichments to increase recall, dramatically improves the search experience for our users.
Supplemental Material
Index Terms
- Find Relevant Cases in All Cases: Your Journey at Doctrine
Recommendations
Query-performance prediction: setting the expectations straight
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrievalThe query-performance prediction task has been described as estimating retrieval effectiveness in the absence of relevance judgments. The expectations throughout the years were that improved prediction techniques would translate to improved retrieval ...
Incorporating query-specific feedback into learning-to-rank models
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrievalRelevance feedback has been shown to improve retrieval for a broad range of retrieval models. It is the most common way of adapting a retrieval model for a specific query. In this work, we expand this common way by focusing on an approach that enables ...
Comments