ABSTRACT
This half day tutorial introduces the participant to the basic concepts underlying neural Cross-Language Information Retrieval (CLIR). It discusses the most common algorithmic approaches to CLIR, focusing on modern neural methods; the history of CLIR; where to find and how to use CLIR training collections, test collections and baseline systems; how CLIR training and test collections are constructed; and open research questions in CLIR.
- Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, et al. 2016. MS MARCO: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268 (2016).Google Scholar
- Petra Galuscáková, Douglas W Oard, and Suraj Nair. 2021. Cross-language information retrieval. arXiv preprint arXiv:2111.05988 (2021).Google Scholar
- Sean MacAvaney, Andrew Yates, Sergey Feldman, Doug Downey, Arman Cohan, and Nazli Goharian. 2021. Simplified Data Wrangling with ir_datasets. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).Google ScholarDigital Library
- Jian-Yun Nie. 2010. Cross-Language Information Retrieval. Morgan & Claypool Publishers. https://doi.org/10.2200/S00266ED1V01Y201005HLT008Google ScholarCross Ref
Index Terms
- Neural Methods for Cross-Language Information Retrieval
Recommendations
Bootstrapping dictionaries for cross-language information retrieval
SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrievalThe bottleneck for dictionary-based cross-language information retrieval is the lack of comprehensive dictionaries, in particular for many different languages. We here introduce a methodology by which multilingual dictionaries (for Spanish and Swedish) ...
Learning bilingual translations from comparable corpora to cross-language information retrieval: hybrid statistics-based and linguistics-based approach
AsianIR '03: Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11Recent years saw an increased interest in the use and the construction of large corpora. With this increased interest and awareness has come an expansion in the application to knowledge acquisition and bilingual terminology extraction. The present paper ...
Enhancing cross-language information retrieval by an automatic acquisition of bilingual terminology from comparable corpora
SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrievalThis paper presents an approach to bilingual lexicon extraction from comparable corpora and evaluations on Cross-Language Information Retrieval. We explore a bi-directional extraction of bilingual terminology primarily from comparable corpora. A ...
Comments