ABSTRACT
In this paper we present a system for cross-lingual information retrieval, which can handle tens of languages and millions of documents. Functioning of the system is demonstrated on corpus of European Legislation (22 languages, more than 400,000 documents per language). The system uses an interactive web-interface, which can take advantage of a predefined thesaurus allowing the user to dynamically re-rank the retrieval results based on the mapping onto a predefined thesaurus.
- Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufiş, D., Varga, D. 2006. The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006). Genoa, Italy, 24-26 May 2006.Google Scholar
- Fortuna, B., Cristianini, N., Shawe-Taylor, J. 2006. A Kernel Canonical Correlation Analysis For Learning The Semantics Of Text. Kernel methods in bioengineering, communications and image processing, edited by G. Camps-Valls, J. L. Rojo-Álvarez & M. Martíínez-Ramón.Google Scholar
- Pajntar, B., Grobelnik, M., http://searchpoint.ijs.si/Google Scholar
Index Terms
- Cross-lingual search over 22 european languages
Recommendations
Cross-Lingual Information Retrieve in Sogou Search
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalIn recent years, more and more Chinese people desires to be able to access the large amount of foreign language information and understand what is happening all over the world. However, language barrier is always a problem to them. In order to break the ...
Amharic-English bilingual web search engine
MEDES '12: Proceedings of the International Conference on Management of Emergent Digital EcoSystemsAs non-English languages are growing exponentially on the Web, the number of online non-English speakers who realizes the importance of finding information in different languages is enormously growing. However, the major general purpose search engines ...
Semantic morphological variant selection and translation disambiguation for cross-lingual information retrieval
AbstractCross-Lingual Information Retrieval (CLIR) enables a user to query in a language which is different from the target documents language. CLIR incorporates a translation technique based on either a manual dictionary or a probabilistic dictionary ...
Comments