Abstract
We investigate a range of crosslingual web retrieval tasks using the test suite of the CLEF 2005 WebCLEF track, which features a stream of known-item topics in various languages. Our main findings are: (i) straightforward indexing and retrieval is effective for mixed monolingual web retrieval; (ii) standard machine translation methods are effective for bilingual web retrieval; but (iii) standard combination methods are ineffective for multilingual web retrieval; we analyze the failure and suggest an alternative Z-score normalization that leads to effective multilingual retrieval results.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Craswell, N., Hawking, D.: Overview of the TREC-2004 Web Track. In: Proceedings TREC 2004 (2005)
Fox, E., Shaw, J.: Combination of multiple searches. In: The Second Text REtrieval Conference (TREC-2). National Institute for Standards and Technology. NIST Special Publication 500-215, pp. 243–252 (1994)
ILPS. The ILPS extension of the Lucene search engine (2005), http://ilps.science.uva.nl/Resources/
Kamps, J.: Web-centric language models. In: CIKM 2005: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 307–308 (2005)
Kamps, J., de Rijke, M.: The effectiveness of combining information retrieval strategies for European languages. In: Proceedings of the 2004 ACM Symposium on Applied Computing, pp. 1073–1077 (2004)
Kamps, J., Monz, C., de Rijke, M., Sigurbjörnsson, B.: Language-Dependent and Language-Independent Approaches to Cross-Lingual Text Retrieval. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 152–165. Springer, Heidelberg (2004)
Kamps, J., Fissaha Adafre, S., de Rijke, M.: Effective translation, tokenization and combination for cross-lingual retrieval. In: Multilingual Information Access for Text, Speech and Images: Results of the Fifth CLEF Evaluation Campaign, pp. 123–134 (2005)
Lee, J.: Combining multiple evidence from different properties of weighting schemes. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 180–188 (1995)
Lucene. The Lucene search engine (2005), http://lucene.apache.org/
Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 143–150 (2003)
Savoy, J.: Report on CLEF-2003 Multilingual Tracks. In: Peters, C., Gonzalo, J., Braschler, M., Kluck, M. (eds.) CLEF 2003. LNCS, vol. 3237, pp. 64–73. Springer, Heidelberg (2004)
Worldlingo. Online translator (2005), http://www.worldlingo.com/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kamps, J., de Rijke, M., Sigurbjörnsson, B. (2006). Combination Methods for Crosslingual Web Retrieval. In: Peters, C., et al. Accessing Multilingual Information Repositories. CLEF 2005. Lecture Notes in Computer Science, vol 4022. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11878773_93
Download citation
DOI: https://doi.org/10.1007/11878773_93
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45697-1
Online ISBN: 978-3-540-45700-8
eBook Packages: Computer ScienceComputer Science (R0)