Abstract
The University of Maryland participated in the CLEF 2000 multilingual task, submitting three official runs that explored the impact of applying language-independent stemming techniques to dictionarybased cross-language information retrieval. The paper begins by describing a cross-language information retrieval architecture based on balanced document translation. A four-stage backoff strategy for improving the coverage of dictionary-based translation techniques is then introduced, and an implementation based on automatically trained statistical stemming is presented. Results indicate that competitive performance can be achieved using four-stage backoff translation in conjunction with freely available bilingual dictionaries, but that the the usefulness of the statistical stemming algorithms that were tried varies considerably across the three languages to which they were applied.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic query expansion using SMART: TREC 3. In D. K. Harman, editor, Overview of the Third Text REtrieval Conference (TREC-3) (1994) 69–80
Goldsmith, J.: Unsupervised learning of the morphology of a natural language. http://humanities.uchicago.edu/faculty/goldsmith (2000)
Hull, D. A.: Stemming algorithms-A case study for detailed evaluation. Journal of the American Society for Information Science, 47(1) (1996) 70–84
Levow, G., Oard, D. W.: Translingual topic tracking with PRISE. In Working Notes of the Third Topic Detection and Tracking Workshop (2000)
Oard, D. W.: A comparative study of query and document translation for crosslan guage information retrieval. In Proceedings of the Third Conference of the Association for Machine Translation in the Americas (1998)
Oard, D. W., Diekema, A. R.: Cross-language information retrieval. Annual Review of Information Science and Technology 33 (1998)
Oard, D. W., Wang, J., Lin, D., Soboroff, I.: TREC-8 experiments at Maryland: CLIR, QA, and routing. The Eighth Text Retrieval Conference (TREC-8) (1999) http://trec.nist.gov.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oard, D.W., Levow, GA., Cabezas, C.I. (2001). CLEF Experiments at Maryland: Statistical Stemming and Backoff Translation. In: Peters, C. (eds) Cross-Language Information Retrieval and Evaluation. CLEF 2000. Lecture Notes in Computer Science, vol 2069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44645-1_17
Download citation
DOI: https://doi.org/10.1007/3-540-44645-1_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42446-8
Online ISBN: 978-3-540-44645-3
eBook Packages: Springer Book Archive