Skip to main content

CLEF Experiments at Maryland: Statistical Stemming and Backoff Translation

  • Conference paper
  • First Online:
Cross-Language Information Retrieval and Evaluation (CLEF 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2069))

Included in the following conference series:

Abstract

The University of Maryland participated in the CLEF 2000 multilingual task, submitting three official runs that explored the impact of applying language-independent stemming techniques to dictionarybased cross-language information retrieval. The paper begins by describing a cross-language information retrieval architecture based on balanced document translation. A four-stage backoff strategy for improving the coverage of dictionary-based translation techniques is then introduced, and an implementation based on automatically trained statistical stemming is presented. Results indicate that competitive performance can be achieved using four-stage backoff translation in conjunction with freely available bilingual dictionaries, but that the the usefulness of the statistical stemming algorithms that were tried varies considerably across the three languages to which they were applied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic query expansion using SMART: TREC 3. In D. K. Harman, editor, Overview of the Third Text REtrieval Conference (TREC-3) (1994) 69–80

    Google Scholar 

  2. Goldsmith, J.: Unsupervised learning of the morphology of a natural language. http://humanities.uchicago.edu/faculty/goldsmith (2000)

  3. Hull, D. A.: Stemming algorithms-A case study for detailed evaluation. Journal of the American Society for Information Science, 47(1) (1996) 70–84

    Article  Google Scholar 

  4. Levow, G., Oard, D. W.: Translingual topic tracking with PRISE. In Working Notes of the Third Topic Detection and Tracking Workshop (2000)

    Google Scholar 

  5. Oard, D. W.: A comparative study of query and document translation for crosslan guage information retrieval. In Proceedings of the Third Conference of the Association for Machine Translation in the Americas (1998)

    Google Scholar 

  6. Oard, D. W., Diekema, A. R.: Cross-language information retrieval. Annual Review of Information Science and Technology 33 (1998)

    Google Scholar 

  7. Oard, D. W., Wang, J., Lin, D., Soboroff, I.: TREC-8 experiments at Maryland: CLIR, QA, and routing. The Eighth Text Retrieval Conference (TREC-8) (1999) http://trec.nist.gov.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oard, D.W., Levow, GA., Cabezas, C.I. (2001). CLEF Experiments at Maryland: Statistical Stemming and Backoff Translation. In: Peters, C. (eds) Cross-Language Information Retrieval and Evaluation. CLEF 2000. Lecture Notes in Computer Science, vol 2069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44645-1_17

Download citation

  • DOI: https://doi.org/10.1007/3-540-44645-1_17

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42446-8

  • Online ISBN: 978-3-540-44645-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics