Skip to main content

Finding Cross-Lingual Spelling Variants

  • Conference paper
String Processing and Information Retrieval (SPIRE 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3246))

Included in the following conference series:

  • 705 Accesses

Abstract

Finding term translations as cross-lingual spelling variants on the fly is an important problem for cross-lingual information retrieval (CLIR). CLIR is typically approached by automatically translating a query into the target language. For an overview of cross-lingual information retrieval, see [1]. When automatically translating the query, specialized terminology is often missing from the translation dictionary. The analysis of query properties in [2] has shown that proper names and technical terms often are prime keys in queries, and if not properly translated or transliterated, query performance may deteriorate significantly. As proper names often need no translation, a trivial solution is to include the untranslated keys as such into the target language query. However, technical terms in European languages often have common Greek or Latin roots, which allows for a more advanced solution using approximate string matching to find the word or words most similar to the source keys in the index of the target language text database [3].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Oard, D., Dickma, A.: Cross language information retrieval. Annual Review of Information Science and Technology 33, 223–256 (1998)

    Google Scholar 

  2. Pirkola, A., Järvelin, K.: Employing the resolution power of search keys. Journal of the American Society of Information Science 52, 575–583 (2001)

    Article  Google Scholar 

  3. Pirkola, A., Hedhind, T., Keskustalo, H., Järvelin, K.: Dictionary-based cross-language information retrieval: Problems, methods, and research findings. Information Retrieval 4, 209–230 (2001)

    Article  MATH  Google Scholar 

  4. Keskustalo, H., Pirkola, A., Visala, K., Leppäncn, E., Järvelin, K.: Non-adjacent digrams improve matching of cross-lingual spelling variants. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 2003–2010. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Pirkola, A., Toivonen, J., Keskustalo, H., Visala, K., Järvelin, K.: Fuzzy translation of cross-lingual spelling variants. In: Proceedings of the 26th annual international ACM S1G1R conference on Research and development in in forma ion retrieval, pp. 345–352. ACM Press, New York (2003)

    Google Scholar 

  6. Knight, K., Grachl, J.: Machine transliteration. Computational Linguistics 24, 599–612 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lindén, K. (2004). Finding Cross-Lingual Spelling Variants. In: Apostolico, A., Melucci, M. (eds) String Processing and Information Retrieval. SPIRE 2004. Lecture Notes in Computer Science, vol 3246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30213-1_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30213-1_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23210-0

  • Online ISBN: 978-3-540-30213-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics