Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3491))

Included in the following conference series:

  • 653 Accesses


This experiment tests a simple, scalable, and effective approach to building a domain-specific translation lexicon using distributional statistics over parallellized bilingual corpora. A bilingual lexicon is extracted from aligned Swedish-French data, used to translate CLEF topics from Swedish to French, which resulting French queries are then in turn used to retrieve documents from the French language CLEF collection. The results give 34 of fifty queries on or above median for the “precision at 1000 documents” recall oriented score; with many of the errors possible to handle by the use of string-matching and cognate search. We conclude that the approach presented here is a simple and efficient component in an automatic query translation system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Karlgren, H.: Term-tuning, a method for the computer-aided revision of multi-lingual texts. International Forum for Information and Documentation 13, 7–13 (1988)

    Google Scholar 

  2. Melamed, D.: Models of translational equivalence among words. Computational Linguistics 26, 221–249 (2000)

    Article  Google Scholar 

  3. Brown, P., Cocke, S., Della Pietra, V., Della Pietra, F., Jelinek, F., Mercer, R., Roossin, P.: A statistical approach to language translation. In: Proceedings of the 12th Annual Conference on Computational Linguistics (COLING 1988), International Committee on Computational Linguistics (1988)

    Google Scholar 

  4. Kanerva, P., Kristofersson, J., Holst, A.: Random indexing of text samples for latent semantic analysis. In: Proceedings of the 22nd Annual Conference of the Cognitive Science Society, Erlbaum, p. 1036 (2000)

    Google Scholar 

  5. Karlgren, J., Sahlgren, M.: From words to understanding. In: Uesaka, Y., Kanerva, P., Asoh, H. (eds.) Foundations of Real-World Intelligence, pp. 294–308. CSLI Publications, Stanford (2001)

    Google Scholar 

  6. Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the Society for Information Science 41, 391–407 (1990)

    Article  Google Scholar 

  7. Landauer, T., Dumais, S.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104, 211–240 (1997)

    Article  Google Scholar 

  8. Sahlgren, M.: Automatic bilingual lexicon acquisition using random indexing of aligned bilingual data. In: Proceedings of the fourth international conference on Language Resources and Evaluation, LREC 2004 (2004)

    Google Scholar 

  9. Sahlgren, M., Karlgren, J.: Automatic bilingual lexicon acquisition using random indexing of parallel corpora. Natural Language Engineering (forthcoming)

    Google Scholar 

  10. Koehn, P.: Europarl: A multilingual corpus for evaluation of machine translation (2002),

  11. Sahlgren, M., Karlgren, J., Cöster, R., Järvinen, T.: Automatic query expansion using random indexing. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 311–320. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Karlgren, J., Sahlgren, M., Järvinen, T., Cöster, R. (2005). Dynamic Lexica for Query Translation. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds) Multilingual Information Access for Text, Speech and Images. CLEF 2004. Lecture Notes in Computer Science, vol 3491. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27420-9

  • Online ISBN: 978-3-540-32051-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics