Skip to main content

Automatic Dictionary Creation by Sub-symbolic Encoding of Words

  • Conference paper
  • 830 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3931))

Abstract

This paper describes a technique for automatic creation of dictionaries using sub-symbolic representation of words in cross-language context. Semantic relationship among words of two languages is extracted from aligned bilingual text corpora. This feature is obtained applying the Latent Semantic Analysis technique to the matrices representing terms co-occurrences in aligned text fragments. The technique allows to find the “best translation” according to a properly defined geometric distance in an automatically created semantic space. Experiments show an interesting correctness of 95% obtained in the best case.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The new american bible. Resources avalaible at, http://www.vatican.va/archive/bible/

  2. Brown, R.D.: Automated dictionary extraction for knowledge-free examplebased translation. In: Proc. of the 7th International Conference on Theoretical and Methodological Issues in Machine Translation (1997)

    Google Scholar 

  3. Tanimoto, T., Rogers, D.: A computer program for classifying plants. Science 132 (1960)

    Google Scholar 

  4. Gaussier, E., Renders, J.-M., Matveeva, I., Goutte, C., Djean, H.: A geometric view on bilingual lexicon extraction from comparable corpora. In: ACL (2004)

    Google Scholar 

  5. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings UAI 1999, pp. 289–296 (1999)

    Google Scholar 

  6. Koehn, P.: Europarl: A multilingual corpus for evaluation of machine translation (2003) (unpublished), http://people.csail.mit.edu/people/koehn/publications/europarl/

  7. Littman, M., Dumais, S., Landauer, T.: Automatic cross-language information retrieval using latent semantic indexing. In: Grefenstette, G. (ed.) Cross Language Information Retrieval. Kluwer, Dordrecht (1998)

    Google Scholar 

  8. McEwan, C.J.A., Ounis, I., Ruthven, I.: Building bilingual dictionaries from parallel web documents. In: Proc. of the 24 European Colloquium on Information Retrieval Research. LNCS (2002)

    Google Scholar 

  9. Foltz, P.W., Landauer, T.K., Laham, D.: An introduction to latent semantic analysis. Discourse Processes 25, 259–284 (1998)

    Article  Google Scholar 

  10. van Rijsbergen, C.J.: Information Retrieval (1999), http://www.dcs.gla.ac.uk/Keith/Preface.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vella, F., Pilato, G., Motisi, I., Gaglio, S. (2006). Automatic Dictionary Creation by Sub-symbolic Encoding of Words. In: Apolloni, B., Marinaro, M., Nicosia, G., Tagliaferri, R. (eds) Neural Nets. WIRN NAIS 2005 2005. Lecture Notes in Computer Science, vol 3931. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731177_17

Download citation

  • DOI: https://doi.org/10.1007/11731177_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33183-4

  • Online ISBN: 978-3-540-33184-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics