Skip to main content

Using Latent Semantics for NE Translation

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4285))

Abstract

This paper describes an algorithm that assists in the discovery of Named Entity (NE) translation pairs from large corpora. It is based on Latent Semantic Analysis (LSA) and Cross-Lingual Latent Semantic Indexing (CL-LSI), and is demonstrated to be able to automatically discover new translation pairs in a bootstrapping framework. Some experiments are performed to quantify the interaction between corpus size, features and algorithm parameters, in order to better understand the workings of the proposed approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Onaizan, Y., Knight, K.: Machine transliteration of names in Arabic text. In: Proc. of ACL Workshop on Computational Approaches to Semitic Languages, pp. 400–408 (2002)

    Google Scholar 

  2. Oh, J.H., Choi, K.S.: An ensemble of grapheme and phoneme for machine transliteration. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 450–461. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Li, H., Zhang, M., Su, J.: A joint source-channel model for machine transliteration. Association for Computational Linguistics (2004)

    Google Scholar 

  4. Huang, F., Vogel, S., Waibel, A.: Improving named entity translation combining phonetic and semantic similarities. In: HLT/NAACL (2004)

    Google Scholar 

  5. Utsuro, T.: Translation knowledge acquisition from cross-linguistically relevant news articles (2004)

    Google Scholar 

  6. Cancedda, N., Dejean, H., Gaussier, E., Renders, J.M.: Report on CLEF-2003 experiments: two ways of extracting multilingual resources (2003)

    Google Scholar 

  7. Landauer, T.K., Littman, M.L.: A statistical method for language-independent representation of the topical context of text segments. In: Proceedings of the Sixth Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research, pp. 31–38 (1990)

    Google Scholar 

  8. Dumais, S., Letsche, T., Littman, M., Landauer, T.: Automatic cross-language retrieval using latent semantic indexing. American Association for Artificial Intelligence (1997)

    Google Scholar 

  9. Mori, T., Kokubu, T., Tanaka, T.: Cross-lingual information retrieval based on LSI with multiple word spaces. In: Proceedings of the NTCIR Workshop 2 Meeting, pp. 67–74 (2001)

    Google Scholar 

  10. Kim, Y.-S., Chang, J.-H., Zhang, B.-T.: A comparative evaluation of data-driven models in translation selection of machine transliteration. In: COLING (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lim, B.P., Sproat, R.W. (2006). Using Latent Semantics for NE Translation. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_48

Download citation

  • DOI: https://doi.org/10.1007/11940098_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49667-0

  • Online ISBN: 978-3-540-49668-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics