Skip to main content

n-Best Reranking for the Efficient Integration of Word Sense Disambiguation and Statistical Machine Translation

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2008)

Abstract

Although it has been always thought that Word Sense Disambiguation (WSD) can be useful for Machine Translation, only recently efforts have been made towards integrating both tasks to prove that this assumption is valid, particularly for Statistical Machine Translation (SMT). While different approaches have been proposed and results started to converge in a positive way, it is not clear yet how these applications should be integrated to allow the strengths of both to be exploited. This paper aims to contribute to the recent investigation on the usefulness of WSD for SMT by using n-best reranking to efficiently integrate WSD with SMT. This allows using rich contextual WSD features, which is otherwise not done in current SMT systems. Experiments with English-Portuguese translation in a syntactically motivated phrase-based SMT system and both symbolic and probabilistic WSD models showed significant improvements in BLEU scores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agirre, E., Màrquez, L., Wicentowski, R.: Proceedings of SemEval-2007 - the Fourth International Workshop on Semantic Evaluations, Prague (2007)

    Google Scholar 

  2. Bar-Hillel, Y.: The Present Status of Automatic Translations of Languages, 91–163 (1960)

    Google Scholar 

  3. Brown, P.F., et al.: The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 19(2) (1993)

    Google Scholar 

  4. Cabezas, C., Resnik, P.: Using WSD Techniques for Lexical Selection in Statistical Machine Translation. UMIACS Technical Report UMIACS-TR-2005-42 (2005)

    Google Scholar 

  5. Carpuat, M., Wu, D.: Word Sense Disambiguation vs. Statistical Machine Translation. In: 43rd Annual Meeting of the Association for Computational Linguistics (ACL-2005), Ann Arbor, pp. 387–394 (2005)

    Google Scholar 

  6. Carpuat, M., Wu, D.: Improving Statistical Machine Translation Using Word Sense Disambiguation. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL-2007), Prague, pp. 61–72 (2007)

    Google Scholar 

  7. Chan, Y.S., Ng, H.T., Chiang, D.: Word Sense Disambiguation Improves Statistical Machine Translation. In: 45th Annual Meeting of the Association for Computational Linguistics (ACL-2007), Prague, pp. 33–40 (2007)

    Google Scholar 

  8. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm

  9. Nunes, M.G.V., et al.: The design of a Lexicon for Brazilian Portuguese: Lessons learned and Perspectives. In: II Workshop on Computational Processing of Written and Speak Portuguese (Propor), Curitiba, pp. 61–70 (1996)

    Google Scholar 

  10. Och, F.J.: Minimum error rate training in statistical machine translation. In: 41st Annual Meeting of the Association for Computational Linguistics (ACL-2003), Sapporo, pp. 160–167 (2003)

    Google Scholar 

  11. Och, F.J., Ney, H.: Improved statistical alignment models. In: 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000), Hong Kong, pp. 440–447 (2000)

    Google Scholar 

  12. Och, F.J., et al.: A Smorgasbord of Features for Statistical Machine Translation. Human Language Technology / North American Chapter of the Association for Computational Linguistics (HLT/NAACL-04), Boston, pp. 161–168 (2004)

    Google Scholar 

  13. Papineni, K., et al.: BLEU: a method for automatic evaluation of machine translation. In: 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002), Philadelphia, pp. 311–318 (2002)

    Google Scholar 

  14. Quirk, C., Menezes, A., Cherry, C.: Dependency Treelet Translation: Syntactically Informed Phrasal SMT. In: 43rd Annual Meeting of the Association for Computational Linguistics (ACL-2005), Ann Arbor, pp. 271–279 (2005)

    Google Scholar 

  15. Specia, L., Nunes, M.G.V., Stevenson, M.: Exploiting Parallel Texts to Produce a Multilingual Sense-tagged Corpus for Word Sense Disambiguation. Recent Advances in Natural Language Processing (RANLP-2005), Borovets, pp. 525–531 (2005)

    Google Scholar 

  16. Specia, L., Stevenson, M., Nunes, M.G.V.: Learning Expressive Models for Word Sense Disambiguation. In: 45th Annual Meeting of the Association for Computational Linguistics (ACL-2007), Prague, pp. 41–48 (2007)

    Google Scholar 

  17. Stevenson, M., Wilks, Y.: The Interaction of Knowledge Sources for Word Sense Disambiguation. Computational Linguistics 27(3), 321–349 (2001)

    Article  Google Scholar 

  18. Toutanova, K., Suzuki, H.: Generating Case Markers in Machine Translation. Human Language Technology / North American Chapter of the Association for Computational Linguistics (HLT/NAACL-2007), Rochester, pp. 49–56 (2007)

    Google Scholar 

  19. Vickrey, D., et al.: Word-Sense Disambiguation for Machine Translation. Joint Conference on Human Language Technology and Empirical Methods in Natural Language Processing (HLT/EMNLP-2005), Vancouver, pp. 771–778 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Specia, L., Sankaran, B., das Graças Volpe Nunes, M. (2008). n-Best Reranking for the Efficient Integration of Word Sense Disambiguation and Statistical Machine Translation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78135-6_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78134-9

  • Online ISBN: 978-3-540-78135-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics