Disambiguating Word Translations with Target Language Models

Lynum, André; Marsi, Erwin; Bungum, Lars; Gambäck, Björn

doi:10.1007/978-3-642-32790-2_46

André Lynum²¹,
Erwin Marsi²¹,
Lars Bungum²¹ &
…
Björn Gambäck²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7499))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1693 Accesses

Abstract

Word Translation Disambiguation is the task of selecting the best translation(s) for a source word in a certain context, given a set of translation candidates. Most approaches to this problem rely on large word-aligned parallel corpora, resources that are scarce and expensive to build. In contrast, the method presented in this paper requires only large monolingual corpora to build vector space models encoding sentence-level contexts of translation candidates as feature vectors in high-dimensional word space. Experimental evaluation shows positive contributions of the models to overall quality in German-English translation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Semantic graph for word disambiguation in machine translation

Article 24 May 2022

Semantic morphological variant selection and translation disambiguation for cross-lingual information retrieval

Article 11 June 2021

Word Sense Disambiguation Using Context Translation

References

Tambouratzis, G., Sofianopoulos, S., Vassiliou, M., Simistira, F., Tsimboukakis, N.: A resource-light phrase scheme for language-portable MT. In: Forcada, M.L., Depraetere, H., Vandeghinste, V. (eds.) Proceedings of the 15th Conference of the European Association for Machine Translation, Leuven, Belgium, pp. 185–192 (2011)
Google Scholar
Salton, G.: Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, Reading (1989)
Google Scholar
Carpuat, M., Wu, D.: Improving statistical machine translation using word sense disambiguation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 61–72. ACL, Prague (2007)
Google Scholar
van Gompel, M.: UvT-WSD1: A cross-lingual word sense disambiguation system. In: Proceedings of the 5^th International Workshop on Semantic Evaluation, pp. 238–241. ACL (2010)
Google Scholar
Vilariño Ayala, D., Balderas Posada, C., Pinto Avendaño, D.E., Rodríguez Hernández, M., León Silverio, S.: FCC: Modeling probabilities with GIZA++ for Task 2 and 3 of SemEval-2. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 5th International Workshop on Semantic Evaluation, pp. 112–116. ACL, Uppsala (2010)
Google Scholar
Silberer, C., Ponzetto, S.P.: UHD: Cross-lingual word sense disambiguation using multilingual co-occurrence graphs. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 5th International Workshop on Semantic Evaluation, pp. 134–137. ACL, Uppsala (2010)
Google Scholar
Koehn, P., Knight, K.: Estimating word translation probabilities from unrelated monolingual corpora using the em algorithm. In: National Conference on Artificial Intelligence (AAAI 2000), Langkilde, pp. 711–715 (2000)
Google Scholar
Koehn, P., Knight, K.: Knowledge sources for word-level translation models. In: Lee, L., Harman, D. (eds.) Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, pp. 27–35. ACL, Pittsburgh (2001)
Google Scholar
Monz, C., Dorr, B.J.: Iterative translation disambiguation for cross-language information retrieval. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2005, pp. 520–527. ACM, New York (2005)
Chapter Google Scholar
Rapp, R.: Automatic identification of word translations from unrelated english and german corpora. In: Proceedings of the 37^thAnnual Meeting of the Association for Computational Linguistics, ACL 1999, pp. 519–526. ACL, College Park (1999)
Google Scholar
Pomikálek, J., Rychlỳ, P., Kilgarriff, A.: Scaling to billion-plus word corpora. Advances in Computational Linguistics 41, 3–13 (2009)
Google Scholar
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of International Conference on New Methods in Language Processing, Manchester, UK, vol. 12, pp. 44–49 (1994)
Google Scholar
Rychlý, P.: Manatee/Bonito – a modular corpus manager. In: 1st Workshop on Recent Advances in Slavonic Natural Language Processing, Brno, Masaryk University, pp. 65–70 (2007)
Google Scholar
Marsi, E., Lynum, A., Bungum, L., Gambäck, B.: Word translation disambiguation without parallel texts. In: International Workshop on Using Linguistic Information for Hybrid Machine Translation (LIHMT 2011), Barcelona, Spain, pp. 66–74 (2011)
Google Scholar
Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Gupta, A., Shmueli, O., Widom, J. (eds.) Proceedings of 24rd International Conference on Very Large Data Bases, VLDB 1998, August 24-27, pp. 194–205. Morgan Kaufmann, New York City (1998)
Google Scholar
Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research, pp. 138–145. Morgan Kaufmann Publishers Inc. (2002)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL, pp. 311–318 (2002)
Google Scholar
Cohen, T., Schvaneveldt, R., Widdows, D.: Reflective random indexing and indirect inference: A scalable method for discovery of implicit connections. Journal of Biomedical Informatics 43(2), 240–256 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Information Science, Norwegian University of Science and Technology (NTNU), Sem Sælands vei 7–9, NO–7491, Trondheim, Norway
André Lynum, Erwin Marsi, Lars Bungum & Björn Gambäck

Authors

André Lynum
View author publications
You can also search for this author in PubMed Google Scholar
Erwin Marsi
View author publications
You can also search for this author in PubMed Google Scholar
Lars Bungum
View author publications
You can also search for this author in PubMed Google Scholar
Björn Gambäck
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Department of Information Technologies, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Aleš Horák , Ivan Kopeček & Karel Pala , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lynum, A., Marsi, E., Bungum, L., Gambäck, B. (2012). Disambiguating Word Translations with Target Language Models. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-32790-2_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32789-6
Online ISBN: 978-3-642-32790-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Disambiguating Word Translations with Target Language Models