Skip to main content

Advertisement

Log in

Semantic graph for word disambiguation in machine translation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This work is an attempt to incorporate semantic context in machine translation using a combination of parallel corpora as well as feedback from human translators. Both parallel corpora and human translators help in determining what constitute keywords/key phrases that help in the disambiguation of words or phrases that lend themselves to multiple possible meanings. The disambiguation process uses a probabilistic language model that captures the dependencies of ambiguous words/phrases on those keywords/phrases through parametric conditional probabilities, with their parameters estimated using parallel corpora data. These are augmented via human translator feedbacks using an interface that maps the degree of confidence (a measure between 0 and 1, with 1 being 100% certainty about the word disambiguation) of the human translator in the disambiguation of a word/phrase into updated language model parameters. The disambiguation is made in accordance with the most probable meaning based on the keywords/phrases. This work also presents an iterative relaxation algorithm to disambiguate multiple words in one sentence by obtaining the translation with the highest joint probability. Experimental results using our model and method are reported on testbeds in the medical and literary fiction domains and our results fare more than favorably when compared to the state-of-the-art Neural Network (NN) based Word Sense Disambiguation approach. Our method goes beyond NN learning by extracting and modeling the essential semantic elements in the original language to faithfully capture the meaning of the source text.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bai X, Chang B, Zhan W, Wu Y (2002) The construction of a large-scale Chinese-English parallel corpus. In: Recent development in machine translation studies-proceedings of the National Conference on machine translation, pp 124–131

    Google Scholar 

  2. Bassnett S (2013) Translation studies. Routledge

    Book  Google Scholar 

  3. Chen L, Zhang Y, Zhang R, Tao C, Gan Z, Zhang H, Li B, Shen D, Chen C, Carin L (2019) Improving sequence-to-sequence learning via optimal transport. Proceedings of ICLR

  4. Chen K, Wang R, Utiyama M, Sumita E (2020) Content word aware neural machine translation. Proceedings of ACL, pp 358-364

  5. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014a) Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv preprint arXiv:14061078

  6. Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014b) On the properties of neural machine translation: Encoder-decoder approaches, arXiv preprint arXiv:14091259

  7. Cohen FS (1986) Markov random fields for image modelling & analysis. In: Modelling and application of stochastic processes, pp 243–272

    Chapter  Google Scholar 

  8. Collins M (2013) Statistical machine translation: IBM models 1and 2. COMS W4705: natural language processing lecture notes, pp 1

  9. Duda R, Hart P, Stork D (2001) Pattern classification. John Wiley & Sons Inc., New Work

    MATH  Google Scholar 

  10. Edunov S, Ott M, Auli M, Grangier D (2018) Understanding Back-Translation at Scale, arXiv preprint arXiv:1808.09381

  11. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610

    Article  Google Scholar 

  12. Hassan WS (2006) Agency and translational literature: Ahdaf Soueif’s the map of love. PMLA 121(3):753–768

    Google Scholar 

  13. He D, Xia Y, Qin T, Wang L, Yu N, Liu T, Ma W-Y (2016) Dual learning for machine translation. In: Advances in neural information processing systems, pp 820–828

    Google Scholar 

  14. Jimeno-Yepes AJ, McInnes BT, Aronson AR (2011) Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation. BMC Bioinformatics 12(1):223

    Article  Google Scholar 

  15. Kågebäck M, Salomonsson H (2016) Word sense disambiguation using a bidirectional lstm, arXiv preprint arXiv:160603568

  16. Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the 2003 Conference of the north American chapter of the Association for Computational Linguistics on human language technology-volume 1. Association for Computational Linguistics, pp 48–54

    Google Scholar 

  17. Kumar S, Byrne W (2004) Minimum Bayes-risk decoding for statistical machine translation, Johns Hopkins Univ Baltimore MD Centre for Language and Speech Proceedings (CLSP)

  18. Kumar S, Tsvetkov Y (2019) Von mises-fisher loss for training sequence to sequence models with continuous outputs. Proceedings of ICLR

  19. Li SZ (2009) Markov random field modeling in image analysis. Springer Science & Business Media, Third Edition

    MATH  Google Scholar 

  20. Liu X, Wong D F, Liu Y, Chao L S, Xiao T, Zhu J (2019) Shared-private bilingual word embeddings for neural machine translation. Proceedings of ACL, pp 3613–3622

  21. Lyons J (1995) Linguistic semantics: an introduction. Cambridge University Press

    Book  Google Scholar 

  22. Marcu D, Wong W (2002) A phrase-based, joint probability model for statistical machine translation. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, volume 10. Association for Computational Linguistics, pp 133–139

    Google Scholar 

  23. MIhalcea RF, Moldovan DI (2001) A highly accurate bootstrapping algorithm for word sense disambiguation. Int J Artif Intell Tools 10(01n02):5–21

    Article  Google Scholar 

  24. Navigli R, Velardi P (2005) Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Trans Pattern Anal Mach Intell 27(7):1075–1086

    Article  Google Scholar 

  25. Nye M (2016) Speaking in tongues: Science's centuries-long hunt for a common language. Distillations 2(1):40–43

    Google Scholar 

  26. Och FJ, Ney H (2004) The alignment template approach to statistical machine translation, computational linguistics. 30(4):417–449

  27. Pennington J, Socher R, Manning CG (2014) Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

    Chapter  Google Scholar 

  28. Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464. https://doi.org/10.1214/aos/1176344136 MR 0468014

    Article  MathSciNet  MATH  Google Scholar 

  29. Sundermeyer M, Ney H, Schlüter R (2015) From feedforward to recurrent LSTM neural networks for language modeling. IEEE Trans Audio Speech Lang Process 23(3):517–529

    Article  Google Scholar 

  30. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112

    Google Scholar 

  31. Tam K, Chan K (2012) Culture in translation. In: Open university of Hong Kong Press, 1st Edition, Kowloon, Hong Kong

  32. Ueno T (1986) パーソナルコンピュータによる機械翻訳プログラムの制作 (in Japanese). Tokyo: (株)ラッセル社. p 16. ISBN 494762700X

  33. Valiant LG (1984) A theory of the learnable. Commun ACM 27(11):1134–1142

    Article  MATH  Google Scholar 

  34. Wang H, Wu H, He Z, Huang L, Church KW (2021) Progress in machine translation. Engineering. https://doi.org/10.1016/j.enwg.2021.03.02330

  35. Wieting J, Berg-Kirkpatrick T, Gimpel K, Neubig G (2019) Beyond BLEU:training neural machine translation with semantic similarity. Proceedings of ACL, pp 4344-4355

  36. Yang Y, Cheng Y, Liu Y, Sun M (2019) Reducing word omission errors in neural machine translation: a contrastive learning approach. Proceedings of ACL, pp 6191-6196

  37. Yang J, Ma S, Zhang D, Li Z, Zhou M (2020) Improving neural machine translation with soft template prediction, proceedings of WMT, pp 5979-5989

  38. Zacks S (2014) Parametric statistical inference: basic theory and modern approaches, vol 4. Elsevier

    MATH  Google Scholar 

Download references

Data Availability

Not Applicable.

Code availability

Not Applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fernand S. Cohen.

Ethics declarations

Conflicts of interest/competing interests

Not Applicable.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cohen, F.S., Zhong, Z. & Li, C. Semantic graph for word disambiguation in machine translation. Multimed Tools Appl 81, 43485–43502 (2022). https://doi.org/10.1007/s11042-022-13242-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13242-y

Keywords

Navigation