Skip to main content
Log in

Investigating the roles of sentiment in machine translation

  • Published:
Machine Translation

Abstract

Parallel corpora are central to translation studies and contrastive linguistics. However, training machine translation (MT) systems by barely using the semantic aspects of a parallel corpus leads to unsatisfactory results, as then the trained MT systems are likely to generate target sentences that are semantically and pragmatically different from the source sentence. In the present work, we explore the improvement in the performance of an MT system when pragmatic features such as sentiment are introduced during its development. The language pair used for the experiments is English (source language) and Bengali (target language). The improvement in the MT output, before and after the introduction of sentiment features, is quantified by comparing various translation models, such as SMT, NMT and a newly developed translation model SeNA, with the help of automated (BLEU and TER) and manual evaluation metrics. In addition, the propagation of sentiment during the translation process is also studied extensively. We observe that the introduction of sentiment features during the system development process helps in elevating the translation quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://tdil.meity.gov.in/.

  2. http://nlp.amrita.edu/mtil_cen/.

  3. https://pypi.org/project/googletrans/.

  4. http://www.nltk.org/.

  5. https://nlp.stanford.edu/software/lex-parser.shtml.

  6. https://scikit-learn.org/stable/modules/generated/sklearn.utils.class_weight.compute_class_weight.html.

  7. https://ai.stanford.edu/~amaas/data/sentiment/.

  8. https://www.kaggle.com/tazimhoque/bengali-sentiment-text.

  9. https://github.com/socianltd/socian-bangla-sentiment-dataset-labeled.

  10. https://data.mendeley.com/datasets/n53xt69gnf/3.

References

  • Afli H, McGuire S, Way A (2017) Sentiment translation for low resourced languages: experiments on irish general election tweets. In: Proceedings of 18th international conference on computational linguistics and intelligent text processing, Budapest, 10 pp

  • Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015, conference track proceedings, San Diego, CA, 15pp

  • Banea C, Mihalcea R, Wiebe J, Hassan S (2008) Multilingual subjectivity analysis using machine translation. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP ’08. Association for Computational Linguistics, Honolulu, pp 127–135

  • Das D, Bandyopadhyay S (2010) Developing Bengali Wordnet affect for analyzing emotion. In: International conference on the computer processing of oriental languages, proceedings, Redwood City, CA, pp 35–40

  • Das D, Bandyopadhyay S (2013) Building language resources for emotion analysis in Bengali. In: Karim M, Kaykobad M, Murshed M (eds) Technical challenges and design issues in Bangla language processing. IGI Global, pp 346–368

  • Doherty S, O’Brien S, Carl M (2010) Eye tracking as an MT evaluation technique. Mach Transl 24(1):1–13

    Article  Google Scholar 

  • Fleiss JL, Cohen J (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas 33(3):613–619

    Article  Google Scholar 

  • Heafield K (2011). KenLM: faster and smaller language model queries. In: Proceedings of the sixth workshop on statistical machine translation. Association for Computational Linguistics, Edinburgh, pp 187–197

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Joshi A, Balamurali A, Bhattacharyya P (2010) A fall-back strategy for sentiment analysis in Hindi: a case study. In: Proceedings of the 8th international conference on natural language processing, Hyderabad, pp 124–130

  • Kanayama H, Nasukawa T, Watanabe H (2004) Deeper sentiment analysis using machine translation technology. In: COLING 2004: proceedings of the 20th international conference on computational linguistics, Geneva, pp 494–500

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the Association for Computational Linguistics Companion volume proceedings of the demo and Poster sessions, Prague, pp 177–180

  • Lohar P, Afli H, Way A (2017) Maintaining sentiment polarity in translation of user-generated content. Prague Bull Math Linguist 108:73–84

    Article  Google Scholar 

  • Lohar P, Afli H, Way A (2018) Balancing translation quality and sentiment preservation. In: Proceedings of the 13th conference of the Association for Machine Translation in the Americas (volume 1: research track). Association for Machine Translation in the Americas, Boston, MA, pp 81–88

  • Lohar P, Popović M, Way A (Aug. 2019) Building English-to-Serbian machine translation system for IMDb movie reviews. In: Proceedings of the 7th workshop on Balto-Slavic natural language processing. Association for Computational Linguistics, Florence, pp 105–113

  • Mahata SK, Mandal S, Das D, Bandyopadhyay S (2018) SMT vs NMT: a comparison over Hindi & Bengali simple sentences. In: 15th international conference on natural language processing, proceedings, Patiala, pp 139–147

  • Och FJ, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguis 30(4):417–449

    Article  Google Scholar 

  • Pal S, Patra BG, Das D, Naskar SK, Bandyopadhyay S, van Genabith J (2014) How sentiment analysis can help machine translation. In: Proceedings of the 11th international conference on natural language processing, Goa, pp 89–94

  • Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, PA, pp 311–318

  • Poornima C, Dhanalakshmi V, Anand K, Soman K (2011) Rule based sentence simplification for English to Tamil machine translation system. Int J Comput Appl 25(8):38–42

    Article  Google Scholar 

  • Resnik P (1998) Parallel strands: a preliminary investigation into mining the web for bilingual text. In: Third conference of the Association for Machine Translation in the Americas. Springer, Langhorne, PA, pp 72–82

  • Resnik P (1999) Mining the web for bilingual text. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics, College Park, MD, pp 527–534

  • Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th conference of the Association for Machine Translation in the Americas: technical papers. Association for Machine Translation in the Americas, Cambridge, MA, pp 223–231

  • Štajner S, Popovic M (2016) Can text simplification help machine translation? In: Proceedings of the 19th annual conference of the European Association for Machine Translation, Riga, pp 230–242

  • Tyagi S, Chopra D, Mathur I, Joshi N (2015) Classifier based text simplification for improved machine translation. In: 2015 international conference on advances in computer engineering and applications. IEEE, Ghaziabad, pp 46–50

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, Long Beach, CA, pp 5998–6008

  • Way A (2018) Quality expectations of machine translation. In: Moorkens J, Castilho S, Gaspari F, Doherty S (eds) Translation quality assessment: from principles to practice. Springer, Cham, pp 159–178

    Chapter  Google Scholar 

Download references

Acknowledgements

This work is supported by Media Lab Asia, MeitY, Government of India, under the Visvesvaraya Ph.D. Scheme for Electronics & IT.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sainik Kumar Mahata.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahata, S.K., Das, D. & Bandyopadhyay, S. Investigating the roles of sentiment in machine translation. Machine Translation 35, 687–709 (2021). https://doi.org/10.1007/s10590-021-09291-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-021-09291-z

Keywords

Navigation