Abstract
Parallel corpora are central to translation studies and contrastive linguistics. However, training machine translation (MT) systems by barely using the semantic aspects of a parallel corpus leads to unsatisfactory results, as then the trained MT systems are likely to generate target sentences that are semantically and pragmatically different from the source sentence. In the present work, we explore the improvement in the performance of an MT system when pragmatic features such as sentiment are introduced during its development. The language pair used for the experiments is English (source language) and Bengali (target language). The improvement in the MT output, before and after the introduction of sentiment features, is quantified by comparing various translation models, such as SMT, NMT and a newly developed translation model SeNA, with the help of automated (BLEU and TER) and manual evaluation metrics. In addition, the propagation of sentiment during the translation process is also studied extensively. We observe that the introduction of sentiment features during the system development process helps in elevating the translation quality.
Similar content being viewed by others
Notes
References
Afli H, McGuire S, Way A (2017) Sentiment translation for low resourced languages: experiments on irish general election tweets. In: Proceedings of 18th international conference on computational linguistics and intelligent text processing, Budapest, 10 pp
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015, conference track proceedings, San Diego, CA, 15pp
Banea C, Mihalcea R, Wiebe J, Hassan S (2008) Multilingual subjectivity analysis using machine translation. In: Proceedings of the conference on empirical methods in natural language processing, EMNLP ’08. Association for Computational Linguistics, Honolulu, pp 127–135
Das D, Bandyopadhyay S (2010) Developing Bengali Wordnet affect for analyzing emotion. In: International conference on the computer processing of oriental languages, proceedings, Redwood City, CA, pp 35–40
Das D, Bandyopadhyay S (2013) Building language resources for emotion analysis in Bengali. In: Karim M, Kaykobad M, Murshed M (eds) Technical challenges and design issues in Bangla language processing. IGI Global, pp 346–368
Doherty S, O’Brien S, Carl M (2010) Eye tracking as an MT evaluation technique. Mach Transl 24(1):1–13
Fleiss JL, Cohen J (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas 33(3):613–619
Heafield K (2011). KenLM: faster and smaller language model queries. In: Proceedings of the sixth workshop on statistical machine translation. Association for Computational Linguistics, Edinburgh, pp 187–197
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Joshi A, Balamurali A, Bhattacharyya P (2010) A fall-back strategy for sentiment analysis in Hindi: a case study. In: Proceedings of the 8th international conference on natural language processing, Hyderabad, pp 124–130
Kanayama H, Nasukawa T, Watanabe H (2004) Deeper sentiment analysis using machine translation technology. In: COLING 2004: proceedings of the 20th international conference on computational linguistics, Geneva, pp 494–500
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the Association for Computational Linguistics Companion volume proceedings of the demo and Poster sessions, Prague, pp 177–180
Lohar P, Afli H, Way A (2017) Maintaining sentiment polarity in translation of user-generated content. Prague Bull Math Linguist 108:73–84
Lohar P, Afli H, Way A (2018) Balancing translation quality and sentiment preservation. In: Proceedings of the 13th conference of the Association for Machine Translation in the Americas (volume 1: research track). Association for Machine Translation in the Americas, Boston, MA, pp 81–88
Lohar P, Popović M, Way A (Aug. 2019) Building English-to-Serbian machine translation system for IMDb movie reviews. In: Proceedings of the 7th workshop on Balto-Slavic natural language processing. Association for Computational Linguistics, Florence, pp 105–113
Mahata SK, Mandal S, Das D, Bandyopadhyay S (2018) SMT vs NMT: a comparison over Hindi & Bengali simple sentences. In: 15th international conference on natural language processing, proceedings, Patiala, pp 139–147
Och FJ, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguis 30(4):417–449
Pal S, Patra BG, Das D, Naskar SK, Bandyopadhyay S, van Genabith J (2014) How sentiment analysis can help machine translation. In: Proceedings of the 11th international conference on natural language processing, Goa, pp 89–94
Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, PA, pp 311–318
Poornima C, Dhanalakshmi V, Anand K, Soman K (2011) Rule based sentence simplification for English to Tamil machine translation system. Int J Comput Appl 25(8):38–42
Resnik P (1998) Parallel strands: a preliminary investigation into mining the web for bilingual text. In: Third conference of the Association for Machine Translation in the Americas. Springer, Langhorne, PA, pp 72–82
Resnik P (1999) Mining the web for bilingual text. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics, College Park, MD, pp 527–534
Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th conference of the Association for Machine Translation in the Americas: technical papers. Association for Machine Translation in the Americas, Cambridge, MA, pp 223–231
Štajner S, Popovic M (2016) Can text simplification help machine translation? In: Proceedings of the 19th annual conference of the European Association for Machine Translation, Riga, pp 230–242
Tyagi S, Chopra D, Mathur I, Joshi N (2015) Classifier based text simplification for improved machine translation. In: 2015 international conference on advances in computer engineering and applications. IEEE, Ghaziabad, pp 46–50
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, Long Beach, CA, pp 5998–6008
Way A (2018) Quality expectations of machine translation. In: Moorkens J, Castilho S, Gaspari F, Doherty S (eds) Translation quality assessment: from principles to practice. Springer, Cham, pp 159–178
Acknowledgements
This work is supported by Media Lab Asia, MeitY, Government of India, under the Visvesvaraya Ph.D. Scheme for Electronics & IT.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mahata, S.K., Das, D. & Bandyopadhyay, S. Investigating the roles of sentiment in machine translation. Machine Translation 35, 687–709 (2021). https://doi.org/10.1007/s10590-021-09291-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-021-09291-z