Abstract
Context-aware machine translation approaches improve the quality of translation by incorporating the context of the surrounding phrases in the translation of a phrase. So far, for the low-resource language pair English-Amharic, context-aware machine translation approaches have not been investigated in depth. Moreover, the current approaches for machine translation of the low-resource language pair English-Amharic usually require a large set of parallel corpus to achieve fluency. This research investigates a new approach that translates English text to Amharic text using a combination of context based machine translation (CBMT) and a recurrent neural network machine translation (RNNMT). We built a bilingual dictionary for the CBMT to use along with a target corpus. The RNNMT model is then provided with the output of the CBMT and a parallel corpus for training. The approach is evaluated using the New Testament Bible as a corpus. Our combinational approach on English–Amharic language pair yields a performance improvement over the simple neural machine translation (NMT), while no improvement is seen over CBMT for a small dataset. We have also assessed the impact of the dictionary used by CBMT on the overall performance of the approach. The result shows that the dictionary accuracy, and hence, the CBMT output is found to affect the combinational approach.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Abate S, Woldeyohannis M, Tachbelie M, Meshesha M, Atnafu S, Gewe W, Assabie Y, Abera H, Seyoum B, Abebe T, Tsegaye W, Lemma A, Andargie T, Shifaw S (2018) Parallel corpora for bi-lingual English–Ethiopian languages statistical machine translation. In: Proceedings of the 27th international conference on computational linguistics, Santa Fe, New Mexico, USA, pp 3102–3111
Artetxe M, Labaka G, Agirre E, Cho K (2018) Unsupervised neural machine translation. In: International conference on learning representations
Brown PF, Cocke J, Della Pietra SA, Della Pietra VJ, Jelinek F, Lafferty JD, Mercer RL, Roossin PS (1990) A statistical approach to machine translation. Comput Linguist 16(2):79–85
Carbonell J, Klein S, Miller DF, Steinbaum M, Grassiany T, Frey J (2006) Context-based machine translation
Clay C (2018) Comparing the gospels: Matthew, Mark, Luke, and John. https://owlcation.com/humanities/Comparing-the-Gospels-Matthew-Mark-Luke-and-John
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923
Eberhard D, Simons G, Fennig C (2020) Ethnologue: languages of the world. http://www.ethnologue.com, online version
Gangadharaiah R (2011) Coping with data-sparsity in example-based machine translation. PhD thesis, USA
Gasser M (2012) Toward a rule-based system for English–Amharic translation
Gu J, Hassan H, Devlin J, Li V (2018) Universal neural machine translation for extremely low resource languages. pp 344–354. https://doi.org/10.18653/v1/N18-1032
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–80. https://doi.org/10.1162/neco.1997.9.8.1735
Kane TL (1974) Arabic translations into Amharic. Bulletin of the School of Oriental and African Studies, University of London 37(3):608–627. http://www.jstor.org/stable/613803
Kishore P, Salim R, Todd W, Wei-Jing Z (2002) Bleu: a method for automatic evaluation of machine translation. pp 311–318
Labaka G, España-Bonet C, Màrquez L, Sarasola K (2014) A hybrid machine translation architecture guided by syntax. Mach Transl 28(2):91–125
Luong MT, Pham H, Manning C (2015) Effective approaches to attention-based neural machine translation. pp 1412–1421. https://doi.org/10.18653/v1/D15-1166
Niehues J, Cho E, Ha TL, Waibel AH (2016) Pre-translation for neural machine translation. In: COLING
Oladosu J, Esan A, Adeyanju I, Adegoke B, Olaniyan O, Omodunbi B (2016) Approaches to machine translation: a review. FUOYE J Eng Technol 1:120–126
Popovic M (2017) Comparing language related issues for nmt and pbmt between German and English. The Prague Bulletin of Mathematical Linguistics 108. https://doi.org/10.1515/pralin-2017-0021
Raschka S (2018) Model evaluation, model selection, and algorithm selection in machine learning
Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, pp 1715–1725. https://doi.org/10.18653/v1/P16-1162, https://www.aclweb.org/anthology/P16-1162
Soergel D (1998) WordNet. An electronic lexical database. MIT, Cambridge
Tadesse A, Mekuria Y (2000) English to Amharic machine translation using smt. Master’s thesis, Addis Ababa University
Teshome E (2000) Bidirectional English–Amharic machine translation: an experiment using constrained corpus. Master’s thesis, Addis Ababa University
Teshome M, Besacier L, Taye G, Teferi D (2015) Phoneme-based English–Amharic statistical machine translation. pp 1–5. https://doi.org/10.1109/AFRCON.2015.7331921
Woldeyohannis M, Besacier L, Meshesha M (2016) Amharic speech recognition for speech translation
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado GS, Hughes M, Dean J (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. ArXiv abs/1609.08144
Yulianti E, Budi I, Hidayanto A, Manurung R, Adriani M (2011) Developing Indonesian–English hybrid machine translation system. pp 265–270
Zewgneh S (2017) English–Amharic document translation using hybrid approach. Master’s thesis, Addis Ababa University
Zhou L, hu W, Zhang J, Zong C (2017) Neural system combination for machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 2: Short Papers), Association for Computational Linguistics, pp 378–384. https://doi.org/10.18653/v1/P17-2060, https://www.aclweb.org/anthology/P17-2060
Zoph B, Knight K (2016) Multi-source neural translation. In: Proceedings of the 2016 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, San Diego, California, pp 30–34. https://doi.org/10.18653/v1/N16-1004, https://www.aclweb.org/anthology/N16-1004
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ashengo, Y.A., Aga, R.T. & Abebe, S.L. Context based machine translation with recurrent neural network for English–Amharic translation. Machine Translation 35, 19–36 (2021). https://doi.org/10.1007/s10590-021-09262-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-021-09262-4