Abstract
In this paper we explore neural machine translation (NMT) for Indian languages. Reported work on Indian language Statistical Machine Translation (SMT) demonstrated good performance within the Indo-Aryan family, but relatively poor performance within the Dravidian family as well as between the two families. Interestingly, by common observation NMT generates more fluent output than SMT. This led us to investigate NMT’s potential for translation involving Indian languages. The current practice in NMT is to train the models with subword units. Among subwording methods, byte pair encoding (BPE) is a popular choice. We conduct extensive experiments with BPE-based NMT models for Indian languages. An interesting outcome of our study is the finding that the optimal value for BPE merge for Indian language pairs seems to be falling in the range of 0–5000 which is fairly low compared to that observed for European Languages. Additionally, we apply other techniques such as phrase table injection and linguistic feature based enhancements on corpora, plus BERT augmented NMT to boost performance. To the best of our knowledge, this is the first comprehensive study on Indian language NMT (ILNMT) covering major languages in India. As an empirical paper, we expect this work could serve as a benchmark for ILNMT research.
Similar content being viewed by others
Notes
Figure 1 has been constructed by drawing inspiration from the following images: https://images.app.goo.gl/CYukRDcQTsytwpQ67, https://qphs.fs.quoracdn.net/main-qimg-f6e580591e48cc0829fdffcc8d4f1ae3, https://en.wikipedia.org/wiki/File:AustroAsiatic_tree_Peiros2004.png.
References
Akella K, Himal Allu S, Ragupathi SS, Singhal A, Khan Z, Namboodiri VP, Jawahar CV (2020) Exploring pair-wise NMT for Indian languages. In: Proceedings of int’l conference natural language processing, Patna
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473
Banerjee T, Bhattacharyya P (2018) Meaningless yet meaningful: morphology grounded subword-level NMT. In: Proceedings of the second workshop on subword/character level models, association for computational linguistics, New Orleans, pp 55–60, https://doi.org/10.18653/v1/W18-1207, https://www.aclweb.org/anthology/W18-1207
Cherry C, Foster G, Bapna A, Firat O, Macherey W (2018) Revisiting character-based neural machine translation with capacity and compression. CoRR abs/1808.09943, arXiv:1808.09943
Dabre R, Chenhu C, Kunchukuttan A (2020) A survey of multilingual neural machine translation. ACM Comput Surv 53, 5, Article 99
Denkowski M, Neubig G (2017) Stronger baselines for trustable results in neural machine translation. In: Proceedings of the first workshop on neural machine translation, association for computational linguistics, Vancouver, pp 18–27, https://doi.org/10.18653/v1/W17-3203, https://www.aclweb.org/anthology/W17-3203
Ding S, Renduchintala A, Duh K (2019a) A call for prudent choice of subword merge operations. CoRR abs/1905.10453, arXiv:1905.10453
Ding S, Renduchintala A, Duh K (2019b) A call for prudent choice of subword merge operations in neural machine translation. arXiv preprint arXiv:190510453
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Jha GN (2010) The TDIL program and the Indian langauge corpora intitiative (ILCI). In: Proceedings of the seventh international conference on language resources and evaluation (LREC’10), European Language Resources Association (ELRA), Valletta, http://www.lrec-conf.org/proceedings/lrec2010/pdf/874_Paper.pdf
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions, association for computational linguistics, Prague, Czech Republic, pp 177–180, https://www.aclweb.org/anthology/P07-2045
Kudo T (2018a) Subword regularization: improving neural network translation models with multiple subword candidates. CoRR abs/1804.10959, arXiv:1804.10959
Kudo T (2018b) Subword regularization: improving neural network translation models with multiple subword candidates. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, pp 66–75, https://doi.org/10.18653/v1/P18-1007, https://www.aclweb.org/anthology/P18-1007
Kunchukuttan A, Bhattacharyya P (2016) Learning variable length units for SMT between related languages via byte pair encoding. CoRR abs/1610.06510, arXiv:1610.06510
Kunchukuttan A, Bhattachatryya P (2021) Low resource machine translation and transliteration. CRC Press, Philadelphia
Kunchukuttan A, Mishra A, Chatterjee R, Shah R, Bhattacharyya P (2014a) Sata-anuvadak: tackling multiway translation of Indian languages. pan 841(54,570):4–135
Kunchukuttan A, Puduppully R, Chatterjee R, Mishra A, Bhattacharyya P (2014b) The IIT Bombay SMT system for icon 2014 tools contest
Murthy R, Kunchukuttan A, Bhattacharyya P (2019) Addressing word-order divergence in multilingual neural machine translation for extremely low resource languages. In: Proceedings of the annual conference of the North American chapter of the association for computational linguistics, Minneapolis
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, Association for Computational Linguistics, pp 311–318
Renduchintala A, Shapiro P, Duh K, Koehn P (2018) Character-aware decoder for neural machine translation. ArXiv abs/1809.02223
Revanuru K, Turlapaty K, Rao S (2017) Neural machine translation of indian languages. In: Proceedings of the 10th annual ACM India Computer Conference, Bhopal
Schuster M, Nakajima K (2012) Japanese and Korean voice search. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 5149–5152
Sen S, Hasanuzzaman M, Ekbal A, Bhattacharyya P, Way A (2019) Take help from elder brother: old to modern English NMT with phrase pair feedback. In: 20th international conference on computational linguistics and intelligent text processing CICLing, La Rochelle
Sennrich R, Haddow B (2016) Linguistic input features improve neural machine translation. In: Proceedings of the first conference on machine translation: volume 1, research papers, association for computational linguistics, Berlin, pp 83–91, https://doi.org/10.18653/v1/W16-2209, https://www.aclweb.org/anthology/W16-2209
Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers), association for computational linguistics, Berlin, pp 1715–1725, https://doi.org/10.18653/v1/P16-1162, https://www.aclweb.org/anthology/P16-1162
Sennrich R, Birch A, Currey A, Germann U, Haddow B, Heafield K, Barone AVM, Williams P (2017) The university of Edinburgh’s neural MT systems for WMT17. CoRR abs/1708.00726, http://arxiv.org/abs/1708.00726, arXiv:1708.00726
Subbārāo KV (2012) South Asian languages: a syntactic typology. Cambridge University Press
Tang Y, Meng F, Lu Z, Li H, Yu PL (2016) Neural machine translation with external phrase memory. arXiv preprint arXiv:160601792
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation. CoRR abs/1609.08144, arXiv:1609.08144
Zhao Y, Wang Y, Zhang J, Zong C (2018) Phrase table as recommendation memory for neural machine translation. arXiv preprint arXiv:180509960
Acknowledgements
We would like to thank the technology development for Indian languages (TDIL) programme and the Department of Electronics and Information Technology, Govt. of India for providing the ILCI corpus. We would also like to thank research scholars, Rudra Murthy, Tamali Banerjee, Jyotsana Khatri, Kevin Patel, and Diptesh Kanojia and members of CFILT for their valuable guidance and support.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dewangan, S., Alva, S., Joshi, N. et al. Experience of neural machine translation between Indian languages. Machine Translation 35, 71–99 (2021). https://doi.org/10.1007/s10590-021-09263-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-021-09263-3