Skip to main content
Log in

An empirical study of low-resource neural machine translation of manipuri in multilingual settings

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Machine translation requires a large amount of parallel data for a production level of translation quality. This is one of the significant factors behind the lack of machine translation systems for most spoken/written languages. Likewise, Manipuri is a low resource Indian language, and there is very little digital textual available data for the same. In this work, we attempt to address the low resource neural machine translation for Manipuri and English using other Indian languages in a multilingual setup. We train an LSTM based many-to-many multilingual neural machine translation system that is infused with cross-lingual features. Experimental results show that our method improves over the vanilla many-to-many multilingual and bilingual baselines for both Manipuri to/from English translation tasks. Furthermore, our method also improves over the vanilla many-to-many multilingual system for the translation task of all the other Indian languages to/from English. We also examine the generalizability of our multilingual model by evaluating the translation among the language pairs which do not have a direct link via the zero-shot translation and compare it against the pivot-based translation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. http://unicode.org/charts/PDF/UABC0.pdf.

  2. http://unicode.org/charts/PDF/U0980.pdf.

  3. https://www.pmindia.gov.in.

  4. https://en.wikipedia.org/wiki/Meitei_language.

  5. https://www.pmindia.gov.in/.

  6. https://github.com/bhaddow/pmindia-crawler.

  7. https://tdil.meity.gov.in/.

  8. https://vikaspedia.in/.

  9. http://lotus.kuee.kyoto-u.ac.jp/WAT/indic-multilingual/indic_wat_2021.tar.gz.

  10. https://github.com/anoopkunchukuttan/indic_nlp_library.

  11. https://github.com/saffsd/langid.py.

  12. http://lotus.kuee.kyoto-u.ac.jp/WAT/indic-multilingual/indic_wat_2021.tar.gz.

  13. https://github.com/artetxem/vecmap.

  14. https://github.com/OpenNMT/OpenNMT-py.

  15. http://lotus.kuee.kyoto-u.ac.jp/WAT/indic-multilingual/indic_wat_2021.tar.gz.

  16. https://github.com/jerinphilip/wateval.

  17. Higher score indicates higher translation quality.

References

  1. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ (eds) Advances in neural information processing systems, vol 37. Curran Associates, Inc, London

    Google Scholar 

  2. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  3. Gehring J, Auli M, Grangier D, Dauphin Y (2017) A convolutional encoder model for neural machine translation. In: Proceedings of the 55th annual meeting of the Association for computational linguistics, vol 1, Long Papers, Vancouver, Canada, pp 123–135, https://doi.org/10.18653/v1/P17-1012, https://www.aclweb.org/anthology/P17-1012

  4. Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 54th annual meeting of the Association for computational linguistics, vol 1, Long Papers, Berlin, Germany, pp 1715–1725, https://doi.org/10.18653/v1/P16-1162, https://www.aclweb.org/anthology/P16-1162

  5. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems, Curran Associates, Inc., vol 30, https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

  6. Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the 2003 human language technology conference of the North American chapter of the Association for computational linguistics, pp 127–133, https://www.aclweb.org/anthology/N03-1017

  7. Koehn P, Knight K (2003) Feature-rich statistical translation of noun phrases. In: Proceedings of the 41st annual meeting of the Association for computational linguistics, Association for computational linguistics, Sapporo, Japan, pp 311–318, https://doi.org/10.3115/1075096.1075136, https://www.aclweb.org/anthology/P03-1040

  8. Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, Association for Computational Linguistics, Lisbon, Portugal, pp 1412–1421, https://doi.org/10.18653/v1/D15-1166, https://www.aclweb.org/anthology/D15-1166

  9. Kudo T, Richardson J (2018) Sentencepiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. EMNLP 2018:66

    Google Scholar 

  10. Cheng Y, Xu W, He Z, He W, Wu H, Sun M, Liu Y (2016) Semi-supervised learning for neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics, vol 1, Long Papers, Association for Computational Linguistics, Berlin, Germany, pp 1965–1974, https://doi.org/10.18653/v1/P16-1185, https://www.aclweb.org/anthology/P16-1185

  11. Edunov S, Ott M, Auli M, Grangier D (2018) Understanding back-translation at scale. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Association for Computational Linguistics, Brussels, Belgium, pp 489–500, https://doi.org/10.18653/v1/D18-1045, https://www.aclweb.org/anthology/D18-1045

  12. He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. In: Advances in neural information processing systems, pp 820–828

  13. He J, Gu J, Shen J, Ranzato M (2020) Revisiting self-training for neural sequence generation. In: Proceedings of ICLR, https://openreview.net/forum?id=SJgdnAVKDH

  14. Lample G, Conneau A, Denoyer L, Ranzato M (2018) Unsupervised machine translation using monolingual corpora only. In: International conference on learning representations, https://openreview.net/forum?id=rkYTTf-AZ

  15. Artetxe M, Labaka G, Agirre E, Cho K (2018) Unsupervised neural machine translation. In: International conference on learning representations, https://openreview.net/forum?id=Sy2ogebAW

  16. Lample G, Ott M, Conneau A, Denoyer L, Ranzato M (2018) Phrase-based & neural unsupervised machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Association for Computational Linguistics, Brussels, Belgium, pp 5039–5049, https://doi.org/10.18653/v1/D18-1549, https://www.aclweb.org/anthology/D18-1549

  17. Song K, Tan X, Qin T, Lu J, Liu TY (2019) Mass: Masked sequence to sequence pre-training for language generation. In: International conference on machine learning, pp 5926–5936

  18. Yang Z, Chen W, Wang F, Xu B (2018) Unsupervised neural machine translation with weight sharing. In: Proceedings of the 56th annual meeting of the Association for computational linguistics, vol 1, Long Papers, Melbourne, Australia, pp 46–55, https://doi.org/10.18653/v1/P18-1005, https://www.aclweb.org/anthology/P18-1005

  19. Li Z, Zhao H, Wang R, Chen K, Utiyama M, Sumita E (2020) SJTU-NICT’s supervised and unsupervised neural machine translation systems for the WMT20 news translation task. In: Proceedings of the fifth conference on machine translation, Association for Computational Linguistics, Online, pp 218–229, https://www.aclweb.org/anthology/2020.wmt-1.22

  20. Singh SM, Singh TD, Bandyopadhyay S (2020) The NITS-CNLP system for the unsupervised MT task at WMT 2020. In: Proceedings of the Fifth conference on machine translation, Association for Computational Linguistics, Online, pp 1139–1143, https://www.aclweb.org/anthology/2020.wmt-1.135

  21. Liu Y, Gu J, Goyal N, Li X, Edunov S, Ghazvininejad M, Lewis M, Zettlemoyer L (2020) Multilingual denoising pre-training for neural machine translation. Trans Assoc Comput Linguist 8:726–742. https://doi.org/10.1162/tacl_a_00343, https://www.aclweb.org/anthology/2020.tacl-1.47

  22. Ha TL, Niehues J, Waibel A (2016) Toward multilingual neural machine translation with universal encoder and decoder. arXiv preprint arXiv:1611.04798

  23. Firat O, Cho K, Bengio Y (2016) Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the 2016 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, San Diego, California, pp 866–875, https://doi.org/10.18653/v1/N16-1101, https://www.aclweb.org/anthology/N16-1101

  24. Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G, Hughes M, Dean J (2017) Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans Assoc Comput Linguist 5:339–351. https://doi.org/10.1162/tacl_a_00065, https://www.aclweb.org/anthology/Q17-1024

  25. Arivazhagan N, Bapna A, Firat O, Lepikhin D, Johnson M, Krikun M, Chen MX, Cao Y, Foster G, Cherry C, Macherey W, Chen Z, Wu Y (2019) Massively multilingual neural machine translation in the wild: findings and challenges. 1907.05019

  26. Zoph B, Yuret D, May J, Knight K (2016) Transfer learning for low-resource neural machine translation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Association for Computational Linguistics, Austin, Texas, pp 1568–1575, https://doi.org/10.18653/v1/D16-1163, https://www.aclweb.org/anthology/D16-1163

  27. Nguyen TQ, Chiang D (2017) Transfer learning across low-resource, related languages for neural machine translation. In: Proceedings of the eighth international joint conference on natural language processing, vol 2, Short Papers, Asian Federation of Natural Language Processing, Taipei, Taiwan, pp 296–301, https://www.aclweb.org/anthology/I17-2050

  28. Kocmi T, Bojar O (2018) Trivial transfer learning for low-resource neural machine translation. In: Proceedings of the third conference on machine translation: research papers, Association for Computational Linguistics, Brussels, Belgium, pp 244–252, https://doi.org/10.18653/v1/W18-6325, https://www.aclweb.org/anthology/W18-6325

  29. Dabre R, Chu C, Cromieres F, Nakazawa T, Kurohashi S (2015) Large-scale dictionary construction via pivot-based statistical machine translation with significance pruning and neural network features. In: Proceedings of the 29th Pacific Asia conference on language, information and computation, Shanghai, China, pp 289–297, https://www.aclweb.org/anthology/Y15-1033

  30. Utiyama M, Isahara H (2007) A comparison of pivot methods for phrase-based statistical machine translation. In: Proceedings of the main conference human language technologies 2007: the conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Rochester, New York, pp 484–491, https://www.aclweb.org/anthology/N07-1061

  31. More R, Kunchukuttan A, Bhattacharyya P, Dabre R (2015) Augmenting pivot based SMT with word segmentation. In: Proceedings of the 12th international conference on natural language processing, NLP Association of India, Trivandrum, India, pp 303–307, https://www.aclweb.org/anthology/W15-5944

  32. Dabre R, Chu C, Kunchukuttan A (2020) A survey of multilingual neural machine translation. ACM Comput Surv 53(5), https://doi.org/10.1145/3406095, https://doi.org/10.1145/3406095

  33. Haddow B, Kirefu F (2020) Pmindia-a collection of parallel corpora of languages of india. CoRR abs/2001.09907, https://arxiv.org/abs/2001.09907, 2001.09907

  34. Bapna A, Firat O (2019) Simple, scalable adaptation for neural machine translation. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, pp 1538–1548, https://doi.org/10.18653/v1/D19-1165, https://www.aclweb.org/anthology/D19-1165

  35. Aharoni R, Johnson M, Firat O (2019) Massively multilingual neural machine translation. In: Proceedings of the 2019 conference of the North American Chapter of the Association for computational linguistics: human language technologies, vol 1, Long and Short Papers, Association for Computational Linguistics, Minneapolis, Minnesota, pp 3874–3884, https://doi.org/10.18653/v1/N19-1388, https://www.aclweb.org/anthology/N19-1388

  36. Singh SM, Sanayai Meetei L, Singh TD, Bandyopadhyay S (2021) Multiple captions embellished multilingual multi-modal neural machine translation. In: Proceedings of the first workshop on multimodal machine translation for low resource languages (MMTLRL 2021), INCOMA Ltd., Online (Virtual Mode), pp 2–11, https://aclanthology.org/2021.mmtlrl-1.2

  37. Sen S, Gupta KK, Ekbal A, Bhattacharyya P (2019) Multilingual unsupervised NMT using shared encoder and language-specific decoders. In: Proceedings of the 57th annual meeting of the association for computational linguistics, Florence, Italy, pp 3083–3089, https://doi.org/10.18653/v1/P19-1297, https://www.aclweb.org/anthology/P19-1297

  38. Kim Y, Gao Y, Ney H (2019) Effective cross-lingual transfer of neural machine translation models without shared vocabularies. In: Proceedings of the 57th annual meeting of the association for computational linguistics, Florence, Italy, pp 1246–1257, https://doi.org/10.18653/v1/P19-1120, https://www.aclweb.org/anthology/P19-1120

  39. Ji B, Zhang Z, Duan X, Zhang M, Chen B, Luo W (2020) Cross-lingual pre-training based transfer for zero-shot neural machine translation. Proceedings of the AAAI conference on artificial intelligence 34(01):115–122. https://doi.org/10.1609/aaai.v34i01.5341, https://ojs.aaai.org/index.php/AAAI/article/view/5341

  40. Singh TD, Bandyopadhyay S (2010) Manipuri-English bidirectional statistical machine translation systems using morphology and dependency relations. In: Proceedings of the 4th workshop on syntax and structure in statistical translation, Coling 2010 Organizing Committee, Beijing, China, pp 83–91, https://www.aclweb.org/anthology/W10-3811

  41. Singh TD (2013) Taste of two different flavours: Which Manipuri script works better for English-Manipuri language pair SMT systems? In: Proceedings of the seventh workshop on syntax, semantics and structure in statistical translation, Association for Computational Linguistics, Atlanta, Georgia, pp 11–18, https://www.aclweb.org/anthology/W13-0802

  42. Singh SM, Singh TD (2021) Statistical and neural machine translation systems of english to manipuri: a preliminary study. In: Reddy VS, Prasad VK, Wang J, Reddy KTV (eds) Soft computing and signal processing. Springer, Singapore, pp 203–211

    Google Scholar 

  43. Sanayai Meetei L, Singh TD, Bandyopadhyay S, Vela M, van Genabith J (2020) English to Manipuri and mizo post-editing effort and its impact on low resource machine translation. In: Proceedings of the 17th international conference on natural language processing (ICON), NLP Association of India (NLPAI), Indian Institute of Technology Patna, Patna, India, pp 50–59, https://aclanthology.org/2020.icon-main.7

  44. Singh SM, Singh TD (2020) Unsupervised neural machine translation for English and Manipuri. In: Proceedings of the 3rd workshop on technologies for MT of low resource languages, Association for Computational Linguistics, Suzhou, China, pp 69–78, https://www.aclweb.org/anthology/2020.loresmt-1.10

  45. Laitonjam L, Ranbir Singh S (2021) Manipuri-English machine translation using comparable corpus. In: Proceedings of the 4th workshop on technologies for MT of low resource languages (LoResMT2021), Association for Machine Translation in the Americas, Virtual, pp 78–88, https://aclanthology.org/2021.mtsummit-loresmt.8

  46. Artetxe M, Labaka G, Agirre E (2018) A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. In: Proceedings of the 56th annual meeting of the association for computational linguistics, vol 1, Long Papers, Association for Computational Linguistics, Melbourne, Australia, pp 789–798, https://doi.org/10.18653/v1/P18-1073, https://www.aclweb.org/anthology/P18-1073

  47. Conneau A, Lample G, Ranzato M, Denoyer L, Jégou H (2017) Word translation without parallel data. arXiv preprint arXiv:1710.04087

  48. Ruder S, Vulić I, Søgaard A (2019) A survey of cross-lingual word embedding models. J Artif Intell Res 65:569–631. https://doi.org/10.1613/jair.1.11640

    Article  MathSciNet  MATH  Google Scholar 

  49. Joulin A, Bojanowski P, Mikolov T, Jégou H, Grave E (2018) Loss in translation: Learning bilingual word mapping with a retrieval criterion. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Association for Computational Linguistics, Brussels, Belgium, pp 2979–2984, https://doi.org/10.18653/v1/D18-1330, https://www.aclweb.org/anthology/D18-1330

  50. Artetxe M, Labaka G, Agirre E (2017) Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th annual meeting of the Association for computational linguistics, vol 1, Long Papers, Vancouver, Canada, pp 451–462, https://doi.org/10.18653/v1/P17-1042, https://www.aclweb.org/anthology/P17-1042

  51. Avazpour R, Ebrahimi E, Fathi MR (2014) Prioritizing agility enablers based on agility attributes using fuzzy prioritization method and similarity-based approach. Int J Econ Manag Soc Sci 3(1):143–153

    Google Scholar 

  52. Fathi MR, Nasrollahi M, Zamanian A (2020) Mathematical modeling of sustainable supply chain networks under uncertainty and solving it using metaheuristic algorithms. Ind Manag J 11(4):621–652, https://doi.org/10.22059/imj.2019.280393.1007588, https://imj.ut.ac.ir/article_75670.html, https://imj.ut.ac.ir/article_75670_d051adce9ef6548e180c4bb8e5138027.pdf

  53. Fathi MR, Sadeghi R (2021) Identification and ranking the key factors of block chain success in the sustainable supply chain of the food industry with an integrated approach of interpretive structural modelling and fuzzy dematel. Andisheh Amad 20(76):175–202, https://www.sid.ir/en/Journal/ViewPaper.aspx?ID=867177, article

  54. Doren Singh T, Bandyopadhyay S (2011) Integration of reduplicated multiword expressions and named entities in a phrase based statistical machine translation system. In: Proceedings of 5th international joint conference on natural language processing, Asian Federation of Natural Language Processing, Chiang Mai, Thailand, pp 1304–1312, https://www.aclweb.org/anthology/I11-1146

  55. Singh TD, Bandyopadhyay S (2010) Web based Manipuri corpus for multiword NER and reduplicated MWEs identification using SVM. In: Proceedings of the 1st workshop on South and Southeast Asian natural language processing, Coling 2010 Organizing Committee, Beijing, China, pp 35–42, https://www.aclweb.org/anthology/W10-3605

  56. Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146. https://doi.org/10.1162/tacl_a_00051

    Article  Google Scholar 

  57. Philip J, Siripragada S, Namboodiri VP, Jawahar CV (2021) Revisiting low resource status of indian languages in machine translation. In: 8th ACM IKDD CODS and 26th COMAD, Association for Computing Machinery, New York, NY, USA, CODS COMAD 2021, p 178-187, https://doi.org/10.1145/3430984.3431026, https://doi.org/10.1145/3430984.3431026

  58. Kudo T, Richardson J (2018) SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations, Association for Computational Linguistics, Brussels, Belgium, pp 66–71, https://doi.org/10.18653/v1/D18-2012, https://www.aclweb.org/anthology/D18-2012

  59. Klein G, Kim Y, Deng Y, Senellart J, Rush A (2017) OpenNMT: Open-source toolkit for neural machine translation. In: Proceedings of ACL 2017, system demonstrations, Association for Computational Linguistics, Vancouver, Canada, pp 67–72, https://www.aclweb.org/anthology/P17-4012

  60. Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. In: Bengio Y, LeCun Y (eds) 3rd International conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, http://arxiv.org/abs/1412.6980

  61. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(56):1929–1958, http://jmlr.org/papers/v15/srivastava14a.html

  62. Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for computational linguistics, Philadelphia, Pennsylvania, USA, pp 311–318, https://doi.org/10.3115/1073083.1073135, https://www.aclweb.org/anthology/P02-1040

  63. Post M (2018) A call for clarity in reporting BLEU scores. In: Proceedings of the third conference on machine translation: research papers, Association for Computational Linguistics, Brussels, Belgium, pp 186–191, https://doi.org/10.18653/v1/W18-6319, https://www.aclweb.org/anthology/W18-6319

  64. Singh TD, Solorio T (2018) Towards translating mixed-code comments from social media. In: Gelbukh A (ed) Computational linguistics and intelligent text processing. Springer, Cham, pp 457–468

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salam Michael Singh.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, S.M., Singh, T.D. An empirical study of low-resource neural machine translation of manipuri in multilingual settings. Neural Comput & Applic 34, 14823–14844 (2022). https://doi.org/10.1007/s00521-022-07337-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07337-8

Keywords

Navigation