skip to main content
research-article

Low Resource Neural Machine Translation: Assamese to/from Other Indo-Aryan (Indic) Languages

Authors Info & Claims
Published:16 November 2021Publication History
Skip Abstract Section

Abstract

Machine translation (MT) systems have been built using numerous different techniques for bridging the language barriers. These techniques are broadly categorized into approaches like Statistical Machine Translation (SMT) and Neural Machine Translation (NMT). End-to-end NMT systems significantly outperform SMT in translation quality on many language pairs, especially those with the adequate parallel corpus. We report comparative experiments on baseline MT systems for Assamese to other Indo-Aryan languages (in both translation directions) using the traditional Phrase-Based SMT as well as some more successful NMT architectures, namely basic sequence-to-sequence model with attention, Transformer, and finetuned Transformer. The results are evaluated using the most prominent and popular standard automatic metric BLEU (BiLingual Evaluation Understudy), as well as other well-known metrics for exploring the performance of different baseline MT systems, since this is the first such work involving Assamese. The evaluation scores are compared for SMT and NMT models for the effectiveness of bi-directional language pairs involving Assamese and other Indo-Aryan languages (Bangla, Gujarati, Hindi, Marathi, Odia, Sinhalese, and Urdu). The highest BLEU scores obtained are for Assamese to Sinhalese for SMT (35.63) and the Assamese to Bangla for NMT systems (seq2seq is 50.92, Transformer is 50.01, and finetuned Transformer is 50.19). We also try to relate the results with the language characteristics, distances, family trees, domains, data sizes, and sentence lengths. We find that the effect of the domain is the most important factor affecting the results for the given data domains and sizes. We compare our results with the only existing MT system for Assamese (Bing Translator) and also with pairs involving Hindi.

REFERENCES

  1. [1] Antony P. J.. 2013. Machine translation approaches and survey for Indian languages. In Proceedings of the International Journal of Computational Linguistics & Chinese Language Processing, Vol. 18.Google ScholarGoogle Scholar
  2. [2] Bahdanau Dzmitry, Cho Kyunghyun, and Bengio Yoshua. 2014. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15), San Diego, CA, USA, May 7-9, 2015, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473.Google ScholarGoogle Scholar
  3. [3] Baker Paul, Hardie Andrew, McEnery Tony, Cunningham Hamish, and Gaizauskas Robert J.. 2002. EMILLE, A 67-Million word corpus of Indic languages: Data collection, mark-up and harmonisation. In Proceedings of the 3rd International Conference on Language Resources and Evaluation.Google ScholarGoogle Scholar
  4. [4] Banerjee Satanjeev and Lavie Alon. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. 6572.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Banik Debajyoty, Ekbal Asif, Bhattacharyya Pushpak, and Bhattacharyya Siddhartha. 2019. Assembling translations from multi-engine machine translation outputs. Applied Soft Computing 78 (2019), 230239. DOI: https://doi.org/10.1016/j.asoc.2019.02.031Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Baruah Kalyanee Kanchan, Das Pranjal, Hannan Abdul, and Sarma Shikhar Kr. 2014. Assamese-English Bilingual Machine Translation. CoRR abs/1407.2019. http://arxiv.org/abs/1407.2019.Google ScholarGoogle Scholar
  7. [7] Bentivogli Luisa, Bisazza Arianna, Cettolo Mauro, and Federico Marcello. 2016. Neural versus Phrase-Based Machine Translation Quality: a Case Study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 257267.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Bojanowski Piotr, Grave Edouard, Joulin Armand, and Mikolov Tomas. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5, 1 (2017), 135146.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Britz Denny, Goldie Anna, Luong Minh-Thang, and Le Quoc. 2017. Massive Exploration of Neural Machine Translation Architectures. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 14421451.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Brown Peter F., Pietra Stephen A. Della, Pietra Vincent J. Della, and Mercer Robert L.. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 19, 2 (1993), 263311. Retrieved from https://www.aclweb.org/anthology/J93-2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Chatterji Sanjay, Roy Devshri, Sarkar Sudeshna, and Basu Anupam. 2009. A hybrid approach for bengali to hindi machine translation. In Proceedings of the ICON-2009 7th International Conference on Natural Language Processing. 8191.Google ScholarGoogle Scholar
  12. [12] Cho Kyunghyun, Merriënboer Bart van, Bahdanau Dzmitry, and Bengio Yoshua. 2014. On the properties of neural machine translation: Encoder–decoder approaches. In Proceedings of the SSST-8, 8th Workshop on Syntax, Semantics and Structure in Statistical Translation. 103111.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Cho Kyunghyun, Merriënboer Bart Van, Gulcehre Caglar, Bahdanau Dzmitry, Bougares Fethi, Schwenk Holger, and Bengio Yoshua. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1724–1734.Google ScholarGoogle Scholar
  14. [14] Dabre Raj, Chu Chenhui, and Kunchukuttan Anoop. 2020. A survey of multilingual neural machine translation. ACM Computing Surveys 53, 5 (2020), 138.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Dargan Shaveta and Kumar Munish. 2019. Writer identification system for indic and non-indic scripts: State-of-the-art survey. Archives of Computational Methods in Engineering 26, 4 (2019), 12831311.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Das Ayan, Yerra Pranay, Kumar Ken, and Sarkar Sudeshna. 2016. A study of attention-based neural machine translation model on Indian languages. In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing. 163172.Google ScholarGoogle Scholar
  17. [17] Das Pranjal and Baruah Kalyanee K.. 2014. Assamese to English statistical machine translation integrated with a transliteration module. International Journal of Computer Applications 100, 5 (2014), 2024.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Deb Debajit. 2012. On case marking in assamese bengali and oriya. International Journal of Applied Linguistics & English Literature 1, 2 (2012), 102.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Doddington George. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the 2nd International Conference on Human Language Technology Research. 138145.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Durrani Nadir, Sajjad Hassan, Fraser Alexander, and Schmid Helmut. 2010. Hindi-to-Urdu machine translation through transliteration. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 465474. Retrieved from https://www.aclweb.org/anthology/P10-1048.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Garje G. V. and Kharate G. K.. 2013. Survey of machine translation systems in India. International Journal on Natural Language Computing 2, 4 (2013), 4765.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Gehring Jonas, Auli Michael, Grangier David, Yarats Denis, and Dauphin Yann N.. 2017. Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning,Vol. 70. JMLR. org, 12431252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Goyal Vikrant, Kumar Sourav, and Sharma Dipti Misra. 2020. Efficient neural machine translation for low-resource languages via exploiting related languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 162168.Google ScholarGoogle Scholar
  24. [24] Goyal Vishal and Lehal Gurpreet Singh. 2008. Comparative study of Hindi and Punjabi language scripts. Nepalese Linguistics 23 (2008), 6782.Google ScholarGoogle Scholar
  25. [25] Goyal Vishal and Lehal Gurpreet Singh. 2011. Hindi to Punjabi machine translation system. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations. Association for Computational Linguistics, 16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Grave Edouard, Bojanowski Piotr, Gupta Prakhar, Joulin Armand, and Mikolov Tomas. 2018. Learning word vectors for 157 languages. In Proceedings of the International Conference on Language Resources and Evaluation.Google ScholarGoogle Scholar
  27. [27] Guzmán Francisco, Chen Peng-Jen, Ott Myle, Pino Juan, Lample Guillaume, Koehn Philipp, Chaudhary Vishrav, and Ranzato Marc’Aurelio. 2019. The FLORES evaluation datasets for low-resource machine translation: Nepali–English and Sinhala–English. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 61006113.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Hasan Md Arid, Alam Firoj, Chowdhury Shammur Absar, and Khan Naira. 2019. Neural machine translation for the Bangla-English language pair. In Proceedings of the 2019 22nd International Conference on Computer and Information Technology. IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] He Wei, He Zhongjun, Wu Hua, and Wang Haifeng. 2016. Improved neural machine translation with SMT features. In Proceedings of the 13th AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Heafield Kenneth. 2011. KenLM: Faster and smaller language model queries. In Proceedings of the 6th Workshop on Statistical Machine Translation. Association for Computational Linguistics, 187197.Google ScholarGoogle Scholar
  32. [32] Heafield Kenneth, Pouzyrevsky Ivan, Clark Jonathan H., and Koehn Philipp. 2013. Scalable modified Kneser-Ney language model estimation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 2. 690696.Google ScholarGoogle Scholar
  33. [33] Hoang Hieu and Koehn Philipp. 2008. Design of the moses decoder for statistical machine translation. In Proceedings of the Software Engineering, Testing, and Quality Assurance for Natural Language Processing. Association for Computational Linguistics, 5865. Retrieved from https://www.aclweb.org/anthology/W08-0510.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Hochreiter Sepp and Schmidhuber Jürgen. 1997. Long short-term memory. Neural computation 9, 8 (1997), 17351780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Ismail Tanvira and Singh L. Joyprakash. 2017. Dialect identification of assamese language using spectral features. Indian Journal of Science and Technology 10, 20 (2017), 17.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Isozaki Hideki, Hirao Tsutomu, Duh Kevin, Sudoh Katsuhito, and Tsukada Hajime. 2010. Automatic evaluation of translation quality for distant language pairs. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 944952.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Jawaid Bushra and Zeman Daniel. 2011. Word-order issues in english-to-urdu statistical machine translation. The Prague Bulletin of Mathematical Linguistics 95, 1 (2011), 87106.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Jean Sébastien, Cho Kyunghyun, Memisevic Roland, and Bengio Yoshua. 2015. On using very large target vocabulary for neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing Vol. 1. Association for Computational Linguistics, 110. DOI: https://doi.org/10.3115/v1/P15-1001Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Johnson Melvin, Schuster Mike, Le Quoc, Krikun Maxim, Wu Yonghui, Chen Zhifeng, Thorat Nikhil, Viégas Fernanda, Wattenberg Martin, Corrado Greg, Macduff Hughes, and Jeffrey Dean. 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics 5 (2017), 339351. DOI: 10.1162/tacl_a_00065Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Josan Gurpreet Singh and Lehal Gurpreet Singh. 2008. A Punjabi to Hindi machine translation system. In Proceedings of the 22nd International Conference on on Computational Linguistics. Association for Computational Linguistics, 157160.Google ScholarGoogle Scholar
  41. [41] Kakati Banikanta. 1953. Aspects of Early Assamese Literature-1953. Gauhati University.Google ScholarGoogle Scholar
  42. [42] Kakati Banikanta and Goswami Golockchandra. 1962. Assamese, its Formation and Development: a Scientific Treatise on the History and Philology of the Assamese Language. Lawyer’s Book Stall.Google ScholarGoogle Scholar
  43. [43] Kalchbrenner Nal and Blunsom Phil. 2013. Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 17001709. Retrieved from https://www.aclweb.org/anthology/D13-1176.Google ScholarGoogle Scholar
  44. [44] Kalita Nayan Jyoti and Islam Baharul. 2015. Bengali to assamese statistical machine translation using moses (corpus based). CoRR abs/1504.01182. http://arxiv.org/abs/1504.01182.Google ScholarGoogle Scholar
  45. [45] Kaur Harmandeep and Kumar Munish. 2018. A comprehensive survey on word recognition for non-Indic and Indic scripts. Pattern Analysis and Applications 21, 4 (2018), 897929.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Khan Nadeem Jadoon, Anwar Waqas, and Durrani Nadir. 2017. Machine translation approaches and survey for indian languages. CoRR abs/1701.04290. http://arxiv.org/abs/1701.04290.Google ScholarGoogle Scholar
  47. [47] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15), San Diego, CA, USA, May 7-9, 2015. http://arxiv.org/abs/1412.6980.Google ScholarGoogle Scholar
  48. [48] Klein Guillaume, Kim Yoon, Deng Yuntian, Senellart Jean, and Rush Alexander M.. 2017. Opennmt: Open-source toolkit for neural machine translation. In Proceedings of ACL 2017, System Demonstrations. 67–72.Google ScholarGoogle Scholar
  49. [49] Koehn Philipp. 2005. Europarl: A parallel corpus for statistical machine translation. In Proceedings of the MT Summit, Vol. 5. Citeseer, 7986.Google ScholarGoogle Scholar
  50. [50] Koehn Philipp, Hoang Hieu, Birch Alexandra, Callison-Burch Chris, Federico Marcello, Bertoldi Nicola, Cowan Brooke, Shen Wade, Moran Christine, Zens Richard, Chris Dyer, Ondej Bojar, Alex Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. 177180.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Koehn Philipp and Knowles Rebecca. 2017. Six challenges for neural machine translation. In Proceedings of the 1st Workshop on Neural Machine Translation. 2839.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Koehn Philipp, Och Franz Josef, and Marcu Daniel. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Vol. 1. Association for Computational Linguistics, 4854.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Kumar Munish, Jindal M. K., and Sharma R. K.. 2011. Review on OCR for handwritten Indian scripts character recognition. In Proceedings of the International Conference on Digital Image Processing and Information Technology. Springer, 268276.Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Kumar Munish, Jindal M. K., and Sharma R. K.. 2016. A novel framework for grading of writers using offline Gurmukhi characters. Proceedings of the National Academy of Sciences, India Section A: Physical Sciences 86, 3 (2016), 405415.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Kumar Munish, Jindal Manish Kumar, Sharma Rajendra Kumar, and Jindal Simpel Rani. 2019. Character and numeral recognition for non-Indic and Indic scripts: A survey. Artificial Intelligence Review 52, 4 (2019), 22352261.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Kumar Munish, Jindal Manish Kumar, Sharma Rajendra Kumar, and Jindal Simpel Rani. 2020. Performance evaluation of classifiers for the recognition of offline handwritten Gurmukhi characters and numerals: A study. Artificial Intelligence Review 53, 3 (2020), 20752097.Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Kumar Munish, Jindal Simpel Rani, Jindal Manish Kumar, and Lehal Gurpreet Singh. 2019. Improved recognition results of medieval handwritten Gurmukhi manuscripts using boosting and bagging methodologies. Neural Processing Letters 50, 1 (2019), 4356.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Lahiri Bornini. 2018. Classifiers in surjapuri. Jadavpur Journal of Languages and Linguistics 2, 1 (2018), 2737.Google ScholarGoogle Scholar
  59. [59] Laskar Sahinur Rahman, Khilji Abdullah Faiz Ur Rahman, Pakray Partha, and Bandyopadhyay Sivaji. 2020. EnAsCorp1. 0: English-Assamese Corpus. In Proceedings of the 3rd Workshop on Technologies for MT of Low Resource Languages. 6268.Google ScholarGoogle Scholar
  60. [60] Lavie Alon. 2010. Evaluating the output of machine translation systems. In Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Tutorials. Association for Machine Translation in the Americas. https://aclanthology.org/2010.amta-tutorials.4.Google ScholarGoogle Scholar
  61. [61] Leusch Gregor, Ueffing Nicola, and Ney Hermann. 2006. CDER: Efficient MT evaluation using block movements. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics.Google ScholarGoogle Scholar
  62. [62] Levenshtein Vladimir I.. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10, 8 (1966), 707710.Google ScholarGoogle Scholar
  63. [63] Ling Wang, Luís Tiago, Marujo Luís, Astudillo Ramón Fernandez, Amir Silvio, Dyer Chris, Black Alan W., and Trancoso Isabel. 2015. Finding function in form: Compositional character models for open vocabulary word representation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1520–1530.Google ScholarGoogle Scholar
  64. [64] Ling Wang, Tsvetkov Yulia, Amir Silvio, Fermandez Ramon, Dyer Chris, Black Alan W., Trancoso Isabel, and Lin Chu-Cheng. 2015. Not all contexts are created equal: Better word representations with variable attention. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 13671372.Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Luong Minh-Thang, Pham Hieu, and Manning Christopher D.. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 412–1421.Google ScholarGoogle Scholar
  66. [66] Masica Colin P.. 1991. The Indo-Aryan Languages. Cambridge University Press, Cambridge.Google ScholarGoogle Scholar
  67. [67] Masica Colin P.. 2005. A new survey of the Indo-Aryan languages. The Journal of the American Oriental Society 125, 1 (2005), 7990.Google ScholarGoogle Scholar
  68. [68] Mumin Mohammad Abdullah Al, Seddiqui Md Hanif, Iqbal Muhammed Zafar, and Islam Mohammed Jahirul. 2019. Neural machine translation for low-resource English-Bangla. Journal of Computer Science 15, 11 (2019), 16271637. DOI: https://doi.org/10.3844/jcssp.2019.1627.1637Google ScholarGoogle ScholarCross RefCross Ref
  69. [69] Mundotiya Rajesh Kumar, Singh Manish Kumar, Kapur Rahul, Mishra Swasti, and Singh Anil Kumar. 2021. Basic linguistic resources and baselines for Bhojpuri, Magahi and Maithili for natural language processing. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 20, 6, Article 95 (2021), 37 pages.Google ScholarGoogle Scholar
  70. [70] Och Franz Josef. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Vol. 1. Association for Computational Linguistics, 160167.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. [71] Och Franz Josef and Ney Hermann. 2003. A systematic comparison of various statistical alignment models. Computational linguistics 29, 1 (2003), 1951.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. [72] Ojha Atul Kr, Kumar Ritesh, Bansal Akanksha, and Rani Priya. 2019. Panlingua-KMI MT system for similar language translation task at WMT 2019. In Proceedings of the 4th Conference on Machine Translation, Vol. 3. 213218.Google ScholarGoogle ScholarCross RefCross Ref
  73. [73] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 311318.Google ScholarGoogle Scholar
  74. [74] PATTANAYAK D. P.. 2016. ORIYA and ASSAMESE. Current Trends in Linguistics.De Gruyter Mouton, 122–152.Google ScholarGoogle Scholar
  75. [75] Philip Jerin, Namboodiri Vinay P., and Jawahar C. V.. 2019. A baseline neural machine translation system for Indian languages. CoRR abs/1907.12437 (2019). https://dblp.org/rec/journals/corr/abs-1907-12437.bib.Google ScholarGoogle Scholar
  76. [76] Ramanathan Ananthakrishnan, Hegde Jayprasad, Shah Ritesh M., Bhattacharyya Pushpak, and M. Sasikumar2008. Simple syntactic and morphological processing can help English-Hindi statistical machine translation. In Proceedings of the 3rd International Joint Conference on Natural Language Processing, Vol. 1. Retrieved from https://www.aclweb.org/anthology/I08-1067.Google ScholarGoogle Scholar
  77. [77] Ren Shuo, Zhang Zhirui, Liu Shujie, Zhou Ming, and Ma Shuai. 2019. Unsupervised neural machine translation with smt as posterior regularization. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 241248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. [78] Revanuru Karthik, Turlapaty Kaushik, and Rao Shrisha. 2017. Neural machine translation of Indian languages. In Proceedings of the 10th Annual ACM India Compute Conference. ACM, 1120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. [79] Saharia Navanath, Konwar Kishori M., Sharma Utpal, and Kalita Jugal K.. 2013. An improved stemming approach using HMM for a highly inflectional language. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 164173.Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. [80] Sen Sukanta, Gupta Kamal Kumar, Ekbal Asif, and Bhattacharyya Pushpak. 2018. IITP-MT at WAT2018: Transformer-based multilingual Indic-English neural machine translation system. In Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation.Google ScholarGoogle Scholar
  81. [81] Sengupta Debapriya and Saha Goutam. 2015. Study on similarity among Indian languages using language verification framework. Advances in Artificial Intelligence 2015, Article 2 (2015), 1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. [82] Sennrich Rico, Haddow Barry, and Birch Alexandra. 2016. Improving Neural Machine Translation Models with Monolingual Data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Vol. 1. 8696.Google ScholarGoogle ScholarCross RefCross Ref
  83. [83] Sennrich Rico, Haddow Barry, and Birch Alexandra. 2016. Neural machine translation of rare words with Subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Vol. 1. 17151725.Google ScholarGoogle ScholarCross RefCross Ref
  84. [84] Shah Parth and Bakrola Vishvajit. 2019. Neural machine translation system of Indic languages-an attention based approach. In Proceedings of the 2019 2nd International Conference on Advanced Computational and Communication Paradigms. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  85. [85] Singh Amitoj, Kadyan Virender, Kumar Munish, and Bassan Nancy. 2020. ASRoIL: A comprehensive survey for automatic speech recognition of Indian languages. Artificial Intelligence Review 53, 5 (2020), 36733704.Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. [86] Singh Anil Kumar. 2010. Modeling and Application of Linguistic Similarity. Ph.D. Dissertation. International Institute of Information Technology, Hyderabad, India.Google ScholarGoogle Scholar
  87. [87] Singh Muskaan, Kumar Ravinder, and Chana Inderveer. 2019. Neural-based machine translation system outperforming statistical phrase-based machine translation for low-resource languages. In Proceedings of the 2019 12th International Conference on Contemporary Computing. IEEE, 17.Google ScholarGoogle ScholarCross RefCross Ref
  88. [88] Singh Moirangthem Tiken, Borgohain Rajdeep, and Gohain Sourav. 2014. An English-assamese machine translation system. International Journal of Computer Applications 93, 4 (2014), 1–6.Google ScholarGoogle Scholar
  89. [89] Sinha R. Mahesh and K.. 2004. An engineering perspective of machine translation: anglabharti-II and anubharti-II architectures. In Proceedings of the International Symposium on Machine Translation, NLP and Translation Support System. 1017.Google ScholarGoogle Scholar
  90. [90] Snover Matthew, Dorr Bonnie, Schwartz Richard, Micciulla Linnea, and Makhoul John. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Association for Machine Translation in the Americas, Vol. 200.Google ScholarGoogle Scholar
  91. [91] Srivastava Nitish, Hinton Geoffrey, Krizhevsky Alex, Sutskever Ilya, and Salakhutdinov Ruslan. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 19291958. Retrieved from http://jmlr.org/papers/v15/srivastava14a.html.Google ScholarGoogle Scholar
  92. [92] Sutskever I., Vinyals O., and Le Q. V.. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  93. [93] Tillmann Christoph, Vogel Stephan, Ney Hermann, Zubiaga Arkaitz, and Sawaf Hassan. 1997. Accelerated DP based search for statistical translation. In Proceedings of the 5th European Conference on Speech Communication and Technology.Google ScholarGoogle ScholarCross RefCross Ref
  94. [94] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems. 59986008.Google ScholarGoogle Scholar
  95. [95] Wang Xing, Lu Zhengdong, Tu Zhaopeng, Li Hang, Xiong Deyi, and Zhang Min. 2017. Neural machine translation advised by statistical machine translation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  96. [96] Wang Yining, Zhang Jiajun, Zhai Feifei, Xu Jingfang, and Zong Chengqing. 2018. Three strategies to improve one-to-many multilingual translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 29552960.Google ScholarGoogle ScholarCross RefCross Ref
  97. [97] Zoph Barret and Knight Kevin. 2016. Multi-source neural translation. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3034.Google ScholarGoogle ScholarCross RefCross Ref
  98. [98] Zoph Barret, Yuret Deniz, May Jonathan, and Knight Kevin. 2016. Transfer learning for low-resource neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 15681575.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Low Resource Neural Machine Translation: Assamese to/from Other Indo-Aryan (Indic) Languages

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 1
        January 2022
        442 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3494068
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 16 November 2021
        • Accepted: 1 June 2021
        • Revised: 1 May 2021
        • Received: 1 July 2020
        Published in tallip Volume 21, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format