Abstract
This research paper work establishes an important concept of improving Phrase based Statistical Machine Translation System incorporating monolingual corpus on the target side of the English to Manipuri translation language pair. However, there has been no work that focuses on translating one of the Indian Minority Tibeton-Burman Manipuri language pair. This Phrase based Statistical Machine Translation system has been developed using the Moses open-source toolkit and evaluated carefully using various automatic and human evaluation techniques. PBSMT achieves a BLEU Score of 10.15 as compared to the baseline PBSMT of BLEU Score 9.89 using the same training, tuning, and testing datasets. This research paper work addresses the issue of limited availability of parallel text corpora (English-Manipuri pair).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Antony, P.: Machine translation approaches and survey for Indian languages. Int. J. Comput. Linguist. Chin. Lang. Process. 18(1), 47–78 (2013)
Dave, S., Parikh, J., Bhattacharyya, P.: Interlingua-based English-Hindi machine translation and language divergence. Mach. Transl. 16(4), 251–304 (2001)
Hoang, H., Koehn, P.: Design of the moses decoder for statistical machine translation. In: Software Engineering, Testing, and Quality Assurance for Natural Language Processing, pp. 58–65. Association for Computational Linguistics (2008)
Koehn, P.: Machine Translation System User Manual and Code Guide (2011)
Nießen, S., Ney, H.: Statistical machine translation with scarce resources using morpho-syntactic information. Comput. Linguist. 30(2), 181–204 (2004)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Ramanathan, A., Hegde, J., Shah, R.M., Bhattacharyya, P., Sasikumar, M.: Simple syntactic and morphological processing can help English-Hindi statistical machine translation. In: IJCNLP, pp. 513–520 (2008)
Resnik, P., Smith, N.A.: The web as a parallel corpus. Comput. Linguist. 29(3), 349–380 (2003)
Singh, T.D.: Addressing some issues of data sparsity towards improving English-Manipuri SMT using morphological information. Monolingual Machine Translation p. 46
Singh, T.D., Bandyopadhyay, S.: Manipuri-English example based machine translation system. Int. J. Comput. Linguist. Appl. (IJCLA), ISSN pp. 0976–0962 (2010)
Utiyama, M., Isahara, H.: A comparison of pivot methods for phrase-based statistical machine translation. In: HLT-NAACL, pp. 484–491 (2007)
Acknowledgments
I would like to express my deepest appreciation to the Technology Development for Indian Languages (TDIL) Programme, initiated by the Ministry of Electronics and Information Technology, Govt. of India for sharing the valuable parallel corpus on English to Manipuri Language pair and the monolingual corpus in Manipuri Language for this research paper. Furthermore, I would like to extend my heart full gratitude to the Department of Computer Science and Engineering, National Institute of Technology, Mizoram for providing me the required financial assistance and the laboratory facilities for conducting out the full experimental research works on this research paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Achom, A., Pakray, P., Gelbukh, A. (2023). Addressing the Issue of Unavailability of Parallel Corpus Incorporating Monolingual Corpus on PBSMT System for English-Manipuri Translation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2018. Lecture Notes in Computer Science, vol 13396. Springer, Cham. https://doi.org/10.1007/978-3-031-23793-5_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-23793-5_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23792-8
Online ISBN: 978-3-031-23793-5
eBook Packages: Computer ScienceComputer Science (R0)