Abstract
This paper presents a hybrid machine translation (HMT) system that pursues syntactic analysis to acquire phrases of source sentences, translates the phrases using multiple online machine translation (MT) system application program interfaces (APIs) and generates output by combining translated chunks to obtain the best possible translation. The aim of this study is to improve translation quality of English – Latvian texts over each of the individual MT APIs. The selection of the best translation hypothesis is done by calculating the perplexity for each hypothesis using an n-gram language model. The result is a phrase-based multi-system machine translation system that allows to improve MT output compared to individual online MT systems. The proposed approach show improvement up to +1.48 points in BLEU and −0.015 in TER scores compared to the baselines and related research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Latvian public administration machine translation service - http://hugo.lv.
- 2.
Google Translate API - https://cloud.google.com/translate/.
- 3.
Bing Translator Control - http://www.bing.com/dev/en-us/translator.
- 4.
Yandex Translate API - https://tech.yandex.com/translate/.
- 5.
Latvian public administration machine translation service API - http://hugo.lv/TranslationAPI.
- 6.
ChunkMT - https://github.com/M4t1ss/ChunkMT.
References
Vasiļjevs, A., Kalniņš, R., Pinnis, M., Skadiņš, R.: Machine translation for e-Government - the Baltic case. In: Proceedings of AMTA 2014, vol. 2: MT Users, pp. 181–193 (2014)
Skadiņš, R., Šics, V., Rozis, R.: Building the world’s best general domain MT for Baltic languages. In: Human Language Technologies – The Baltic Perspective, Proceedings of the Sixth International Conference Baltic HLT 2014, pp. 141–148. IOS Press (2014)
Costa-Jussa, M.R., Fonollosa, J.A.R.: Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 32(1), 3–10 (2015)
Rikters, M., Skadiņa, I.: Syntax-based multi-system machine translation. In: LREC 2016 (2016)
Thurmair, G.: Comparing different architectures of hybrid machine translation systems. In: Proceedings of the MT Summit XII, pp. 340–347 (2009)
Mellebeek, B., Owczarzak, K., Van Genabith, J., Way, A.: Multi-engine machine translation by recursive sentence decomposition. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, pp. 110–118 (2006)
Ahsan, A., Kolachina, P.: Coupling statistical machine translation with rule-based transfer and generation. In: AMTA-The Ninth Conference of the Association for Machine Translation in the Americas, Denver, Colorado (2010)
Barrault, L.: MANY: open source machine translation system combination. Prague Bull. Math. Linguist. 93, 147–155 (2010)
Hildebrand, A.S., Vogel, St.: CMU system combination for WMT’09. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
Rikters, M.: Multi-system machine translation using online APIs for English-Latvian. In: ACL-IJCNLP 2015, p. 6 (2015)
Heafield, K., Hanneman, Gr., Lavie, A.: Machine translation system combination with flexible word ordering. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
Chen, Y., Jellinghaus, M., Eisele, A., Yi, Zh., Hunsicker, S., Theison, S., Federmann, Ch., Uszkoreit, H.: Combining multi-engine translations with Moses. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
Feng, Y., Liu, Y., Mi, H., Liu, Q., Lü, Y.: Lattice-based system combination for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, vol. 3. Association for Computational Linguistics (2009)
Freitag, M., Peitz, S., Wuebker, J., Ney, H., Huck, M., Sennrich, R., Durrani, N., Nadejde, M., Williams, P., Koehn, P., Herrmann, T., Cho, E., Waibel, A.: EU-BRIDGE MT: combined machine translation. In: ACL 2014 Ninth Workshop on Statistical Machine Translation (WMT 2014), Baltimore, MD, USA, pp. 105–113 (2014)
Freitag, M., Peter, J., Peitz, S., Feng, M., Ney, H.: Local system voting feature for machine translation system combination. In: EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015), Lisbon, Portugal, pp. 467–476 (2015)
Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2006)
Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics (2011)
Gamon, M., Aue, A., Smets, M.: Sentence-level MT evaluation without reference translations: beyond language modeling. In: Proceedings of EAMT (2005)
Callison-Burch, C., Flournoy, R.S.: A program for automatically selecting the best output from multiple machine translation engines. In: Proceedings of the Machine Translation Summit VIII (2001)
Akiba, Y., Watanabe, T., Sumita, E.: Using language and translation models to select the best among outputs from multiple MT systems. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics (2002)
Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., Varga, D.: The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. arXiv preprint cs/0609058 (2006)
Steinberger, R., Eisele, A., Klocek, S., Pilos, S., Schlüter, P.: DGT-TM: a freely available translation memory in 22 languages. arXiv preprint arXiv:1309.5226 (2013)
Skadiņš, R., Goba, K., Šics, V.: Improving SMT for Baltic languages with factored models. In: Proceedings of the Fourth International Conference Baltic HLT 2010. Frontiers in Artificial Intelligence and Applications, vol. 2192, pp. 125–132 (2010)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2002)
Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc. (2002)
Klejch, O., Avramidis, E., Burchardt, A., Popel, M.: MT-ComparEval: graphical evaluation interface for machine translation development. Prague Bull. Math. Linguist. 104(1), 63–74 (2015)
Madnani, N.: iBLEU: interactively debugging and scoring statistical machine translation systems. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC). IEEE (2011)
Acknowledgements
The research was supported by Grant 271/2012 from the Latvian Council of Science.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Rikters, M., Skadiņa, I. (2018). Combining Machine Translated Sentence Chunks from Multiple MT Systems. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-75487-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75486-4
Online ISBN: 978-3-319-75487-1
eBook Packages: Computer ScienceComputer Science (R0)