Combining Machine Translated Sentence Chunks from Multiple MT Systems

Rikters, Matīss; Skadiņa, Inguna

doi:10.1007/978-3-319-75487-1_3

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9624))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1138 Accesses

Abstract

This paper presents a hybrid machine translation (HMT) system that pursues syntactic analysis to acquire phrases of source sentences, translates the phrases using multiple online machine translation (MT) system application program interfaces (APIs) and generates output by combining translated chunks to obtain the best possible translation. The aim of this study is to improve translation quality of English – Latvian texts over each of the individual MT APIs. The selection of the best translation hypothesis is done by calculating the perplexity for each hypothesis using an n-gram language model. The result is a phrase-based multi-system machine translation system that allows to improve MT output compared to individual online MT systems. The proposed approach show improvement up to +1.48 points in BLEU and −0.015 in TER scores compared to the baselines and related research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Latvian public administration machine translation service - http://hugo.lv.
2.
Google Translate API - https://cloud.google.com/translate/.
3.
Bing Translator Control - http://www.bing.com/dev/en-us/translator.
4.
Yandex Translate API - https://tech.yandex.com/translate/.
5.
Latvian public administration machine translation service API - http://hugo.lv/TranslationAPI.
6.
ChunkMT - https://github.com/M4t1ss/ChunkMT.

References

Vasiļjevs, A., Kalniņš, R., Pinnis, M., Skadiņš, R.: Machine translation for e-Government - the Baltic case. In: Proceedings of AMTA 2014, vol. 2: MT Users, pp. 181–193 (2014)
Google Scholar
Skadiņš, R., Šics, V., Rozis, R.: Building the world’s best general domain MT for Baltic languages. In: Human Language Technologies – The Baltic Perspective, Proceedings of the Sixth International Conference Baltic HLT 2014, pp. 141–148. IOS Press (2014)
Google Scholar
Costa-Jussa, M.R., Fonollosa, J.A.R.: Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 32(1), 3–10 (2015)
Article Google Scholar
Rikters, M., Skadiņa, I.: Syntax-based multi-system machine translation. In: LREC 2016 (2016)
Google Scholar
Thurmair, G.: Comparing different architectures of hybrid machine translation systems. In: Proceedings of the MT Summit XII, pp. 340–347 (2009)
Google Scholar
Mellebeek, B., Owczarzak, K., Van Genabith, J., Way, A.: Multi-engine machine translation by recursive sentence decomposition. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, pp. 110–118 (2006)
Google Scholar
Ahsan, A., Kolachina, P.: Coupling statistical machine translation with rule-based transfer and generation. In: AMTA-The Ninth Conference of the Association for Machine Translation in the Americas, Denver, Colorado (2010)
Google Scholar
Barrault, L.: MANY: open source machine translation system combination. Prague Bull. Math. Linguist. 93, 147–155 (2010)
Article Google Scholar
Hildebrand, A.S., Vogel, St.: CMU system combination for WMT’09. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
Google Scholar
Rikters, M.: Multi-system machine translation using online APIs for English-Latvian. In: ACL-IJCNLP 2015, p. 6 (2015)
Google Scholar
Heafield, K., Hanneman, Gr., Lavie, A.: Machine translation system combination with flexible word ordering. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
Google Scholar
Chen, Y., Jellinghaus, M., Eisele, A., Yi, Zh., Hunsicker, S., Theison, S., Federmann, Ch., Uszkoreit, H.: Combining multi-engine translations with Moses. In: Proceedings of the 4th Workshop on SMT, Athens (2009)
Google Scholar
Feng, Y., Liu, Y., Mi, H., Liu, Q., Lü, Y.: Lattice-based system combination for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, vol. 3. Association for Computational Linguistics (2009)
Google Scholar
Freitag, M., Peitz, S., Wuebker, J., Ney, H., Huck, M., Sennrich, R., Durrani, N., Nadejde, M., Williams, P., Koehn, P., Herrmann, T., Cho, E., Waibel, A.: EU-BRIDGE MT: combined machine translation. In: ACL 2014 Ninth Workshop on Statistical Machine Translation (WMT 2014), Baltimore, MD, USA, pp. 105–113 (2014)
Google Scholar
Freitag, M., Peter, J., Peitz, S., Feng, M., Ney, H.: Local system voting feature for machine translation system combination. In: EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015), Lisbon, Portugal, pp. 467–476 (2015)
Google Scholar
Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2006)
Google Scholar
Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics (2011)
Google Scholar
Gamon, M., Aue, A., Smets, M.: Sentence-level MT evaluation without reference translations: beyond language modeling. In: Proceedings of EAMT (2005)
Google Scholar
Callison-Burch, C., Flournoy, R.S.: A program for automatically selecting the best output from multiple machine translation engines. In: Proceedings of the Machine Translation Summit VIII (2001)
Google Scholar
Akiba, Y., Watanabe, T., Sumita, E.: Using language and translation models to select the best among outputs from multiple MT systems. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics (2002)
Google Scholar
Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., Varga, D.: The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. arXiv preprint cs/0609058 (2006)
Steinberger, R., Eisele, A., Klocek, S., Pilos, S., Schlüter, P.: DGT-TM: a freely available translation memory in 22 languages. arXiv preprint arXiv:1309.5226 (2013)
Skadiņš, R., Goba, K., Šics, V.: Improving SMT for Baltic languages with factored models. In: Proceedings of the Fourth International Conference Baltic HLT 2010. Frontiers in Artificial Intelligence and Applications, vol. 2192, pp. 125–132 (2010)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2002)
Google Scholar
Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc. (2002)
Google Scholar
Klejch, O., Avramidis, E., Burchardt, A., Popel, M.: MT-ComparEval: graphical evaluation interface for machine translation development. Prague Bull. Math. Linguist. 104(1), 63–74 (2015)
Article Google Scholar
Madnani, N.: iBLEU: interactively debugging and scoring statistical machine translation systems. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC). IEEE (2011)
Google Scholar

Download references

Acknowledgements

The research was supported by Grant 271/2012 from the Latvian Council of Science.

Author information

Authors and Affiliations

University of Latvia, 19 Raina Blvd., Riga, Latvia
Matīss Rikters
Institute of Mathematics and Computer Science, University of Latvia, 29 Raina Blvd., Riga, Latvia
Inguna Skadiņa

Authors

Matīss Rikters
View author publications
You can also search for this author in PubMed Google Scholar
Inguna Skadiņa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matīss Rikters .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rikters, M., Skadiņa, I. (2018). Combining Machine Translated Sentence Chunks from Multiple MT Systems. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-75487-1_3
Published: 21 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75486-4
Online ISBN: 978-3-319-75487-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics