Skip to main content

Combining Machine Translated Sentence Chunks from Multiple MT Systems

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2016)

Abstract

This paper presents a hybrid machine translation (HMT) system that pursues syntactic analysis to acquire phrases of source sentences, translates the phrases using multiple online machine translation (MT) system application program interfaces (APIs) and generates output by combining translated chunks to obtain the best possible translation. The aim of this study is to improve translation quality of English – Latvian texts over each of the individual MT APIs. The selection of the best translation hypothesis is done by calculating the perplexity for each hypothesis using an n-gram language model. The result is a phrase-based multi-system machine translation system that allows to improve MT output compared to individual online MT systems. The proposed approach show improvement up to +1.48 points in BLEU and −0.015 in TER scores compared to the baselines and related research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Latvian public administration machine translation service - http://hugo.lv.

  2. 2.

    Google Translate API - https://cloud.google.com/translate/.

  3. 3.

    Bing Translator Control - http://www.bing.com/dev/en-us/translator.

  4. 4.

    Yandex Translate API - https://tech.yandex.com/translate/.

  5. 5.

    Latvian public administration machine translation service API - http://hugo.lv/TranslationAPI.

  6. 6.

    ChunkMT - https://github.com/M4t1ss/ChunkMT.

References

  1. Vasiļjevs, A., Kalniņš, R., Pinnis, M., Skadiņš, R.: Machine translation for e-Government - the Baltic case. In: Proceedings of AMTA 2014, vol. 2: MT Users, pp. 181–193 (2014)

    Google Scholar 

  2. Skadiņš, R., Šics, V., Rozis, R.: Building the world’s best general domain MT for Baltic languages. In: Human Language Technologies – The Baltic Perspective, Proceedings of the Sixth International Conference Baltic HLT 2014, pp. 141–148. IOS Press (2014)

    Google Scholar 

  3. Costa-Jussa, M.R., Fonollosa, J.A.R.: Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 32(1), 3–10 (2015)

    Article  Google Scholar 

  4. Rikters, M., Skadiņa, I.: Syntax-based multi-system machine translation. In: LREC 2016 (2016)

    Google Scholar 

  5. Thurmair, G.: Comparing different architectures of hybrid machine translation systems. In: Proceedings of the MT Summit XII, pp. 340–347 (2009)

    Google Scholar 

  6. Mellebeek, B., Owczarzak, K., Van Genabith, J., Way, A.: Multi-engine machine translation by recursive sentence decomposition. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, pp. 110–118 (2006)

    Google Scholar 

  7. Ahsan, A., Kolachina, P.: Coupling statistical machine translation with rule-based transfer and generation. In: AMTA-The Ninth Conference of the Association for Machine Translation in the Americas, Denver, Colorado (2010)

    Google Scholar 

  8. Barrault, L.: MANY: open source machine translation system combination. Prague Bull. Math. Linguist. 93, 147–155 (2010)

    Article  Google Scholar 

  9. Hildebrand, A.S., Vogel, St.: CMU system combination for WMT’09. In: Proceedings of the 4th Workshop on SMT, Athens (2009)

    Google Scholar 

  10. Rikters, M.: Multi-system machine translation using online APIs for English-Latvian. In: ACL-IJCNLP 2015, p. 6 (2015)

    Google Scholar 

  11. Heafield, K., Hanneman, Gr., Lavie, A.: Machine translation system combination with flexible word ordering. In: Proceedings of the 4th Workshop on SMT, Athens (2009)

    Google Scholar 

  12. Chen, Y., Jellinghaus, M., Eisele, A., Yi, Zh., Hunsicker, S., Theison, S., Federmann, Ch., Uszkoreit, H.: Combining multi-engine translations with Moses. In: Proceedings of the 4th Workshop on SMT, Athens (2009)

    Google Scholar 

  13. Feng, Y., Liu, Y., Mi, H., Liu, Q., Lü, Y.: Lattice-based system combination for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3, vol. 3. Association for Computational Linguistics (2009)

    Google Scholar 

  14. Freitag, M., Peitz, S., Wuebker, J., Ney, H., Huck, M., Sennrich, R., Durrani, N., Nadejde, M., Williams, P., Koehn, P., Herrmann, T., Cho, E., Waibel, A.: EU-BRIDGE MT: combined machine translation. In: ACL 2014 Ninth Workshop on Statistical Machine Translation (WMT 2014), Baltimore, MD, USA, pp. 105–113 (2014)

    Google Scholar 

  15. Freitag, M., Peter, J., Peitz, S., Feng, M., Ney, H.: Local system voting feature for machine translation system combination. In: EMNLP 2015 Tenth Workshop on Statistical Machine Translation (WMT 2015), Lisbon, Portugal, pp. 467–476 (2015)

    Google Scholar 

  16. Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics (2006)

    Google Scholar 

  17. Heafield, K.: KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics (2011)

    Google Scholar 

  18. Gamon, M., Aue, A., Smets, M.: Sentence-level MT evaluation without reference translations: beyond language modeling. In: Proceedings of EAMT (2005)

    Google Scholar 

  19. Callison-Burch, C., Flournoy, R.S.: A program for automatically selecting the best output from multiple machine translation engines. In: Proceedings of the Machine Translation Summit VIII (2001)

    Google Scholar 

  20. Akiba, Y., Watanabe, T., Sumita, E.: Using language and translation models to select the best among outputs from multiple MT systems. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics (2002)

    Google Scholar 

  21. Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., Varga, D.: The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. arXiv preprint cs/0609058 (2006)

  22. Steinberger, R., Eisele, A., Klocek, S., Pilos, S., Schlüter, P.: DGT-TM: a freely available translation memory in 22 languages. arXiv preprint arXiv:1309.5226 (2013)

  23. Skadiņš, R., Goba, K., Šics, V.: Improving SMT for Baltic languages with factored models. In: Proceedings of the Fourth International Conference Baltic HLT 2010. Frontiers in Artificial Intelligence and Applications, vol. 2192, pp. 125–132 (2010)

    Google Scholar 

  24. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics (2002)

    Google Scholar 

  25. Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc. (2002)

    Google Scholar 

  26. Klejch, O., Avramidis, E., Burchardt, A., Popel, M.: MT-ComparEval: graphical evaluation interface for machine translation development. Prague Bull. Math. Linguist. 104(1), 63–74 (2015)

    Article  Google Scholar 

  27. Madnani, N.: iBLEU: interactively debugging and scoring statistical machine translation systems. In: 2011 Fifth IEEE International Conference on Semantic Computing (ICSC). IEEE (2011)

    Google Scholar 

Download references

Acknowledgements

The research was supported by Grant 271/2012 from the Latvian Council of Science.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matīss Rikters .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rikters, M., Skadiņa, I. (2018). Combining Machine Translated Sentence Chunks from Multiple MT Systems. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75487-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75486-4

  • Online ISBN: 978-3-319-75487-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics