Language-related issues for NMT and PBMT for English–German and English–Serbian

Popović, Maja

doi:10.1007/s10590-018-9219-5

Language-related issues for NMT and PBMT for English–German and English–Serbian

Published: 24 April 2018

Volume 32, pages 237–253, (2018)
Cite this article

Machine Translation

Maja Popović¹

753 Accesses
1 Altmetric
Explore all metrics

Abstract

This work presents an extensive comparison of language-related problems for neural machine translation (NMT) and phrase-based machine translation (PBMT) for German-to-English, English-to-German and English-to-Serbian. The explored issues are related both to the characteristics of the languages as well as to the (machine) translation process and, although related, go beyond typical translation error classes. It is shown that the main advantage of the NMT approach consists of better generating verb forms, avoiding verb omissions, as well as better handling of English noun collocations and negation. It is also shown that the main obstacles for the NMT system are prepositions, translation of English (source) ambiguous words and generating English (target) continuous and perfect tenses. In addition, preliminary experiments show that a number of issues are complementary, i.e., not occurring in the same segments and/or in the same form. This means that a combination or hybridisation of the NMT and PBMT approaches is a promising direction for improving both types of systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

Notes

http://www.statmt.org/wmt16/.
http://www.statmt.org/wmt16/translation-task.html.
http://server1.nlp.insight-centre.org/asistent/.
We acknowledge that the main deficiency of the described approach is poor scalability, since the evaluation procedure is time-consuming and also resource-intensive. Furthermore, the annotators have to be familiar with both linguistic phenomena and the translation process, and to be fluent in both the source and the target language.
https://github.com/m-popovic/german-english_pbmt-nmt-issues.

References

Arčan M, Popović M, Buitelaar P (2016) Asistent—a machine translation system for Slovene, Serbian and Croatian. In: Proceedings of the conference on language technologies and digital humanities, Ljubljana, Slovenia, p 13–20
Bentivogli L, Bisazza A, Cettolo M, Federico M (2016) Neural versus phrase-based machine translation quality: a case study. In: Proceedings of the 2016 conference on empirical methods in natural language processing (EMNLP 2016), Austin, Texas, p 257–267
Bojar O, Chatterjee R, Federmann C, Graham Y, Haddow B, Huck M, Jimeno Yepes A, Koehn P, Logacheva V, Monz C, Negri M, Neveol A, Neves M, Popel M, Post M, Rubino R, Scarton C, Specia L, Turchi M, Verspoor K, Zampieri M (2016) Findings of the 2016 conference on machine translation. In: Proceedings of the first conference on machine translation (WMT 2016), Berlin, Germany, p 131–198
Comelles E, Atserias J, Arranz V, Castellón I (2012) VERTa: linguistic features in MT evaluation. In: Proceedings of the 8th international conference on language resources and evaluation (LREC 2012), Istanbul, Turkey, p 3944–3950
Comelles E, Arranz V, Castellón I (2016) Guiding automatic MT evaluation by means of linguistic features. Digital Scholarsh Humanit 29(2):761–778
Google Scholar
Farrús M, Costa-Jussà MR, Mariño JB, Fonollosa JAR (2010) Linguistic-based evaluation criteria to identify statistical machine translation errors. In: Proceedings of the 14th annual conference of the European Association for Machine Translation (EAMT 2010), Saint-Raphaël, France, p 167–173
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the Association for Computational Linguistics companion volume proceedings of the demo and poster sessions, Prague, Czech Republic, p 177–180
Niehues J, Cho E, Ha T, Waibel A (2016) Pre-translation for neural machine translation. In: Proceedings of the 26th international conference on computational linguistics (CoLing 2016), Osaka, Japan, p 1828–1836
Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia, PA, p 311–318
Popović M (2015) chrF: character n-gram F-score for automatic MT evaluation. In: Proceedings of the 10th workshop on statistical machine translation (WMT 2015), Lisbon, Portugal, p 392–395
Popović M, Arčan M (2015) Identifying main obstacles for statistical machine translation of morphologically rich South Slavic languages. In: Proceedings of the 18th annual conference of the European Association for Machine Translation (EAMT 2015), Antalya, Turkey, p 97–104
Sennrich R, Haddow B, Birch A (2016) Edinburgh neural machine translation systems for WMT16. In: Proceedings of the 1st conference on machine translation (WMT 2016), Berlin, Germany, p 371–376
Toral A, Sánchez-Cartagena VM (2017) A multifaceted evaluation of neural versus statistical machine translation for 9 language directions. In: Proceedings of the 15th conference of the European Chapter of the Association for Computational Linguistics (EACL 2017), Valencia, Spain, p 1063–1073
Williams P, Sennrich R, Nadejde M, Huck M, Haddow B, Bojar O (2016) Edinburgh’s statistical machine translation systems for WMT16. In: Proceedings of the 1st conference on machine translation (WMT 2016), Berlin, Germany, p 399–410

Download references

Author information

Authors and Affiliations

Humboldt University of Berlin, Unter den Linden 6, 10099, Berlin, Germany
Maja Popović

Authors

Maja Popović
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Maja Popović.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Popović, M. Language-related issues for NMT and PBMT for English–German and English–Serbian. Machine Translation 32, 237–253 (2018). https://doi.org/10.1007/s10590-018-9219-5

Download citation

Received: 31 August 2017
Accepted: 22 March 2018
Published: 24 April 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s10590-018-9219-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Language-related issues for NMT and PBMT for English–German and English–Serbian

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Combining Phrase and Neural-Based Machine Translation: What Worked and Did Not

A comprehensive survey on machine translation for English, Hindi and Sanskrit languages

Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Language-related issues for NMT and PBMT for English–German and English–Serbian

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Combining Phrase and Neural-Based Machine Translation: What Worked and Did Not

A comprehensive survey on machine translation for English, Hindi and Sanskrit languages

Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now