Skip to main content
Log in

Evaluation of the impact of controlled language on neural machine translation compared to other MT architectures

  • Published:
Machine Translation

Abstract

Many studies have shown that the application of controlled languages (CL) is an effective pre-editing technique to improve machine translation (MT) output. In this paper, we investigate whether this also holds true for neural machine translation (NMT). We compare the impact of applying nine CL rules on the quality of NMT output as opposed to that of rule-based, statistical, and hybrid MT by applying three methods: error annotation, human evaluation, and automatic evaluation. The analyzed data is a German corpus-based test suite of technical texts that have been translated into English by five MT systems (a neural, a rule-based, a statistical, and two hybrid MT systems). The comparison is conducted in terms of several quantitative parameters (number of errors, error types, quality ratings, and automatic evaluation metrics scores). The results show that CL rules positively affect rule-based, statistical, and hybrid MT systems. However, CL does not improve the results of the NMT system. The output of the neural system is mostly error-free both before and after CL application and has the highest quality in both scenarios among the analyzed MT systems showing a decrease in quality after applying the CL rules. The qualitative discussion of the NMT output sheds light on the problems that CL causes for this kind of MT architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Many excellent tutorials explain NMT in detail, for instance: http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf.

  2. This paper presents some of the results of a PhD thesis by the first author that examines the impact of the application of individual CL rules on the quality of machine translation output at different levels. This work is due to be published by the end of 2019.

  3. https://www.tekom.de.

  4. As explained in Sect. 3, the CL position is the part of the source sentence that has to be modified in order to apply the CL rule and its equivalence in the target sentence.

  5. http://www.iai-sb.de/de/produkte/clat.

  6. The main criterion for determining the terms that should be replaced was whether the term existed in one of the most widely-used online dictionaries (www.dict.cc or www.leo.org). Terms that are not listed in these online dictionaries were replaced by common terms.

  7. https://www.bing.com/translator/.

  8. https://translate.google.de/.

  9. http://www.lucysoftware.com/english/machine-translation/lucy-lt-kwik-translator-/.

  10. https://www.freetranslation.com/de/.

  11. http://www.systranet.com/translate.

  12. Since hybrid systems are structured differently, the study uses two hybrid systems: Bing is an SMT system with language-specific rule components, while Systran was originally an RBMT system and was further developed into a hybrid system in the last years. Accordingly, both yielded different output.

  13. 216 source sentences * 2 versions * 5 MT systems = 2,160 MT sentences.

  14. As explained in Sect. 3, the CL position is the part of the source sentence that has to be modified in order to apply the CL rule and its equivalence in the target sentence.

  15. The annotation groups are: FF (for False-False) - translation contains error before and after CL; FR (for False-Right) - translation contains error only before CL; RF (for Right- False) - translation contains error only after CL; RR (for Right- Right): no errors before and after CL. For more details, see Sect. 3.1.

  16. TERbase = Translation Edit Rate. hLEPOR = Harmonic mean of enhanced Length Penalty, Precision, n-gram Position difference Penalty and Recall (Han et al. 2013).

  17. Since the participants edited the whole MT output and not only the CL position, the mean values of the AEM scores include all necessary edits both within and outside the CL position. Therefore, only the differences in the mean values (mean of the metric score after CL minus before CL), not the mean values themselves, are taken into account.

  18. The overall quality is the mean of the quality of style and quality of content, as analyzing the correlation here requires no distinction between the quality parameters.

References

  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. In: ICLR. http://arxiv.org/pdf/1409.0473v7.pdf

  • Bouillon P, Gaspar L, Gerlach J, Porro V, Roturier J (2014) Pre-editing by forum users: a case study. In: proceedings of the Workshop on Controlled Natural Language (CNL) Simplifying Language Use (CNL), pp 3–10

  • Busemann S, Bojar O, Callison-Burch C, Cettolo M, Federico M, Garabik R, van Genabith J et al (2012) EuroMatrixPlus—Final Report. http://www.euromatrixplus.org/resources/86

  • Castilho S, Moorkens J, Gaspari F, Sennrich R, Sosoni V, Georgakopoulou P, Lohar P, Way A, Valerio Miceli Barone A, Gialama M (2017) A comparative quality evaluation of PBSMT and NMT using professional translators. In: proceedings of MT Summit 2017

  • Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: proceedings of EMNLP. https://arxiv.org/abs/1406.1078

  • Eisele A (2007) Hybrid machine translation: combining rule-based and statistical MT systems. Presented at the first machine translation marathon, Edinburgh. http://mt-archive.info/MTMarathon-2007-Eisele.pdf

  • Ferlein J, Hartge N (2008) Technische Dokumentation für internationale Märkte: haftungsrechtliche Grundlagen—Sprache—Gestaltung—Redaktion und Übersetzung. Expert Verl, Renningen

    Google Scholar 

  • Fiederer R, O’Brien S (2009) Quality and machine translation: a realistic objective? J Special Transl 11:52–74

    Google Scholar 

  • Geldbach S (2009) Neue Werkzeuge zur Autorenunterstützung—Quelltextbearbeitung in Kombination mit Translation-Memory-Systemen. IN: MDÜ– Fachzeitschrift für Dolmetscher und Übersetzer, Heft 4/2009, 10–19 Gerzymisch-Arbogast, Heidrun/Villiger, Claudia (eds.)

  • Gesellschaft für Technische Kommunikation—tekom e. V. (2013) Leitlinie „Regelbasiertes Schreiben, Deutsch für die Technische Kommunikation“. 2. Erweiterte Auflage. Stuttgart

  • Gonzàlez M, Giménez J (2014) An Open Toolkit for Automatic Machine Translation (Meta-) Evaluation. Technical Manual v3.0. February 2014. Technical Report LSI-14-2-T. Departamento de Lenguajes y Sistemas Informáticos, Universitat Politècnica de Catalunya. http://asiya.lsi.upc.edu/Asiya_technical_manual_v3.0.pdf

  • Han ALF, Wong DF, Chao LS, He L, Lu Y, Xing J, Zeng X (2013) Language-independent Model for machine translation evaluation with reinforced factors. In: proceedings of the Machine Translation Summit XIV (MT SUMMIT 2013), Nice, France: International Association for Machine Translation. pp. 215–222. http://www.mt-archive.info/10/MTS-2013-Han.pdf#!

  • Hutchins J, Somers HL (1992) An Introduction to machine translation. Academic Press Limited, Cambridge

    MATH  Google Scholar 

  • Jääskeläinen R (1993) Investigating Translation Strategies. In: Recent trends in empirical translation research. Kielitieteellisiä tutkimuksia/Studies in Languages 28. Joensuu: University of Joensuu, Faculty of Arts, 99–120 Tirkkonen-Condit, Sonja/Laffling, John (eds.)

  • Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Thorat N, Viégas FB, Wattenberg M, Corrado G, Hughes M, Dean J (2016) Google’s multilingual neural machine translation system: enabling zero-shot translation. CoRR abs/1611.04558. http://arxiv.org/abs/1611.04558

  • Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, Washington, USA, pp. 1700–1709. http://www.aclweb.org/anthology/D13-1176

  • Kamprath C, Adolphson E, Mitamura T, Nyberg E (1998) Controlled language for multilingual document production: experience with Caterpillar Technical English. Mitamura et al. (eds) In: Proceedings of the second international workshop on controlled language applications—CLAW ‘98, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, pp 51–61

  • Koehn P (2010) Statistical machine translation. Cambridge University Press, Cambridge, New York

    MATH  Google Scholar 

  • Koehn P (2017) Neural machine translation. statistical machine. Translation. Chapter 13. Johns Hopkins University, Baltimore. http://arxiv.org/abs/1709.07809v1

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: proceedings of the ACL-2007 Demo and Poster Sessions, Prague, Czech Republic, pp 177–180

  • Lehmann S, Gottesman B, Grabowski R, Kudo M, Lo S K P, Siegel M, Fouvry F (2012) Applying CNL authoring support to improve machine translation of forum data. In: proceedings of third international workshop on controlled natural language (CNL), p 1–10

  • Lehrndorfer A (1996) Kontrollierte Sprache für Technische Dokumentation—Ein Ansatz für das Deutsche. In: Wissenschaftliche Grundlagen der technischen Kommunikation, edited by Hans P. Krings. Tübingen: G. Narr

  • Luong MT, Pham H, Manning CD (2015) Effective Approaches to attention-based neural machine translation. In: proceedings of EMNLP. http://aclweb.org/anthology/D15-1166

  • Ñeco RP, Forcada ML (1997) Asynchronous translations with recurrent neural nets. Neural Netw 4:2535–2540

    Google Scholar 

  • Nitzke J (2019) Problem solving activities in post-editing and translation from scratch: a multi-method study. Language Science Press, Berlin. http://langsci-press.org/catalog/book/196

  • Nyberg E, Mitamura T (1996) Controlled language and knowledge-based machine translation: principles and practice. In: proceedings of the first controlled language application workshop (CLAW 1996), Leuven, Belgium, Centre for Computational Linguistics, pp 74–83

  • O’Brien S (2011) Towards predicting post-editing productivity. Mach Transl 25(3):197–215

    Article  Google Scholar 

  • Pouget-Abadie J, Bahdanau D, van Merriënboer B, Cho K, Bengio Y (2014) Overcoming the curse of sentence length for neural machine translation using automatic segmentation. In: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. http://www.aclweb.org/anthology/W14-4009

  • Reuther U (2003) Two in one—Can it work? Readability and translatability by means of controlled language. In: proceedings of the Joint Conference combining the 8th International Workshop of the European Association for Machine Translation and the 4th Controlled Language Applications Workshop (CLAW 2003), 15–17th May, Dublin City University, Ireland. pp 124–132

  • Rösener C (2010) Computational linguistics in the translator’s workflow—combining authoring tools and translation memory systems. IN: Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids. Los Angeles, California, pp 1–6

  • Snover M, Dorr BJ, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: proceeding of AMTA

  • Sutskever I, Vinyals O, Quoc VL (2014) Sequence to sequence learning with neural networks. In: NIPS. p 9. http://arxiv.org/abs/1409.3215

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention is all you need. CoRR abs/1706.03762. http://arxiv.org/abs/1706.03762

  • Vilar D, Xu J, D’Haro LF, Ney H (2006) Error analysis of machine translation output. In: LREC-2006: fifth international conference on language resources and evaluation. Proceedings, Genoa, Italy, pp 697–702

  • Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. CoRR abs/1609.08144. http://arxiv.org/abs/1609.08144.pdf

  • Xing S, Knight K, Yuret D (2016) Why neural translations are the right length. In: proceedings of the 2016 conference on empirical methods in natural language processing, Austin, Texas, pp 2278–2282. http://www.aclweb.org/anthology/D16-1248.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaimaa Marzouk.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marzouk, S., Hansen-Schirra, S. Evaluation of the impact of controlled language on neural machine translation compared to other MT architectures. Machine Translation 33, 179–203 (2019). https://doi.org/10.1007/s10590-019-09233-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-019-09233-w

Keywords

Navigation