Abstract
Many studies have shown that the application of controlled languages (CL) is an effective pre-editing technique to improve machine translation (MT) output. In this paper, we investigate whether this also holds true for neural machine translation (NMT). We compare the impact of applying nine CL rules on the quality of NMT output as opposed to that of rule-based, statistical, and hybrid MT by applying three methods: error annotation, human evaluation, and automatic evaluation. The analyzed data is a German corpus-based test suite of technical texts that have been translated into English by five MT systems (a neural, a rule-based, a statistical, and two hybrid MT systems). The comparison is conducted in terms of several quantitative parameters (number of errors, error types, quality ratings, and automatic evaluation metrics scores). The results show that CL rules positively affect rule-based, statistical, and hybrid MT systems. However, CL does not improve the results of the NMT system. The output of the neural system is mostly error-free both before and after CL application and has the highest quality in both scenarios among the analyzed MT systems showing a decrease in quality after applying the CL rules. The qualitative discussion of the NMT output sheds light on the problems that CL causes for this kind of MT architecture.
Similar content being viewed by others
Notes
Many excellent tutorials explain NMT in detail, for instance: http://nlp.stanford.edu/projects/nmt/Luong-Cho-Manning-NMT-ACL2016-v4.pdf.
This paper presents some of the results of a PhD thesis by the first author that examines the impact of the application of individual CL rules on the quality of machine translation output at different levels. This work is due to be published by the end of 2019.
As explained in Sect. 3, the CL position is the part of the source sentence that has to be modified in order to apply the CL rule and its equivalence in the target sentence.
The main criterion for determining the terms that should be replaced was whether the term existed in one of the most widely-used online dictionaries (www.dict.cc or www.leo.org). Terms that are not listed in these online dictionaries were replaced by common terms.
Since hybrid systems are structured differently, the study uses two hybrid systems: Bing is an SMT system with language-specific rule components, while Systran was originally an RBMT system and was further developed into a hybrid system in the last years. Accordingly, both yielded different output.
216 source sentences * 2 versions * 5 MT systems = 2,160 MT sentences.
As explained in Sect. 3, the CL position is the part of the source sentence that has to be modified in order to apply the CL rule and its equivalence in the target sentence.
The annotation groups are: FF (for False-False) - translation contains error before and after CL; FR (for False-Right) - translation contains error only before CL; RF (for Right- False) - translation contains error only after CL; RR (for Right- Right): no errors before and after CL. For more details, see Sect. 3.1.
TERbase = Translation Edit Rate. hLEPOR = Harmonic mean of enhanced Length Penalty, Precision, n-gram Position difference Penalty and Recall (Han et al. 2013).
Since the participants edited the whole MT output and not only the CL position, the mean values of the AEM scores include all necessary edits both within and outside the CL position. Therefore, only the differences in the mean values (mean of the metric score after CL minus before CL), not the mean values themselves, are taken into account.
The overall quality is the mean of the quality of style and quality of content, as analyzing the correlation here requires no distinction between the quality parameters.
References
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. In: ICLR. http://arxiv.org/pdf/1409.0473v7.pdf
Bouillon P, Gaspar L, Gerlach J, Porro V, Roturier J (2014) Pre-editing by forum users: a case study. In: proceedings of the Workshop on Controlled Natural Language (CNL) Simplifying Language Use (CNL), pp 3–10
Busemann S, Bojar O, Callison-Burch C, Cettolo M, Federico M, Garabik R, van Genabith J et al (2012) EuroMatrixPlus—Final Report. http://www.euromatrixplus.org/resources/86
Castilho S, Moorkens J, Gaspari F, Sennrich R, Sosoni V, Georgakopoulou P, Lohar P, Way A, Valerio Miceli Barone A, Gialama M (2017) A comparative quality evaluation of PBSMT and NMT using professional translators. In: proceedings of MT Summit 2017
Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: proceedings of EMNLP. https://arxiv.org/abs/1406.1078
Eisele A (2007) Hybrid machine translation: combining rule-based and statistical MT systems. Presented at the first machine translation marathon, Edinburgh. http://mt-archive.info/MTMarathon-2007-Eisele.pdf
Ferlein J, Hartge N (2008) Technische Dokumentation für internationale Märkte: haftungsrechtliche Grundlagen—Sprache—Gestaltung—Redaktion und Übersetzung. Expert Verl, Renningen
Fiederer R, O’Brien S (2009) Quality and machine translation: a realistic objective? J Special Transl 11:52–74
Geldbach S (2009) Neue Werkzeuge zur Autorenunterstützung—Quelltextbearbeitung in Kombination mit Translation-Memory-Systemen. IN: MDÜ– Fachzeitschrift für Dolmetscher und Übersetzer, Heft 4/2009, 10–19 Gerzymisch-Arbogast, Heidrun/Villiger, Claudia (eds.)
Gesellschaft für Technische Kommunikation—tekom e. V. (2013) Leitlinie „Regelbasiertes Schreiben, Deutsch für die Technische Kommunikation“. 2. Erweiterte Auflage. Stuttgart
Gonzàlez M, Giménez J (2014) An Open Toolkit for Automatic Machine Translation (Meta-) Evaluation. Technical Manual v3.0. February 2014. Technical Report LSI-14-2-T. Departamento de Lenguajes y Sistemas Informáticos, Universitat Politècnica de Catalunya. http://asiya.lsi.upc.edu/Asiya_technical_manual_v3.0.pdf
Han ALF, Wong DF, Chao LS, He L, Lu Y, Xing J, Zeng X (2013) Language-independent Model for machine translation evaluation with reinforced factors. In: proceedings of the Machine Translation Summit XIV (MT SUMMIT 2013), Nice, France: International Association for Machine Translation. pp. 215–222. http://www.mt-archive.info/10/MTS-2013-Han.pdf#!
Hutchins J, Somers HL (1992) An Introduction to machine translation. Academic Press Limited, Cambridge
Jääskeläinen R (1993) Investigating Translation Strategies. In: Recent trends in empirical translation research. Kielitieteellisiä tutkimuksia/Studies in Languages 28. Joensuu: University of Joensuu, Faculty of Arts, 99–120 Tirkkonen-Condit, Sonja/Laffling, John (eds.)
Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Thorat N, Viégas FB, Wattenberg M, Corrado G, Hughes M, Dean J (2016) Google’s multilingual neural machine translation system: enabling zero-shot translation. CoRR abs/1611.04558. http://arxiv.org/abs/1611.04558
Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: proceedings of the 2013 conference on empirical methods in natural language processing. Association for Computational Linguistics, Seattle, Washington, USA, pp. 1700–1709. http://www.aclweb.org/anthology/D13-1176
Kamprath C, Adolphson E, Mitamura T, Nyberg E (1998) Controlled language for multilingual document production: experience with Caterpillar Technical English. Mitamura et al. (eds) In: Proceedings of the second international workshop on controlled language applications—CLAW ‘98, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, pp 51–61
Koehn P (2010) Statistical machine translation. Cambridge University Press, Cambridge, New York
Koehn P (2017) Neural machine translation. statistical machine. Translation. Chapter 13. Johns Hopkins University, Baltimore. http://arxiv.org/abs/1709.07809v1
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: proceedings of the ACL-2007 Demo and Poster Sessions, Prague, Czech Republic, pp 177–180
Lehmann S, Gottesman B, Grabowski R, Kudo M, Lo S K P, Siegel M, Fouvry F (2012) Applying CNL authoring support to improve machine translation of forum data. In: proceedings of third international workshop on controlled natural language (CNL), p 1–10
Lehrndorfer A (1996) Kontrollierte Sprache für Technische Dokumentation—Ein Ansatz für das Deutsche. In: Wissenschaftliche Grundlagen der technischen Kommunikation, edited by Hans P. Krings. Tübingen: G. Narr
Luong MT, Pham H, Manning CD (2015) Effective Approaches to attention-based neural machine translation. In: proceedings of EMNLP. http://aclweb.org/anthology/D15-1166
Ñeco RP, Forcada ML (1997) Asynchronous translations with recurrent neural nets. Neural Netw 4:2535–2540
Nitzke J (2019) Problem solving activities in post-editing and translation from scratch: a multi-method study. Language Science Press, Berlin. http://langsci-press.org/catalog/book/196
Nyberg E, Mitamura T (1996) Controlled language and knowledge-based machine translation: principles and practice. In: proceedings of the first controlled language application workshop (CLAW 1996), Leuven, Belgium, Centre for Computational Linguistics, pp 74–83
O’Brien S (2011) Towards predicting post-editing productivity. Mach Transl 25(3):197–215
Pouget-Abadie J, Bahdanau D, van Merriënboer B, Cho K, Bengio Y (2014) Overcoming the curse of sentence length for neural machine translation using automatic segmentation. In: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. http://www.aclweb.org/anthology/W14-4009
Reuther U (2003) Two in one—Can it work? Readability and translatability by means of controlled language. In: proceedings of the Joint Conference combining the 8th International Workshop of the European Association for Machine Translation and the 4th Controlled Language Applications Workshop (CLAW 2003), 15–17th May, Dublin City University, Ireland. pp 124–132
Rösener C (2010) Computational linguistics in the translator’s workflow—combining authoring tools and translation memory systems. IN: Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids. Los Angeles, California, pp 1–6
Snover M, Dorr BJ, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: proceeding of AMTA
Sutskever I, Vinyals O, Quoc VL (2014) Sequence to sequence learning with neural networks. In: NIPS. p 9. http://arxiv.org/abs/1409.3215
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention is all you need. CoRR abs/1706.03762. http://arxiv.org/abs/1706.03762
Vilar D, Xu J, D’Haro LF, Ney H (2006) Error analysis of machine translation output. In: LREC-2006: fifth international conference on language resources and evaluation. Proceedings, Genoa, Italy, pp 697–702
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. CoRR abs/1609.08144. http://arxiv.org/abs/1609.08144.pdf
Xing S, Knight K, Yuret D (2016) Why neural translations are the right length. In: proceedings of the 2016 conference on empirical methods in natural language processing, Austin, Texas, pp 2278–2282. http://www.aclweb.org/anthology/D16-1248.pdf
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Marzouk, S., Hansen-Schirra, S. Evaluation of the impact of controlled language on neural machine translation compared to other MT architectures. Machine Translation 33, 179–203 (2019). https://doi.org/10.1007/s10590-019-09233-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-019-09233-w