Abstract
This paper aims to shed light on the post-editing process of the recently-introduced neural machine translation (NMT) paradigm. Using simple and more complex texts, we first evaluate the output quality from English to Chinese phrase-based statistical (PBSMT) and NMT systems. Nine raters assess the MT quality in terms of fluency and accuracy and find that NMT produces higher-rated translations than PBSMT for both texts. Then we analyze the effort expended by 68 student translators during HT and when post-editing NMT and PBSMT output. Our measures of post-editing effort are all positively correlated for both NMT and PBSMT post-editing. Our findings suggest that although post-editing output from NMT is not always significantly faster than post-editing PBSMT, it significantly reduces the technical and cognitive effort. We also find that, in contrast to HT, post-editing effort is not necessarily correlated with source text complexity.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
The Test for English Majors Band 8 is a national English test for English majors in China, which requires a candidate to master about 13,000 words.
References
Alves F (2006) A relevance-theoretic approach to effort and effect in translation: discussing the cognitive interface between inferential processing, problem-solving and decision-making. In: Proceedings of the international symposium on new horizons in theoretical translation studies, Hong Kong, China, pp 1–12
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the 6th international conference on learning representations, San Diego, California, USA
Balling LW (2008) A brief introduction to regression designs and mixed-effects modelling by a recent convert. In: Göpferich S, Jakobsen AL, Mees IM (eds) Looking at eyes: eye tracking studies of reading and translation processing. Copenhagen studies in language, vol 36, pp 175–192
Balling L, Carl M (2014) Production time across language and tasks: a large-scale analysis using the CRITT translation process database. In: Schwieter J, Ferreira A (eds) The development of translation competence: theories and methodologies from psycholinguistics and cognitive science. Cambridge Scholar Publishing, Cambridge, pp 239–268
Bates D, Maechler M, Bolker B, Walker S (2014) lme4: linear mixed-effects models using Eigen and S4. R package version 3.1.2. http://CRAN.R-project.org/package=lme4
Bentivogli L, Bisazza A, Cettolo M, Federico M (2016) Neural versus phrase-based machine translation quality: a case study. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Austin, Texas, pp 257–267
Bojar O, Chatterjee R, Federmann C et al (2016) Findings of the 2016 conference on machine translation. In: Proceedings of the first conference on machine translation (WMT 2016), Berlin, Germany, pp 131–198
Butterworth B (1980) Evidence from pauses in speech. In: Butterworth B (ed) Language production, vol 1. Speech and talk. Academic Press, London, pp 155–176
Carl M (2012) Translog-II: a program for recording user activity data for empirical reading and writing research. In: Proceedings of the eighth international conference on language resources and evaluation (LREC12), pp 4108–4112
Carl M, Dragsted B, Elming J, Hardt D, Jakobsen AL (2011) The process of post-editing: a pilot study. In: Sharp B, Zock M, Carl M, Jakobsen AL (eds) Proceedings of the 8th international NLPCS Workshop, Copenhagen, Denmark, pp 131–142
Carl M, Gutermuth S, Hansen-Schirra S (2015) Post-editing machine translation: a usability test for professional translation settings. In: Schwieter J, Ferreira A (eds) Psycholinguistic and cognitive inquiries in translation and interpretation studies. John Benjamins, Amsterdam, pp 145–174
Carl M, Schaeffer M, Bangalore S (2016) The CRITT Translation process research database. In: Carl M, Bangalore S, Schaeffer M (eds) New directions in empirical translation process research: exploring the CRITT TPR-DB. Springer, Cham, pp 13–54
Castilho S, Moorkens J, Gaspari F, Sennrich R, Way A, Georgakopoulou P (2018) Evaluating MT for massive open online courses. Mach Transl 32(3):255–278
da Silva IAL, Alves F, Schmaltz M et al (2017) Translation, post-editing and directionality: a study of effort in the Chinese-Portuguese language pair. In: Jakobsen AL, Mesa-Lao B (eds) Translation in transition. John Benjamins, Amsterdam, pp 91–117
Daems J, Vandepitte S, Hartsuiker RJ, Macken L (2017) Translation methods and experience: a comparative analysis of human translation and post-editing with students and professional translators. Meta 62(2):245–270
Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382
Germann U (2008) Yawat: yet another word alignment tool. In: Proceedings of the 46th annual meeting of the association for computational linguistics: demo session, Columbus. Columbus, OH, pp 20–23
Groves D, Schmidtke D (2009) Identification and analysis of post-editing patterns for MT. In: Proceedings of the twelfth machine translation Summit, August 26–30, Ottawa, ON, Canada, pp 429–436
Guerberof A (2009) Productivity and quality in MT post-editing. In: Proceedings of MT Summit XII—workshop: beyond translation memories: new tools for translators MT, Ottawa, Ontario, Canada
Hansen G (2002) Zeit und Qualität im Übersetzungsprozess. In: Hansen G (ed) Empirical translation studies: process and product. Samfundslitteratur, Copenhagen, pp 29–54
Jakobsen AL (1998) Logging time delay in translation, LSP texts and the translation process. In: Copenhagen working papers, pp 73–101
Jakobsen AL (2002) Translation drafting by professional translators and by translation students. In: Empirical translation studies: process and product. Copenhagen studies in language vol 27, pp 191–204
Jensen KTH (2009) Indicators of text complexity. In: Göpferich S, Jakobsen AL, Mees IM (eds) Behind the mind: methods, models and results in translation process research. Copenhagen studies in language, vol 36. Samfundslitteratur, Copenhagen, pp 61–80
Jensen KTH (2011) Allocation of cognitive resources in translation—an eye-tracking and key-logging study. Dissertation, Copenhagen Business School
Jia Y, Carl M, Wang X (2019) How does the post-editing of neural machine translation compare with from-scratch translation? A product and process study. J Specialised Transl 3:60–86
Junczys-Dowmunt M, Dwojak T, Hoang H (2016) Is neural machine translation ready for deployment? A case study on 30 translation directions. In: Proceedings of the 9th international workshop on spoken language translation, Seattle, Washington
Klubička F, Toral A, Sánchez-Cartagena VM (2017) Fine-grained human evaluation of neural versus phrase-based machine translation. Prague Bull Math Linguist 108:121–132
Koglin A (2015) An empirical investigation of cognitive effort required to post-edit machine translated metaphors compared to the translation of metaphors. Transl Interpret 7(1):126–141
Krings HP (2001) Repairing texts: empirical investigations of machine translation post-editing processes. The Kent State University Press, Kent, OH
Lacruz I, Shreve GM (2014) Pauses and cognitive effort in post-editing. In: O’Brien S, Winther Balling L, Carl M, Simard M, Specia L (eds) Post-editing of machine translation: processes and applications. Cambridge Scholars Publishing, Newcastle-upon-Tyne, pp 246–272
Lacruz I, Shreve GM, Angelone E (2012) Average pause ratio as an indicator of cognitive effort in post-editing: a case study. In: Proceedings of the AMTA 2012 workshop on post-editing technology and practice (WPTP 2012), San Diego, CA, pp 21–30
Lacruz I, Denkowski M, Lavie A (2014) Cognitive demand and cognitive effort in post-editing. In: Proceedings of the third workshop on post-editing technology and practice, Vancouver, Canada
Mesa-Lao B (2014) Gaze behaviour on source texts: an exploratory study comparing translation and post-editing. In: O’Brien S, Winther Balling L, Carl M, Simard M, Specia L (eds) Post-editing of machine. Cambridge Scholars Publishing, Newcastle, pp 219–245
Mishra A, Bhattacharyya P, Carl M (2013) Automatically predicting sentence translation difficulty. Proceedings from the 51st annual meeting of the association for computational linguistics (vol 2: short papers). Sofia, Bulgaria, pp 346–351
Moorkens J, O’Brien S (2015) Post-editing evaluations: trade-offs between novice and professional participants. In: Proceedings of the 18th annual conference of the European Association for Machine Translation (EAMT 2015), Antalya, Turkey, pp 75–81
O’Brien S (2006) Pauses as indicators of cognitive effort in post-editing machine translation. Across Lang Cult 7(1):1–21
O’Brien S (2011) Towards predicting post-editing productivity. Mach Transl 25(3):197–215
O’Brien S (2007) An empirical investigation of temporal and technical post-editing effort. Transl Interpret Stud 2(1):83–136
Plitt M, Masselot F (2010) A productivity test of statistical machine translation post-editing in a typical localisation context. Prague Bull Math Linguist 93:7–16
R Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org
Schaeffer M, Carl M, Lacruz I, Aizawa A (2016) Measuring cognitive translation effort with activity units. Baltic J Mod Comput 4(2):331–345
Schilperoord J (1996) It’s about time. Temporal aspects of cognitive processes in text production. Rodopi, Amsterdam
Screen B (2017) Machine translation and Welsh: analysing free statistical machine translation for the professional translation of an under-researched language pair. J Spec Transl 28:317–344
Shterionov D, Superbo R, Nagle P, Casanellas L, O’Dowd T (2018) Human versus automatic quality evaluation of NMT and PBSMT. Mach Transl 32(3):217–235
Snover M, Dorr B, Schwartz R, Miccuilla L, Makhoul J (2006) A study of translation edit rate with targeted human annotations. In: Proceedings of association for machine translation in the Americas, Cambridge, MA, USA, pp 223–231
Suo J, Yu B, He Y, Zang G (2012) Study of ambiguities of English–Chinese machine translation. Appl Mech Mater 157:472–475
Tatsumi M (2009) Correlation between automatic evaluation scores, post-editing speed and some other factors. In: Proceedings of MT Summit XII, Ottawa, Canada, pp 332–339
TAUS (2016) TAUS post-editing guidelines. https://www.taus.net/think-tank/articles/postedit-articles/taus-post-editing-guidelines. Accessed 2 May 2016
Toral A, Sánchez-Cartagena VM (2017) A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions. arXiv:1701.02901
Wu X, Cardey S, Greenfield P (2006) Realization of the Chinese BA-construction in an English-Chinese machine translation system. In: Proceedings of the fifth SIGHAN workshop on Chinese language processing, pp 79–86
Wu Y, Schuster M, Chen Z et al (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. CoRR. arXiv:1609.08144
Yamada M (2015) Can college students be post-editors? An investigation into employing language learners in machine translation plus post-editing settings. Mach Transl 29(1):49–67
Acknowledgements
Thanks goes to our participants, annotators and raters for their precious time. Particular gratitude is extended to Prof. Yves Gambier for his comments on the earlier drafts of this article. We are also grateful to Dr. Joss Moorkens and the anonymous reviewers for their constructive comments. This research was supported by the Social Science Foundation of Hunan Province (17ZDB005), China Hunan Provincial Science & Technology Foundation ([2017]131).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jia, Y., Carl, M. & Wang, X. Post-editing neural machine translation versus phrase-based machine translation for English–Chinese. Machine Translation 33, 9–29 (2019). https://doi.org/10.1007/s10590-019-09229-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-019-09229-6