Skip to main content
Log in

Post-editing neural machine translation versus phrase-based machine translation for English–Chinese

  • Published:
Machine Translation

Abstract

This paper aims to shed light on the post-editing process of the recently-introduced neural machine translation (NMT) paradigm. Using simple and more complex texts, we first evaluate the output quality from English to Chinese phrase-based statistical (PBSMT) and NMT systems. Nine raters assess the MT quality in terms of fluency and accuracy and find that NMT produces higher-rated translations than PBSMT for both texts. Then we analyze the effort expended by 68 student translators during HT and when post-editing NMT and PBSMT output. Our measures of post-editing effort are all positively correlated for both NMT and PBSMT post-editing. Our findings suggest that although post-editing output from NMT is not always significantly faster than post-editing PBSMT, it significantly reduces the technical and cognitive effort. We also find that, in contrast to HT, post-editing effort is not necessarily correlated with source text complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://taus.net/academy/best-practices/evaluate-best-practices/adequacy-fluency-guidelines (accessed May 2018).

  2. The Test for English Majors Band 8 is a national English test for English majors in China, which requires a candidate to master about 13,000 words.

References

  • Alves F (2006) A relevance-theoretic approach to effort and effect in translation: discussing the cognitive interface between inferential processing, problem-solving and decision-making. In: Proceedings of the international symposium on new horizons in theoretical translation studies, Hong Kong, China, pp 1–12

  • Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the 6th international conference on learning representations, San Diego, California, USA

  • Balling LW (2008) A brief introduction to regression designs and mixed-effects modelling by a recent convert. In: Göpferich S, Jakobsen AL, Mees IM (eds) Looking at eyes: eye tracking studies of reading and translation processing. Copenhagen studies in language, vol 36, pp 175–192

  • Balling L, Carl M (2014) Production time across language and tasks: a large-scale analysis using the CRITT translation process database. In: Schwieter J, Ferreira A (eds) The development of translation competence: theories and methodologies from psycholinguistics and cognitive science. Cambridge Scholar Publishing, Cambridge, pp 239–268

    Google Scholar 

  • Bates D, Maechler M, Bolker B, Walker S (2014) lme4: linear mixed-effects models using Eigen and S4. R package version 3.1.2. http://CRAN.R-project.org/package=lme4

  • Bentivogli L, Bisazza A, Cettolo M, Federico M (2016) Neural versus phrase-based machine translation quality: a case study. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Austin, Texas, pp 257–267

  • Bojar O, Chatterjee R, Federmann C et al (2016) Findings of the 2016 conference on machine translation. In: Proceedings of the first conference on machine translation (WMT 2016), Berlin, Germany, pp 131–198

  • Butterworth B (1980) Evidence from pauses in speech. In: Butterworth B (ed) Language production, vol 1. Speech and talk. Academic Press, London, pp 155–176

    Google Scholar 

  • Carl M (2012) Translog-II: a program for recording user activity data for empirical reading and writing research. In: Proceedings of the eighth international conference on language resources and evaluation (LREC12), pp 4108–4112

  • Carl M, Dragsted B, Elming J, Hardt D, Jakobsen AL (2011) The process of post-editing: a pilot study. In: Sharp B, Zock M, Carl M, Jakobsen AL (eds) Proceedings of the 8th international NLPCS Workshop, Copenhagen, Denmark, pp 131–142

  • Carl M, Gutermuth S, Hansen-Schirra S (2015) Post-editing machine translation: a usability test for professional translation settings. In: Schwieter J, Ferreira A (eds) Psycholinguistic and cognitive inquiries in translation and interpretation studies. John Benjamins, Amsterdam, pp 145–174

    Google Scholar 

  • Carl M, Schaeffer M, Bangalore S (2016) The CRITT Translation process research database. In: Carl M, Bangalore S, Schaeffer M (eds) New directions in empirical translation process research: exploring the CRITT TPR-DB. Springer, Cham, pp 13–54

    Chapter  Google Scholar 

  • Castilho S, Moorkens J, Gaspari F, Sennrich R, Way A, Georgakopoulou P (2018) Evaluating MT for massive open online courses. Mach Transl 32(3):255–278

    Article  Google Scholar 

  • da Silva IAL, Alves F, Schmaltz M et al (2017) Translation, post-editing and directionality: a study of effort in the Chinese-Portuguese language pair. In: Jakobsen AL, Mesa-Lao B (eds) Translation in transition. John Benjamins, Amsterdam, pp 91–117

    Google Scholar 

  • Daems J, Vandepitte S, Hartsuiker RJ, Macken L (2017) Translation methods and experience: a comparative analysis of human translation and post-editing with students and professional translators. Meta 62(2):245–270

    Article  Google Scholar 

  • Fleiss JL (1971) Measuring nominal scale agreement among many raters. Psychol Bull 76(5):378–382

    Article  Google Scholar 

  • Germann U (2008) Yawat: yet another word alignment tool. In: Proceedings of the 46th annual meeting of the association for computational linguistics: demo session, Columbus. Columbus, OH, pp 20–23

  • Groves D, Schmidtke D (2009) Identification and analysis of post-editing patterns for MT. In: Proceedings of the twelfth machine translation Summit, August 26–30, Ottawa, ON, Canada, pp 429–436

  • Guerberof A (2009) Productivity and quality in MT post-editing. In: Proceedings of MT Summit XII—workshop: beyond translation memories: new tools for translators MT, Ottawa, Ontario, Canada

  • Hansen G (2002) Zeit und Qualität im Übersetzungsprozess. In: Hansen G (ed) Empirical translation studies: process and product. Samfundslitteratur, Copenhagen, pp 29–54

    Google Scholar 

  • Jakobsen AL (1998) Logging time delay in translation, LSP texts and the translation process. In: Copenhagen working papers, pp 73–101

  • Jakobsen AL (2002) Translation drafting by professional translators and by translation students. In: Empirical translation studies: process and product. Copenhagen studies in language vol 27, pp 191–204

  • Jensen KTH (2009) Indicators of text complexity. In: Göpferich S, Jakobsen AL, Mees IM (eds) Behind the mind: methods, models and results in translation process research. Copenhagen studies in language, vol 36. Samfundslitteratur, Copenhagen, pp 61–80

    Google Scholar 

  • Jensen KTH (2011) Allocation of cognitive resources in translation—an eye-tracking and key-logging study. Dissertation, Copenhagen Business School

  • Jia Y, Carl M, Wang X (2019) How does the post-editing of neural machine translation compare with from-scratch translation? A product and process study. J Specialised Transl 3:60–86

    Google Scholar 

  • Junczys-Dowmunt M, Dwojak T, Hoang H (2016) Is neural machine translation ready for deployment? A case study on 30 translation directions. In: Proceedings of the 9th international workshop on spoken language translation, Seattle, Washington

  • Klubička F, Toral A, Sánchez-Cartagena VM (2017) Fine-grained human evaluation of neural versus phrase-based machine translation. Prague Bull Math Linguist 108:121–132

    Article  Google Scholar 

  • Koglin A (2015) An empirical investigation of cognitive effort required to post-edit machine translated metaphors compared to the translation of metaphors. Transl Interpret 7(1):126–141

    Google Scholar 

  • Krings HP (2001) Repairing texts: empirical investigations of machine translation post-editing processes. The Kent State University Press, Kent, OH

    Google Scholar 

  • Lacruz I, Shreve GM (2014) Pauses and cognitive effort in post-editing. In: O’Brien S, Winther Balling L, Carl M, Simard M, Specia L (eds) Post-editing of machine translation: processes and applications. Cambridge Scholars Publishing, Newcastle-upon-Tyne, pp 246–272

    Google Scholar 

  • Lacruz I, Shreve GM, Angelone E (2012) Average pause ratio as an indicator of cognitive effort in post-editing: a case study. In: Proceedings of the AMTA 2012 workshop on post-editing technology and practice (WPTP 2012), San Diego, CA, pp 21–30

  • Lacruz I, Denkowski M, Lavie A (2014) Cognitive demand and cognitive effort in post-editing. In: Proceedings of the third workshop on post-editing technology and practice, Vancouver, Canada

  • Mesa-Lao B (2014) Gaze behaviour on source texts: an exploratory study comparing translation and post-editing. In: O’Brien S, Winther Balling L, Carl M, Simard M, Specia L (eds) Post-editing of machine. Cambridge Scholars Publishing, Newcastle, pp 219–245

    Google Scholar 

  • Mishra A, Bhattacharyya P, Carl M (2013) Automatically predicting sentence translation difficulty. Proceedings from the 51st annual meeting of the association for computational linguistics (vol 2: short papers). Sofia, Bulgaria, pp 346–351

    Google Scholar 

  • Moorkens J, O’Brien S (2015) Post-editing evaluations: trade-offs between novice and professional participants. In: Proceedings of the 18th annual conference of the European Association for Machine Translation (EAMT 2015), Antalya, Turkey, pp 75–81

  • O’Brien S (2006) Pauses as indicators of cognitive effort in post-editing machine translation. Across Lang Cult 7(1):1–21

    Article  Google Scholar 

  • O’Brien S (2011) Towards predicting post-editing productivity. Mach Transl 25(3):197–215

    Article  Google Scholar 

  • O’Brien S (2007) An empirical investigation of temporal and technical post-editing effort. Transl Interpret Stud 2(1):83–136

    Article  Google Scholar 

  • Plitt M, Masselot F (2010) A productivity test of statistical machine translation post-editing in a typical localisation context. Prague Bull Math Linguist 93:7–16

    Article  Google Scholar 

  • R Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org

  • Schaeffer M, Carl M, Lacruz I, Aizawa A (2016) Measuring cognitive translation effort with activity units. Baltic J Mod Comput 4(2):331–345

    Google Scholar 

  • Schilperoord J (1996) It’s about time. Temporal aspects of cognitive processes in text production. Rodopi, Amsterdam

    Google Scholar 

  • Screen B (2017) Machine translation and Welsh: analysing free statistical machine translation for the professional translation of an under-researched language pair. J Spec Transl 28:317–344

    Google Scholar 

  • Shterionov D, Superbo R, Nagle P, Casanellas L, O’Dowd T (2018) Human versus automatic quality evaluation of NMT and PBSMT. Mach Transl 32(3):217–235

    Article  Google Scholar 

  • Snover M, Dorr B, Schwartz R, Miccuilla L, Makhoul J (2006) A study of translation edit rate with targeted human annotations. In: Proceedings of association for machine translation in the Americas, Cambridge, MA, USA, pp 223–231

  • Suo J, Yu B, He Y, Zang G (2012) Study of ambiguities of English–Chinese machine translation. Appl Mech Mater 157:472–475

    Article  Google Scholar 

  • Tatsumi M (2009) Correlation between automatic evaluation scores, post-editing speed and some other factors. In: Proceedings of MT Summit XII, Ottawa, Canada, pp 332–339

  • TAUS (2016) TAUS post-editing guidelines. https://www.taus.net/think-tank/articles/postedit-articles/taus-post-editing-guidelines. Accessed 2 May 2016

  • Toral A, Sánchez-Cartagena VM (2017) A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions. arXiv:1701.02901

  • Wu X, Cardey S, Greenfield P (2006) Realization of the Chinese BA-construction in an English-Chinese machine translation system. In: Proceedings of the fifth SIGHAN workshop on Chinese language processing, pp 79–86

  • Wu Y, Schuster M, Chen Z et al (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. CoRR. arXiv:1609.08144

  • Yamada M (2015) Can college students be post-editors? An investigation into employing language learners in machine translation plus post-editing settings. Mach Transl 29(1):49–67

    Article  Google Scholar 

Download references

Acknowledgements

Thanks goes to our participants, annotators and raters for their precious time. Particular gratitude is extended to Prof. Yves Gambier for his comments on the earlier drafts of this article. We are also grateful to Dr. Joss Moorkens and the anonymous reviewers for their constructive comments. This research was supported by the Social Science Foundation of Hunan Province (17ZDB005), China Hunan Provincial Science & Technology Foundation ([2017]131).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangling Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, Y., Carl, M. & Wang, X. Post-editing neural machine translation versus phrase-based machine translation for English–Chinese. Machine Translation 33, 9–29 (2019). https://doi.org/10.1007/s10590-019-09229-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-019-09229-6

Keywords

Navigation