Skip to main content
Log in

Involving language professionals in the evaluation of machine translation

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

Significant breakthroughs in machine translation (MT) only seem possible if human translators are taken into the loop. While automatic evaluation and scoring mechanisms such as BLEU have enabled the fast development of systems, it is not clear how systems can meet real-world (quality) requirements in industrial translation scenarios today. The taraXŰ project has paved the way for wide usage of multiple MT outputs through various feedback loops in system development. The project has integrated human translators into the development process thus collecting feedback for possible improvements. This paper describes results from detailed human evaluation. Performance of different types of translation systems has been compared and analysed via ranking, error analysis and post-editing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. http://taraxu.dfki.de/

  2. http://translate.google.com/

  3. http://www.trados.com/en/

  4. For reasons of required anonymisation.

  5. Note that this must be seen as an experiment. This was done in order to simulate the use of TMs, although it does not mirror the exact use of them in the translation industry.

  6. More publications can be found online: http://taraxu.dfki.de/publications

References

  • Alonso, J. A., & Thurmair, G. (2003). The comprendium translator system. In: Proceedings of the Ninth Machine Translation Summit.

  • Avramidis, E., Popović, M., Vilar, D., & Burchardt, A. (2011). Evaluate with confidence estimation: Machine ranking of translation outputs using grammatical features. Proceedings of the Sixth Workshop on Statistical Machine Translation. Edinburgh, Scotland: Association for Computational Linguistics, pp. 65–70.

  • Burchardt, A., Tscherwinka, C., & Avramidis, E. (2013). Machine translation at work, studies in computational intelligence (241–261) (Vol. 458). Berlin: Springer.

    Google Scholar 

  • Callison-Burch, C., Koehn, P., Monz, C., Peterson, K., Przybocki, M., & Zaidan, O. (2010). Findings of the 2010 joint workshop on statistical machine translation and metrics for machine translation. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR. Association for Computational Linguistics, Uppsala, Sweden, pp. 17–53, revised August 2010.

  • Eisele, A., & Chen, Y. (2010). MultiUN: A Multilingual Corpus from United Nation Documents. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Malta: La Valletta, pp. 2868–2872.

  • Farzindar, A., & Lapalme, G. (2009). Machine translation of legal information and its evaluation. Proceedings of the 22nd Canadian Conference on Artificial Intelligence (Canadian AI 09), BC: Kelowna, pp. 64–73.

  • Federmann, C. (2010). Appraise: An open-source toolkit for manual phrase-based evaluation of translations. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010), La Valletta, Malta.

  • He, Y., Ma, Y., Roturier, J., Way, A., & van Genabith, J. (2010). Improving the post-editing experience using translation recommendation: A user study. In: Proceedings of the Ninth Conference of the Association for Machine Translation in the Americas (AMTA 2010), Denver, Colorado.

  • Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., et al. (2007). Moses: Open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics, Stroudsburg, PA, USA, ACL ’07, pp. 177–180.

  • Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2001). Bleu: A method for automatic evaluation of machine translation. IBM Research Report RC22176(W0109–022), IBM.

  • Popović, M. (2011). Hjerson: An open source tool for automatic error classification of machine translation output. The Prague Bulletin of Mathematical Linguistics, 96, 59–68.

  • Specia, L., & Farzindar, A. (2010). Estimating machine translation post-editing effort with HTER. In: Proceedings of AMTA-2010 Workshop Bringing MT to the User. MT Research and the Translation Industry, Denver, Colorado.

  • Tiedemann, J. (2009). News from OPUS—A collection of multilingual parallel corpora with tools and interfaces. In: N. Nicolov, K. Bontcheva, G. Angelova & R. Mitkov (Eds.), Advances in natural language processing (pp. 237–248.), vol V, chap V, Borovets, Bulgaria.

  • Vilar, D., Xu, J., D’Haro, L. F., & Ney, H. (2006). Error analysis of machine translation output. International Conference on Language Resources and Evaluation, Italy: Genoa, pp. 697–702.

  • Vilar, D., Stein, D., Huck, M., & Ney, H. (2010). Jane: Open source hierarchical translation, extended with reordering and lexicon models. ACL 2010 Joint Fifth Workshop on Statistical Machine Translation and Metrics MATR, Sweden: Uppsala, pp. 262–270.

Download references

Acknowledgments

This work has been developed within the taraXŰ Project financed by TSB Technologiestiftung Berlin—Zukunftsfonds Berlin, co-financed by the European Union—European fund for regional development. Thanks to our colleague Christian Federmann for helping with the Appraise system.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maja Popović.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Popović, M., Avramidis, E., Burchardt, A. et al. Involving language professionals in the evaluation of machine translation. Lang Resources & Evaluation 48, 541–559 (2014). https://doi.org/10.1007/s10579-014-9286-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-014-9286-z

Keywords

Navigation