ABSTRACT
This research advocates for the integration of Human Evaluation (HE) in Machine Translation Quality Assessment (MTQA), countering the over-reliance on Automatic Evaluation Metrics (AEM) and their associated risks. Highlighting the limitations of AEMs and stressing the strength of HE, this project proposes a mixed-methods training approach for Natural Language Processing (NLP) students, using the ADDIE framework. It aims to equip NLP students with robust HE approaches for a more comprehensive MTQA process, so future developers have the skills to ensure the reliability of MT systems and preventing risks and biases to be propagated.
- Sheila Castilho, Stephen Doherty, Federico Gaspari, and Joss Moorkens. 2018. Approaches to human and machine translation quality assessment. Translation quality assessment: From principles to practice 1 (2018), 9–38.Google Scholar
- John W Creswell and Vicki L Plano Clark. 2017. Designing and conducting mixed methods research. Sage publications, California, US.Google Scholar
- Virginia Dignum. 2020. Responsibility and artificial intelligence. The oxford handbook of ethics of AI 4698 (2020), 215.Google Scholar
- Philipp Koehn. 2020. Neural machine translation. Cambridge University Press, London, UK.Google Scholar
- Benjamin Marie, Atsushi Fujita, and Raphael Rubino. 2021. Scientific credibility of machine translation research: A meta-evaluation of 769 papers. arXiv preprint arXiv:2106.15195 1, 1 (2021), 7297–7306.Google Scholar
- Joss Moorkens. 2022. Ethics and machine translation. Machine translation for everyone: Empowering users in the age of artificial intelligence 18 (2022), 121.Google Scholar
- Gary R Morrison, Steven J Ross, Jennifer R Morrison, and Howard K Kalman. 2019. Designing effective instruction. John Wiley & Sons, Hoboken, NJ, USA.Google Scholar
- Irene Rivera-Trigueros. 2022. Machine translation systems and quality assessment: a systematic review. Language Resources and Evaluation 56, 2 (2022), 593–619.Google ScholarDigital Library
- Andy Way. 2020. Machine translation: Where are we at today. The Bloomsbury companion to language industry studies 1, 1 (2020), 311–332.Google Scholar
Index Terms
- Promoting Human-centred Machine Translation Quality Assessment in NLP education
Recommendations
Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian
This paper presents a quantitative fine-grained manual evaluation approach to comparing the performance of different machine translation (MT) systems. We build upon the well-established multidimensional quality metrics (MQM) error taxonomy and implement ...
Human versus automatic quality evaluation of NMT and PBSMT
Neural machine translation (NMT) has recently gained substantial popularity not only in academia, but also in industry. For its acceptance in industry it is important to investigate how NMT performs in comparison to the phrase-based statistical MT (...
Improving Learning Outcomes for Higher Education Through Smart Technology
The ever-decreasing time between the doubling of knowledge creates a problem for education concerning how to handle information overload. To address this issue, educators must learn to make learning more effective and more efficient. Currently, there is ...
Comments