Turning Machine Translation Metrics into Confidence Measures

Navarro Martínez, Ángel; Casacuberta Nolla, Francisco

doi:10.1007/978-3-031-08473-7_44

Ángel Navarro Martínez¹² &
Francisco Casacuberta Nolla¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13286))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

1407 Accesses

Abstract

The Interactive Machine Translation (IMT) systems produce translations combining the knowledge of professional translators with the generation speed of the Machine Translation (MT) models. Both interact with the finality of generating error-free translations. The main goal of research in the IMT field is to reduce the effort that the professional translators have to perform during the IMT session. There are very different techniques to reduce this effort, from changing the display used to perform the corrections to changing the feedback signal that the user sends to the MT model. This article propose a method to reduce the effort performed by applying Confidence Measures (CMs) that give us a score for each translation and only let the user translate those that obtained a low score. We have trained for Recurrent Neural Network (RNN) models to approximate the scores from four of the most used metrics in MT: Bleu, Meteor, Chr-F, and Ter. We have simulated the user interaction with an Interactive-Predictive Neural Machine Translation (IPNMT) system to study the effort reduction that we can obtain while getting high-quality translations from the system. We have tested different thresholds values to consider that a translation has a low score, which gives us a transition between a convention IPNMT system where the system has to correct all the translations to an unsupervised MT system. The results showed that this method obtains very good translations – 70 points of Bleu – and reduces the human effort by 60%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alabau, V., et al.: CASMACAT: an open source workbench for advanced computer aided translation. Prague Bull. Math. Linguist. 100(1), 101–112 (2013)
Article Google Scholar
Bahdanau, D., et al.: An actor-critic algorithm for sequence prediction. arXiv preprint arXiv:1607.07086 (2016)
Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the Association for Computational Linguistics Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, Michigan, pp. 65–72. ACL, June 2005. https://aclanthology.org/W05-0909
Barrachina, S., et al.: Statistical approaches to computer-assisted translation. Comput. Linguist. 35(1), 3–28 (2009). https://aclanthology.org/J09-1002
Blatz, J., et al.: Confidence estimation for machine translation. In: COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics, pp. 315–321. COLING, Geneva, August 2004. https://aclanthology.org/C04-1046
Domingo, M., Peris, Á., Casacuberta, F.: Segment-based interactive-predictive machine translation. Mach. Transl. 31(4), 163–185 (2018). https://doi.org/10.1007/s10590-017-9213-3
Article Google Scholar
Foster, G., Isabelle, P., Plamondon, P.: Target-text mediated interactive machine translation. Mach. Transl. 12(1), 175–194 (1997)
Article Google Scholar
González-Rubio, J., Ortíz-Martínez, D., Casacuberta, F.: Balancing user effort and translation error in interactive machine translation via confidence measures. In: Proceedings of the Association for Computational Linguistics 2010 Conference Short Papers, Uppsala, Sweden, pp. 173–177. ACL, July 2010. https://www.aclweb.org/anthology/P10-2032
González-Rubio, J., Ortíz-Martínez, D., Casacuberta, F.: On the use of confidence measures within an interactive-predictive machine translation system. In: Proceedings of the 14th Annual conference of the European Association for Machine Translation. EAMT, Saint Raphaël, May 2010. https://aclanthology.org/2010.eamt-1.18
Granell, E., Romero, V., Martínez-Hinarejos, C.D.: Study of the influence of lexicon and language restrictions on computer assisted transcription of historical manuscripts. Neurocomputing 390, 12–27 (2020)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Ive, J., Blain, F., Specia, L.: DeepQuest: a framework for neural-based quality estimation. In: Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, pp. 3146–3157. ACL, August 2018. https://aclanthology.org/C18-1266
Kepler, F., Trénous, J., Treviso, M., Vera, M., Martins, A.F.T.: OpenKiwi: an open source framework for quality estimation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Florence, Italy, pp. 117–122. ACL, July 2019. https://aclanthology.org/P19-3020
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2017)
Koehn, P., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, Prague, Czech Republic, pp. 177–180. ACL, June 2007. https://www.aclweb.org/anthology/P07-2045
Navarro, Á., Casacuberta, F.: Confidence measures for interactive neural machine translation. In: Proceedings of the IberSPEECH 2021, pp. 195–199. IberSPEECH (2021)
Google Scholar
Navarro, Á., Casacuberta, F.: Neural models for measuring confidence on interactive machine translation systems. Appl. Sci. 12(3), 1100 (2022)
Article Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 311–318. ACL, July 2002. https://aclanthology.org/P02-1040
Peris, Á., Casacuberta, F.: NMT-Keras: a very flexible toolkit with a focus on interactive NMT and online learning. Prague Bull. Math. Linguist. 111, 113–124, October 2018. https://ufal.mff.cuni.cz/pbml/111/art-peris-casacuberta.pdf
Popović, M.: chrF: character n-gram F-score for automatic MT evaluation. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, pp. 392–395. ACL, September 2015. https://aclanthology.org/W15-3049
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, pp. 1715–1725. ACL, August 2016. https://www.aclweb.org/anthology/P16-1162
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, pp. 223–231. AMTA, Cambridge, Massachusetts, August 2006. https://aclanthology.org/2006.amta-papers.25
Specia, L., Shah, K., De Souza, J.G., Cohn, T.: QuEst-a translation quality estimation framework. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Sofia, Bulgaria, pp. 79–84. ACL, August 2013. https://aclanthology.org/P13-4014
Specia, L., Turchi, M., Cancedda, N., Cristianini, N., Dymetman, M.: Estimating the sentence-level quality of machine translation systems. In: Proceedings of the 13th Annual conference of the European Association for Machine Translation, vol. 9, pp. 28–35. EAMT (2009)
Google Scholar
Tomás, J., Casacuberta, F.: Statistical phrase-based models for interactive computer-assisted translation. In: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia, pp. 835–841. ACL, July 2006. https://aclanthology.org/P06-2107
Toral, A.: Reassessing claims of human parity and super-human performance in machine translation at WMT 2019. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, Lisboa, Portugal, pp. 185–194. EAMT, November 2020. https://aclanthology.org/2020.eamt-1.20
Ueffing, N., Macherey, K., Ney, H.: Confidence measures for statistical machine translation. In: Proceedings of Machine Translation Summit IX: Papers, MTSummit, New Orleans, USA, September 2003. https://aclanthology.org/2003.mtsummit-papers.52
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Wessel, F., Schluter, R., Macherey, K., Ney, H.: Confidence measures for large vocabulary continuous speech recognition. IEEE Trans. Speech Audio Process. 9(3), 288–298 (2001)
Article Google Scholar

Download references

Acknowledgements

This work received funds from the Comunitat Valenciana under project EU-FEDER (IDIFEDER/2018/025), Generalitat Valenciana under project ALMAMATER (PrometeoII/2014/030), Ministerio de Ciencia e Investigación/Agencia Estatal de Investigacion /10.13039/501100011033/and “FEDER Una manera de hacer Europa” under project MIRANDA-DocTIUM (RTI2018-095645-B-C22), and Universitat Politècnica de València under the program (PAID-01-21).

Author information

Authors and Affiliations

Research Center of Pattern Recognition and Human Language Technology, Universitat Politècnica de València, 46022, Valencia, Spain
Ángel Navarro Martínez & Francisco Casacuberta Nolla

Authors

Ángel Navarro Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Casacuberta Nolla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ángel Navarro Martínez .

Editor information

Editors and Affiliations

Universitat Politècnica de València, Valencia, Spain
Paolo Rosso
University of Turin, Torino, Italy
Valerio Basile
Universidad Nacional de Educación a Distancia, Madrid, Spain
Raquel Martínez
Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Derby, Derby, UK
Farid Meziane

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Navarro Martínez, Á., Casacuberta Nolla, F. (2022). Turning Machine Translation Metrics into Confidence Measures. In: Rosso, P., Basile, V., Martínez, R., Métais, E., Meziane, F. (eds) Natural Language Processing and Information Systems. NLDB 2022. Lecture Notes in Computer Science, vol 13286. Springer, Cham. https://doi.org/10.1007/978-3-031-08473-7_44

Download citation

DOI: https://doi.org/10.1007/978-3-031-08473-7_44
Published: 13 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08472-0
Online ISBN: 978-3-031-08473-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Turning Machine Translation Metrics into Confidence Measures