MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance

Ma, Qingsong; Meng, Fandong; Zheng, Daqi; Wang, Mingxuan; Graham, Yvette; Jiang, Wenbin; Liu, Qun

doi:10.1007/978-3-319-50496-4_13

Qingsong Ma^18,19,
Fandong Meng^18,19,
Daqi Zheng^18,19,
Mingxuan Wang^18,19,
Yvette Graham²⁰,
Wenbin Jiang^18,19 &
…
Qun Liu^18,20

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10102))

Included in the following conference series:

4688 Accesses

Abstract

We propose a novel metric for machine translation evaluation based on neural networks. In the training phrase, we maximize the distance between the similarity scores of high and low-quality hypotheses. Then, the trained neural network is used to evaluate the new hypotheses in the testing phase. The proposed metric can efficiently incorporate lexical and syntactic metrics as features in the network and thus is able to capture different levels of linguistic information. Experiments on WMT-14 show state-of-the-art performance is achieved in two out of five language pairs on the system-level and one on the segment-level. Comparative results are also achieved in the remaining language pairs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bojar, O., Buck, C., Callison-Burch, C., Federmann, C., Haddow, B., Koehn, P., Monz, C., Post, M., Soricut, R., Specia, L.: Findings of the 2013 workshop on statistical machine translation. In: Proceedings of the Eighth Workshop on Statistical Machine Translation, Sofia, Bulgaria, pp. 1–44. Association for Computational Linguistics, August 2013
Google Scholar
Callison-Burch, C., Fordyce, C., Koehn, P., Monz, C., Schroeder, J.: (Meta-) evaluation of machine translation. In: Proceedings of the Second Workshop on Statistical Machine Translation, pp. 136–158. Association for Computational Linguistics (2007)
Google Scholar
Denkowski, M., Lavie, A.: Meteor universal: language specific translation evaluation for any target language. In: Proceedings of the Ninth Workshop on Statistical Machine Translation. Citeseer (2014)
Google Scholar
Gautam, S., Bhattacharyya, P.: LAYERED: metric for machine translation evaluation. In: ACL 2014, p. 387 (2014)
Google Scholar
Graham, Y., Baldwin, T.: Testing for significance of increased correlation with human judgment. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 172–176. Association for Computational Linguistics, October 2014. http://www.aclweb.org/anthology/D14-1020
Gupta, R., Orasan, C., van Genabith, J.: ReVal: a simple and effective machine translation evaluation metric based on recurrent neural networks. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, pp. 1066–1072. Association for Computational Linguistics, September 2015
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. EMNLP 14, 1532–1543 (2014)
Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Article Google Scholar
Snover, M., Madnani, N., Dorr, B., Schwartz, R.: Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric. In: Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, pp. 259–268. Association for Computational Linguistics, March 2009
Google Scholar
Stanojevic, M., Sima’an, K.: BEER: better evaluation as ranking. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, Baltimore, Maryland, USA, pp. 414–419. Association for Computational Linguistics, June 2014
Google Scholar
Yu, H., Wu, X., Jiang, W., Liu, Q., Lin, S.: An automatic machine translation evaluation metric based on dependency parsing model. arXiv preprint arXiv:1508.01996 (2015)
Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of P. R. China under Grant No. 61379086, European Union Horizon 2020 research and innovation programme under grant agreement 645452 (QT21), and the ADAPT Centre for Digital Content Technology (www.adaptcentre.ie) at Dublin City University funded under the SFI Research Centres Programme (Grant 13/RC/2106) co-funded under the European Regional Development Fund.

Author information

Authors and Affiliations

Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Qingsong Ma, Fandong Meng, Daqi Zheng, Mingxuan Wang, Wenbin Jiang & Qun Liu
University of Chinese Academy of Sciences, Beijing, China
Qingsong Ma, Fandong Meng, Daqi Zheng, Mingxuan Wang & Wenbin Jiang
ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland
Yvette Graham & Qun Liu

Authors

Qingsong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Fandong Meng
View author publications
You can also search for this author in PubMed Google Scholar
Daqi Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Mingxuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yvette Graham
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Qun Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qingsong Ma .

Editor information

Editors and Affiliations

Microsoft Research Asia, Beijing, China
Chin-Yew Lin
Brandeis University, Waltham, Massachusetts, USA
Nianwen Xue
Peking University, Beijing, China
Dongyan Zhao
Fudan University, Shanghai, China
Xuanjing Huang
Peking University, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ma, Q. et al. (2016). MaxSD: A Neural Machine Translation Evaluation Metric Optimized by Maximizing Similarity Distance. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-50496-4_13
Published: 02 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50495-7
Online ISBN: 978-3-319-50496-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics