Abstract
This paper presents AL-TERp (Arabic Language Translation Edit Rate - Plus) an extended version of machine translation evaluation metric TER-Plus that supports Arabic language. This metric takes into accounts some valuable linguistic features of Arabic like synonyms and stems, and correlates well with human judgments. Thus, the development of such tool will bring high benefits to the building of machine translation systems from other languages into Arabic that its quality remains under the expectations, specifically for evaluation and optimization tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association for Machine Translation in the Americas (2006)
Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)
Snover, M., Madnani, N., Dorr, B.J., Schwartz, R.: Fluency, adequacy, or HTER?: exploring different human judgments with a tunable MT metric. In: Proceedings of the Fourth Workshop on Statistical Machine Translation, pp. 259–268. Association for Computational Linguistics (2009)
Habash, N.Y.: Introduction to Arabic Natural Language Processing (2010)
Snowball: A language for stemming algorithms. http://snowball.tartarus.org/texts/introduction.html
Miller, G.A., Fellbaum, C.: WordNet then and now. Lang Resour. Eval. 41, 209–214 (2007)
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach (2009)
Bouamor, H., Alshikhabobakr, H., Mohit, B., Oflazer, K.: A human judgement corpus and a metric for arabic MT evaluation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 207–213. Association for Computational Linguistics, Doha, Qatar (2014)
Elkateb, S., Black, W., RodrÃguez, H., Alkhalifa, M., Vossen, P., Pease, A., Fellbaum, C.: Building a wordnet for arabic. In: Proceedings of The fifth international conference on Language Resources and Evaluation (LREC 2006), pp. 22–28 (2006)
Shereen, K.: Stemming Arabic Text. http://zeus.cs.pacificu.edu/shereen/research.htm
Ganitkevitch, J., Callison-Burch, C.: The multilingual paraphrase database. In: LREC, pp. 4276–4283 (2014)
Kendall, M.G.: A new measure of rank correlation. Biometrika 30, 81–93 (1938)
Clayton, N., Dunlap, W.P.: Averaging correlation coefficients: should Fisher’s z transformation be used? J. Appl. Psychol. 72, 146–148 (1987)
Acknowledgments
We would like to thank Houda Bouamor for her grateful helping by providing us the dataset used in this work and for her comments and recommendations.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
El Marouani, M., Boudaa, T., Enneya, N. (2017). AL-TERp: Extended Metric for Machine Translation Evaluation of Arabic. In: Frasincar, F., Ittoo, A., Nguyen, L., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2017. Lecture Notes in Computer Science(), vol 10260. Springer, Cham. https://doi.org/10.1007/978-3-319-59569-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-59569-6_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59568-9
Online ISBN: 978-3-319-59569-6
eBook Packages: Computer ScienceComputer Science (R0)