Abstract
Lexicalized reordering model is adopted in state-of-the-art phrase-based machine translation systems to help formulate a better word reordering of translation results. The most widely-used MSD (Monotone, Swap, Discontinuous) reordering model is designed generically and has been used in every language pair without customization. However, in the scenarios of translation between Chinese and English, the word reordering distance tends to be long due to the syntax difference between English and Chinese, in which case MSD model is likely to deliver unappropriate results.
Based on intensive investigation on large English-Chinese bilingual corpus, we redesign the orientation set of the reordering model and propose a new lexicalized reordering model MLR (Monotone, LeftDiscontinuous, RightDiscontinuous), which is tailored for C2E and E2C MT. MLR can handel long-distance word reordering well. The superiority of MLR is verified in our empirical studies and has already been applied to Youdao online translation system (http://fanyi.youdao.com).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tillmann, C.: A unigram orientation model for statistical machine translation. In: Proceedings of HLT-NAACL 2004: Short Papers. Association for Computational Linguistics, pp. 101–104 (2004)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al.: Moses: Open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics, pp. 177–180 (2007)
Och, F., Ney, H.: Improved statistical alignment models. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, pp. 440–447 (2000)
Zaidan, O.: Z-mert: A fully configurable open source tool for minimum error rate training of machine translation systems. The Prague Bulletin of Mathematical Linguistics 91(-1), 79–88 (2009)
Papineni, K., Roukos, S., Ward, T., Zhu, W.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, pp. 311–318 (2002)
Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research, pp. 138–145. Morgan Kaufmann Publishers Inc. (2002)
Ohashi, K., Yamamoto, K., Saito, K., Nagata, M.: Nut-ntt statistical machine translation system for iwslt 2005. In: Proceedings of International Workshop on Spoken Language Translation, pp. 128–133 (2005)
Galley, M., Manning, C.: A simple and effective hierarchical phrase reordering model. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 848–856 (2008)
Koehn, P., Axelrod, A., Mayne, A., Callison-Burch, C., Osborne, M., Talbot, D.: Edinburgh system description for the 2005 iwslt speech translation evaluation. In: International Workshop on Spoken Language Translation (2005)
Nagata, M., Saito, K., Yamamoto, K., Ohashi, K.: A clustered global phrase reordering model for statistical machine translation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 713–720 (2006)
Xiong, D., Liu, Q., Lin, S.: Maximum entropy based phrase reordering model for statistical machine translation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 521–528 (2006)
Chiang, D.: Hierarchical phrase-based translation. Computational Linguistics 33(2), 201–228 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Su, F., Huang, J., Su, K. (2013). A Customized Lexicalized Reordering Model for Machine Translation between Chinese and English. In: Liu, P., Su, Q. (eds) Chinese Lexical Semantics. CLSW 2013. Lecture Notes in Computer Science(), vol 8229. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45185-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-45185-0_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45184-3
Online ISBN: 978-3-642-45185-0
eBook Packages: Computer ScienceComputer Science (R0)