Abstract
This paper presents a new method for reordering in phrase based statistical machine translation (PBSMT). Our method is based on previous chunk-level reordering methods for PBSMT. Our method is a global reordering. First, we parse the source language sentence to a chunk tree, according to the method developed by [1]. Second, we apply a series of transformation rules, which are learnt automatically from the parallel corpus to the chunk tree over chunk level. Finally, we solve phenomena for the overlapping of phrases and chunks, and integrate a global reordering model directly in a decoder as a graph of phrases. The experimental results with English-Vietnamese and English-French pairs show that our method outperforms the baseline PBSMT in both accuracy and speed.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tsuruoka, Y., Tsujii, J.: Chunk parsing revisited. In: Proceedings of the 9th International Workshop on Parsing Technologies (IWPT 2005) (2005)
Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of HLT-NAACL 2003, Edmonton, Canada, pp. 127–133 (2003)
Och, F.J., Ney, H.: The alignment template approach to statistical machine translation. Computational Linguistics 30(4), 417–449 (2004)
Zens, R., Ney, H., Watanabe, T., Sumita, E.: Reordering constraints for phrase-based statistical machine translation. In: Proceedings of the 20th International Conference on Computational Linguistics (CoLing), Geneva, Switzerland, pp. 205–211 (2004)
Wu, D.: A polynomial-time algorithm for statistical machine translation. In: Proceedings of ACL 1996, Santa, Cruz, CA, pp. 152–158 (1996)
Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), Ann Arbor, Michigan, pp. 263–270. Association for Computational Linguistics (June 2005)
Collins, M., Koehn, P., Kucerová, I.: Clause restructuring for statistical machine translation. In: Proc. ACL 2005, Ann Arbor, USA, pp. 531–540 (2005)
Quirk, C., Menezes, A., Cherry, C.: Dependency treelet translation: Syntactically informed phrasal smt. In: Proceedings of ACL 2005, Ann Arbor, Michigan, USA, pp. 271–279 (2005)
Galley, M., Graehl, J., Knight, K., Marcu, D., DeNeefe, S., Wang, W., Thayer, I.: Scalable inference and training of context-rich syntactic translation models. In: Proceedings of COLING/ACL 2006, Sydney, Australia, pp. 961–968 (2006)
Koehn, P., Axelrod, A., Mayne, A.B., Callison-Burch, C., Osborne, M., Talbot, D., White, M.: Edinburgh system description for the 2005 nist mt evaluation. In: Proceedings of Machine Translation Evaluation Workshop 2005 (2005)
Xiong, D., Lui, Q., Lin, S.: Maximum entropy based phrase reordering model for statistical machine translation. In: Proceedings of ACL 2006, pp. 521–528 (2006)
Zen, R., Hey, H.: Discriminative reordering models for statistical machine translation. In: Proceeding of the Workshop on Statistical Machine Translation, pp. 55–63 (2006)
Nguyen, P.T., Shimazu, A., Nguyen, L.M., Nguyen, V.V.: A syntactic transformation model for statistical machine translation. International Journal of Computer Processing of Oriental Languages (IJCPOL) 20(2), 1–20 (2007)
Zhang, Y., Zens, R., Ney, H.: Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation. In: Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation, pp. 1–8 (2007)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)
Nguyen, T.P., Shimazu, A.: Improving phrase-based smt with morpho-syntactic analysis and transformation. In: Proceedings AMTA 2006 (2006)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open source toolkit for statistical machine translation. In: Proceedings of ACL, Demonstration Session (2007)
Koehn, P.: Europarl: A parallel corpus for statistical machine translation. In: Proceedings of MT Summit 2005 (2005)
Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)
Stolcke, A.: Srilm - an extensible language modeling toolkit. In: Proceedings of International Conference on Spoken Language Processing, vol. 29, pp. 901–904 (2002)
Papineni, K., Roukos, S., Ward, T., W.J.Z.: Bleu: a method for automatic evaluation of machine translation. In: Proc. of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, July,2002, pp. 311–318 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Van Nguyen, V., Phuong Nguyen, T., Shimazu, A., Le Nguyen, M. (2008). A Reordering Model for Phrase-Based Machine Translation. In: Nordström, B., Ranta, A. (eds) Advances in Natural Language Processing. GoTAL 2008. Lecture Notes in Computer Science(), vol 5221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85287-2_45
Download citation
DOI: https://doi.org/10.1007/978-3-540-85287-2_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85286-5
Online ISBN: 978-3-540-85287-2
eBook Packages: Computer ScienceComputer Science (R0)