Skip to main content

A Syntactic-based Word Re-ordering for English-Vietnamese Statistical Machine Translation System

  • Conference paper
PRICAI 2008: Trends in Artificial Intelligence (PRICAI 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5351))

Included in the following conference series:

  • 1358 Accesses

Abstract

In machine translation, the re-ordering of word from source to target language is one of the major steps that affect mainly the performance of the system. Among many approaches for this type of problem, syntactic is an effective method for handling word-order in a statistical machine translation (SMT) system. In this paper, we introduce a word re-ordering approach that makes use the syntactic rules extracted from parse tree for the English-Vietnamese SMT system. Our word re-ordering rule set includes rules in noun phrase, verb phrase and adjective phrase. According to the experiment result, the noun phrase rules are the most significant rules of all. Compared with the MOSES phrase-based SMT system [1], these rules can improve BLEU score of 3.24 on our testing corpus. Moreover, we also conduct other experiments by using different combinations of rules to study their effectiveness. And we find that the translation performance for each corpus can be tuned by different ways of combination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constrantin, A., Moses, E.H.: Open source toolkit for statistical machine translation. In: Proceedings of ACL, Demonstration Session (2007)

    Google Scholar 

  2. Xia, F., McCord, M.: Improving a statistical MT system with automatically learned rewrite patterns. In: Proceedings of COLING (2004)

    Google Scholar 

  3. Quang, P.-C., Tuoutanova, K.: A Discriminative syntactic word order model for machine translation. In: Proceedings of ACL 45th, pp. 9–16 (2007)

    Google Scholar 

  4. Wang, C., Collins, M., Koehn, P.: Chinese syntactic re-ordering for statistical machine translation. In: Proceedings of 2007 Joint Conference on Emperical Methods in NLP and CL NLP, pp. 737–745 (2007)

    Google Scholar 

  5. Collins, M., Koehn, P., Kucerova, I.: Clause restructuring for statistical machine translation. In: Proceedings of the 43rd Annual Meeting of the Assoc. for Computational Linguistics (ACL), Ann Arbor, Michigan, pp. 531–540 (2005)

    Google Scholar 

  6. Nguyen, T.P., Shimazu, A.: A syntactic transformation model for statistical machine translation. In: Matsumoto, Y., Sproat, R.W., Wong, K.-F., Zhang, M. (eds.) ICCPOL 2006. LNCS (LNAI) vol. 4285, pp. 63–74. Springer, Heidelberg (2006)

    Google Scholar 

  7. Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proc. of the HLT-NAACL 2003 conference, Edmonton, Alberta, Canada, pp. 127–133 (2003)

    Google Scholar 

  8. Kumar, S., Byrne, W.: Local phrase re-ordering models for statistical machine translation. In: Proceedings of Human Language Technology Conference and Conference on Emperical Methods in NLP, pp. 161–168 (2007)

    Google Scholar 

  9. Sanchis, G., Casacuberta, F.: N-best re-ordering in statistical machine translation. Jornadas en Techlogia del Habla, pp. 99–104 (2006)

    Google Scholar 

  10. Zhang, Y., Zens, R., Ney, H.: Chunk-level re-ordering of source language with automatically learned rules for statistical machine translation. In: Proceedings of SSST, NAACL-HLT, pp. 1–8 (2007)

    Google Scholar 

  11. Dien, D.: Comparision word order of attributions in English and Vietnamese. In Journal of Social Sciences and Humanities. University of Social Sciences and Humanities. HCM City (2001)

    Google Scholar 

  12. Dinh, D.: Building an Annotated English-Vietnamese parallel Corpus. In MKS: A Journal of Southeast Asian Linguistics and Languages, 35, 21–36 (2005)

    Google Scholar 

  13. Dien, D., Thuy, V.: A maximum entropy approach for Vietnamese word segmentation. In: Proceedings of 4th IEEE International Conference RIVF 2006, Ho Chi Minh City, Vietnam, February 12-16, 2006, pp. 247–252 (2006)

    Google Scholar 

  14. Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of ACL 2003 (2003)

    Google Scholar 

  15. Li, C.-H., Zhang, D., Li, M., Zhou, M., Li, M., Guan, Y.: A probabilistic approach to syntax-based re-ordering for statistical machine translation. In: Proceedings of 45th ACL, pp. 720–727 (2007)

    Google Scholar 

  16. Papineni, K.A., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: The Proc. of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nguyen Thi, HN., Dinh, D. (2008). A Syntactic-based Word Re-ordering for English-Vietnamese Statistical Machine Translation System. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_75

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89197-0_75

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89196-3

  • Online ISBN: 978-3-540-89197-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics