Skip to main content
Log in

Exploiting syntactic relationships in a phrase-based decoder: an exploration

  • Published:
Machine Translation

Abstract

Phrase-based decoding is conceptually simple and straightforward to implement, at the cost of drastically oversimplified reordering models. Syntactically aware models make it possible to capture linguistically relevant relationships in order to improve word order, but they can be more complex to implement and optimise. In this paper, we explore a new middle ground between phrase-based and syntactically informed statistical MT, in the form of a model that supplements conventional, non-hierarchical phrase-based techniques with linguistically informed reordering based on syntactic dependency trees. The key idea is to exploit linguistically-informed hierchical structures only for those dependencies that cannot be captured within a single flat phrase. For very local dependencies we leverage the success of conventional phrase-based approaches, which provide a sequence of target-language words appropriately ordered and ready-made with any agreement morphology. Working with dependency trees rather than constituency trees allows us to take advantage of the flexibility of phrase-based systems to treat non-constituent fragments as phrases. We do impose a requirement—that the fragment be a novel sort of “dependency constituent”—on what can be translated as a phrase, but this is much weaker than the requirement that phrases be traditional linguistic constituents, which has often proven too restrictive in MT systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Brown PF, Cocke J, Pietra SAD, Pietra VJD, Jelinek F, Lafferty JD, Mercer RL, Roossin PS (1990) A statistical approach to machine translation. Comput Ling 16(2): 79–85

    Google Scholar 

  • Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL, pp 263–270

  • Chiang D, Marton Y, Resnik P (2008) Online large-margin training of syntactic and structural translation features. In: Proceedings of EMNLP, pp 224–233

  • Cmejrek M, Cuřín J, Havelka J (2004) Prague Czech-English dependency treebank: any hopes for a common annotation scheme? In: Proceedings of HLT/NAACL 2004 workshop: frontiers in corpus annotation, pp 47–54

  • Collins M, Koehn P, Kučerová I (2005) Clause restructuring for statistical machine translation. In: Proceedings of ACL, pp 531–540

  • Dyer C, Resnik P (2010) Forest translation. In: Proceedings of NAACL-HLT

  • Fox HJ (2002) Phrasal cohesion and statistical machine translation. In: Proceedings of EMNLP, pp 304–311

  • Galley M, Manning CD (2008) A simple and effective hierarchical phrase reordering model. In: Proceedings of EMNLP, pp 848–856

  • Galley M, Hopkins M, Knight K, Marcu D (2004) What’s in a translation rule? In: Proceedings of HLT-NAACL, pp 273–280

  • Gildea D (2003) Loosely tree-based alignment for machine translation. In: Proceedings of ACL, pp 80–87

  • Hunter T, Resnik P (2009) Extending phrase-based decoding with a dependency-based reordering model. Technical Report UMIACS-TR-2009-15, LAMP-TR-152. Available at http://hdl.handle.net/1903/9782

  • Hwa R, Resnik P, Weinberg A, Kolak O (2002) Evaluating translation correspondence using annotation projection. In: Proceedings of ACL, pp 392–399

  • Kahn JG, Snover M, Ostendorf M (2009) Expected dependency pair match: predicting translation quality with expected syntactic structure. Mach Transl. Published online 31 October 2009

  • Koehn P, Och FJ, Marcu D (2003) Statistical phrase based translation. In: Proceedings of HLT-NAACL, pp 127–133

  • Lee Y-S, Roukos S, Al-Onaizan Y, Papieni K (2006) IBM spoken language translation system. In: Proceedings of TC-STAR Workshop, pp 13–18

  • Marton Y, Resnik P (2008) Soft syntactic constraints for hierarchical phrase-based translation. In: Proceedings of ACL, pp 1003–1011

  • Och F (2003) Minimum error rate training for statistical machine translation. In: Proceedings of ACL, pp 160–167

  • Och FJ, Tillman C, Ney H (1999) Improved alignment models for statistical machine translation. In: Proceedings of the joint conference of empirical methods in natural language processing and very large corpora, pp 20–28

  • Quirk C, Menezes A, Cherry C (2005) Dependency tree translation: syntactically informed phrasal SMT. In: Proceedings of ACL, pp 271–279

  • Shen L, Xu J, Weischedel R (2008) A new string-to-dependency machine translation algorithm with a target dependency language model. In: Proceedings of ACL, pp 577–585

  • Shen L, Xu J, Zhang B, Weischedel SMR (2009) Effective use of linguistic and contextual information for statistical machine translation. In: Proceedings of EMNLP, pp 72–80

  • Tromble R, Eisner J (2009) Learning linear ordering problems for better translation. In: Proceedings of EMNLP, pp 1007–1016

  • Wu D, Wong H (1998) Machine translation with a stochastic grammatical channel. In: Proceedings of ACL-COLING, pp 1408–1415

  • Xia F, McCord M (2004) Improving a statistical MT system with automatically learned rewrite patterns. In: Proceedings of COLING, pp 508–514

  • Yamada K, Knight K (2001) A syntax-based statistical translation model. In: Proceedings of ACL, pp 523–530

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tim Hunter.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hunter, T., Resnik, P. Exploiting syntactic relationships in a phrase-based decoder: an exploration. Machine Translation 24, 123–140 (2010). https://doi.org/10.1007/s10590-010-9074-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-010-9074-5

Keywords

Navigation