Skip to main content

Phrasal Syntactic Category Sequence Model for Phrase-Based MT

  • Conference paper
  • 1332 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7182))

Abstract

Incorporating target syntax into phrase-based machine translation (PBMT) can generate syntactically well-formed translations. We propose a novel phrasal syntactic category sequence (PSCS) model which allows a PBMT decoder to prefer more grammatical translations. We parse all the sentences on the target side of the bilingual training corpus. In the standard phrase pair extraction procedure, we assign a syntactic category to each phrase pair and build a PSCS model from the parallel training data. Then, we log linearly incorporate the PSCS model into a standard PBMT system. Our method is very simple and yields a 0.7 BLEU point improvement when compared to the baseline PBMT system.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Birch, A., Osborne, M., Koehn, P.: CCG Supertags in Factored Translation Models. In: SMT Workshop. ACL (2007)

    Google Scholar 

  • Cherry, C.: Cohesive phrase-Based decoding for statistical machine translation. In: ACL- HLT (2008)

    Google Scholar 

  • Chiang, D.: A hierarchical phrase-based model for statistical machine translation. In: ACL (2005)

    Google Scholar 

  • DeNeefe, S., Knight, K., Wang, W., Marcu, D.: What can syntax-based MT learn from phrase-based MT? In: EMNLP-CoNLL (2007)

    Google Scholar 

  • Finch, A., Sumita, E.: Dynamic model interpolation for statistical machine translation. In: SMT Workshop (2008)

    Google Scholar 

  • Galley, M., Graehl, J., Knight, K., Marcu, D., Deneefe, S., Wang, W., Thayer, I.: Scalable inference and training of context-rich syntactic translation models. In: ACL (2006)

    Google Scholar 

  • Hassan, H., Sima’an, K., Way, A.: Supertagged phrase-based statistical machine translation. In ACL (2007)

    Google Scholar 

  • Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: HLT-NAACL (2003)

    Google Scholar 

  • Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: ACL demo and poster sessions (2007)

    Google Scholar 

  • Liu, Y., Liu, Q., Lin, S.: Tree-to-string alignment template for statistical machine translation. In: ACL-COLING (2006)

    Google Scholar 

  • Liu, Y., Huang, Y., Liu, Q., Lin, S.: Forest-to-string statistical translation rules. In: ACL (2007)

    Google Scholar 

  • Marcu, D., Wang, W., Echihabi, A., Knight, K.: SPMT: Statistical machine translation with syntactified target language phrases. In: EMNLP (2006)

    Google Scholar 

  • Marton, Y., Resnik, P.: Soft syntactic constraints for hierarchical phrased-based translation. In: ACL-HLT (2008)

    Google Scholar 

  • Och, F.: Minimum error rate training in statistical machine translation. In: ACL (2003)

    Google Scholar 

  • Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: COLING-ACL (2006)

    Google Scholar 

  • Quirk, C., Menezes, A., Cherry, C.: Dependency treelet translation: Syntactically informed phrasal SMT. In: ACL (2005)

    Google Scholar 

  • Xiong, D., Zhang, M., Aw, A., Li, H.: A syntax-driven bracketing model for phrase-based translation. In: ACL-IJCNLP (2009)

    Google Scholar 

  • Yamada, K., Knight, K.: A syntax-based statistical translation model. In: ACL (2000)

    Google Scholar 

  • Zhang, M., Jiang, H., Aw, A., Tan, C.L., Li, S.: A tree sequence alignment-based tree-to-tree translation model. In: ACL- HLT (2008)

    Google Scholar 

  • Zollmann, A., Venugopal, A.: Syntax augmented machine translation via chart parsing. In: SMT Workshop, HLT-NAACL (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cao, H., Sumita, E., Zhao, T., Li, S. (2012). Phrasal Syntactic Category Sequence Model for Phrase-Based MT. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28601-8_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28600-1

  • Online ISBN: 978-3-642-28601-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics