Skip to main content
Log in

Cardinality pruning and language model heuristics for hierarchical phrase-based translation

  • Published:
Machine Translation

Abstract

In this article we present two novel enhancements for the cube pruning and cube growing algorithms, two of the most widely applied methods when using the hierarchical approach to statistical machine translation. Cube pruning is the de facto standard search algorithm for the hierarchical model. We propose to adapt concepts of the source cardinality synchronous search organization as used for standard phrase-based translation to the characteristics of cube pruning. In this way we will be able to improve the performance of the generation process and reduce the average translation time per sentence to approximately one quarter. We will also investigate the cube growing algorithm, a reformulation of cube pruning with on-demand computation. This algorithm depends on a heuristic for the language model, but this issue is barely discussed in the original work. We analyze the behaviour of this heuristic and propose a new one which greatly reduces memory consumption without costs in runtime or translation performance. Results are reported on the German–English Europarl corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arun A, Dyer C, Haddow B, Blunsom P, Lopez A, Koehn P (2009) Monte Carlo inference and maximization for phrase-based translation. In: Proceedings of the thirteenth conference on computational natural language learning (CoNLL-2009). Association for Computational Linguistics, Boulder, pp 102–110

  • Block HU (2000) Example-based incremental synchronous interpretation. In: Wahlster W (eds) Verbmobil: foundations of speech-to-speech translation. Springer Verlag, Berlin, pp 411–417

    Google Scholar 

  • Callison-Burch C, Fordyce C, Koehn P, Monz C, Schroeder J (2008) Further meta-evaluation of machine translation. In: Proceedings of the third workshop on statistical machine translation. Association for Computational Linguistics, Columbus, pp 70–106

  • Chappelier JC, Rajman M (1998) A Generalized CYK algorithm for parsing stochastic CFG. In: Proceedings of the first workshop on tabulation in parsing and deduction, Paris, pp 133–137

  • Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: Proceedings of the 43rd annual meeting on Association for Computational Linguistics. Ann Arbor, pp 263–270

  • Chiang D (2007) Hierarchical phrase-based translation. Comput Linguist 33(2): 201–228

    Article  MATH  Google Scholar 

  • Cocke J (1969) Programming languages and their compilers: preliminary notes. Courant Institute of Mathematical Sciences, New York University, New York

    Google Scholar 

  • Déchelotte D, Adda G, Allauzen A, Bonneau-Maynard H, Galibert O, Gauvain JL, Langlais P, Yvon F (2008) Limsi’s statistical translation systems for WMT’08. In: Proceedings of the third workshop on statistical machine translation. Association for Computational Linguistics, Columbus, pp 107–110

  • Heger C, Wuebker J, Huck M, Leusch G, Mansour S, Stein D, Ney H (2010) The RWTH Aachen machine translation system for WMT 2010. In: Proceedings of the joint fifth workshop on statistical machine translation and MetricsMATR. Association for Computational Linguistics, Uppsala, pp 93–97

  • Hopkins M, Langmead G (2009) Cube pruning as heuristic search. In: Proceedings of the 2009 conference on empirical methods in natural language processing. Association for Computational Linguistics, Singapore, pp 62–71

  • Huang L, Chiang D (2005) Better k-best parsing. In: Proceedings of the 9th international workshop on parsing technologies, Vancouver, pp 53–64

  • Huang L, Chiang D (2007) Forest rescoring: faster decoding with integrated language models. In: Proceedings of the 45th annual meeting of the Association for Computational Linguistics. Prague, pp 144–151

  • Iglesias G, de Gispert A, Banga ER, Byrne W (2009) Hierarchical phrase-based translation with weighted finite state transducers. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Boulder, pp 433–441

  • Kasami T (1965) An efficient recognition and syntax analysis algorithm for context-free languages. Tech. rep., Hawaii University Honolulu Department of Electrical Engineering

  • Kneser R, Ney H (1995) Improved backing-off for M-gram language modeling. In: Proceedings of the international conference on acoustics, speech, and signal processing, vol 1, Detroit, pp 181–184

  • Koehn P (2003) Noun phrase translation. PhD thesis, University of Southern California, Los Angeles

  • Koehn P (2004) Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In: Proceedings of the 6th conference of the Association for Machine Translation in the Americas. Georgetown University, Washington, pp 115–124

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proc. of the annual meeting of the Association for Computational Linguistics (ACL), Prague, pp 177–180

  • Koehn P, Arun A, Hoang H (2008) Towards better Machine translation quality for the German–English language pairs. In: Proceedings of the third workshop on statistical machine translation. Association for Computational Linguistics, Columbus, pp 139–142

  • Li Z, Callison-Burch C, Dyer C, Khudanpur S, Schwartz L, Thornton W, Weese J, Zaidan O (2009a) Joshua: an open source toolkit for parsing-based machine translation. In: Proceedings of the workshop on statistical machine translation, Athens, pp 135–139

  • Li Z, Eisner J, Khudanpur S (2009b) Variational decoding for statistical machine translation. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. Association for Computational Linguistics, Suntec, Singapore, pp 593–601.

  • Lopez A (2009) Translation as weighted deduction. In: Proceedings of the 12th conference of the European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Athens, pp 532–540

  • Martin S, Liermann J, Ney H (1995) Algorithms for bigram and trigram word clustering. In: European conference on speech communication and technology, Madrid, pp 1253–1256

  • May J, Knight K (2006) Tiburon: a weighted tree automata toolkit. In: Proceedings of the eleventh international conference on implementation and application of automata, Taipei, pp 102–113

  • Och FJ (1999) An efficient method for determining bilingual word classes. In: Proceedings of the ninth conference of the European chapter of the Association for Computational Linguistics, Bergen, pp 8–12

  • Och FJ (2002) Statistical machine translation: from single-word models to alignment templates. PhD thesis, RWTH Aachen University, Aachen

  • Och FJ (2003) Minimum error rate training for statistical machine translation. In: Proceedings of the 41st annual meeting of the Association for Computational Linguistics, Sapporo, pp 160–167

  • Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1): 19–51

    Article  MATH  Google Scholar 

  • Och FJ, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguist 30(4): 417–449

    Article  MATH  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 41st annual meeting of the Association for Computational Linguistics, Philadelphia, pp 311–318

  • Petrov S, Haghighi A, Klein D (2008) Coarse-to-fine syntactic machine translation using language projections. In: Proceedings of the 2008 conference on empirical methods in natural language processing, Honolulu, pp 108–116

  • Schwartz L (2010) Reproducible results in parsing-based machine translation: the JHU shared task submission. In: Proceedings of the joint fifth workshop on statistical machine translation and MetricsMATR. Association for Computational Linguistics, Uppsala, pp 177–182

  • Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th conference of the Association for Machine Translation in the Americas, Cambridge, pp 223–231

  • Stolcke A (2002) SRILM—an extensible language modeling toolkit. In: Proceedings of the seventh international conference on spoken language processing. ISCA, Denver, pp 901–904

  • Tillmann C, Ney H (2003) Word reordering and a dynamic programming beam search algorithm for statistical machine translation. Computat Linguist 29(1): 97–133

    Article  MATH  Google Scholar 

  • Venugopal A, Zollmann A, Stephan V (2007) An efficient two-pass approach to synchronous-CFG driven statistical MT. In: Human language technologies 2007: the conference of the North American chapter of the Association for Computational Linguistics, Proceedings of the main conference. Association for Computational Linguistics, Rochester, pp 500–507

  • Vilar D, Ney H (2009) On LM heuristics for the cube growing algorithm. In: Proceedings of the annual conference of the European Association for Machine Translation (EAMT), Barcelona, pp 242–249

  • Vilar D, Stein D, Huck M, Ney H (2010) Jane: open source hierarchical translation, extended with reordering and Lexicon models. In: Proceedings of the joint fifth workshop on statistical machine translation and MetricsMATR. Association for Computational Linguistics, Uppsala, pp 262–270

  • Watanabe T, Tsukada H, Isozaki H (2006) Left-to-right target generation for hierarchical phrase-based translation. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the Association for Computational Linguistics, Sydney, pp 777–784

  • Wuebker J, Mauser A, Ney H (2010) Training phrase translation models with leaving-one-out. In: 48th annual meeting of the association for computational linguistics, Uppsala, pp 475–484

  • Younger DH (1967) Recognition and parsing of context-free languages in time n 3. Inform Control 2(10): 189–208

    Article  Google Scholar 

  • Zens R (2002) Kontextabhängige Statistische Übersetzungsmodelle. Master’s thesis, RWTH Aachen University, Aachen

  • Zens R (2008) Phrase-based statistical machine translation: models, search, training. PhD thesis, RWTH Aachen University, Aachen

  • Zens R, Ney H (2008) Improvements in dynamic programming beam search for phrase-based statistical machine translation. In: International workshop on spoken language translation, Honolulu, pp 195–205

  • Zollmann A, Venugopal A (2006) Syntax augmented machine translation via chart parsing. In: Proceedings of the workshop on statistical machine translation, New York, pp 138–141

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Vilar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vilar, D., Ney, H. Cardinality pruning and language model heuristics for hierarchical phrase-based translation. Machine Translation 26, 217–254 (2012). https://doi.org/10.1007/s10590-011-9119-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-011-9119-4

Keywords

Navigation