Skip to main content
Log in

What types of word alignment improve statistical machine translation?

  • Published:
Machine Translation

Abstract

In most statistical machine translation (SMT) systems, bilingual segments are extracted via word alignment. However, there is a need for systematic study as to what alignment characteristics can benefit MT under specific experimental settings such as the type of MT system, the language pair or the type or size of the corpus. In this paper we perform, in each of these experimental settings, a statistical analysis of the data and study the sample correlation coefficients between a number of alignment or phrase table characteristics and variables such as the phrase table size, the number of untranslated words or the BLEU score. We report results for two different SMT systems (a phrase-based and an n-gram-based system) on Chinese-to-English FBIS and BTEC data, and Spanish-to-English European Parliament data. We find that the alignment characteristics which help in translation greatly depend on the MT system and on the corpus size. We give alignment hints to improve BLEU score, depending on the SMT system used and the type of corpus. For example, for phrase-based SMT, dense alignments are required with larger corpora, especially on the target side, while with smaller corpora, more precise, sparser alignments are better, especially on the source side. Avoiding some long-distance crossing links may also improve BLEU score with small corpora. We take these conclusions into account to modify two types of alignment systems, and get 1 to 1.6 % relative improvements in BLEU score on two held-out corpora, although the improved system is different in each corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ayan NF, Dorr BJ (2006) Going beyond AER: an extensive analysis of word alignments and their impact on MT. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics. Sydney, Australia, pp 9–16

  • Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2): 263–311

    Google Scholar 

  • Chen B, Federico M (2006) Improving phrase-based statistical translation through combination of word alignment. In: Proceedings of FinTAL—5th international conference on natural language processing. Turku, Finland, pp 356–367

  • Clark JH, Dyer C, Lavie A, Smith NA (2011) Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In: Proceedings of the 49th annual meeting of the association for computational linguistics. Portland, Oregon, USA, pp 176–181

  • Crego JM, Mariño JB (2007) Improving SMT by coupling reordering and decoding. Mach Trans 20(3): 199–215

    Article  Google Scholar 

  • DeNero J, Klein D (2007) Tailoring word alignments to syntactic machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics. Prague, Czech Republic, pp 17–24

  • Fraser A, Marcu D (2007) Measuring word alignment quality for statistical machine translation. Comput Linguist 33(3): 293–303

    Article  MathSciNet  MATH  Google Scholar 

  • Guzman F, Gao Q, Vogel S (2009) Reassessment of the role of phrase extraction in PBSMT. In: Proceedings of machine translation summit XII. Ottawa, Canada, pp 49–56

  • Hollander M, Wolfe D (1973) Nonparametric statistical methods. Wiley, New York

    MATH  Google Scholar 

  • Jolliffe IT (2002) Principal component analysis. Springer, New York

    MATH  Google Scholar 

  • Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the human language technology conference of the NAACL. Edmonton, Canada, pp 48–54

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics (demo and poster sessions). Association for Computational Linguistics, Prague, Czech Republic, pp 177–180

  • Lambert P, Banchs RE (2006) Tuning machine translation parameters with SPSA. In: Proceedings of the international workshop on spoken language translation, IWSLT’06. Kyoto, Japan, pp 190–196

  • Lambert P, Banchs RE (2011) BIA: a discriminative phrase alignment toolkit. Prague Bulletin of Mathematical Linguistics 97

  • Lambert P, de Gispert A, Banchs RE, Mariño JB (2005) Guidelines for word alignment evaluation and manual alignment. Lang Resour Eval 39(4): 267–285

    Article  Google Scholar 

  • Lambert P, Banchs RE, Crego JM (2007) Discriminative alignment training without annotated data for machine translation. In: Proceedings of the human language technology conference of the NAACL (short papers). Rochester, NY, USA, pp 85–88

  • Lambert P, Ma Y, Ozdowska S, Way A (2009) Tracking relevant alignment characteristics for machine translation. In: Proceedings of machine translation summit XII. Ottawa, Canada, pp 268–275

  • Liang P, Taskar B, Klein D (2006) Alignment by agreement. In: Proceedings of the human language technology conference of the NAACL. New York City, USA, pp 104–111

  • Liu Y, Liu Q, Lin S (2010) Discriminative word alignment by linear modeling. Comput Linguist 36(3): 303–339

    Article  Google Scholar 

  • Mariño JB, Banchs RE, Crego JM, de Gispert A, Lambert P, Fonollosa JA, Costa-jussá MR (2006) N-gram based machine translation. Comput Linguist 32(4): 527–549

    Article  MathSciNet  MATH  Google Scholar 

  • Melamed ID (2000) Models of translational equivalence among words. Comput Linguist 26(2): 221–249

    Article  Google Scholar 

  • Moore RC (2005) A discriminative framework for bilingual word alignment. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing. Vancouver, Canada, pp 81–88

  • Näther W (2001) Random fuzzy variable of second order and applications to statistical inference. Inform Sci 133: 69–88

    Article  MathSciNet  MATH  Google Scholar 

  • Nelder J, Mead R (1965) A simplex method for function minimization. Comput J 7: 308–313

    Article  MATH  Google Scholar 

  • Och F, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguist 30(4): 417–449

    Article  MATH  Google Scholar 

  • Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the 41th annual meeting of the association for computational linguistics, pp 160–167

  • Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1): 19–51

    Article  MATH  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics. Philadelphia, USA, pp 311–318

  • Rodgers JL, Nicewander WA (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42(1): 59–66

    Article  Google Scholar 

  • Spall JC (1992) Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans Automat Control 37: 332–341

    Article  MathSciNet  MATH  Google Scholar 

  • Spall JC (1998) An overview of the simultaneous perturbation method for efficient optimization. Johns Hopkins APL Techn Digest 19(4): 482–492

    Google Scholar 

  • Stephens MA (1974) EDF statistics for goodness of fit and some comparisons. J Am Stat Assoc 69: 730–737

    Article  Google Scholar 

  • Takezawa T, Sumita E, Sugaya F, Yamamoto H, Yamamoto S (2002) Toward a broad-coverage bilingual corpus for speech translation of travel conversations in the real world. In: Proceedings of third international conference on language resources and evaluation 2002. Las Palmas, Canary Islands, Spain, pp 147–152

  • Vilar D, Popovic M, Ney H (2006) AER: do we need to “improve” our alignments? In: Proceedings of the international workshop on spoken language translation, IWSLT’06. Kyoto, Japan, pp 205–212

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrik Lambert.

Additional information

P. Lambert, Y. Ma and A. Way–Work partially done while at CNGL, Dublin City University, Ireland.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lambert, P., Petitrenaud, S., Ma, Y. et al. What types of word alignment improve statistical machine translation?. Machine Translation 26, 289–323 (2012). https://doi.org/10.1007/s10590-012-9123-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-012-9123-3

Keywords

Navigation