research-article

Statistical machine translation

Author:
Adam Lopez

University of Edinburgh, Edinburgh, United Kingdom

University of Edinburgh, Edinburgh, United Kingdom
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 40 Issue 3Article No.: 8pp 1–49https://doi.org/10.1145/1380584.1380586

Published:13 August 2008Publication History

ACM Computing Surveys

Abstract

Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and new ideas are constantly introduced. This survey presents a tutorial overview of the state of the art. We describe the context of the current research and then move to a formal problem description and an overview of the main subproblems: translation modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and a discussion of future directions.

References

Aho, A. V. and Ullman, J. D. 1969. Syntax directed translations and the pushdown assembler. J. Comput. Syst. Sci. 3, 37--57.Google ScholarDigital Library
Ahrenberg, L., Merkel, M., Hein, A. S., and Tiedmann, J. 2000. Evaluation of word alignment systems. In Proceedings of the International Conference on Language Resources and Evaluation (LREC). Vol. 3. 1255--1261.Google Scholar
Al-Onaizan, Y., Curin, J., Jahr, M., Knight, K., Lafferty, J., Melamed, D., Och, F. J., Purdy, D., Smith, N. A., and Yarowsky, D. 1999. Statistical machine translation: Tech. rep., Center for Speech and Language Processing, Johns Hopkins University.Google Scholar
Al-Onaizan, Y. and Papineni, K. 2006. Distortion models for statistical machine translation. In Proceedings of ACL-COLING. 529--536. Google ScholarDigital Library
Albrecht, J. and Hwa, R. 2007. Regression for sentence-level MT evaluation with pseudo references. In Proceedings of the Association for Computational Linguistics (ACL). 296--303.Google Scholar
Alshawi, H., Bangalore, S., and Douglas, S. 2000. Learning dependency translation models as collections of finite state head transducers. Computat. Linguist. 26, 1, 45--60. Google ScholarDigital Library
Ayan, N. F. and Dorr, B. 2006a. Going beyond AER: An extensive analysis of word alignments and their impact on MT. In Proceedings of ACL-COLING. 9--16. Google ScholarDigital Library
Ayan, N. F., Dorr, B., and Monz, C. 2005a. Alignment link projection using transformation-based learning. In Proceedings of HLT-EMNLP. 185--192. Google ScholarDigital Library
Ayan, N. F., Dorr, B., and Monz, C. 2005b. Neuralign: Combining word alignments using neural networks. In Proceedings of HLT-EMNLP. 65--72. Google ScholarDigital Library
Ayan, N. F. and Dorr, B. J. 2006b. A maximum entropy approach to combining word alignments. In Proceedings of HLT-NAACL. 96--103. Google ScholarDigital Library
Banchs, R. E., Crego, J. M., de Gispert, A., Lambert, P., and Mariño, J. B. 2005. Statistical machine translation of euparl data by using bilingual n-grams. In Proceedings of ACL Workshop on Building and Using Parallel Texts. 133--136. Google ScholarDigital Library
Bannard, C. and Callison-Burch, C. 2005. Paraphrasing with bilingual parallel corpora. In Proceedings of the Association for Computational Linguistics (ACL). 597--604. Google ScholarDigital Library
Baum, L. E. 1972. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process. In Proceedings of the 3rd Symposium on Inequalities. Inequalities, vol. 3. Academic Press, 1--8.Google Scholar
Berger, A. L., Brown, P. F., Pietra, S. A. D., Pietra, V. J. D., Gillett, J. R., Lafferty, J. D., Mercer, R. L., Printz, H., and Ures, L. 1994. The Candide system for machine translation. In Proceedings of the ARPA Workshop on Human Language Technology. 157--162. Google ScholarDigital Library
Berger, A. L., Brown, P. F., Pietra, S. A. D., Pietra, V. J. D., Kehler, A. S., and Mercer, R. L. 1996. Language translation apparatus and method using context-based translation models. United States Patent 5510981.Google Scholar
Berger, A. L., Pietra, S. A. D., and Pietra, V. J. D. 1996. A maximum entropy approach to natural language processing. Comput. Linguist. 22, 1, 39--71. Google ScholarDigital Library
Birch, A., Callison-Burch, C., Osborne, M., and Koehn, P. 2006. Constraining the phrase-based, joint probability statistical translation model. In Proceedings of HLT-NAACL Workshop on Statistical Machine Translation. 154--157. Google ScholarDigital Library
Blunsom, P. and Cohn, T. 2006. Discriminative word alignment with conditional random fields. In Proceedings of ACL-COLING. 65--72. Google ScholarDigital Library
Brants, T., Popat, A. C., Xu, P., Och, F. J., and Dean, J. 2007. Large language models in machine translation. In Proceedings of EMNLP-CoNLL. 858--867.Google Scholar
Brown, P. F., Cocke, J., Pietra, S. D., Pietra, V. J. D., Jelinek, F., Lafferty, J. D., Mercer, R. L., and Roossin, P. S. 1990. A statistical approach to machine translation. Comput. Linguist. 16, 2, 79--85. Google ScholarDigital Library
Brown, P. F., deSouza, P. V., Mercer, R. L., Pietra, V. J. D., and Lai, J. C. 1992. Class-based n-gram models of natural language. Comput. Linguist. 18, 4, 467--479. Google ScholarDigital Library
Brown, P. F., Pietra, S. A. D., Pietra, V. J. D., and Mercer, R. L. 1993. The mathematics of statistical machine translation: Parameter estimation. Comput. Linguist. 19, 2, 263--311. Google ScholarDigital Library
Burbank, A., Carpuat, M., Clark, S., Dreyer, M., Fox, P., Groves, D., Hall, K., Hearne, M., Melamed, I. D., Shen, Y., Way, A., Wellington, B., and Wu, D. 2005. Final report of the 2005 language engineering workshop on statistical machine translation by parsing. Tech. rep., Johns Hopkins University Center for Speech and Language Processing.Google Scholar
Callison-Burch, C., Bannard, C., and Schroeder, J. 2005. Scaling phrase-based statistical machine translation to larger corpora and longer phrases. In Proceedings of the Association for Computational Linguistics (ACL). 255--262. Google ScholarDigital Library
Callison-Burch, C., Fordyce, C., Koehn, P., Monz, C., and Schroeder, J. 2007. (Meta-) evaluation of machine translation. In Proceedings of the Workshop on Statistical Machine Translation. 136--158. Google ScholarDigital Library
Callison-Burch, C., Koehn, P., and Osborne, M. 2006. Improved statistical machine translation using paraphrases. In Proceedings of HLT-NAACL. Google ScholarDigital Library
Callison-Burch, C., Osborne, M., and Koehn, P. 2006. Re-evaluating the role of BLEU in machine translation research. In Proceedings of European Chapter of the Association for Computational Linguistics (EACL). 249--256.Google Scholar
Callison-Burch, C., Talbot, D., and Osborne, M. 2004. Statistical machine translation with word- and sentence-aligned parallel corpora. In Proceedings of the Association for Computational Linguistics (ACL). 176--183. Google ScholarDigital Library
Carpuat, M. and Wu, D. 2005. Word sense disambiguation vs. statistical machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 387--394. Google ScholarDigital Library
Carpuat, M. and Wu, D. 2007. Improving statistical machine translation using word sense disambiguation. In Proceedings of the Association for Computational Linguistics (ACL). 61--72.Google Scholar
Chan, Y. S., Ng, H. T., and Chiang, D. 2007. Word sense disambiguation improves statistical machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 33--40.Google Scholar
Charniak, E., Knight, K., and Yamada, K. 2003. Syntax-based language models for statistical machine translation. In Proceedings of MT Summit IX.Google Scholar
Chelba, C. and Jelinek, F. 1998. Exploiting syntactic structure for language modeling. In Proceedings of ACL-COLING. 225--231. Google ScholarDigital Library
Chen, S. F. and Goodman, J. 1998. An empirical study of smoothing techniques for language modeling. Tech. rep. TR-10-98, Computer Science Group, Harvard University.Google Scholar
Cherry, C. and Lin, D. 2003. A probability model to improve word alignment. In Proceedings of the Association for Computational Linguistics (ACL). Google ScholarDigital Library
Chiang, D. 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 263--270. Google ScholarDigital Library
Chiang, D. 2006. An introduction to synchronous grammars. Part of a ACL Tutorial.Google Scholar
Chiang, D. 2007. Hierarchical phrase-based translation. Comput. Linguist. 33, 2. Google ScholarDigital Library
Chiang, D., Lopez, A., Madnani, N., Monz, C., Resnik, P., and Subotin, M. 2005. The Hiero machine translation system: Extensions, evaluation, and analysis. In Proceedings of HLT-EMNLP. 779--786. Google ScholarDigital Library
Church, K. and Hovy, E. 1993. Good applications for crummy machine translation. Mach. Transl. 8, 239--258.Google ScholarCross Ref
Church, K. and Patil, R. 1982. Coping with syntactic ambiguity or how to put the block in the box on the table. Comput. Linguist. 8, 3--4, 139--149. Google ScholarDigital Library
Collins, M., Hajič, J., Ramshaw, L., and Tillman, C. 1999. A statistical parser for Czech. In Proceedings of the Association for Computational Linguistics (ACL). 505--512. Google ScholarDigital Library
Collins, M., Koehn, P., and Kucerova, I. 2005. Clause restructuring for statistical machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 531--540. Google ScholarDigital Library
Darroch, J. N. and Ratcliff, D. 1972. Generalized iterative scaling for log-linear models. Annals Math. Statist. 43, 5, 1470--1480.Google ScholarCross Ref
Dejean, H., Gaussier, E., Goutte, C., and Yamada, K. 2003. Reducing parameter space for word alignment. In Proceedings of HLT-NAACL Workshop on Building and Using Parallel Texts. 23--26. Google ScholarDigital Library
Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Soc. 39, 1, 1--38.Google Scholar
DeNeefe, S., Knight, K., and Chan, H. H. 2005. Interactively exploring a machine translation model. In Proceedings of the Association for Computational Linguistics (ACL) (Companion Vol.). 97--100. Google ScholarDigital Library
DeNero, J., Gillick, D., Zhang, J., and Klein, D. 2006. Why generative phrase models underperform surface heuristics. In Proceedings of HLT-NAACL Workshop on Statistical Machine Translation. 31--38. Google ScholarDigital Library
DeNero, J. and Klein, D. 2007. Tailoring word alignments to syntactic machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 17--24.Google Scholar
Doddington, G. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the Human Language Technology Conference (HLT). Google ScholarDigital Library
Dorr, B. J., Jordan, P. W., and Benoit, J. W. 1999. A survey of current paradigms in machine translation. In Advances in Computers, M. Zelkowitz, Ed. Vol. 49. Academic Press, 1--68.Google Scholar
Eck, M., Vogel, S., and Waibel, A. 2004. Language model adaptation for statistical machine translation based on information retrieval. In Proceedings of the International Conference on Language Resources and Evaluation (LREC).Google Scholar
Federico, M. and Bertoldi, N. 2006. How many bits are needed to store probabilities for phrase-based translation&quest; In Proceedings of NAACL Workshop on Statistical Machine Translation. 94--101. Google ScholarDigital Library
Foster, G., Kuhn, R., and Johnson, H. 2006. Phrasetable smoothing for statistical machine translation. In Proceedings of EMNLP. 53--61. Google ScholarDigital Library
Fox, H. J. 2002. Phrasal cohesion and statistical machine translation. In Proceedings of EMNLP. 304--311. Google ScholarDigital Library
Fraser, A. and Marcu, D. 2006. Semi-supervised training for statistical word alignment. In Proceedings of the Association for Computational Linguistics (ACL). 769--776. Google ScholarDigital Library
Fraser, A. and Marcu, D. 2007a. Getting the structure right for word alignment: LEAF. In Proceedings of EMNLP-CoNLL. 51--60.Google Scholar
Fraser, A. and Marcu, D. 2007b. Measuring word alignment quality for statistical machine translation. Comput. Linguist. 33, 3. Google ScholarDigital Library
Gale, W. A. and Church, K. W. 1991. Identifying word correspondences in parallel text. In Proceedings of Darpa Workshop on Speech and Natural Language. 152--157. Google ScholarDigital Library
Gale, W. A. and Church, K. W. 1993. A program for aligning sentences in bilingual corpora. Comput. Linguist. 19, 1, 75--102. Google ScholarDigital Library
Galley, M., Graehl, J., Knight, K., Marcu, D., DeNeefe, S., Wang, W., and Thayer, I. 2006. Scalable inference and training of context-rich syntactic translation models. In Proceedings of the Association for Computational Linguistics (ACL). 961--968. Google ScholarDigital Library
Galley, M., Hopkins, M., Knight, K., and Marcu, D. 2004. What's in a translation rule&quest; In Proceedings of HLT-NAACL. 273--280.Google Scholar
Germann, U. 2003. Greedy decoding for statistical machine translation in almost linear time. In Proceedings of HLT-NAACL. 72--79. Google ScholarDigital Library
Germann, U., Jahr, M., Knight, K., Marcu, D., and Yamada, K. 2001. Fast decoding and optimal decoding for machine translation. In Proceedings of ACL-EACL. Google ScholarDigital Library
Germann, U., Jahr, M., Knight, K., Marcu, D., and Yamada, K. 2004. Fast and optimal decoding for machine translation. Artif. Intellig. 154, 1--2, 127--143. Google ScholarDigital Library
Gildea, D. 2004. Dependencies vs. constituencies for tree-based alignment. In Proceedings of EMNLP. 214--221.Google Scholar
Goldwater, S. and McClosky, D. 2005. Improving statistical MT through morphological analysis. In Proceedings of HLT-EMNLP. 676--683. Google ScholarDigital Library
Graehl, J. and Knight, K. 2004. Training tree transducers. In Proceedings of HLT-NAACL. 105--112.Google Scholar
Hopcroft, J. E. and Ullman, J. D. 1979. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. Google ScholarDigital Library
Hovy, E., King, M., and Popescu-Belis, A. 2002. Principles of context-based machine translation evaluation. Mach. Transl. 17, 1, 43--75. Google ScholarDigital Library
Huang, L. and Chiang, D. 2005. Better k-best parsing. In Proceedings of the International Workshop on Pausing Technologies (IWPT). 53--64. Google ScholarDigital Library
Huang, L. and Chiang, D. 2007. Forest rescoring: Faster decoding with integrated language models. In Proceedings of the Association for Computational Linguistics (ACL). 144--151.Google Scholar
Hutchins, J. 2007. Machine translation: A concise history. In Computer Aided Translation: Theory and Practice, C. S. Wai, Ed. Chinese University of Hong Kong.Google Scholar
Hwa, R., Resnik, P., Weinberg, A., Cabezas, C., and Kolak, O. 2005. Bootstrapping parsers via syntactic projection across parallel texts. Natural Lang. Engin. 11, 3, 311--325. Google ScholarDigital Library
Ittycheriah, A. and Roukos, S. 2005. A maximum entropy word aligner for Arabic-English machine translation. In Proceedings of HLT-EMNLP. 89--96. Google ScholarDigital Library
Jelinek, F. 1969. A stack algorithm for faster sequential decoding of transmitted information. Tech. rep. RC2441, IBM Research Center, Yorktown Heights, NY.Google Scholar
Jelinek, F. 1998. Statistical Methods for Speech Recognition. MIT Press, Cambridge, MA. Google ScholarDigital Library
Johnson, H., Martin, J., Foster, G., and Kuhn, R. 2007. Improving translation quality by discarding most of the phrasetable. In Proceedings of EMNLP-CoNLL. 967--975.Google Scholar
Joshi, A. K. and Schabes, Y. 1997. Tree-adjoining grammars. In Handbook of Formal Languages, G. Rozenberg and A. Salomaa, Eds. Vol. 3. Springer, Berlin, Germany, 69--124. Google ScholarDigital Library
Joshi, A. K., Vijay-Shanker, K., and Weir, D. 1991. The convergence of mildly context-sensitive grammar formalisms. In Foundational Issues in Natural Language Processing, P. Sells, S. Shieber, and T. Wasow, Eds. MIT Press, Cambridge, MA, Chapter 2, 31--81.Google Scholar
Jurafsky, D. and Martin, J. H. 2008. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd Ed. Prentice-Hall. Google ScholarDigital Library
Kirchhoff, K. and Yang, M. 2005. Improved language modeling for statistical machine translation. In Proceedings of ACL Workshop on Building and Using Parallel Texts. 125--128. Google ScholarDigital Library
Knight, K. 1997. Automating knowledge acquisition for machine translation. AI Mag. 18, 4, 81--96.Google Scholar
Knight, K. 1999a. Decoding complexity in word-replacement translation models. Comput. Linguist. 25, 4, 607--615. Google ScholarDigital Library
Knight, K. 1999b. A statistical MT tutorial workbook. Unpublished. http://www.cisp.jhu.edu/ws99/projects/mt/wkbk.rtf.Google Scholar
Knight, K. and Al-Onaizan, Y. 1998. Translation with finite-state devices. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). 421--437. Google ScholarDigital Library
Knight, K. and Graehl, J. 2005. An overview of probabilistic tree transducers for natural language processing. In Proceedings of CICLing. Google ScholarDigital Library
Knight, K. and Marcu, D. 2005. Machine translation in the year 2004. In Proceedings of ICASSP.Google Scholar
Koehn, P. 2004a. Pharaoh: A beam search decoder for phrase-based statistical machine translation models. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA).Google ScholarCross Ref
Koehn, P. 2004b. PHARAOH, A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models, User Manual and Description for Version 1.2 USC Information Sciences Institute.Google Scholar
Koehn, P. 2004c. Statistical significance tests for machine translation evaluation. In Proceedings of EMNLP. 388--395.Google Scholar
Koehn, P. 2005. Europarl: A parallel corpus for statistical machine translation. In Proceedings of MT Summit.Google Scholar
Koehn, P. 2008. Statistical Machine Translation. Cambridge University Press. To appear. Google ScholarDigital Library
Koehn, P. and Hoang, H. 2007. Factored translation models. In Proceedings of EMNLP-CoNLL. 868--876.Google Scholar
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of ACL Demo and Poster Sessions. 177--180. Google ScholarDigital Library
Koehn, P. and Monz, C. 2005. Shared task: Statistical machine translation between european languages. In Proceedings of ACL Workshop on Building and Using Parallel Texts. 119--124. Google ScholarDigital Library
Koehn, P. and Monz, C. 2006. Manual and automatic evaluation of machine translation between european languages. In Proceedings of NAACL Workshop on Statistical Machine Translation. 102--121. Google ScholarDigital Library
Koehn, P., Och, F. J., and Marcu, D. 2003. Statistical phrase-based translation. In Proceedings of HLT-NAACL. 127--133. Google ScholarDigital Library
Kulesza, A. and Shieber, S. M. 2004. A learning approach to improving sentence-level MT evaluation. In Proceedings of the International Conference on Theoretical and Methodological Issues in Machine Translation (TMI).Google Scholar
Kumar, S. and Byrne, W. 2004. Minimum bayes-risk decoding for statistical machine translation. In Proceedings of HLT-NAACL. 169--176.Google Scholar
Kumar, S., Deng, Y., and Byrne, W. 2006. A weighted finite state transducer translation template model for statistical machine translation. Natural Lang. Engin. 12, 1, 35--75. Google ScholarDigital Library
Lari, K. and Young, S. J. 1990. The estimation of stochastic context-free grammars using the inside-outside algorithm. Comput. Speech Lang. 4, 1.Google ScholarCross Ref
Lewis, P. M. I. and Stearns, R. E. 1968. Syntax-directed transductions. J. ACM 15, 465--488. Google ScholarDigital Library
Liang, P., Bouchard-Côté, A., Taskar, B., and Klein, D. 2006. An end-to-end discriminative approach to machine translation. In Proceedings of ACL-COLING. 761--768. Google ScholarDigital Library
Liang, P., Taskar, B., and Klein, D. 2006. Alignment by agreement. In Proceedings of HLT-NAACL. 104--111. Google ScholarDigital Library
Lin, C.-Y. and Och, F. J. 2004. Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In Proceedings of the Association for Computational Linguistics (ACL). 606--613. Google ScholarDigital Library
Lita, L., Rogati, M., and Lavie, A. 2005. BLANC: Learning evaluation metrics for MT. In Proceedings of HLT-EMNLP. 740--747. Google ScholarDigital Library
Liu, D. and Gildea, D. 2007. Source-language features and maximum correlation training for machine translation evaluation. In Proceedings of HLT-NAACL. 41--48.Google Scholar
Lopez, A. 2007. Hierarchical phrase-based translation with suffix arrays. In Proceedings of EMNLP-CoNLL. 976--985.Google Scholar
Lopez, A. and Resnik, P. 2005. Improved HMM alignment models for languages with scarce resources. In Proceedings of ACL Workshop on Building and Using Parallel Texts. 83--86. Google ScholarDigital Library
Lopez, A. and Resnik, P. 2006. Word-based alignment, phrase-based translation: What's the link&quest; In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). 90--99.Google Scholar
Manning, C. D. and Schütze, H. 1999. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA. Google ScholarDigital Library
Marcu, D., Wang, W., Echihabi, A., and Knight, K. 2006. SPMT: Statistical machine translation with syntactified target language phrases. In Proceedings of EMNLP. 44--52. Google ScholarDigital Library
Marcu, D. and Wong, W. 2002. A phrase-based, joint probability model for statistical machine translation. In Proceedings of EMNLP. 133--139. Google ScholarDigital Library
Marcus, M. P., Marcinkiewicz, M. A., and Santorini, B. 1993. Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 19, 2, 314--330. Google ScholarDigital Library
Matusov, E., Zens, R., and Ney, H. 2004. Symmetric word alignments for statistical machine translation. In Proceedings of COLING. 219--225. Google ScholarDigital Library
Melamed, I. D. 1996. Automatic construction of clean broad-coverage translation lexicons. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA).Google Scholar
Melamed, I. D. 1998. Manual annotation of translational equivalence: The blinker project. Tech. rep. 98-07, University of Pennsylvania Institute for Research in Cognitive Science.Google Scholar
Melamed, I. D. 2000. Models of translational equivalence among words. Comput. Linguist. 26, 2, 221--249. Google ScholarDigital Library
Melamed, I. D. 2003. Multitext grammars and synchronous parsers. In Proceedings of HLT-NAACL. 79--86. Google ScholarDigital Library
Melamed, I. D. 2004a. Algorithms for syntax-aware statistical machine translation. In Proceedings of Theoretical and Methodological Issues in Machine Translation (TMI).Google Scholar
Melamed, I. D. 2004b. Statistical machine translation by parsing. In Proceedings of the Association for Computational Linguistics (ACL). 654--661. Google ScholarDigital Library
Melamed, I. D., Green, R., and Turian, J. P. 2003. Precision and recall of machine translation. In Proceedings of HLT-NAACL (Companion Vol.). 61--63. Google ScholarDigital Library
Melamed, I. D., Satta, G., and Wellington, B. 2004. Generalized multitext grammars. In Proceedings of the Association for Computational Linguistics (ACL). 662--669. Google ScholarDigital Library
Merialdo, B. 1994. Tagging English text with a probabilistic model. Comput. Linguist. 20, 2, 155--172. Google ScholarDigital Library
Mihalcea, R. and Pedersen, T. 2003. An evaluation exercise for word alignment. In Proceedings of HLT-NAACL Workshop on Building and Using Parallel Texts. 1--10. Google ScholarDigital Library
Minkov, E., Toutanova, K., and Suzuki, H. 2007. Generating complex morphology for machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 128--135.Google Scholar
Mitchell, T. M. 1997. Machine Learning. McGraw-Hill. Google ScholarDigital Library
Moore, R. C. 2004. Improving IBM word-alignment model 1. In Proceedings of the Association for Computational Linguistics (ACL). 519--526. Google ScholarDigital Library
Moore, R. C. 2005a. Association-based bilingual word alignment. In Proceedings of ACL Workshop on Building and Using Parallel Texts. 1--8. Google ScholarDigital Library
Moore, R. C. 2005b. A discriminative framework for bilingual word alignment. In Proceedings of HLT-EMNLP. 81--88. Google ScholarDigital Library
Nießen, S. and Ney, H. 2004. Statistical machine translation with scarce resources using morpho-syntactic information. Comput. Linguist. 30, 2, 182--204. Google ScholarDigital Library
Nießen, S., Vogel, S., Ney, H., and Tillman, C. 1998. A DP based search algorithm for statistical machine translation. In Proceedings of ACL-COLING. 960--967. Google ScholarDigital Library
Oard, D. W., Doermann, D., Dorr, B., He, D., Resnik, P., Weinberg, A., Byrne, W., Khudanpur, S., Yarowsky, D., Leuski, A., Koehn, P., and Knight, K. 2003. Desperately seeking Cebuano. In Proceedings of HLT-NAACL (Companion Vol.). 76--78. Google ScholarDigital Library
Oard, D. W. and Och, F. J. 2003. Rapid-response machine translation for unexpected languages. In Proceedings of MT Summit IX.Google Scholar
Och, F. J. 1999. An efficient method for determining bilingual word classes. In Proceedings of the European Chapter of the Association for Computational Linguistics (EACL). 71--76. Google ScholarDigital Library
Och, F. J. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the Association for Computational Linguistics (ACL). Google ScholarDigital Library
Och, F. J. 2005. Statistical machine translation: The fabulous present and future. In Proceedings of ACL Workshop on Building and Using Parallel Texts. Invited talk.Google Scholar
Och, F. J., Gildea, D., Khudanpur, S., Sarkar, A., Yamada, K., Fraser, A., Kumar, S., Shen, L., Smith, D., Eng, K., Jain, V., Jin, Z., and Radev, D. 2004a. Final report of Johns Hopkins 2003 summer workshop on syntax for statistical machine translation. Tech. rep., Johns Hopkins University.Google Scholar
Och, F. J., Gildea, D., Khudanpur, S., Sarkar, A., Yamada, K., Fraser, A., Kumar, S., Shen, L., Smith, D., Eng, K., Jain, V., Jin, Z., and Radev, D. 2004b. A smorgasbord of features for statistical machine translation. In Proceedings of HLT-NAACL. 161--168.Google Scholar
Och, F. J. and Ney, H. 2000. A comparison of alignment models for statistical machine translation. In Proceedings of COLING. 1086--1090. Google ScholarDigital Library
Och, F. J. and Ney, H. 2001. Statistical multi-source translation. In Proceedings of MT Summit.Google Scholar
Och, F. J. and Ney, H. 2002. Discriminative training and maximum entropy models for machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 156--163. Google ScholarDigital Library
Och, F. J. and Ney, H. 2003. A systematic comparison of various statistical alignment models. Comput. Linguist. 29, 1, 19--51. Google ScholarDigital Library
Och, F. J. and Ney, H. 2004. The alignment template approach to machine translation. Comput. Linguist. 30, 4, 417--449. Google ScholarDigital Library
Och, F. J., Tillman, C., and Ney, H. 1999. Improved alignment models for statistical machine translation. In Proceedings of EMNLP-VLC. 20--28.Google Scholar
Och, F. J., Ueffing, N., and Ney, H. 2001. An efficient A&ast; search algorithm for statistical machine translation. In Proceedings of ACL Workshop on Data-Driven Methods in Machine Translation. 55--62. Google ScholarDigital Library
Olteanu, M., Davis, C., Volosen, I., and Moldovan, D. 2006. Phramer—an open source statistical phrase-based translator. In Proceedings of HLT-NAACL Workshop on Statistical Machine Translation. 146--149. Google ScholarDigital Library
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 311--318. Google ScholarDigital Library
Popovic, M., de Gispert, A., Gupta, D., Lambert, P., Ney, H., Mariño, J. B., Federico, M., and Banchs, R. 2006. Morpho-syntactic information for automatic error analysis of statistical machine translation output. In Proceedings of NAACL Workshop on Statistical Machine Translation. 1--6. Google ScholarDigital Library
Quirk, C., Menezes, A., and Cherry, C. 2005. Dependency treelet translation: Syntactically informed phrasal SMT. In Proceedings of the Association for Computational Linguistics (ACL). 271--279. Google ScholarDigital Library
Ratnaparkhi, A. 1998. Maximum entropy models for natural language ambiguity resolution. Ph.D. thesis, University of Pennsylvania. Google ScholarDigital Library
Resnik, P., Olsen, M. B., and Diab, M. 1997. Creating a parallel corpus from the Book of 2000 Tongues. In Proceedings of the Text Encoding Initiative 10th Anniversary User Conference (TEI-10).Google Scholar
Resnik, P. and Smith, N. A. 2003. The Web as parallel corpus. Comput. Linguist. 29, 3, 349--380. Google ScholarDigital Library
Russell, S. and Norvig, P. 2003. Artificial Intelligence: A Modern Approach 2nd Ed. Prentice-Hall. Google ScholarDigital Library
Schafer, C. and Drabek, E. F. 2005. Models for inuktitut-english word alignment. In Proceedings of ACL Workshop on Building and Using Parallel Texts. 79--82. Google ScholarDigital Library
Shen, L., Sarkar, A., and Och, F. J. 2004. Discriminative reranking for machine translation. In Proceedings of HLT-NAACL. 177--184.Google Scholar
Shieber, S. M. and Schabes, Y. 1990. Synchronous tree-adjoining grammars. In Proceedings of COLING. 253--258. Google ScholarDigital Library
Simard, M., Cancedda, N., Cavestro, B., Dymetman, M., Gaussier, E., Goutte, C., Yamada, K., Langlais, P., and Mauser, A. 2005. Translating with non-contiguous phrases. In Proceedings of HLT-EMNLP. 755--762. Google ScholarDigital Library
Sipser, M. 2005. Introduction to the Theory of Computation 2nd Ed. PWS Publishing. Google ScholarDigital Library
Smith, D. A. and Smith, N. 2004. Bilingual parsing with factored estimation: Using English to parse Korean. In Proceedings of EMNLP. 49--56.Google Scholar
Smith, N. A. 2002. From words to corpora: Recognizing translation. In Proceedings of EMNLP. 95--102. Google ScholarDigital Library
Smith, N. A. 2006. Novel estimation methods for unsupervised discovery of latent structure in natural language text. Ph.D. thesis, Johns Hopkins University. Google ScholarDigital Library
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). 223--231.Google Scholar
Talbot, D. and Osborne, M. 2007a. Randomised language modelling for statistical machine translation. In Proceedings of the Association for Computational Linguistics (ACL). 512--519.Google Scholar
Talbot, D. and Osborne, M. 2007b. Smoothed bloom filter language models: Tera-scale LMs on the cheap. In Proceedings of the Association for Computational Linguistics (ACL). 468--476.Google Scholar
Taskar, B. 2004. Learning structured prediction models: A large-margin approach. Ph.D. thesis, Stanford University. Google ScholarDigital Library
Taskar, B., Lacoste-Julien, S., and Klein, D. 2005. A discriminative matching approach to word alignment. In Proceedings of HLT-EMNLP. 73--80. Google ScholarDigital Library
Tillman, C. and Ney, H. 2003. Word reordering and a dynamic programming beam search algorithm for statistical machine translation. Comput. Linguist. 29, 1, 98--133. Google ScholarDigital Library
Tillmann, C., Vogel, S., Ney, H., and Zubiaga, A. 1997. A DP-based search using monotone alignments in statistical translation. In Proceedings of ACL-EACL. 289--296. Google ScholarDigital Library
Tillmann, C. and Zhang, T. 2006. A discriminative global training algorithm for statistical MT. In Proceedings of ACL-COLING. 721--728. Google ScholarDigital Library
Toutanova, K., Ilhan, H. T., and Manning, C. D. 2002. Extensions to HMM-based statistical word alignment models. In Proceedings of EMNLP. 87--94. Google ScholarDigital Library
Turian, J. P., Shen, L., and Melamed, I. D. 2003. Evaluation of machine translation and its evaluation. In Proceedings of MT Summit IX.Google Scholar
Ueffing, N., Haffari, G., and Sarkar, A. 2007. Transductive learning for statistical machine translation. In Proceedings of the Association for Computational Linguisties (ACL). 25--32.Google Scholar
Ueffing, N. and Ney, H. 2005. Word-level confidence estimation for machine translation using phrase-based translation models. In Proceedings of HLT-EMNLP. 763--770. Google ScholarDigital Library
Ueffing, N., Och, F. J., and Ney, H. 2002. Generation of word graphs in statistical machine translation. In Proceedings of EMNLP. 156--163. Google ScholarDigital Library
Venugopal, A., Zollmann, A., and Vogel, S. 2007. An efficient two-pass approach to synchronous-CFG driven statistical MT. In Proceedings of HLT-NAACL.Google Scholar
Venugopal, A., Zollmann, A., and Waibel, A. 2005. Training and evaluating error minimization rules for statistical machine translation. In Proceedings of ACL Workshop on Building and Using Parallel Texts. 208--215. Google ScholarDigital Library
Vijay-Shanker, K., Weir, D., and Joshi, A. K. 1987. Characterizing structural descriptions produced by various grammatical formalisms. In Proceedings of the Association for Computational Linguisties (ACL). 104--111. Google ScholarDigital Library
Viterbi, A. J. 1967. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13, 2, 260--269.Google ScholarDigital Library
Vogel, S., Ney, H., and Tillman, C. 1996. HMM-based word alignment in statistical machine translation. In Proceedings of COLING. 836--841. Google ScholarDigital Library
Wang, C., Collins, M., and Koehn, P. 2007. Chinese syntactic reordering for statistical machine translation. In Proceedings of EMNLP-CoNLL. 737--745.Google Scholar
Wang, J. 2005. Matching meaning for cross-language information retrieval. Ph.D. thesis, University of Maryland. Google ScholarDigital Library
Wang, Y.-Y. and Waibel, A. 1997. Decoding algorithm in statistical machine translation. In Proceedings of ACL-EACL. 366--372. Google ScholarDigital Library
Watanabe, T. and Sumita, E. 2002. Bidirectional decoding for statistical machine translation. In Proceedings of COLING. 1079--1085. Google ScholarDigital Library
Watanabe, T., Suzuki, J., Tsukada, H., and Isozaki, H. 2007. Online large-margin training for statistical machine translation. In Proceedings of EMNLP-CoNLL. 764--773.Google Scholar
Weaver, W. 1955. Translation. In Machine Translation of Languages: Fourteen Essays, W. N. Locke and A. D. Booth, Eds. MIT Press, Chapter 1, 15--23.Google Scholar
Wellington, B., Turian, J., Pike, C., and Melamed, I. D. 2006. Scalable purely-discriminative training for word and tree transducers. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). 251--260.Google Scholar
Wellington, B., Waxmonsky, S., and Melamed, I. D. 2006. Empirical lower bounds on the complexity of translational equivalence. In Proceedings of ACL-COLING. 977--984. Google ScholarDigital Library
White, J. S., O'Cornell, T., and O'Nava, F. 1994. The ARPA MT evaluation methodologies: Evolution, lessons and future approaches. In Proceedings of the Association for Machine Translation in the Americas.Google Scholar
Wu, D. 1995a. Grammarless extraction of phrasal translation examples from parallel texts. In Proceedings of the Theoretical and Methodological Issues in Machine Translation (TMI). 354--372.Google Scholar
Wu, D. 1995b. Stochastic inversion transduction grammars, with application to segmentation, bracketing, and alignment of parallel corpora. In Proceedings of IJCAI. 1328--1335. Google ScholarDigital Library
Wu, D. 1996. A polynomial-time algorithm for statistical machine translation. In Proceedings of the Association for Computational Linguisties (ACL). 152--158. Google ScholarDigital Library
Wu, D. and Wong, H. 1998. Machine translation with a stochastic grammatical channel. In Proceedings of ACL-COLING. 1408--1415. Google ScholarDigital Library
Xiong, D., Liu, Q., and Lin, S. 2006. Maximum entropy based phrase reordering model for statistical machine translation. In Proceedings of ACL-COLING. 521--528. Google ScholarDigital Library
Yamada, K. and Knight, K. 2001. A syntax-based statistical translation model. In Proceedings of ACL-EACL. Google ScholarDigital Library
Yamada, K. and Knight, K. 2002. A decoder for syntax-based statistical MT. In Proceedings of the Association for Computational Linguisties (ACL). 303--310. Google ScholarDigital Library
Yarowsky, D. and Ngai, G. 2001. Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora. In Proceedings of NAACL. 200--207. Google ScholarDigital Library
Yarowsky, D., Ngai, G., and Wicentowski, R. 2001. Inducing mulilingual text analysis tools via robust projection across aligned corpora. In Proceedings of the Human Language Technology Conference (HLT). 109--116. Google ScholarDigital Library
Zens, R. and Ney, H. 2003. A comparative study on reordering constraints in statistical machine translation. In Proceedings of the Association for Computational Linguisties (ACL). 144--151. Google ScholarDigital Library
Zens, R. and Ney, H. 2004. Improvements in phrase-based statistical machine translation. In Proceedings of HLT-NAACL. 257--264. Google ScholarDigital Library
Zens, R. and Ney, H. 2007. Efficient phrase-table representation for machine translation with applications to online MT and speech translation. In Proceedings of HLT-NAACL.Google Scholar
Zhang, H. and Gildea, D. 2005. Stochastic lexicalized inversion transduction grammar for alignment. In Proceedings of the Association for Computational Linguisties (ACL). 475--482. Google ScholarDigital Library
Zhang, H., Huang, L., Gildea, D., and Knight, K. 2006. Synchronous binarization for machine translation. In Proceedings of HLT-NAACL. 256--263. Google ScholarDigital Library
Zhang, Y., Hildebrand, A. S., and Vogel, S. 2006. Distributed language modeling for N-best list re-ranking. In Proceedings of EMNLP. 216--223. Google ScholarDigital Library
Zhang, Y. and Vogel, S. 2005. An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora. In Proceedings of EAMT.Google Scholar

Index Terms

Statistical machine translation

Recommendations

A Neural Network Classifier Based on Dependency Tree for English-Vietnamese Statistical Machine Translation
Computational Linguistics and Intelligent Text Processing
Abstract
Reordering in MT is a major challenge when translating between languages with different of sentence structures. In Phrase-based statistical machine translation (PBSMT) systems, syntactic pre-ordering is a commonly used pre-processing technique. ...
Read More
Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

Word reordering is a difficult task for translation between languages with widely different word orders, such as Japanese and English. A previously proposed post-ordering method for Japanese-to-English translation first translates a Japanese sentence ...
Read More
A Reordering Model for Phrase-Based Machine Translation
GoTAL '08: Proceedings of the 6th international conference on Advances in Natural Language Processing

This paper presents a new method for reordering in phrase based statistical machine translation (PBSMT). Our method is based on previous chunk-level reordering methods for PBSMT. Our method is a global reordering. First, we parse the source language ...
Read More

Reviews

Reviewer: Rathinasamy B. Lenin

Through this self-contained literature survey, Lopez characterizes the core ideas of statistical machine translation (SMT) and provides a taxonomy of various approaches. The survey outlines five important factors that contribute to the interest in SMT. Then, it discusses in detail four steps for building a functioning SMT system: a translation equivalence model, parameterization, parameter estimation, and decoding. The first step includes two formalisms that are generalizations of finite-state automata (FSA): finite-state transducers (FST) and synchronous context-free grammars (SCFG). The second step involves designing a function "to assign a real-valued score to any pair of source and target sentences." The decoding step, for translating new input sentences, is explained through a maximization problem. For this step, two types of decoding techniques for FST and SCFG are discussed through an extensive literature survey. The survey discusses the importance of reranking or rescoring, and data structures for model representation. It talks about the bilingual evaluation understudy (BLEU) to evaluate the performance of SMT systems. Finally, the survey provides current directions, based on recently published papers on SMT, and future research. In summary, this is a good survey paper that will be useful to researchers in this important area of research, especially beginners. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Computing Surveys Volume 40, Issue 3
August 2008
155 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/1380584
Issue’s Table of Contents

Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 August 2008
- Accepted: 1 October 2007
- Revised: 1 August 2007
- Received: 1 March 2006
Published in csur Volume 40, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Natural language processing
machine translation
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 179
  Total Citations
  View Citations
- 6,811
  Total Downloads
- Downloads (Last 12 months)509
- Downloads (Last 6 weeks)86
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Statistical machine translation

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

A Neural Network Classifier Based on Dependency Tree for English-Vietnamese Statistical Machine Translation

Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

A Reordering Model for Phrase-Based Machine Translation

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Statistical machine translation

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

A Neural Network Classifier Based on Dependency Tree for English-Vietnamese Statistical Machine Translation

Post-Ordering by Parsing with ITG for Japanese-English Statistical Machine Translation

A Reordering Model for Phrase-Based Machine Translation

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media