Abstract
Minimum error rate training (MERT) is probably still the most widely used parameter learning algorithm in statistical machine translation [1] (SMT). However, it does not support the use of large number of learning features (e.g. 30 features or more). Moreover, acting on parameter space, MERT is only a local optimization algorithm. In this paper, we investigate for the first time the use of metaheuristics and global optimization techniques for the problem of learning parameters in SMT. In particular, We replace MERT with the well-known meta-heuristics for global optimization called CovarianceMatrixAdaptation Evolution Strategy (CMAES) [2]. We test the effectiveness of CMA-ES by conducting SMT experiments on an English-Vietnamese corpus. The results show that the improved SMT system using CMA-ES achieved superior BLEU scores compared to the baseline SMT system using MERT both on the dev and test data sets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 160–167. Association for Computational Linguistics, Sapporo (2003)
Hansen, N.: The CMA evolution strategy: A comparing review. In: Lozano, J.A., Larrañaga, P., Inza, I., Bengoetxea, E. (eds.) Towards a New Evolutionary Computation. STUDFUZZ, vol. 192, pp. 75–102. Springer, Heidelberg (2006)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open source toolkit for statistical machine translation. In: Proceedings of ACL, Demonstration Session (2007)
Koehn, P.: Statistical Machine Translation. Cambridge University Press (2010)
Smith, D.A., Eisner, J.: Minimum risk annealing for training log-linear models. In: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pp. 787–794. Association for Computational Linguistics, Sydney (2006)
Huang, L., Mi, H.: Efficient incremental decoding for tree-to-string translation. In: Proceedings of the 2010 Conference on EmpiricalMethods in Natural Language Processing, pp. 273–283. Association for Computational Linguistics, Cambridge (2010)
Suzuki, J., Tsukada, H., Watanabe, T., Isozaki, H.: Online large-margin training for statistical machine translation. In: Proceedings of EMNLP-CoNLL, Prague, pp. 764–773 (June 2007)
Knight-and, K., Wang, W., Chiang, D.: 11,001 new features for statistical machine translation. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the NACL, Stroudsburg, PA, USA (June 2009)
Arun, A., Koehn, P.: Online learningmethods for discriminative training of phrase based statistical machine translation. In: MT Summit XI, Copenhagen (September 2007)
Crammer, K., McDonald, R., Pereira, F.: Online large-margin training of dependency parsers. In: Proceedings of ACL (2005)
Teukolsky, S.A., Flannery, B.P., Press, W.H., Vetterling, W.T.: Numerical Recipes in C++: the art of scientific computing, 2nd edn. Cambridge University Press, New York (2002)
Akimoto, Y., Nagata, Y., Ono, I., Kobayashi, S.: Bidirectional relation between CMA evolution strategies and natural evolution strategies. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6238, pp. 154–163. Springer, Heidelberg (2010)
Ho, T.B., Nguyen, M.L., Nguyen, T.P., Shimazu, A., Van Nguyen, V.: A tree-to-string phrase-based model for statistical machine translation. In: Proceedings of the Twelfth Conference on Computational Natural Language Learning (CoNLL 2008), Manchester, England, pp. 143–150. Coling 2008 Organizing Committee (August 2008)
Birch, A., CallisonBurch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Koehn, P., Hoang, H., Herbst, E.: Moses: Open source toolkit for statistical machine translation. In: Proceedings of ACL, Demonstration Session (2007)
Stolcke, A.: Srilm - an extensible language modeling toolkit. In: Proceedings of International Conference on Spoken Language Processing, Cambridge, MA, vol. 9, pp. 901–904 (2002)
Roukos, S., Ward, T., Papineni, K., Zhu, W.J.: Bleu: A method for automatic evaluation of machine translation. In: ACL (2002)
Fortin, F.A., De Rainville, F.M., Gardner, M.A., Parizeau, M., Gagné, C.: DEAP: Evolutionary algorithms made easy. Journal of Machine Learning Research 13, 2171–2175 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Tran, VH., Pham, AT., Nguyen, VV., Nguyen, HX., Nguyen, HQ. (2015). Parameter Learning for Statistical Machine Translation Using CMA-ES. In: Nguyen, VH., Le, AC., Huynh, VN. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 326. Springer, Cham. https://doi.org/10.1007/978-3-319-11680-8_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-11680-8_34
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11679-2
Online ISBN: 978-3-319-11680-8
eBook Packages: EngineeringEngineering (R0)