An Investigation on Statistical Machine Translation with Neural Language Models

Zhao, Yinggong; Huang, Shujian; Chen, Huadong; Chen, Jiajun

doi:10.1007/978-3-319-12277-9_16

Yinggong Zhao²¹,
Shujian Huang²¹,
Huadong Chen²¹ &
…
Jiajun Chen²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8801))

Included in the following conference series:

Abstract

Recent work has shown the effectiveness of neural probabilistic language models(NPLMs) in statistical machine translation(SMT) through both reranking the n-best outputs and direct decoding. However there are still some issues remained for application of NPLMs. In this paper we further investigate through detailed experiments and extension of state-of-art NPLMs. Our experiments on large-scale datasets show that our final setting, i.e., decoding with conventional n-gram LMs plus un-normalized feedforward NPLMs extended with word clusters could significantly improve the translation performance by up to averaged 1.1 Bleu on four test datasets, while decoding time is acceptable. And results also show that current NPLMs, including feedforward and RNN still cannot simply replace n-gram LMs for SMT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Comparable Study on Model Averaging, Ensembling and Reranking in NMT

Improving neural machine translation through phrase-based soft forced decoding

Article 04 April 2020

ISTIC’s Neural Machine Translation System for CCMT’ 2021

References

Auli, M., Galley, M., Quirk, C., Zweig, G.: Joint language and translation modeling with recurrent neural networks. In: Proceedings of the 2013 Conference on EMNLP, pp. 1044–1054 (2013)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. Journal of Machine Learning Research (2003)
Google Scholar
Brown, P.F., Desouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)
Google Scholar
Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. Tech. Rep. TR-10-98, Harvard University Center for Research in Computing Technology (1998)
Google Scholar
Chiang, D.: Hierarchical phrase-based translation. Computational Linguistics 33(2), 201–228 (2007)
Article MATH Google Scholar
Clark, J.H., Dyer, C., Lavie, A., Smith, N.A.: Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 176–181. Association for Computational Linguistics (June 2011), http://www.aclweb.org/anthology/P11-2031
Devlin, J., Zbib, R., Huang, Z., Lamar, T., Schwartz, R., Makhoul, J.: Fast and robust neural network joint models for statistical machine translation. In: Proceedings of the ACL, Association for Computational Linguistics, Baltimore (2014)
Google Scholar
Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In: Proceedings of AISTATS (2010)
Google Scholar
Le, H.S., Allauzen, A., Yvon, F.: Measuring the influence of long range dependencies with neural network language models. In: Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, Montréal, Canada (2012)
Google Scholar
Mikolov, T.: Statistical Language Models Based on Neural Networks. Ph.D. thesis, Brno University of Technology (2012)
Google Scholar
Mikolov, T., Deoras, A., Kombrink, S., Burget, L., Černocký, J.H.: Empirical evaluation and combination of advanced language modeling techniques. In: Proceedings of INTERSPEECH, pp. 605–608 (2011)
Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Černocký, J.H., Khudanpur, S.: Recurrent neural network based language model. In: Proceedings of INTERSPEECH (2010)
Google Scholar
Mnih, A., Hinton, G.: Three new graphical models for statistical language modelling. In: Proceedings of ICML (2007)
Google Scholar
Mnih, A., Hinton, G.: A scalable hierarchical distributed language model. In: NIPS (2009)
Google Scholar
Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models. In: Proceedings of the 29th ICML, pp. 1751–1758 (2012)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of ICML, pp. 807–814 (2010)
Google Scholar
Niehues, J., Waibel, A.: Continuous space language models using Restricted Boltzmann Machines. In: Proceedings of IWSLT (2012)
Google Scholar
Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of ACL, pp. 160–167 (2003)
Google Scholar
Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)
Article MATH Google Scholar
Ramabhadran, B., Khudanpur, S., Arisoy, E. (eds.): Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, Montréal, Canada (June 2012)
Google Scholar
Schwenk, H.: Continuous space language models. Computer Speech and Language 21, 492–518 (2007)
Article Google Scholar
Schwenk, H.: Continuous-space language models for statistical machine translation. Prague Bulletin of Mathematical Linguistics 93, 137–146 (2010)
Article Google Scholar
Sundermeyer, M., Oparin, I., Gauvain, J.L., Freiberg, B., Schluter, R., Ney, H.: Comparison of feedforward and recurrent neural network language models. In: Proceedings of ICASSP (2013)
Google Scholar
Vaswani, A., Zhao, Y., Fossum, V., Chiang, D.: Decoding with large-scale neural language models improves translation. In: Proceedings of the 2013 Conference on EMNLP, pp. 1387–1392.
Google Scholar
Wu, Y., Lu, X., Yamamoto, H., Matsuda, S., Hori, C., Kashioka, H.: Factored language model based on recurrent neural network. In: Proceedings of COLING 2012, Mumbai, India, pp. 2835–2850 (December 2012)
Google Scholar
Wuebker, J., Peitz, S., Rietig, F., Ney, H.: Improving statistical machine translation with word class models. In: Proceedings of EMNLP, pp. 1377–1381 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210046, China
Yinggong Zhao, Shujian Huang, Huadong Chen & Jiajun Chen

Authors

Yinggong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shujian Huang
View author publications
You can also search for this author in PubMed Google Scholar
Huadong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jiajun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, Haidian District, 100084, Beijing, China
Maosong Sun & Yang Liu &
Chinese Academy of Sciences, Institute of Automation, 100190, Beijing, China
Jun Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, Y., Huang, S., Chen, H., Chen, J. (2014). An Investigation on Statistical Machine Translation with Neural Language Models. In: Sun, M., Liu, Y., Zhao, J. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2014 2014. Lecture Notes in Computer Science(), vol 8801. Springer, Cham. https://doi.org/10.1007/978-3-319-12277-9_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-12277-9_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12276-2
Online ISBN: 978-3-319-12277-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics