Speed Up the Training of Neural Machine Translation

Liu, Xinyue; Wang, Weixuan; Liang, Wenxin; Li, Yuangang

doi:10.1007/s11063-019-10084-y

Speed Up the Training of Neural Machine Translation

Published: 27 July 2019

Volume 51, pages 231–249, (2020)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Xinyue Liu¹,
Weixuan Wang¹,
Wenxin Liang ORCID: orcid.org/0000-0002-9166-3138¹ &
…
Yuangang Li²

367 Accesses
9 Citations
Explore all metrics

Abstract

Neural machine translation (NMT) has achieved notable achievements in recent years. Although existing models provide reasonable translation performance, they cost too much training time. Especially, when the corpus is enormous, their computational cost will be extremely high. In this paper, we propose a novel NMT model based on the conventional bidirectional recurrent neural network (bi-RNN). In this model, we apply a tanh activation function, which can learn the future and history context information more sufficiently, to speed up the training process. Experimental results on tasks of German–English and English–French translation demonstrate that the proposed model can save much training time compared with the state-of-the-art models and provide better translation performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review on the long short-term memory model

Article 13 May 2020

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

Automatic speech recognition: a survey

Article 10 November 2020

References

Alinejad A, Siahbani M, Sarkar A (2018) Prediction improves simultaneous neural machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31–November 4, 2018, pp 3022–3027
Amari S (1998) Natural gradient works efficiently in learning. Neural Comput 10(2):251–276
Google Scholar
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Chen X, Liu X, Wang Y, Gales MJF, Woodland PC (2016) Efficient training and evaluation of recurrent neural network language models for automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process 24(11):2146–2157
Google Scholar
Cho K, van Merrienboer B, Gülçehre Ç, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1724–1734
Gehring J, Auli M, Grangier D, Dauphin YN (2017) A convolutional encoder model for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: long papers, pp 123–135
Gu J, Bradbury J, Xiong C, Li VOK, Socher R (2017) Non-autoregressive neural machine translation. CoRR arXiv:1711.02281
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural comput 9(8):1735–1780
Google Scholar
Jean S, Cho K, Memisevic R, Bengio Y (2015) On using very large target vocabulary for neural machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the Asian federation of natural language processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: long papers, pp 1–10
Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing, EMNLP 2013, 18–21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A meeting of SIGDAT, a Special Interest Group of the ACL, pp 1700–1709
Kalchbrenner N, Espeholt L, Simonyan K, Oord AVD, Graves A, Kavukcuoglu K (2016) Neural machine translation in linear time. CoRR arXiv:1610.10099
Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: open-source toolkit for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, System Demonstrations, pp 67–72
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Conference of the North American chapter of the association for computational linguistics on human language technology, NAACL2003, May 27–June 1, Edmonton, Canda, pp 48–54
Lei T, Zhang Y (2017) Training rnns as fast as cnns. arXiv preprint arXiv:1709.02755
Luong M, Brevdo E, Zhao R (2017) Neural machine translation (seq2seq) tutorial. https://github.com/tensorflow/nmt
Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp 1412–1421
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL 2002, Pennsylvania Philadelphia, PA 19104 , July 2–12, pp 311–318. Association for Computational Linguistics
Press O, Smith NA (2018) You may not need attention. CoRR arXiv:1810.13409
Ranzato M, Chopra S, Auli M, Zaremba W (2015) Sequence level training with recurrent neural networks. CoRR arXiv:1511.06732
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Google Scholar
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of the 27th annual conference on neural information processing systems, NIPS 2014, December 8–13, 2014, Montreal, Quebec, Canada, pp 3104–3112
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN,Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 30th annual conference on neural information processing systems, NIPS 2017, December 4–9, 2017, Long Beach, CA, USA, pp 6000–6010
Wu L, Xia Y, Zhao L, Tian F, Qin T, Lai J, Liu TY (2017) Adversarial neural machine translation. arXiv preprint arXiv:1704.06933
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al. (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144
Yan Y, Wang Y, Gao W, Zhang B, Yang C, Yin X (2018) \(\text{ Lstm }^{2}\): multi-label ranking for document classification. Neural Process Lett 47(1):117–138
Google Scholar
Zhang B, Xiong D, Su J, Duan H (2017) A context-aware recurrent encoder for neural machine translation. IEEE/ACM Trans Audio Speech Lang Process 25(12):2424–2432
Google Scholar
Zhang D, Kim J, Crego JM, Senellart J (2017) Boosting neural machine translation. In: Proceedings of the eighth international joint conference on natural language processing, IJCNLP 2017, Taipei, Taiwan, November 27–December 1, 2017, Volume 2: short papers, pp 271–276
Zhou J, Cao Y, Wang X, Li P, Xu W (2016) Deep recurrent models with fast-forward connections for neural machine translation. TACL 4:371–383
Google Scholar

Download references

Author information

Authors and Affiliations

School of Software, Dalian University of Technology, Dalian, 116620, China
Xinyue Liu, Weixuan Wang & Wenxin Liang
Business School, Dalian University of Foreign Languages, Dalian, 116044, China
Yuangang Li

Authors

Xinyue Liu
View author publications
You can also search for this author in PubMed Google Scholar
Weixuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wenxin Liang
View author publications
You can also search for this author in PubMed Google Scholar
Yuangang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenxin Liang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by National Science Foundation of China (Nos. 61632019, 61876028).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Wang, W., Liang, W. et al. Speed Up the Training of Neural Machine Translation. Neural Process Lett 51, 231–249 (2020). https://doi.org/10.1007/s11063-019-10084-y

Download citation

Published: 27 July 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s11063-019-10084-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speed Up the Training of Neural Machine Translation

Abstract

Access this article

Similar content being viewed by others

A review on the long short-term memory model

A review of convolutional neural networks in computer vision

Automatic speech recognition: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speed Up the Training of Neural Machine Translation

Abstract

Access this article

Similar content being viewed by others

A review on the long short-term memory model

A review of convolutional neural networks in computer vision

Automatic speech recognition: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation