Improving Neural Machine Translation Model with Deep Encoding Information

Duan, Guiduo; Yang, Haobo; Qin, Ke; Huang, Tianxi

doi:10.1007/s12559-021-09860-7

Improving Neural Machine Translation Model with Deep Encoding Information

Published: 15 May 2021

Volume 13, pages 972–980, (2021)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Guiduo Duan^1,2,
Haobo Yang¹,
Ke Qin^1,2 &
…
Tianxi Huang ORCID: orcid.org/0000-0002-1495-9170³

503 Accesses
Explore all metrics

Abstract

Availability of very high computational power along with the development of deep neural network (DNN) technology has enabled rapid progress of machine translation technology. The powerful representation ability of the deep neural network also enables the neural machine translation technology (NMT) to exploit the available large-scale bilingual parallel corpus as well as the computing power to provide a highly effective translation model. Nevertheless, the existing neural machine translation models only utilize the top layer encoder information, whereas the information available in deeper encoding layers is often ignored. This significantly constrains the performance of the translation model. To address this issue, in this paper, we propose a novel neural machine translation model which can fully exploit the deep encoding information. The core idea is to use different ways of aggregating the information from different encoding layers. We further design three different aggregation strategies including parallel layer, multi-layer, and dynamic layer encoding information aggregations. Three translation models are correspondingly trained and compared with the baseline transformer model for the Chinese-to-English translation task. The experimental results indicate that the BLEU-4 score of the proposed model has been increased by 0.89 compared with that of the benchmark model. Experiments demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neural Network Machine Translation Model Based on Deep Learning Technology

Translation model based on discrete Fourier transform and Skipping Sub-Layer methods

Article 22 April 2024

Machine Translation Using Deep Learning: A Comparison

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Kalchbrenner N, Blunsom P. Recurrent continuous translation models. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013.pp.1700–1709.
Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems. 2014.pp.3104–3112.
Chua LO, Roska T. The CNN paradigm. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications. 1993;40(3):147–56.
Article Google Scholar
Zhang W, Feng Y, Meng F, et al. Bridging the gap between training and inference for neural machine translation. 57th Annual Meeting of the Association for Computational Linguistics. 2019.pp.4334–4343.
Cho K, Van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. 2014 Conference on Empirical Methods in Natural Language Processing. 2014.pp.1724–1734.
Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Networks. 1994;5(2):157–66.
Article Google Scholar
Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. 3rd International Conference on Learning Representations. 2015.
Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015.pp.1412–1421.
Gehring J, Auli M, Grangier D, et al. A convolutional encoder model for neural machine translation. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017.pp.123–135.
Gehring J, Auli M, Grangier D, et al. Convolutional sequence to sequence learning. Proceedings of the 34th International Conference on Machine Learning. 2017.pp.1243–1252.
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017.pp.6000–6010.
Zhang B, Xiong D, Su J. Neural machine translation with deep attention. IEEE Trans Pattern Anal Mach Intell. 2020;42(1):154–63.
Article Google Scholar
Zhou J, Cao Y, Wang X, et al. Deep recurrent models with fast-forward connections for neural machine translation. Transactions of the Association for Computational Linguistics. 2016;4:371–83.
Article Google Scholar
Chen K, Wang R, Utiyama M, et al. Towards more diverse input representation for neural machine translation. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2020;28:1586–97.
Article Google Scholar
Shi X, Padhi I, Knight K. Does string-based neural MT learn source syntax?. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016.pp.1526–1534.
Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations. 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018.pp.2227–2237.
Yang M, Liu S, Chen K, et al. A hierarchical clustering approach to fuzzy semantic representation of rare words in neural machine translation. IEEE Trans Fuzzy Syst. 2020;28(5):1992–1002.
Article Google Scholar
Zhang B, Xiong D, Su J. Accelerating neural transformer via an average attention network. 56th Annual Meeting of the Association for Computational Linguistics. 2018.pp.1789–1798.
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.pp.770–778.
Shen Y, Tan X, He D, et al. Dense information flow for neural machine translation. 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018.pp.1294–1303.
Dou Z Y, Tu Z, Wang X, et al. Exploiting deep representations for neural machine translation. 2018 Conference on Empirical Methods in Natural Language Processing. 2018.pp.4253–4262.
Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.pp.4700–4708.
Yu F, Wang D, Shelhamer E, et al. Deep layer aggregation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.pp.2403–2412.
Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002.pp.311–318.

Download references

Funding

This work was supported by National Natural Science Foundation of China (No. U19A2059) and by Ministry of Science and Technology of Sichuan Province Program (2020YFG0328).

Author information

Authors and Affiliations

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
Guiduo Duan, Haobo Yang & Ke Qin
Trusted Cloud Computing and Big Data Key Laboratory of Sichuan Province, Chengdu, China
Guiduo Duan & Ke Qin
Department of Fundamental Courses, Chengdu Textile College, Chengdu, China
Tianxi Huang

Authors

Guiduo Duan
View author publications
You can also search for this author in PubMed Google Scholar
Haobo Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Qin
View author publications
You can also search for this author in PubMed Google Scholar
Tianxi Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianxi Huang.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duan, G., Yang, H., Qin, K. et al. Improving Neural Machine Translation Model with Deep Encoding Information. Cogn Comput 13, 972–980 (2021). https://doi.org/10.1007/s12559-021-09860-7

Download citation

Received: 02 November 2020
Accepted: 30 March 2021
Published: 15 May 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s12559-021-09860-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Neural Machine Translation Model with Deep Encoding Information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Neural Network Machine Translation Model Based on Deep Learning Technology

Translation model based on discrete Fourier transform and Skipping Sub-Layer methods

Machine Translation Using Deep Learning: A Comparison

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical Approval

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Improving Neural Machine Translation Model with Deep Encoding Information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Neural Network Machine Translation Model Based on Deep Learning Technology

Translation model based on discrete Fourier transform and Skipping Sub-Layer methods

Machine Translation Using Deep Learning: A Comparison

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical Approval

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation