Recurrent graph encoder for syntax-aware neural machine translation

Ding, Liang; Wang, Longyue; Liu, Siyou

doi:10.1007/s13042-022-01682-9

Recurrent graph encoder for syntax-aware neural machine translation

Original Article
Published: 07 November 2022

Volume 14, pages 1053–1062, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

342 Accesses
1 Altmetric
Explore all metrics

Abstract

Self-attention networks (SAN) have achieved promising performance in a variety of NLP tasks, e.g. neural machine translation (NMT), as they can directly build dependencies among words. But it is weaker at learning positional information than recurrent neural networks (RNN). Natural questions arise: (1) Can we design a component with RNN by directly guiding the syntax dependencies for it? (2) Whether such syntax enhanced sequence modeling component benefits existing NMT structures, e.g. RNN-based NMT and Transformer-based NMT. To answer above question, we propose a simple yet effective recurrent graph syntax encoder, dubbed RGSE, to utilize off-the-shelf syntax dependencies and its intrinsic recurrence property, such that RGSE models syntactic dependencies and sequential information (i.e. word order) simultaneously. Experimental studies on various neural machine translation tasks demonstrate that RGSE equipped RNN and Transformer models could gain consistent significant improvements over several strong syntax-aware benchmarks, with minuscule parameters increases. The extensive analysis further illustrates that RGSE does improve the syntactic and semantic preservation ability than SAN, additionally, shows superior robustness to defend syntactic noise than existing syntax-aware NMT models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on deep learning approaches for text-to-SQL

Article Open access 23 January 2023

George Katsogiannis-Meimarakis & Georgia Koutrika

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Article Open access 17 February 2024

Marco Cascella, Federico Semeraro, … Elena Bignami

Text Data Augmentation for Deep Learning

Article Open access 19 July 2021

Connor Shorten, Taghi M. Khoshgoftaar & Borko Furht

Notes

http://www.statmt.org/wmt16/translation-task.html.
https://github.com/tensorflow/models/tree/master/research/syntaxnet.
http://www.tkl.iis.u-tokyo.ac.jp/ynaga/jdepp/.
The syntactic dependency label will remain on each substring if a word is splitted by BPE.
See footnote 4.

References

Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing. pp 1700–1709
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, 27
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd International Conference on Learning Representations 2015
Gehring J, Auli M, Grangier D (2017) Convolutional sequence to sequence learning. In: International conference on machine learning, pp 1243–1252
Wu Y, Schuster M, Chen Z, Le QV (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation. In: Advances in Neural Information Processing Systems
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. p 30
Koehn P, Och FJ, Marcu D (2003) Statistical Phrase-Based Translation. In: Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp 127–133
Cho K, Van Merriënboer B, Gulcehre C (2014) Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1724–1734
Parikh A, Täckström O, Das D, Uszkoreit J (2016) A Decomposable Attention Model for Natural Language Inference. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 2249–2255
Lin Z, Feng M, Santos CN, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. In: Proceedings of the 5th International Conference on Learning Representations
Chen MX, Firat O, Bapna A, Johnson (2018) The best of both worlds: combining recent advances in neural machine translation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Long Papers), pp 76–86
Shaw P, Uszkoreit J, Vaswani A et al (2018) Self-Attention with Relative Position Representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol 2 (Short Papers), pp 464–468
Hao J, Wang X, Yang B, Wang L, Zhang J, Tu Z (2019) Modeling Recurrence for Transformer. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol 1. pp 1198–1207
Tran K, Bisazza A, Monz C (2018) The Importance of Being Recurrent for Modeling Hierarchical Structure. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4731–4736
Tang G, Müller M, Rios A, Sennrich R (2018) Why self-attention? A targeted evaluation of neural machine translation architectures. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4263–4272
Marcheggiani D, Titov I (2017) Encoding sentences with graph convolutional networks for semantic role labeling. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 1506–1515
Bastings J, Titov I, Aziz W (2017) Graph Convolutional Encoders for Syntax-aware Neural Machine Translation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp 1957–1967
Song L, Zhang Y (2018) A Graph-to-Sequence Model for AMR-to-Text Generation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1616–1626
Beck D, Haffari G et al (2018) Graph-to-Sequence Learning using Gated Graph Neural Networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 273–283
Song L, Gildea D, Zhang Y, Wang Z, Su J (2019) Semantic neural machine translation using AMR. Trans Assoc Comput Linguist 7:19–31
Article Google Scholar
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Wu S, Zhang D et al (2018) Dependency-to-dependency neural machine translation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(11):2132–2141
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Domhan T (2018) How much attention do you need? a granular analysis of neural machine translation architectures. In: Proceedings of the 56th annual meeting of the association for computational linguistics, vol 1: long papers, pp 1799–1808
Sennrich R, Haddow B, Birch A (2016) Neural Machine Translation of Rare Words with Subword Units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1715–1725
Morishita M, Suzuki J, Nagata M (2017) NTT neural machine translation systems at WAT 2017. In: Proceedings of the 4th workshop on Asian translation (WAT2017). Asian Federation of Natural Language Processing, pp 89–94
Chen H, Huang S, Chiang D, Chen J (2017) Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1936–1945
Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311–318
Anastasopoulos A, Chiang D (2018) Tied Multitask Learning for Neural Speech Translation. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp 82–91
Yang B, Tu Z, Wong DF, Meng F, Chao LS, Zhang T (2018) Modeling Localness for Self-Attention Networks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4449–4458
Zhang Z, Wu Y, Zhou J, Duan S, Zhao H, Wang R (2022) SG-Net: Syntax Guided Transformer for Language Representation. IEEE Transactions on Pattern Analysis & Machine Intelligence, 44(06):3285–3299
Article Google Scholar
Deguchi H, Tamura A, Ninomiya T (2021) Synchronous Syntactic Attention for Transformer Neural Machine Translation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop, pp 348–355
Duan S, Zhao H, Zhang D, Wang R (2020) Syntax-aware data augmentation for neural machine translation. arXiv:2004.14200
Yang J, Ma S, Zhang D, Li Z, Zhou M (2020) Improving neural machine translation with soft template prediction. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5979–5989
Conneau A, Kruszewski G, Lample G (2018) What you can cram into a single $ &!#* vector: Probing sentence embeddings for linguistic properties. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 2126–2136
Li J, Yang B, Dou Z.-Y, Wang X, Lyu M.R, Tu Z (2019) Information Aggregation for Multi-Head Attention with Routing-by-Agreement. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 3566–3575
Xu M, Wong DF, Yang B, Zhang Y, Chao LS (2019) Leveraging local and global patterns for self-attention networks. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 3069–3075
Yin Y, Song L, Su J, Zeng J, Zhou C, Luo J (2019) Graph-based neural sentence ordering. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19), pp 5387–5393
Yin Y, Meng F, Su J, Zhou C, Yang Z, Zhou J, Luo J (2020) A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 3025–3035
Ding L, Wang L, Tao D (2020) Self-Attention with Cross-Lingual Position Representation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 1679–1685
Tai KS, Socher R, Manning CD (2015) Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers, pp 1556–1566
Sennrich R, Haddow B (2016) Linguistic Input Features Improve Neural Machine Translation. In: Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers, pp 83–91
Eriguchi A, Hashimoto K, Tsuruoka Y (2016) Tree-to-Sequence Attentional Neural Machine Translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 823–833
Li J, Xiong D, Tu Z, Zhu M, Zhang M, Zhou G (2017) Modeling Source Syntax for Neural Machine Translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 688–697
Zaremoodi P, Haffari G (2018) Incorporating Syntactic Uncertainty in Neural Machine Translation with a Forest-to-Sequence Model. In: Proceedings of the 27th International Conference on Computational Linguistics, pp 1421–1429
Ma C, Tamura A, Utiyama M, Zhao T, Sumita E (2018) Forest-Based Neural Machine Translation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1253–1263
Luong MT, Le QV, Sutskever I, Vinyals O, Kaiser L (2015) Multi-task sequence to sequence learning. arXiv:1511.06114
Niehues J, Cho E (2017) Exploiting Linguistic Resources for Neural Machine Translation Using Multi-task Learning. In: Proceedings of the Second Conference on Machine Translation, pp 80–89
Zhang M, Li Z, Fu G, Zhang M (2019) Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word Representations. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 1151–1161
Marcheggiani D, Bastings J, Titov I (2018) Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pp 486–492

Download references

Author information

Liang Ding, Longyue Wang and Siyou Liu contributed equally to this work.

Authors and Affiliations

The University of Sydney, Sydney, Australia
Liang Ding
Tencent AI Lab, Shenzhen, China
Longyue Wang
Macao Polytechnic University, Macao, China
Siyou Liu

Authors

Liang Ding
View author publications
You can also search for this author in PubMed Google Scholar
Longyue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Siyou Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Longyue Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ding, L., Wang, L. & Liu, S. Recurrent graph encoder for syntax-aware neural machine translation. Int. J. Mach. Learn. & Cyber. 14, 1053–1062 (2023). https://doi.org/10.1007/s13042-022-01682-9

Download citation

Received: 11 April 2022
Accepted: 08 October 2022
Published: 07 November 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s13042-022-01682-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recurrent graph encoder for syntax-aware neural machine translation

Abstract

Access this article

Similar content being viewed by others

A survey on deep learning approaches for text-to-SQL

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Text Data Augmentation for Deep Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Recurrent graph encoder for syntax-aware neural machine translation

Abstract

Access this article

Similar content being viewed by others

A survey on deep learning approaches for text-to-SQL

The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives

Text Data Augmentation for Deep Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation