Skip to main content
Log in

Syntax-aware neural machine translation directed by syntactic dependency degree

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

There are various ways to incorporate syntax knowledge into neural machine translation (NMT). However, quantifying the dependency syntactic intimacy (DSI) between word pairs in a dependency tree has not being considered to use in attentional and transformer-based NMT. In this paper, we innovatively propose a variant of Tree-LSTM to capture the syntactic dependency degree (SDD) between word pairs in dependency trees. Two syntax-aware distances, including a tuned syntax distance and a \(\varvec{\rho }\)-dependent distance, are proposed. For attentional NMT, two syntax-aware attentions based on two syntax-aware distances are proposed for attentional NMT, and we also design a dual attention to simultaneously generate global context and dependency syntactic context. For transformer-based NMT, we explicitly incorporate the dependency syntax into self-attention network (SAN) to propose a syntax-aware SAN. Experiments on IWSLT’17 English–German, IWSLT Chinese–English and WMT’15 English–Finnish translation tasks show that our syntax-aware NMT significantly improves translation quality by comparing with baseline methods, even the state-of-the-art transformer-based NMT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://github.com/nyu-dl/dl4mt-tutorial/blob/master/docs/cgru.tex.

  2. https://nlp.stanford.edu/software/dependencies_manual.pdf.

  3. https://sites.google.com/site/iwsltevaluation2017/Dialogues-task.

  4. http://statmt.org/wmt15/translation-task.html.

  5. Since there was no training set of suitable size, we combined Chinese–English bilingual data of IWSLT2012, 2013, 2014, 2015, 2017, and cleaned duplicate sentences.

  6. https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/multi-bleu-detok.perl.

  7. https://github.com/moses-smt/mosesdecoder/blob/master/scripts/analysis/bootstrap-hypothesis-difference-significance.pl.

  8. https://github.com/OpenNMT/OpenNMT-py.

References

  1. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proc. Int. Conf. Learn. Represent

  2. Calixto I, Liu Q, Campbell N (2017) Doubly-attentive decoder for multi-modal neural machine translation. In: Proc. 55th Annu. Meet. Assoc. Comput. Linguist., pp. 1913–1924

  3. Cettolo M, Niehues J, Stüker S, Bentivogli L, Cattoni R, Federico M (2015) The iwslt 2015 evaluation campaign. In: Proc. 12th Int. Workshop Spoken Lang. Trans

  4. Cettolo M, Niehues J, Stüker S, Bentivogli L, Federico M (2013) Report on the 10th iwslt evaluation campaign. In: Proc. 10th Int. Workshop Spoken Lang. Trans., pp. 29–38

  5. Cettolo M, Niehues J, Stüker S, Bentivogli L, Federico M (2014) Report on the 11th iwslt evaluation campaign, iwslt 2014. In: Proc. 11th Int. Workshop Spoken Lang. Trans., vol. 57

  6. Chen K, Wang R, Utiyama M, Liu L, Tamura A, Sumita E, Zhao T (2017) Neural machine translation with source dependency representation. In: Proc. Conf. Empir. Methods Nat. Lang. Process., pp. 2846–2852

  7. Chen K, Wang R, Utiyama M, Sumita E, Zhao T (2018) Syntax-directed attention for neural machine translation. In: Proc. 32nd AAAI Conf. Artif. Intell., pp. 4792–4799

  8. Eriguchi A, Hashimoto K, Tsuruoka Y (2016) Tree-to-sequence attentional neural machine translation. In: Proc. 54th Annu. Meet. Assoc. Comput. Linguist., pp. 823–833

  9. Eriguchi A, Tsuruoka Y, Cho K (2017) Learning to parse and translate improves neural machine translation. In: Proc. 55th Annu. Meet. Assoc. Comput. Linguist., pp. 72–78

  10. Gū J, Shavarani HS, Sarkar A (2018) Top-down tree structured decoding with syntactic connections for neural machine translation and parsing. In: Proc. Conf. Empir. Methods Nat. Lang. Process., pp. 401–413

  11. Hashimoto K, Tsuruoka Y (2017) Neural machine translation with source-side latent graph parsing. In: Proc. Conf. Empir. Methods Nat. Lang. Process., pp. 125–135

  12. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580

  13. Hudson R (1995) Measuring syntactic difficulty. Manuscript

  14. Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proc. Conf. Empir. Methods Nat. Lang. Process., pp. 388–395

  15. Li J, Xiong D, Tu Z, Zhu M, Zhang M, Zhou G (2017) Modeling source syntax for neural machine translation. In: Proc. 55th Annu. Meet. Assoc. Comput. Linguist., pp. 688–697

  16. Liu H (2007) Dependency relations and dependency distance: a statistical view based on treebank. In: Proc. Int. Conf. Mean. Text. Theory., pp. 269–278

  17. Liu H (2008) Dependency distance as a metric of language comprehension difficulty. J Cogn Sc 9(2):159–191

    Article  Google Scholar 

  18. Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proc. Conf. Empir. Methods Nat. Lang. Process., pp. 1412–1421

  19. Nicenboim B, Vasishth S, Gattei C, Sigman M, Kliegl R (2015) Working memory differences in long-distance dependency resolution. Front Psychol 6:312–328

    Article  Google Scholar 

  20. Nivre J, De Marneffe MC, Ginter F, Goldberg Y, Hajic J, Manning CD, McDonald R, Petrov S, Pyysalo S, Silveira N, et al. (2016) Universal dependencies v1: a multilingual treebank collection. In: Proc. 10th Int. Conf. Lang. Resour. Eval., pp. 1659–1666

  21. Oya M (2011) Syntactic dependency distance as sentence complexity measure. In: Proc. 16th Int. Conf. Pan Pac. Assoc. Appl. Linguist., pp. 313–316

  22. Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proc. 40th Annu. Meet. Assoc. Comput. Linguist., pp. 311–318

  23. Peng R, Chen Z, Hao T, Fang Y (2019) Neural machine translation with attention based on a new syntactic branch distance. In: Proc. 15th China. Conf. Mach. Trans., pp. 47–57

  24. Sennrich R, Haddow B (2016) Linguistic input features improve neural machine translation. In: Proc. 1st Conf. Mach. Trans., pp. 83–91

  25. Sethuraman J (1994) A constructive definition of dirichlet priors. Stat Sin 4(2):639–650

    MathSciNet  MATH  Google Scholar 

  26. Shen Y, Lin Z, wei Huang C, Courville A (2018) Neural language modeling by jointly learning syntax and lexicon. In: Proc. 6th Int. Conf. Learn. Represent

  27. Shen Y, Lin Z, Jacob AP, Sordoni A, Courville A, Bengio Y (2018) Straight to the tree: constituency parsing with neural syntactic distance. In: Proc. 56th Annu. Meet. Assoc. Comput. Linguist., pp. 1171–1180

  28. Shi L, Niu C, Zhou M, Gao J (2006) A DOM tree alignment model for mining parallel data from the web. In: Proc. 21st Int. Conf. Comput. Linguist. 44th Annu. Meet. Assoc. Comput. Linguist., pp. 489–496

  29. Steele D, Sim Smith K, Specia L (2015) Sheffield systems for the Finnish-English WMT translation task. In: Proc. Conf. Empir. Methods Nat. Lang. Process., pp. 172–176

  30. Su J, Chen J, Jiang H, Zhou C, Lin H, Ge Y, Wu Q, Lai Y (2021) Multi-modal neural machine translation with deep semantic interactions. Inf Sci 554:47–60

    Article  MathSciNet  Google Scholar 

  31. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proc. 53rd Annu. Meet. Assoc. Comput. Linguist. 7th Int. Jt. Conf. Nat. Lang. Process. Asian Fed. Nat. Lang. Process., pp. 1556–1566

  32. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Adv. neural inf. proces. syst., pp. 5998–6008

  33. Wu S, Zhang D, Zhang Z, Yang N, Li M, Zhou M (2018) Dependency-to-dependency neural machine translation. IEEE-ACM T AUDIO SPE 26(11):2132–2141

    Google Scholar 

  34. Wu Y, Schuster M, Chen Z, Le Q, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser u, Gouws S, Kato Y, Kudo T, Kazawa H, Dean J (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144, 2016

  35. Yang B, Tu Z, Wong DF, Meng F, Chao LS, Zhang T (2018) Modeling localness for self-attention networks. In: Proc. Conf. Empir. Methods Nat. Lang. Process., pp. 4449–4458

  36. Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inf 13(2):616–624

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China under Grants 61772146. The author would like to thank Biao Zhang from the University of Edinburgh for his assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tianyong Hao.

Ethics declarations

Conflict of interest

The authors confirm that this paper content has no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, R., Hao, T. & Fang, Y. Syntax-aware neural machine translation directed by syntactic dependency degree. Neural Comput & Applic 33, 16609–16625 (2021). https://doi.org/10.1007/s00521-021-06256-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06256-4

Keywords

Navigation