Skip to main content
Log in

Semantically Smooth Bilingual Phrase Embeddings Based on Recursive Autoencoders

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

In this paper, we propose Semantically Smooth Bilingual Recursive Autoencoders to learn bilingual phrase embeddings. The intuition behind our work is to exploit the intrinsic geometric structure of the embedding space and enforce the learned phrase embeddings to be semantically smooth. Specifically, we extend the conventional bilingual recursive autoencoders by preserving the translation and paraphrase probability distributions via regularization terms to simultaneously exploit richer explicit and implicit similarity constraints for bilingual phrase embeddings. To examine the effectiveness of our model, we incorporate two phrase-level similarity features based on the proposed model into a state-of-the-art phrase-based statistical machine translation system. Experiments on NIST Chinese–English test sets show that our model achieves substantial improvements over the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Note that the source and target languages have different three sets of parameters.

  2. http://homepages.inf.ed.ac.uk/lzhang10/maxent_toolkit.html.

  3. LDC2002E18, LDC2003E07, LDC2003E14, Hansards portion of LDC2004T07, LDC2004T08 and LDC2005T06.

  4. https://nlp.stanford.edu/software/segmenter.html.

  5. http://www.statmt.org/moses/.

  6. http://www.speech.sri.com/projects/srilm/download.html.

  7. https://github.com/moses-smt/mosesdecoder/tree/master/scripts/analysis.

References

  1. Auli M, Galley M, Quirk C, Zweig G (2013) Joint language and translation modeling with recurrent neural networks. In: Proceedings of EMNLP 2013, pp 1044–1054

  2. Bannard C, Callison-Burch C (2005) Paraphrasing with bilingual parallel corpora. In: Proceedings of ACL2005, pp 597–604

  3. Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:22–29

    MathSciNet  MATH  Google Scholar 

  4. Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of EMNLP 2014, pp 1724–1734

  5. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 16:2493–2537

    MATH  Google Scholar 

  6. Devlin J, Zbib R, Huang Z, Lamar T, Schwartz R, Makhoul J (2014) Fast and robust neural network joint models for statistical machine translation. In: Proceedings of ACL 2014, pp 1370–1380

  7. Gao J, He X, Wt Yih, Deng L (2014) Learning continuous phrase representations for translation modeling. In: Proceedings of ACL 2014, pp 699–709

  8. Garmash E, Monz C (2014) Dependency-based bilingual language models for reordering in statistical machine translation. In: Proceedings of EMNLP 2014, pp 1689–1700

  9. Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of EMNLP 2013, pp 1700–1709

  10. Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of EMNLP 2004, pp 388–395

  11. Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the 2003 conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol 1, pp 127–133

  12. Li P, Liu Y, Sun M (2013) Recursive autoencoders for ITG-based translation. In: Proceedings of EMNLP 2013, pp 567–577

  13. Liu L, Watanabe T, Sumita E, Zhao T (2013) Additive neural networks for statistical machine translation. In: Proceedings of ACL 2013, pp 791–801

  14. Liu S, Yang N, Li M, Zhou M (2014) A recursive recurrent neural network for statistical machine translation. In: Proceedings of ACL 2014, pp 1491–1500

  15. Lu S, Chen Z, Xu B (2014) Learning new semi-supervised deep auto-encoder features for statistical machine translation. In: Proceedings of ACL 2014, pp 122–132

  16. Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

    MATH  Google Scholar 

  17. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of ICLR 2013 workshop papers

  18. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings NIPS 2013, pp 3111–3119

  19. Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of ACL 2003, pp 160–167

  20. Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of ACL 2002, 295–302

  21. Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29:19–51

    Article  Google Scholar 

  22. Papineni K, Roukos S, Ward T, Zhu W (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL 2002, pp 311–318

  23. Ren S, Zhang Z, Liu S, Zhou M, Ma S (2019) Unsupervised neural machine translation with smt as posterior regularization. arXiv preprint arXiv:1901.04112

  24. Ruan Z, Su J, Xiong D, Ji R (2018) Context-aware phrase representation for statistical machine translation. In: Proceedings of PRICAI2018, pp 137–149

  25. Socher R, Manning CD, Ng AY (2010) Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: Proceedings of the NIPS 2010 workshop, pp 1–9

  26. Socher R, Huang EH, Pennington J, Ng AY, Manning CD (2011) Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Proceedings of NIPS 2011

  27. Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of EMNLP 2011, pp 151–161

  28. Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of EMNLP 2012, pp 1201–1211

  29. Socher R, Bauer J, Manning CD, Andrew YN (2013) Parsing with compositional vector grammars. In: Proceedings of ACL 2013, pp 455–465

  30. Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP 2013, pp 1631–1642

  31. Su J, Xiong D, Zhang B, Liu Y, Yao J, Zhang M (2015) Bilingual correspondence recursive autoencoder for statistical machine translation. In: Proceedings of EMNLP 2015, pp 1248–1258

  32. Su J, Zhang B, Xiong D, Li R, Yin J (2016) Convolution-enhanced bilingual recursive neural network for bilingual semantic modeling. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 3071–3081

  33. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of WWW 2015, pp 1067–1077

  34. Tang Y, Meng F, Lu Z, Li H, Yu PL (2016) Neural machine translation with external phrase memory. arXiv preprint arXiv:1606.01792

  35. Tran KM, Bisazza A, Monz C (2014) Word translation prediction for morphologically rich languages with bilingual neural networks. In: Proceedings of EMNLP 2014, pp 1676–1688

  36. Vaswani A, Zhao Y, Fossum V, Chiang D (2013) Decoding with large-scale neural language models improves translation. In: Proceedings of EMNLP 2013, pp 1387–1392

  37. Wang X, Xiong D, Zhang M (2015) Learning semantic representations for nonterminals in hierarchical phrase-based translation. In: Proceedings of EMNLP 2015, pp 1391–1400

  38. Wang X, Lu Z, Tu Z, Li H, Xiong D, Zhang M (2017) Neural machine translation advised by statistical machine translation. In: Proceedings of AAAI2017

  39. Wang X, Tu Z, Xiong D, Zhang M (2017) Translating phrases in neural machine translation. In: Proceedings of EMNLP2017, pp 1421–1431

  40. Wu D (1997) Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput Linguist 23(3):377–403

    MathSciNet  Google Scholar 

  41. Wu H, Dong D, Hu X, Yu D, He W, Wu H, Wang H, Liu T (2014) Improve statistical machine translation with context-sensitive bilingual semantic embedding model. In: Proceedings of EMNLP 2014, pp 142–146

  42. Wuebker J, Mauser A, Ney H (2010) Training phrase translation models with leaving-one-out. In: Proceedings of ACL 2010, pp 475–484

  43. Xiong D, Liu Q, Lin S (2006) Maximum entropy based phrase reordering model for statistical machine translation. In: Proceedings of ACL 2006, pp 521–528

  44. Yang N, Liu S, Li M, Zhou M, Yu N (2013) Word alignment modeling with context dependent deep neural network. In: Proceedings of ACL 2013, pp 166–175

  45. Zhang B, Xiong D, Su J, Qin Y (2018) Alignment-supervised bidimensional attention-based recursive autoencoders for bilingual phrase representation. IEEE Trans Cybern 50:503–513

    Article  Google Scholar 

  46. Zhang J, Liu S, Li M, Zhou M, Zong C (2014) Bilingually-constrained phrase embeddings for machine translation. In: Proceedings of ACL 2014, pp 111–121

  47. Zhao Y, Wang Y, Zhang J, Zong C (2018) Phrase table as recommendation memory for neural machine translation. In: Proceedings of IJCAI2018, pp 4609–4615

  48. Zou WY, Socher R, Cer D, Manning CD (2013) Bilingual word embeddings for phrase-based machine translation. In: Proceedings of EMNLP 2013, pp 1393–1398

Download references

Acknowledgements

This work was supported by Natural Science Foundation of China (No. 61672440), National Key R&D Program of China (No. 2019QY1803) the Fundamental Research Funds for the Central Universities (Grant No. ZK1024), Scientific Research Project of National Language Committee of China (Grant No. YB135-49).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinsong Su.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, Q., Yang, J., Zhang, X. et al. Semantically Smooth Bilingual Phrase Embeddings Based on Recursive Autoencoders. Neural Process Lett 51, 2497–2512 (2020). https://doi.org/10.1007/s11063-020-10210-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10210-1

Keywords

Navigation