Abstract
In this paper, we propose Semantically Smooth Bilingual Recursive Autoencoders to learn bilingual phrase embeddings. The intuition behind our work is to exploit the intrinsic geometric structure of the embedding space and enforce the learned phrase embeddings to be semantically smooth. Specifically, we extend the conventional bilingual recursive autoencoders by preserving the translation and paraphrase probability distributions via regularization terms to simultaneously exploit richer explicit and implicit similarity constraints for bilingual phrase embeddings. To examine the effectiveness of our model, we incorporate two phrase-level similarity features based on the proposed model into a state-of-the-art phrase-based statistical machine translation system. Experiments on NIST Chinese–English test sets show that our model achieves substantial improvements over the baseline.
Similar content being viewed by others
Notes
Note that the source and target languages have different three sets of parameters.
LDC2002E18, LDC2003E07, LDC2003E14, Hansards portion of LDC2004T07, LDC2004T08 and LDC2005T06.
References
Auli M, Galley M, Quirk C, Zweig G (2013) Joint language and translation modeling with recurrent neural networks. In: Proceedings of EMNLP 2013, pp 1044–1054
Bannard C, Callison-Burch C (2005) Paraphrasing with bilingual parallel corpora. In: Proceedings of ACL2005, pp 597–604
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:22–29
Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of EMNLP 2014, pp 1724–1734
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 16:2493–2537
Devlin J, Zbib R, Huang Z, Lamar T, Schwartz R, Makhoul J (2014) Fast and robust neural network joint models for statistical machine translation. In: Proceedings of ACL 2014, pp 1370–1380
Gao J, He X, Wt Yih, Deng L (2014) Learning continuous phrase representations for translation modeling. In: Proceedings of ACL 2014, pp 699–709
Garmash E, Monz C (2014) Dependency-based bilingual language models for reordering in statistical machine translation. In: Proceedings of EMNLP 2014, pp 1689–1700
Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of EMNLP 2013, pp 1700–1709
Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of EMNLP 2004, pp 388–395
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the 2003 conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol 1, pp 127–133
Li P, Liu Y, Sun M (2013) Recursive autoencoders for ITG-based translation. In: Proceedings of EMNLP 2013, pp 567–577
Liu L, Watanabe T, Sumita E, Zhao T (2013) Additive neural networks for statistical machine translation. In: Proceedings of ACL 2013, pp 791–801
Liu S, Yang N, Li M, Zhou M (2014) A recursive recurrent neural network for statistical machine translation. In: Proceedings of ACL 2014, pp 1491–1500
Lu S, Chen Z, Xu B (2014) Learning new semi-supervised deep auto-encoder features for statistical machine translation. In: Proceedings of ACL 2014, pp 122–132
Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of ICLR 2013 workshop papers
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings NIPS 2013, pp 3111–3119
Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of ACL 2003, pp 160–167
Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of ACL 2002, 295–302
Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29:19–51
Papineni K, Roukos S, Ward T, Zhu W (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL 2002, pp 311–318
Ren S, Zhang Z, Liu S, Zhou M, Ma S (2019) Unsupervised neural machine translation with smt as posterior regularization. arXiv preprint arXiv:1901.04112
Ruan Z, Su J, Xiong D, Ji R (2018) Context-aware phrase representation for statistical machine translation. In: Proceedings of PRICAI2018, pp 137–149
Socher R, Manning CD, Ng AY (2010) Learning continuous phrase representations and syntactic parsing with recursive neural networks. In: Proceedings of the NIPS 2010 workshop, pp 1–9
Socher R, Huang EH, Pennington J, Ng AY, Manning CD (2011) Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Proceedings of NIPS 2011
Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of EMNLP 2011, pp 151–161
Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of EMNLP 2012, pp 1201–1211
Socher R, Bauer J, Manning CD, Andrew YN (2013) Parsing with compositional vector grammars. In: Proceedings of ACL 2013, pp 455–465
Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of EMNLP 2013, pp 1631–1642
Su J, Xiong D, Zhang B, Liu Y, Yao J, Zhang M (2015) Bilingual correspondence recursive autoencoder for statistical machine translation. In: Proceedings of EMNLP 2015, pp 1248–1258
Su J, Zhang B, Xiong D, Li R, Yin J (2016) Convolution-enhanced bilingual recursive neural network for bilingual semantic modeling. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 3071–3081
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: large-scale information network embedding. In: Proceedings of WWW 2015, pp 1067–1077
Tang Y, Meng F, Lu Z, Li H, Yu PL (2016) Neural machine translation with external phrase memory. arXiv preprint arXiv:1606.01792
Tran KM, Bisazza A, Monz C (2014) Word translation prediction for morphologically rich languages with bilingual neural networks. In: Proceedings of EMNLP 2014, pp 1676–1688
Vaswani A, Zhao Y, Fossum V, Chiang D (2013) Decoding with large-scale neural language models improves translation. In: Proceedings of EMNLP 2013, pp 1387–1392
Wang X, Xiong D, Zhang M (2015) Learning semantic representations for nonterminals in hierarchical phrase-based translation. In: Proceedings of EMNLP 2015, pp 1391–1400
Wang X, Lu Z, Tu Z, Li H, Xiong D, Zhang M (2017) Neural machine translation advised by statistical machine translation. In: Proceedings of AAAI2017
Wang X, Tu Z, Xiong D, Zhang M (2017) Translating phrases in neural machine translation. In: Proceedings of EMNLP2017, pp 1421–1431
Wu D (1997) Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput Linguist 23(3):377–403
Wu H, Dong D, Hu X, Yu D, He W, Wu H, Wang H, Liu T (2014) Improve statistical machine translation with context-sensitive bilingual semantic embedding model. In: Proceedings of EMNLP 2014, pp 142–146
Wuebker J, Mauser A, Ney H (2010) Training phrase translation models with leaving-one-out. In: Proceedings of ACL 2010, pp 475–484
Xiong D, Liu Q, Lin S (2006) Maximum entropy based phrase reordering model for statistical machine translation. In: Proceedings of ACL 2006, pp 521–528
Yang N, Liu S, Li M, Zhou M, Yu N (2013) Word alignment modeling with context dependent deep neural network. In: Proceedings of ACL 2013, pp 166–175
Zhang B, Xiong D, Su J, Qin Y (2018) Alignment-supervised bidimensional attention-based recursive autoencoders for bilingual phrase representation. IEEE Trans Cybern 50:503–513
Zhang J, Liu S, Li M, Zhou M, Zong C (2014) Bilingually-constrained phrase embeddings for machine translation. In: Proceedings of ACL 2014, pp 111–121
Zhao Y, Wang Y, Zhang J, Zong C (2018) Phrase table as recommendation memory for neural machine translation. In: Proceedings of IJCAI2018, pp 4609–4615
Zou WY, Socher R, Cer D, Manning CD (2013) Bilingual word embeddings for phrase-based machine translation. In: Proceedings of EMNLP 2013, pp 1393–1398
Acknowledgements
This work was supported by Natural Science Foundation of China (No. 61672440), National Key R&D Program of China (No. 2019QY1803) the Fundamental Research Funds for the Central Universities (Grant No. ZK1024), Scientific Research Project of National Language Committee of China (Grant No. YB135-49).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lin, Q., Yang, J., Zhang, X. et al. Semantically Smooth Bilingual Phrase Embeddings Based on Recursive Autoencoders. Neural Process Lett 51, 2497–2512 (2020). https://doi.org/10.1007/s11063-020-10210-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10210-1