Abstract
One challenge in Natural Language Processing (NLP) area is to learn semantic representation in different contexts. Recent works on pre-trained language model have received great attentions and have been proven as an effective technique. In spite of the success of pre-trained language model in many NLP tasks, the learned text representation only contains the correlation among the words in the sentence itself and ignores the implicit relationship between arbitrary tokens in the sequence. To address this problem, we focus on how to make our model effectively learn word representations that contain the relational information between any tokens of text sequences. In this paper, we propose to integrate the relational network(RN) into a Wasserstein autoencoder(WAE). Specifically, WAE and RN are used to better keep the semantic structurse and capture the relational information, respectively. Extensive experiments demonstrate that our proposed model achieves significant improvements over the traditional Seq2Seq baselines.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Rumelhart, D.E., Hinton, G.E., Williams, R.J., et al.: Learning representations by back-propagating errors. Nature 323(6088), 696–699 (1988)
Bahdanau, D., Cho, K., Bengio, Y., et al.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations (2015)
Luong, M., Pham, H., Manning, C.D., et al.: Effective approaches to attention-based neural machine translation. In: Empirical Methods in Natural Language Processing, pp. 1412–1421 (2015). https://doi.org/10.18653/v1/d15-1166
Klein, T., Nabi, M.: Attention is (not) all you need for commonsense reasoning. In: Meeting of the Association for Computational Linguistics, pp. 4831–4836 (2019). https://doi.org/10.18653/v1/p19-1477
Tan, Z., Wang, M., Xie, J., et al.: Deep semantic role labeling with self-attention. In: National Conference on Artificial Intelligence, pp. 4929–4936 (2018)
Santoro, A., Raposo, D., Barrett, D.G., et al.: A simple neural network module for relational reasoning. In: Neural Information Processing Systems, pp. 4967–4976 (2017)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: International Conference on Learning Representations (2014)
Peters, M.E., Neumann, M., Iyyer, M., et al.: Deep contextualized word representations. In: North American Chapter of the Association for Computational Linguistics, pp. 2227–2237 (2018). https://doi.org/10.18653/v1/n18-1202
Devlin, J., Chang, M., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics, pp. 4171–4186 (2019). https://doi.org/10.18653/v1/n19-1423
Lan, Z., Chen, M., Goodman, S., et al.: ALBERT: A Lite BERT for self-supervised learning of language representations. In: International Conference on Learning Representations (2020)
Zhang, Z., Han, X., Liu, Z., et al.: ERNIE: enhanced language representation with informative entities. In: Meeting of the Association for Computational Linguistics, pp. 1441–1451 (2019). https://doi.org/10.18653/v1/n19-1423
Sun, Y., Wang, S., Li, Y., et al.: ERNIE 2.0: a continual pre-training framework for language understanding. arXiv: Computation and Language (2019)
Yang, Z., Dai, Z., Yang, Y., et al.: XLNet: generalized autoregressive pretraining for language understanding. arXiv: Computation and Language (2019)
Bowman, S.R., Vilnis, L., Vinyals, O., et al.: Generating sentences from a continuous space. In: Conference on Computational Natural Language Learning, pp. 10–21 (2016). DOIurlhttp://doi.org/10.18653/v1/k16-1002
Tolstikhin, I., Bousquet, O., Gelly, S., et al.: Wasserstein auto-encoders. In: International Conference on Learning Representations (2018)
Zhang, B., Xiong, D., Su, J., et al.: Variational neural machine translation. In: Empirical Methods in Natural Language Processing, pp. 521–530 (2016)
Shah, H., Barber, D.: Generative neural machine translation. In: Neural Information Processing Systems, pp. 1346–1355 (2018)
Bahuleyan, H., Mou L., Zhou, H., et al.: Stochastic wasserstein autoencoder for probabilistic sentence generation. In: North American Chapter of the Association for Computational Linguistics, pp. 4068–4076 (2019). https://doi.org/10.18653/v1/n19-1411
Wang, P.Z., Wang, W.Y.: Riemannian normalizing flow on variational wasserstein autoencoder for text modeling. In: North American Chapter of the Association for Computational Linguistics, pp. 284–294 (2019). https://doi.org/10.18653/v1/n19-1025
Zhang, W., Jiawei, H., Feng, Y., et al.: Refining source representations with relation networks for neural machine translation. In: International Conference on Computational Linguistics, pp. 1292–1303 (2018)
Chen, H., Lin, Z., Ding, G., et al.: GRN: gated relation network to enhance convolutional neural network for named entity recognition. In: National Conference on Artificial Intelligence, vol. 33, no. 01, pp. 6236–6243 (2019). https://doi.org/10.1609/aaai.v33i01.33016236
Pradhan, S., Moschitti, A., Xe, N., et al.: CoNLL-2012 shared task: modeling multilingual unrestricted coreference in OntoNotes. In: Empirical Methods in Natural Language Processing, pp. 1–40 (2012)
Ranzato, M., Chopra, S., Auli, M., et al.: Sequence level training with recurrent neural networks. In: International Conference on Learning Representations (2016)
Shu, R., Nakayama, H.: Compressing word embeddings via deep compositional code learning. In: International Conference on Learning Representations (2018)
Huang, P., Wang, C., Huang, S., et al.: Towards neural phrase-based machine translation. In: International Conference on Learning Representations (2018)
Eikema, B., Aziz, W.: Auto-encoding variational neural machine translation. In: Meeting of the Association for Computational Linguistics, pp. 124–141 (2019). https://doi.org/10.18653/v1/w19-4315
Pradhan, S., Moschitti, A., Xue, N., et al.: Towards robust linguistic analysis using OntoNotes. In: Conference on Computational Natural Language Learning, pp. 143–152 (2013)
Acknowledgements
This work was supported by the Science and Technology Planning Project of Henan Province of China (Grant No. 182102210513 and 182102310945) and the National Natural Science Foundation of China(Grant No. 61672361 and 61772020).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, X., Liu, X., Yang, G., Li, F., Liu, W. (2020). WAE\(_{-}\)RN: Integrating Wasserstein Autoencoder and Relational Network for Text Sequence. In: Sun, M., Li, S., Zhang, Y., Liu, Y., He, S., Rao, G. (eds) Chinese Computational Linguistics. CCL 2020. Lecture Notes in Computer Science(), vol 12522. Springer, Cham. https://doi.org/10.1007/978-3-030-63031-7_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-63031-7_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63030-0
Online ISBN: 978-3-030-63031-7
eBook Packages: Computer ScienceComputer Science (R0)