Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network

Yu, Fengying; Wang, Jianzong; Tao, Dewei; Cheng, Ning; Xiao, Jing

doi:10.1007/978-3-030-85896-4_21

Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network

Fengying Yu¹²,
Jianzong Wang¹²,
Dewei Tao¹²,
Ning Cheng¹² &
…
Jing Xiao¹²

Conference paper
First Online: 19 August 2021

1404 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12858))

Abstract

Semantic sentence matching concerns predicting the relationship between a pair of natural language sentences. Recently, many methods based on interaction structure have been proposed, usually involving encoder, matching, and aggregation parts. Although some of them obtain impressive results, the simple encoder training from scratch cannot extract the global features of sentences effectively, and the transmission of information in the stacked network will cause certain loss. In this paper, we propose a Densely-connected Inference-Attention network (DCIA) to maximize the use of the feature from each layer of the network by dense connection mechanism and to get robust encoder by self-supervised learning (SSL) based on contrastive method, which can maximize the mutual information between global features and local features of input data. We have conducted experiments on Quora, MRPC, and SICK dataset, the experimental results show that our method owns competitive results on these dataset where we drive 89.13%, 78.1% and 87.7% accuracies respectively. In addition, the accuracy of DCIA with SSL will surpass the one of DCIA without SSL by about 2%.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: EMNLP (2015)
Google Scholar
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “Siamese” time delay neural network. In: NIPS (1993)
Google Scholar
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: EMNLP (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)
Google Scholar
Dolan, W., Brockett, C.: Automatically constructing a corpus of sentential paraphrases. In: IWP@IJCNLP (2005)
Google Scholar
Gong, Y., Luo, H., Zhang, J.: Natural language inference over interaction space. In: ICLR (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Hjelm, R.D., et al.: Learning deep representations by mutual information estimation and maximization. In: ICLR (2019)
Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR (2017)
Google Scholar
Jernite, Y., Bowman, S.R., Sontag, D.: Discourse-based objectives for fast unsupervised sentence representation learning. Computation and Language. arXiv:1705.00557 (2017)
Kim, S., Kang, I., Kwak, N.: Semantic sentence matching with densely-connected recurrent and co-attentive information. In: AAAI (2018)
Google Scholar
Kiros, R., et al.: Skip-thought vectors. In: NIPS (2015)
Google Scholar
Klein, T., Nabi, M.: Contrastive self-supervised learning for commonsense reasoning. In: ACL (2020)
Google Scholar
Lan, W., Xu, W.: Neural network models for paraphrase identification, semantic textual similarity, natural language inference, and question answering. In: COLING (2018)
Google Scholar
Liu, X., et al.: Self-supervised learning: Generative or contrastive. arXiv preprint arXiv:2006.08218 (2020)
Marelli, M., Menini, S., Baroni, M., Bentivogli, L., Bernardi, R., Zamparelli, R.: A sick cure for the evaluation of compositional distributional semantic models. In: LREC (2014)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: EMNLP (2014)
Google Scholar
Qian, C., Zhu, X., Ling, Z.H., Si, W., Inkpen, D.: Enhanced LSTM for natural language inference. In: ACL (2017)
Google Scholar
Radford, A., Jozefowicz, R., Sutskever, I., A, B.: Learning to generate reviews and discovering sentiment. Machine Learning. arXiv:1704.01444 (2018)
Ravanelli, M., et al.: Multi-task self-supervised learning for robust speech recognition. In: ICASSP (2020)
Google Scholar
Siddhant, A., et al.: Leveraging monolingual data with self-supervision for multilingual neural machine translation. In: ACL (2020)
Google Scholar
Srivastava, R.K., Greff, K., Schmidhuber, J.: Training very deep networks. In: NIPS (2015)
Google Scholar
Subramanian, S., Trischler, A., Bengio, Y., Pal, C.J.: Learning general purpose distributed sentence representations via large scale multi-task learning. In: ICLR (2018)
Google Scholar
Tomar, G.S., Duque, T., Täckström, O., Uszkoreit, J., Das, D.: Neural paraphrase identification of questions with noisy pretraining. In: SWCN@EMNLP (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)
Google Scholar
Wang, H., et al.: Self-supervised learning for contextualized extractive summarization. In: ACL (2019)
Google Scholar
Wang, S., Jiang, J.: A compare-aggregate model for matching text sequences. In: ICLR (2016)
Google Scholar
Wang, Z., Hamza, W., Florian, R.: Bilateral multi-perspective matching for natural language sentences. In: IJCAI (2017)
Google Scholar
Yang, R., Zhang, J., Gao, X., Ji, F., Chen, H.: Simple and effective text matching with richer alignment features. In: ACL (2019)
Google Scholar
Zhou, X., et al.: Multi-turn response selection for chatbots with deep attention matching network. In: ACL (2018)
Google Scholar

Download references

Acknowledgements

This paper is supported by National Key Research and Development Program of China under grant No. 2018YFB0204403, No. 2017YFB1401202 and No. 2018YFB1003500. Corresponding author is Jianzong Wang from Ping An Technology (Shenzhen) Co., Ltd.

Author information

Authors and Affiliations

Ping An Technology (Shenzhen) Co., Ltd., Shenzhen, China
Fengying Yu, Jianzong Wang, Dewei Tao, Ning Cheng & Jing Xiao

Authors

Fengying Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jianzong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dewei Tao
View author publications
You can also search for this author in PubMed Google Scholar
Ning Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Jing Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Macau, Macau, China
Leong Hou U
University of Caen Normandie, Caen, France
Marc Spaniol
Osaka University, Osaka, Japan
Yasushi Sakurai
South China University of Technology, Guangzhou, China
Junying Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, F., Wang, J., Tao, D., Cheng, N., Xiao, J. (2021). Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network. In: U, L.H., Spaniol, M., Sakurai, Y., Chen, J. (eds) Web and Big Data. APWeb-WAIM 2021. Lecture Notes in Computer Science(), vol 12858. Springer, Cham. https://doi.org/10.1007/978-3-030-85896-4_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-85896-4_21
Published: 19 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85895-7
Online ISBN: 978-3-030-85896-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics