Skip to main content

Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network

  • Conference paper
  • First Online:
  • 1404 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12858))

Abstract

Semantic sentence matching concerns predicting the relationship between a pair of natural language sentences. Recently, many methods based on interaction structure have been proposed, usually involving encoder, matching, and aggregation parts. Although some of them obtain impressive results, the simple encoder training from scratch cannot extract the global features of sentences effectively, and the transmission of information in the stacked network will cause certain loss. In this paper, we propose a Densely-connected Inference-Attention network (DCIA) to maximize the use of the feature from each layer of the network by dense connection mechanism and to get robust encoder by self-supervised learning (SSL) based on contrastive method, which can maximize the mutual information between global features and local features of input data. We have conducted experiments on Quora, MRPC, and SICK dataset, the experimental results show that our method owns competitive results on these dataset where we drive 89.13%, 78.1% and 87.7% accuracies respectively. In addition, the accuracy of DCIA with SSL will surpass the one of DCIA without SSL by about 2%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: EMNLP (2015)

    Google Scholar 

  2. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “Siamese” time delay neural network. In: NIPS (1993)

    Google Scholar 

  3. Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: EMNLP (2017)

    Google Scholar 

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)

    Google Scholar 

  5. Dolan, W., Brockett, C.: Automatically constructing a corpus of sentential paraphrases. In: IWP@IJCNLP (2005)

    Google Scholar 

  6. Gong, Y., Luo, H., Zhang, J.: Natural language inference over interaction space. In: ICLR (2017)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  8. Hjelm, R.D., et al.: Learning deep representations by mutual information estimation and maximization. In: ICLR (2019)

    Google Scholar 

  9. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR (2017)

    Google Scholar 

  10. Jernite, Y., Bowman, S.R., Sontag, D.: Discourse-based objectives for fast unsupervised sentence representation learning. Computation and Language. arXiv:1705.00557 (2017)

  11. Kim, S., Kang, I., Kwak, N.: Semantic sentence matching with densely-connected recurrent and co-attentive information. In: AAAI (2018)

    Google Scholar 

  12. Kiros, R., et al.: Skip-thought vectors. In: NIPS (2015)

    Google Scholar 

  13. Klein, T., Nabi, M.: Contrastive self-supervised learning for commonsense reasoning. In: ACL (2020)

    Google Scholar 

  14. Lan, W., Xu, W.: Neural network models for paraphrase identification, semantic textual similarity, natural language inference, and question answering. In: COLING (2018)

    Google Scholar 

  15. Liu, X., et al.: Self-supervised learning: Generative or contrastive. arXiv preprint arXiv:2006.08218 (2020)

  16. Marelli, M., Menini, S., Baroni, M., Bentivogli, L., Bernardi, R., Zamparelli, R.: A sick cure for the evaluation of compositional distributional semantic models. In: LREC (2014)

    Google Scholar 

  17. Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: EMNLP (2014)

    Google Scholar 

  18. Qian, C., Zhu, X., Ling, Z.H., Si, W., Inkpen, D.: Enhanced LSTM for natural language inference. In: ACL (2017)

    Google Scholar 

  19. Radford, A., Jozefowicz, R., Sutskever, I., A, B.: Learning to generate reviews and discovering sentiment. Machine Learning. arXiv:1704.01444 (2018)

  20. Ravanelli, M., et al.: Multi-task self-supervised learning for robust speech recognition. In: ICASSP (2020)

    Google Scholar 

  21. Siddhant, A., et al.: Leveraging monolingual data with self-supervision for multilingual neural machine translation. In: ACL (2020)

    Google Scholar 

  22. Srivastava, R.K., Greff, K., Schmidhuber, J.: Training very deep networks. In: NIPS (2015)

    Google Scholar 

  23. Subramanian, S., Trischler, A., Bengio, Y., Pal, C.J.: Learning general purpose distributed sentence representations via large scale multi-task learning. In: ICLR (2018)

    Google Scholar 

  24. Tomar, G.S., Duque, T., Täckström, O., Uszkoreit, J., Das, D.: Neural paraphrase identification of questions with noisy pretraining. In: SWCN@EMNLP (2017)

    Google Scholar 

  25. Vaswani, A., et al.: Attention is all you need. In: NIPS (2017)

    Google Scholar 

  26. Wang, H., et al.: Self-supervised learning for contextualized extractive summarization. In: ACL (2019)

    Google Scholar 

  27. Wang, S., Jiang, J.: A compare-aggregate model for matching text sequences. In: ICLR (2016)

    Google Scholar 

  28. Wang, Z., Hamza, W., Florian, R.: Bilateral multi-perspective matching for natural language sentences. In: IJCAI (2017)

    Google Scholar 

  29. Yang, R., Zhang, J., Gao, X., Ji, F., Chen, H.: Simple and effective text matching with richer alignment features. In: ACL (2019)

    Google Scholar 

  30. Zhou, X., et al.: Multi-turn response selection for chatbots with deep attention matching network. In: ACL (2018)

    Google Scholar 

Download references

Acknowledgements

This paper is supported by National Key Research and Development Program of China under grant No. 2018YFB0204403, No. 2017YFB1401202 and No. 2018YFB1003500. Corresponding author is Jianzong Wang from Ping An Technology (Shenzhen) Co., Ltd.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, F., Wang, J., Tao, D., Cheng, N., Xiao, J. (2021). Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network. In: U, L.H., Spaniol, M., Sakurai, Y., Chen, J. (eds) Web and Big Data. APWeb-WAIM 2021. Lecture Notes in Computer Science(), vol 12858. Springer, Cham. https://doi.org/10.1007/978-3-030-85896-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85896-4_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85895-7

  • Online ISBN: 978-3-030-85896-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics