Skip to main content

Simple Flow-Based Contrastive Learning for BERT Sentence Representations

  • Conference paper
  • First Online:
Book cover Advances in Swarm Intelligence (ICSI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13345))

Included in the following conference series:

  • 618 Accesses

Abstract

Natural language processing is a significant branch of machine learning, and pre-trained models such as BERT have been widely used in it. Previous research has shown that sentence embeddings from pre-trained language models without fine-tune have difficulty in capturing their exact semantics. The ambiguous semantics leads to poor performance on semantic text similarity (STS) tasks. However, fine-tune tends to skew the model toward high-frequency distributions due to the heterogeneous nature of word frequency and word sense distributions. Therefore, fine-tune is not a optimal choice. To address this issue, we propose an unsupervised flow-based contrastive learning model. The model maps sentence embedding distributions to smooth and isotropic Gaussian distributions, thus mitigating the impact caused by irregular word frequency distributions. To evaluate the performance of our model, we use an industry-recognized method that outperforms competing baselines in different sentence-related tasks.

This work is supported by the key cooperation project of Chongqing municipal education commission (HZ2021008), partly funded by the State Key Program of National Nature Science Foundation of China (61936001), National Nature Science Foundation of China (61772096), the Key Research and Development Program of Chongqing (cstc2017zdcy-zdyfx0091) and the Key Research and Development Program on AI of Chongqing (cstc2017rgzn-zdyfx0022).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Association for Computational Linguistics (2019)

    Google Scholar 

  2. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 3980–3990. Association for Computational Linguistics (2019)

    Google Scholar 

  3. Gao, J., He, D., Tan, X., Qin, T., Wang, L.W., Liu, T.Y.: Representation degeneration problem in training natural language generation models. In: 7th International Conference on Learning Representations. OpenReview.net (2019)

    Google Scholar 

  4. Li, B.H., Zhou, H., He, J.X., Wang, M.X., Yang, Y.M., Li, L.: On the sentence embeddings from pre-trained language models. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 9119–9130. Association for Computational Linguistics (2020)

    Google Scholar 

  5. Su, J.L., Cao, J.R., Liu, W.J., Ou, Y.Y.W.: Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316 (2021)

  6. Kim, T., Yoo, K.M., Lee, S.: Self-guided contrastive learning for BERT sentence representations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 2528–2540. Association for Computational Linguistics (2021)

    Google Scholar 

  7. Gao, T.Y., Yao, X.C., Chen, D.Q.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 6894–6910. Association for Computational Linguistics (2021)

    Google Scholar 

  8. Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th International Workshop on Semantic Evaluation, pp. 1–14. Association for Computational Linguistics (2017)

    Google Scholar 

  9. Iyyer, M., Manjunatha, V., Boyd-Graber, J., Daumé III, H.: Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1681–1691. The Association for Computer Linguistics (2015)

    Google Scholar 

  10. Marelli, M., Menini, S., Baroni, M., Bentivogli, L., Bernardi, R., Zamparelli, R.: A SICK cure for the evaluation of compositional distributional semantic models. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation, pp. 216–223. European Language Resources Association (2014)

    Google Scholar 

  11. Agirre, E., et al.: SemEval-2015 task 2: semantic textual similarity, English, Spanish and pilot on interpretability. In: Proceedings of the 9th International Workshop on Semantic Evaluation, pp. 252–263. The Association for Computer Linguistics (2015)

    Google Scholar 

  12. Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680. Association for Computational Linguistics (2017)

    Google Scholar 

  13. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543. ACL (2014)

    Google Scholar 

  14. Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1735–1742. IEEE Computer Society (2006)

    Google Scholar 

  15. Yan, Y.M., Li, R.M., Wang, S.R., Zhang, F.Z., Wu, W., Xu, W.R.: ConSERT: a contrastive framework for self-supervised sentence representation transfer. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 5065–5075. Association for Computational Linguistics (2021)

    Google Scholar 

  16. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 632–642. The Association for Computational Linguistics (2015)

    Google Scholar 

  17. Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1112–1122. Association for Computational Linguistics (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qun Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tian, Z., Liu, Q., Liu, M., Deng, W. (2022). Simple Flow-Based Contrastive Learning for BERT Sentence Representations. In: Tan, Y., Shi, Y., Niu, B. (eds) Advances in Swarm Intelligence. ICSI 2022. Lecture Notes in Computer Science, vol 13345. Springer, Cham. https://doi.org/10.1007/978-3-031-09726-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-09726-3_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-09725-6

  • Online ISBN: 978-3-031-09726-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics