Abstract
Natural language processing is a significant branch of machine learning, and pre-trained models such as BERT have been widely used in it. Previous research has shown that sentence embeddings from pre-trained language models without fine-tune have difficulty in capturing their exact semantics. The ambiguous semantics leads to poor performance on semantic text similarity (STS) tasks. However, fine-tune tends to skew the model toward high-frequency distributions due to the heterogeneous nature of word frequency and word sense distributions. Therefore, fine-tune is not a optimal choice. To address this issue, we propose an unsupervised flow-based contrastive learning model. The model maps sentence embedding distributions to smooth and isotropic Gaussian distributions, thus mitigating the impact caused by irregular word frequency distributions. To evaluate the performance of our model, we use an industry-recognized method that outperforms competing baselines in different sentence-related tasks.
This work is supported by the key cooperation project of Chongqing municipal education commission (HZ2021008), partly funded by the State Key Program of National Nature Science Foundation of China (61936001), National Nature Science Foundation of China (61772096), the Key Research and Development Program of Chongqing (cstc2017zdcy-zdyfx0091) and the Key Research and Development Program on AI of Chongqing (cstc2017rgzn-zdyfx0022).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186. Association for Computational Linguistics (2019)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 3980–3990. Association for Computational Linguistics (2019)
Gao, J., He, D., Tan, X., Qin, T., Wang, L.W., Liu, T.Y.: Representation degeneration problem in training natural language generation models. In: 7th International Conference on Learning Representations. OpenReview.net (2019)
Li, B.H., Zhou, H., He, J.X., Wang, M.X., Yang, Y.M., Li, L.: On the sentence embeddings from pre-trained language models. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 9119–9130. Association for Computational Linguistics (2020)
Su, J.L., Cao, J.R., Liu, W.J., Ou, Y.Y.W.: Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316 (2021)
Kim, T., Yoo, K.M., Lee, S.: Self-guided contrastive learning for BERT sentence representations. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 2528–2540. Association for Computational Linguistics (2021)
Gao, T.Y., Yao, X.C., Chen, D.Q.: SimCSE: simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 6894–6910. Association for Computational Linguistics (2021)
Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th International Workshop on Semantic Evaluation, pp. 1–14. Association for Computational Linguistics (2017)
Iyyer, M., Manjunatha, V., Boyd-Graber, J., Daumé III, H.: Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1681–1691. The Association for Computer Linguistics (2015)
Marelli, M., Menini, S., Baroni, M., Bentivogli, L., Bernardi, R., Zamparelli, R.: A SICK cure for the evaluation of compositional distributional semantic models. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation, pp. 216–223. European Language Resources Association (2014)
Agirre, E., et al.: SemEval-2015 task 2: semantic textual similarity, English, Spanish and pilot on interpretability. In: Proceedings of the 9th International Workshop on Semantic Evaluation, pp. 252–263. The Association for Computer Linguistics (2015)
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680. Association for Computational Linguistics (2017)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543. ACL (2014)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1735–1742. IEEE Computer Society (2006)
Yan, Y.M., Li, R.M., Wang, S.R., Zhang, F.Z., Wu, W., Xu, W.R.: ConSERT: a contrastive framework for self-supervised sentence representation transfer. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 5065–5075. Association for Computational Linguistics (2021)
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 632–642. The Association for Computational Linguistics (2015)
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1112–1122. Association for Computational Linguistics (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Tian, Z., Liu, Q., Liu, M., Deng, W. (2022). Simple Flow-Based Contrastive Learning for BERT Sentence Representations. In: Tan, Y., Shi, Y., Niu, B. (eds) Advances in Swarm Intelligence. ICSI 2022. Lecture Notes in Computer Science, vol 13345. Springer, Cham. https://doi.org/10.1007/978-3-031-09726-3_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-09726-3_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09725-6
Online ISBN: 978-3-031-09726-3
eBook Packages: Computer ScienceComputer Science (R0)