An Unsupervised Sentence Embedding Method by Maximizing the Mutual Information of Augmented Text Representations

Sheng, Tianye; Wang, Lisong; He, Zongfeng; Sun, Mingjie; Jiang, Guohua

doi:10.1007/978-3-031-15931-2_15

Tianye Sheng¹²,
Lisong Wang¹²,
Zongfeng He¹²,
Mingjie Sun¹² &
…
Guohua Jiang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13530))

Included in the following conference series:

International Conference on Artificial Neural Networks

2229 Accesses
1 Citations

Abstract

For natural language processing tasks with unlabeled or partially labeled datasets, it is vital to learn sentence representations in an unsupervised manner. However, unsupervised methods pale by comparison to supervised ones on many tasks. Recently, some unsupervised methods propose to learn sentence representations by maximizing the mutual information between text representations of different levels, such as global MI maximization: global and global representations, local MI maximization: local and global representations. Among these methods, local MI maximization encourages the global representations to capture useful information that shared across the local contexts. Despite this advantage, this method suffers from the inherent gap of semantic information contained in the global representations and the local representations. Consequently, the performance is inferior to models using global MI maximization as well as supervised ones. In this paper, we propose an unsupervised sentence embedding method by maximizing the mutual information of augmented text representations. Experimental results show that our model achieves an average of 73.36% Spearman’s correlation on a series of semantic text similarity tasks, a 7 points improvement compared to the previous best model using local MI maximization. Furthermore, our model outperforms models using global MI maximization and close the gap to supervised methods to 1.5 points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A.: Semeval-2012 task 6: a pilot on semantic textual similarity. In: * SEM 2012: The First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pp. 385–393 (2012)
Google Scholar
Belghazi, M.I., et al.: Mutual information neural estimation. In: International Conference on Machine Learning, pp. 531–540. PMLR (2018)
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)
Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., Specia, L.: Semeval-2017 task 1: semantic textual similarity-multilingual and cross-lingual focused evaluation. arXiv preprint arXiv:1708.00055 (2017)
Cer, D., et al.: Universal sentence encoder for English. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 169–174 (2018)
Google Scholar
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Superv.sied learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680 (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (1) (2019)
Google Scholar
Hill, F., Cho, K., Korhonen, A.: Learning distributed representations of sentences from unlabelled data. In: HLT-NAACL (2016)
Google Scholar
Hjelm, R.D., et al.: Learning deep representations by mutual information estimation and maximization. In: International Conference on Learning Representations (2018)
Google Scholar
Kiros, R., Zhu, Y., Salakhutdinov, R.R., Zemel, R., Urtasun, R., Torralba, A., Fidler, S.: Skip-thought vectors. In: Advances in Neural Information Processing Systems, pp. 3294–3302 (2015)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196. PMLR (2014)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A., Bottou, L.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(12) (2010)
Google Scholar
Li, B., Zhou, H., He, J., Wang, M., Yang, Y., Li, L.: On the sentence embeddings from pre-trained language models. arXiv preprint arXiv:2011.05864 (2020)
Logeswaran, L., Lee, H.: An efficient framework for learning sentence representations. In: International Conference on Learning Representations (2018)
Google Scholar
Marelli, M., Menini, S., Baroni, M., Bentivogli, L., Bernardi, R., Zamparelli, R., et al.: A sick cure for the evaluation of compositional distributional semantic models. In: Lrec, pp. 216–223. Reykjavik (2014)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Nowozin, S., Cseke, B., Tomioka, R.: f-gan: training generative neural samplers using variational divergence minimization. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 271–279 (2016)
Google Scholar
Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992 (2019)
Google Scholar
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. In: 2018 Conference of the North American Chapter of the Association for Computational Lingusitics: Human Language Technologies, NAACL HLT 2018, pp. 1112–1122. Association for Computational Lingu.sitics (ACL) (2018)
Google Scholar
Yan, Y., Li, R., Wang, S., Zhang, F., Wu, W., Xu, W.: Consert: a contrastive framework for self-supervised sentence representation transfer. arXiv preprint arXiv:2105.11741 (2021)
Zhang, Y., He, R., Liu, Z., Lim, K.H., Bing, L.: An unsupervisied sentence embedding method by mutual information maximization. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1601–1610 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Nanjing University of Aeronautics and Astronautics, Nanjing, China
Tianye Sheng, Lisong Wang, Zongfeng He, Mingjie Sun & Guohua Jiang

Authors

Tianye Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Lisong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zongfeng He
View author publications
You can also search for this author in PubMed Google Scholar
Mingjie Sun
View author publications
You can also search for this author in PubMed Google Scholar
Guohua Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianye Sheng .

Editor information

Editors and Affiliations

University of the West of England, Bristol, UK
Elias Pimenidis
Lancaster University, Lancaster, UK
Plamen Angelov
Digital Innovation, Teeside University, Middlesbrough, UK
Chrisina Jayne
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
The University of the West of England, Bristol, UK
Mehmet Aydin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sheng, T., Wang, L., He, Z., Sun, M., Jiang, G. (2022). An Unsupervised Sentence Embedding Method by Maximizing the Mutual Information of Augmented Text Representations. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13530. Springer, Cham. https://doi.org/10.1007/978-3-031-15931-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-15931-2_15
Published: 07 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15930-5
Online ISBN: 978-3-031-15931-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Unsupervised Sentence Embedding Method by Maximizing the Mutual Information of Augmented Text Representations