Contrastive learning with text augmentation for text classification

Jia, Ouyang; Huang, Huimin; Ren, Jiaxin; Xie, Luodi; Xiao, Yinyin

doi:10.1007/s10489-023-04453-3

Contrastive learning with text augmentation for text classification

Published: 09 March 2023

Volume 53, pages 19522–19531, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ouyang Jia¹,
Huimin Huang²,
Jiaxin Ren³,
Luodi Xie⁴ &
…
Yinyin Xiao⁴

1024 Accesses
1 Altmetric
Explore all metrics

Abstract

Various contrastive learning models have been successfully applied to representation learning for downstream tasks. The positive samples used in contrastive learning are often derived from augmented data, which improve the performance of many computer vision tasks while still not being fully utilized for natural language processing tasks, such as text classification. The existing data augmentation methods have been rarely applied to contrastive learning in the field of NLP. In this paper, we propose a Text Augmentation Contrastive Learning Representation model, TACLR, that combines the easy text augmentation techniques (i.e., synonym replacement, random insertion, random swap and random deletion) and textMixup augmentation method with contrastive learning for text classification task. Furthermore, we propose a unified method that allows flexibly adapting supervised, semi-supervised and unsupervised learning. Experimental results on five text classification datasets show that our TACLR can significantly improve text classification accuracies. We also provide extensive ablation studies for exploring the validity of each component of our model. The source code of our work is publicly available from https://gitlab.com/models-for-paper/taclr.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pre-trained Data Augmentation for Text Classification

SPACL: Shared-Private Architecture Based on Contrastive Learning for Multi-domain Text Classification

Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling

Notes

References

Kim Y (2015) Convolutional neural networks for sentence classification. arXiv:1408.5882
Huang Z, Xu W, Yu K (2015) Bidirectional lstm-crf models for sequence tagging. arXiv:1508.01991
Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. CoRR arXiv:1807.03748
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. arXiv:2002.05709
Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv:2003.04297
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
Gutmann M, Hyvärinen A (2010) Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 297–304
Wang WY, Yang D (2015) That’s so annoying!!!: a lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using #petpeeve tweets. In: Proceedings of the 2015 conference on empirical methods in natural language processing, association for computational linguistics, lisbon, portugal, pp 2557–2563. https://doi.org/10.18653/v1/D15-1306. https://www.aclweb.org/anthology/D15-1306
Buckchash H, Raman B (2020) Dutrinet: dual-stream triplet siamese network for self-supervised action recognition by modeling temporal correlations. In: 2020 IEEE 32nd international conference on tools with artificial intelligence, ICTAI. IEEE, pp 488-495
Buslaev A, Iglovikov VI, Khvedchenya E, Parinov A, Druzhinin M, Kalinin AA (2020) Albumentations: fast and flexible image augmentations. Information vol 11(2). https://doi.org/10.3390/info11020125 https://www.mdpi.com/2078-2489/11/2/125
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: beyond empirical risk minimization. arXiv:1710.09412
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Wang X, Qi G-J (2021) Contrastive learning with stronger augmentations. https://openreview.net/forum?id=KJSC_AsN14
Bachman P, Hjelm RD, Buchwalter W (2019) Learning representations by maximizing mutual information across views. arXiv:1906.00910
Harris E, Marcu A, Painter M, Niranjan M, Hare AP-BJ (2020) Fmix: enhancing mixed sample data augmentation, vol 2(3), p 4 . arXiv:2002.12047
Zhang X, Wang Q, Zhang J, Zhong Z (2019) Adversarial autoaugment. arXiv:1912.11188
Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2018) Autoaugment: learning augmentation policies from data. arXiv:1805.09501
He X, Zhao K, Chu X (2019) Automl: a survey of the state-of-the-art. arXiv:1908.00709 1
Wei J, Zou K (2019) Eda: easy data augmentation techniques for boosting performance on text classification tasks. arXiv:1901.11196
Cheng Y, Tu Z, Meng F, Zhai J, Liu Y (2018) Towards robust neural machine translation. arXiv:1805.06130
Guo H, Mao Y, Zhang R (2019) Augmenting data with mixup for sentence classification: an empirical study. CoRR arXiv:1905.08941
Guo Z, Liu Z, Ling Z, Wang S, Jin L, Li Y (2020) Text classification by contrastive learning and cross-lingual data augmentation for alzheimer’s disease detection. In: Proceedings of the 28th international conference on computational linguistics, pp 6161–6171
Wang Z, Wang P, Huang L, Sun X, Wang H (2022) Incorporating hierarchy into text encoder : a contrastive learning approach for hierarchical text classification. arXiv:2203.03825
Chen Q, Zhang R, Zheng Y, Mao Y (2022) Dual contrastive learning: text classification via label-aware data augmentation. arXiv:2201.08702
Sennrich R, Haddow B, Birch A (2015) Improving neural machine translation models with monolingual data. arXiv:1511.06709
Xie Q, Dai Z, Hovy E, Luong M-T, Le QV (2019) Unsupervised data augmentation for consistency training. arXiv:1904.12848
Wang WY, Yang D (2015) That’s so annoying!!!: a lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using# petpeeve tweets. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 2557–2563
Wu X, Lv S, Zang L, Han J, Hu S (2019) Conditional BERT contextual augmentation. CoRR arXiv:1812.06705
Qu Y, Shen D, Shen Y, Sajeev S, Chen W, Han J (2021) Co{da}: contrast-enhanced and diversity-promoting data augmentation for natural language understanding. In: International conference on learning representations. https://openreview.net/forum?id=Ozk9MrX1hvA
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2, IEEE, pp 1735–1742
Oord Avd, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv:1807.03748
Hjelm RD, Fedorov A, Lavoie-Marchildon S, Grewal K, Bachman P, Trischler A, Bengio Y (2018) Learning deep representations by mutual information estimation and maximization. arXiv:1808.06670
Wu Z, Xiong Y, Yu S, Lin D (2018) Unsupervised feature learning via non-parametric instance-level discrimination. arXiv:1805.01978
He K, Fan H, Wu Y, Xie S, Girshick RB (2020) Momentum contrast for unsupervised visual representation learning. CoRR arXiv:1911.05722
Kim S, Lee G, Bae S, Yun S-Y (2020) Mixco: mix-up contrastive learning for visual representation. arXiv:2010.06300
Giorgi JM, Nitski O, Bader GD, Wang B (2020) Declutr: deep contrastive learning for unsupervised textual representations. arXiv:2006.03659
Miyato T, Dai AM, Goodfellow I (2016) Adversarial training methods for semi-supervised text classification. arXiv:1605.07725
Johnson R, Zhang T (2015) Semi-supervised convolutional neural networks for text categorization via region embedding. Adv Neural Inf Process Syst 28:919
Google Scholar
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Computat 18(7):1527–1554
Article MathSciNet MATH Google Scholar
Maaløe L, Sønderby CK, Sønderby SK, Winther O (2016) Auxiliary deep generative models. In: International conference on machine learning, PMLR, pp 1445–1453
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Liu Q, Gao Z, Liu B (2015) Automated rule selection for aspect extraction in opinion mining. In: Twenty-fourth international joint conference on artificial intelligence
Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. arXiv:0506075
Socher R, Bauer J, Manning CD, Ng AY (2013) Parsing with compositional vector grammars. In: Proceedings of the 51st annual meeting of the association for computational linguistics (vol 1: long papers), pp 455–465
Li X, Roth D (2002) Learning question classifiers. In: COLING 2002: the 19th international conference on computational linguistics
Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. arXiv:0409058
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv:1802.05365

Download references

Author information

Authors and Affiliations

School of Cyber Security, Guangdong Polytechnic Normal University, Guangzhou, China
Ouyang Jia
School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, China
Huimin Huang
JD.com Inc., Beijing, China
Jiaxin Ren
School of Computer Science, Sun Yat-sen University, Guangzhou, China
Luodi Xie & Yinyin Xiao

Authors

Ouyang Jia
View author publications
You can also search for this author in PubMed Google Scholar
Huimin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaxin Ren
View author publications
You can also search for this author in PubMed Google Scholar
Luodi Xie
View author publications
You can also search for this author in PubMed Google Scholar
Yinyin Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Huimin Huang or Jiaxin Ren.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jia, O., Huang, H., Ren, J. et al. Contrastive learning with text augmentation for text classification. Appl Intell 53, 19522–19531 (2023). https://doi.org/10.1007/s10489-023-04453-3

Download citation

Accepted: 27 December 2022
Published: 09 March 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s10489-023-04453-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Contrastive learning with text augmentation for text classification

Abstract

Access this article

Similar content being viewed by others

Pre-trained Data Augmentation for Text Classification

SPACL: Shared-Private Architecture Based on Contrastive Learning for Multi-domain Text Classification

Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Contrastive learning with text augmentation for text classification

Abstract

Access this article

Similar content being viewed by others

Pre-trained Data Augmentation for Text Classification

SPACL: Shared-Private Architecture Based on Contrastive Learning for Multi-domain Text Classification

Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation