Skip to main content
Log in

Contrastive learning with text augmentation for text classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Various contrastive learning models have been successfully applied to representation learning for downstream tasks. The positive samples used in contrastive learning are often derived from augmented data, which improve the performance of many computer vision tasks while still not being fully utilized for natural language processing tasks, such as text classification. The existing data augmentation methods have been rarely applied to contrastive learning in the field of NLP. In this paper, we propose a Text Augmentation Contrastive Learning Representation model, TACLR, that combines the easy text augmentation techniques (i.e., synonym replacement, random insertion, random swap and random deletion) and textMixup augmentation method with contrastive learning for text classification task. Furthermore, we propose a unified method that allows flexibly adapting supervised, semi-supervised and unsupervised learning. Experimental results on five text classification datasets show that our TACLR can significantly improve text classification accuracies. We also provide extensive ablation studies for exploring the validity of each component of our model. The source code of our work is publicly available from https://gitlab.com/models-for-paper/taclr.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://github.com/huggingface/transformers

  2. https://gitlab.com/models-for-paper/taclr

References

  1. Kim Y (2015) Convolutional neural networks for sentence classification. arXiv:1408.5882

  2. Huang Z, Xu W, Yu K (2015) Bidirectional lstm-crf models for sequence tagging. arXiv:1508.01991

  3. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  4. van den Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. CoRR arXiv:1807.03748

  5. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. arXiv:2002.05709

  6. Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv:2003.04297

  7. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781

  8. Gutmann M, Hyvärinen A (2010) Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 297–304

  9. Wang WY, Yang D (2015) That’s so annoying!!!: a lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using #petpeeve tweets. In: Proceedings of the 2015 conference on empirical methods in natural language processing, association for computational linguistics, lisbon, portugal, pp 2557–2563. https://doi.org/10.18653/v1/D15-1306. https://www.aclweb.org/anthology/D15-1306

  10. Buckchash H, Raman B (2020) Dutrinet: dual-stream triplet siamese network for self-supervised action recognition by modeling temporal correlations. In: 2020 IEEE 32nd international conference on tools with artificial intelligence, ICTAI. IEEE, pp 488-495

  11. Buslaev A, Iglovikov VI, Khvedchenya E, Parinov A, Druzhinin M, Kalinin AA (2020) Albumentations: fast and flexible image augmentations. Information vol 11(2). https://doi.org/10.3390/info11020125https://www.mdpi.com/2078-2489/11/2/125

  12. Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: beyond empirical risk minimization. arXiv:1710.09412

  13. He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738

  14. Wang X, Qi G-J (2021) Contrastive learning with stronger augmentations. https://openreview.net/forum?id=KJSC_AsN14

  15. Bachman P, Hjelm RD, Buchwalter W (2019) Learning representations by maximizing mutual information across views. arXiv:1906.00910

  16. Harris E, Marcu A, Painter M, Niranjan M, Hare AP-BJ (2020) Fmix: enhancing mixed sample data augmentation, vol 2(3), p 4 . arXiv:2002.12047

  17. Zhang X, Wang Q, Zhang J, Zhong Z (2019) Adversarial autoaugment. arXiv:1912.11188

  18. Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2018) Autoaugment: learning augmentation policies from data. arXiv:1805.09501

  19. He X, Zhao K, Chu X (2019) Automl: a survey of the state-of-the-art. arXiv:1908.00709 1

  20. Wei J, Zou K (2019) Eda: easy data augmentation techniques for boosting performance on text classification tasks. arXiv:1901.11196

  21. Cheng Y, Tu Z, Meng F, Zhai J, Liu Y (2018) Towards robust neural machine translation. arXiv:1805.06130

  22. Guo H, Mao Y, Zhang R (2019) Augmenting data with mixup for sentence classification: an empirical study. CoRR arXiv:1905.08941

  23. Guo Z, Liu Z, Ling Z, Wang S, Jin L, Li Y (2020) Text classification by contrastive learning and cross-lingual data augmentation for alzheimer’s disease detection. In: Proceedings of the 28th international conference on computational linguistics, pp 6161–6171

  24. Wang Z, Wang P, Huang L, Sun X, Wang H (2022) Incorporating hierarchy into text encoder : a contrastive learning approach for hierarchical text classification. arXiv:2203.03825

  25. Chen Q, Zhang R, Zheng Y, Mao Y (2022) Dual contrastive learning: text classification via label-aware data augmentation. arXiv:2201.08702

  26. Sennrich R, Haddow B, Birch A (2015) Improving neural machine translation models with monolingual data. arXiv:1511.06709

  27. Xie Q, Dai Z, Hovy E, Luong M-T, Le QV (2019) Unsupervised data augmentation for consistency training. arXiv:1904.12848

  28. Wang WY, Yang D (2015) That’s so annoying!!!: a lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using# petpeeve tweets. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 2557–2563

  29. Wu X, Lv S, Zang L, Han J, Hu S (2019) Conditional BERT contextual augmentation. CoRR arXiv:1812.06705

  30. Qu Y, Shen D, Shen Y, Sajeev S, Chen W, Han J (2021) Co{da}: contrast-enhanced and diversity-promoting data augmentation for natural language understanding. In: International conference on learning representations. https://openreview.net/forum?id=Ozk9MrX1hvA

  31. Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2, IEEE, pp 1735–1742

  32. Oord Avd, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv:1807.03748

  33. Hjelm RD, Fedorov A, Lavoie-Marchildon S, Grewal K, Bachman P, Trischler A, Bengio Y (2018) Learning deep representations by mutual information estimation and maximization. arXiv:1808.06670

  34. Wu Z, Xiong Y, Yu S, Lin D (2018) Unsupervised feature learning via non-parametric instance-level discrimination. arXiv:1805.01978

  35. He K, Fan H, Wu Y, Xie S, Girshick RB (2020) Momentum contrast for unsupervised visual representation learning. CoRR arXiv:1911.05722

  36. Kim S, Lee G, Bae S, Yun S-Y (2020) Mixco: mix-up contrastive learning for visual representation. arXiv:2010.06300

  37. Giorgi JM, Nitski O, Bader GD, Wang B (2020) Declutr: deep contrastive learning for unsupervised textual representations. arXiv:2006.03659

  38. Miyato T, Dai AM, Goodfellow I (2016) Adversarial training methods for semi-supervised text classification. arXiv:1605.07725

  39. Johnson R, Zhang T (2015) Semi-supervised convolutional neural networks for text categorization via region embedding. Adv Neural Inf Process Syst 28:919

    Google Scholar 

  40. Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Computat 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  41. Maaløe L, Sønderby CK, Sønderby SK, Winther O (2016) Auxiliary deep generative models. In: International conference on machine learning, PMLR, pp 1445–1453

  42. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543

  43. Liu Q, Gao Z, Liu B (2015) Automated rule selection for aspect extraction in opinion mining. In: Twenty-fourth international joint conference on artificial intelligence

  44. Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. arXiv:0506075

  45. Socher R, Bauer J, Manning CD, Ng AY (2013) Parsing with compositional vector grammars. In: Proceedings of the 51st annual meeting of the association for computational linguistics (vol 1: long papers), pp 455–465

  46. Li X, Roth D (2002) Learning question classifiers. In: COLING 2002: the 19th international conference on computational linguistics

  47. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. arXiv:0409058

  48. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. arXiv:1802.05365

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Huimin Huang or Jiaxin Ren.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, O., Huang, H., Ren, J. et al. Contrastive learning with text augmentation for text classification. Appl Intell 53, 19522–19531 (2023). https://doi.org/10.1007/s10489-023-04453-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04453-3

Keywords

Navigation