Skip to main content
Log in

Supervised contrastive learning for robust text adversarial training

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The lack of robustness is a serious problem for deep neural networks (DNNs) and makes DNNs vulnerable to adversarial examples. A promising solution is applying adversarial training to alleviate this problem, which allows the model to learn the features from adversarial examples. However, adversarial training usually produces overfitted models and may not work when facing a new attack. We believe this is because the previous adversarial training using cross-entropy loss ignores the similarity between the adversarial examples and the original examples, which will result in a low margin. Accordingly, we propose a supervised adversarial contrastive learning (SACL) approach for adversarial training. SACL uses supervised adversarial contrastive loss which contains both the cross-entropy term and adversarial contrastive term. The cross-entropy term is used for guiding DNN inductive bias learning, and the adversarial contrastive term can help models learn example representations by maximizing feature consistency under different original examples, which fits well with the goal of solving low margins. In addition, SACL only uses adversarial examples which can successfully fool the model and their corresponding original examples for training. This process is more advantageous to provide the model with more accurate information about the decision boundary and obtain a model that fits the example distribution. Experiments show that SACL can reduce the attack success rate of multiple adversarial attack algorithms against different models on text classification tasks. The defensive performance is significantly better than other adversarial training approaches without reducing the generalization ability of the model. In addition, the DNN model trained by our approach has high transferability and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available in the GitHub repository, [https://github.com/chrisli1995/paper/tree/main/SACL].

References

  1. Akash AK, Lokhande VS, Ravi SN, Singh V (2021) Learning invariant representations using inverse contrastive loss. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 6582–6591

  2. Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, et al (2020) Language models are few-shot learners. In: Proceedings of the 34th conference on neural information processing systems, pp 2339–2352

  3. Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. In: Proceedings of the 34th conference on neural information processing systems

  4. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: Proceedings of the international conference on machine learning, pp 1597–1607

  5. Deng Z, Liu H, Wang Y, Wang C, Yu Z, Sun X (2021) Pml: Progressive margin loss for long-tailed age classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10503–10512

  6. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the North American chapter of the association for computational linguistics, pp 4171–4186

  7. Eger S, Şahin GG, Rücklé A, Lee JU, Schulz C, Mesgar M, Swarnkar K, Simpson E, Gurevych I (2019) Text processing like humans do: visually attacking and shielding NLP systems. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1634–1647

  8. Gan C, Feng Q, Zhang Z (2021) Scalable multi-channel dilated CNN-BiLSTM model with attention mechanism for Chinese textual sentiment analysis. Future Gener Comput Syst 118:297–309

    Article  Google Scholar 

  9. Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. In: Proceedings of the 3rd international conference on learning representations

  10. Gunel B, Du J, Conneau A, Stoyanov V (2020) Supervised contrastive learning for pre-trained language model fine-tuning. In: Proceedings of the 9th international conference on learning representations

  11. He X, Lyu L, Xu Q, Sun L (2021) Model extraction and adversarial transferability, your bert is vulnerable! In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 2006–2012

  12. Jia R, Liang P (2017) Adversarial examples for evaluating reading comprehension systems. In: Proceedings of the conference on empirical methods in natural language processing, pp 2021–2031

  13. Jiang Z, Chen T, Chen T, Wang Z (2020) Robust pre-training by adversarial contrastive learning. In: Proceedings of the 34th conference on neural information processing systems

  14. Jin D, Jin Z, Zhou JT, Szolovits P (2020) Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In: Proceedings of the 34th AAAI conference on artificial intelligence, vol 34, pp 8018–8025

  15. Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. arXiv preprint arXiv:2004.11362

  16. Kim M, Tack J, Hwang SJ (2020) Adversarial self-supervised contrastive learning. In: Proceedings of the 34th conference on neural information processing systems

  17. Li J, Ji S, Du T, Li B, Wang T (2019) Textbugger: generating adversarial text against real-world applications. In: Proceedings of the 26th network and distributed system security symposium

  18. Li D, Zhang Y, Peng H, Chen L, Brockett C, Sun MT, Dolan B (2021) Contextualized perturbation for textual adversarial attack. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 5053–5069

  19. Lin J, Zou J, Ding N (2021) Using adversarial attacks to reveal the statistical bias in machine reading comprehension models. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 333–342

  20. Liu H, Zhang Y, Wang Y, Lin Z, Chen Y (2020) Joint character-level word embedding and adversarial stability training to defend adversarial text. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8384–8391

  21. Lv Y, Wei F, Zheng Y, Wang C, Wan C, Wang C (2021) A span-based model for aspect terms extraction and aspect sentiment classification. Neural Comput Appl 33(8):3769–3779

    Article  Google Scholar 

  22. Maruf S, Saleh F, Haffari G (2021) A survey on document-level neural machine translation: methods and evaluation. ACM Comput Surv 54(2):1–36

    Article  Google Scholar 

  23. Rakhlin A (2014) Convolutional neural networks for sentence classification. In: Proceedings of the conference on empirical methods in natural language processing

  24. Ren S, Deng Y, He K, Che W (2019) Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1085–1097

  25. Shen C, Li Z, Chu Y, Zhao Z (2021) GAR: Graph adversarial representation for adverse drug event detection on twitter. Appl Soft Comput 106:107324

    Article  Google Scholar 

  26. Sun X, Jiang J, Shang Y (2021) ESRE: handling repeated entities in distant supervised relation extraction. Neural Comput Appl 33(8):1–13

    Google Scholar 

  27. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In: Proceedings of the international conference on learning representations

  28. Tan S, Joty S, Kan MY, Socher R (2020) It’s Morphin’Time! combating linguistic discrimination with inflectional perturbations. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 2920–2935

  29. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  30. Wang W, Wang R, Wang L, Wang Z, Ye A (2020) Towards a robust deep neural network in texts: a survey. arXiv preprint arXiv:1902.07285

  31. Wang X, Yang Y, Deng Y, He K (2020) Adversarial training with fast gradient projection method against synonym substitution based text attacks. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 13997–14005

  32. Wang Y, Che W, Titov I, Cohen S, Zhao Z, Liu T (2021) A closer look into the robustness of neural dependency parsers using better adversarial examples. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 2344–2354

  33. Xu Y, Zhong X, Yepes AJ, Lau JH (2021) Grey-box adversarial attack and defence for sentiment classification. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 4078–4087

  34. Yu H, Yang K, Zhang T, Tsai YY, Ho TY, Jin Y (2020) Cloudleak: large-scale deep learning models stealing through adversarial examples. In: Proceedings of the 27th network and distributed system security symposium

  35. Yuan M, Xu Y (2021) Bound estimation-based safe acceleration for maximum margin of twin spheres machine with pinball loss. Pattern Recogn 114:107860

    Article  Google Scholar 

  36. Zhou Q, Zhou W, Wang S, Xing Y (2021) Unsupervised domain adaptation with adversarial distribution adaptation network. Neural Comput Appl 33(13):7709–7721

    Article  Google Scholar 

  37. Zhou Y, Zheng X, Hsieh CJ, Chang KW, Huang X (2021) Defense against synonym substitution-based adversarial attacks via Dirichlet neighborhood ensemble. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 5482–5492

Download references

Acknowledgements

This work is supported by joint funds of China’s national natural science foundation (U1936122) and Primary Research & Developement Plan of Hubei Province (2020BAB101).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Zhao.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Financial interest

The authors have no relevant financial or non-financial interests to disclose. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Zhao, B., An, Y. et al. Supervised contrastive learning for robust text adversarial training. Neural Comput & Applic 35, 7357–7368 (2023). https://doi.org/10.1007/s00521-022-07871-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07871-5

Keywords

Navigation