Supervised contrastive learning for robust text adversarial training

Li, Weidong; Zhao, Bo; An, Yang; Shangguan, Chenhan; Ji, Minzi; Yuan, Anqi

doi:10.1007/s00521-022-07871-5

Supervised contrastive learning for robust text adversarial training

Original Article
Published: 23 December 2022

Volume 35, pages 7357–7368, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Weidong Li¹,
Bo Zhao ORCID: orcid.org/0000-0003-4307-9380¹,
Yang An²,
Chenhan Shangguan¹,
Minzi Ji¹ &
…
Anqi Yuan¹

560 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The lack of robustness is a serious problem for deep neural networks (DNNs) and makes DNNs vulnerable to adversarial examples. A promising solution is applying adversarial training to alleviate this problem, which allows the model to learn the features from adversarial examples. However, adversarial training usually produces overfitted models and may not work when facing a new attack. We believe this is because the previous adversarial training using cross-entropy loss ignores the similarity between the adversarial examples and the original examples, which will result in a low margin. Accordingly, we propose a supervised adversarial contrastive learning (SACL) approach for adversarial training. SACL uses supervised adversarial contrastive loss which contains both the cross-entropy term and adversarial contrastive term. The cross-entropy term is used for guiding DNN inductive bias learning, and the adversarial contrastive term can help models learn example representations by maximizing feature consistency under different original examples, which fits well with the goal of solving low margins. In addition, SACL only uses adversarial examples which can successfully fool the model and their corresponding original examples for training. This process is more advantageous to provide the model with more accurate information about the decision boundary and obtain a model that fits the example distribution. Experiments show that SACL can reduce the attack success rate of multiple adversarial attack algorithms against different models on text classification tasks. The defensive performance is significantly better than other adversarial training approaches without reducing the generalization ability of the model. In addition, the DNN model trained by our approach has high transferability and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating Defensive Distillation for Defending Text Processing Neural Networks Against Adversarial Examples

DaCon: Multi-Domain Text Classification Using Domain Adversarial Contrastive Learning

Chinese Text Classification Based on Adversarial Training

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Data availability

The datasets generated during and/or analysed during the current study are available in the GitHub repository, [https://github.com/chrisli1995/paper/tree/main/SACL].

References

Akash AK, Lokhande VS, Ravi SN, Singh V (2021) Learning invariant representations using inverse contrastive loss. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 6582–6591
Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, et al (2020) Language models are few-shot learners. In: Proceedings of the 34th conference on neural information processing systems, pp 2339–2352
Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. In: Proceedings of the 34th conference on neural information processing systems
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: Proceedings of the international conference on machine learning, pp 1597–1607
Deng Z, Liu H, Wang Y, Wang C, Yu Z, Sun X (2021) Pml: Progressive margin loss for long-tailed age classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10503–10512
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the North American chapter of the association for computational linguistics, pp 4171–4186
Eger S, Şahin GG, Rücklé A, Lee JU, Schulz C, Mesgar M, Swarnkar K, Simpson E, Gurevych I (2019) Text processing like humans do: visually attacking and shielding NLP systems. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1634–1647
Gan C, Feng Q, Zhang Z (2021) Scalable multi-channel dilated CNN-BiLSTM model with attention mechanism for Chinese textual sentiment analysis. Future Gener Comput Syst 118:297–309
Article Google Scholar
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. In: Proceedings of the 3rd international conference on learning representations
Gunel B, Du J, Conneau A, Stoyanov V (2020) Supervised contrastive learning for pre-trained language model fine-tuning. In: Proceedings of the 9th international conference on learning representations
He X, Lyu L, Xu Q, Sun L (2021) Model extraction and adversarial transferability, your bert is vulnerable! In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 2006–2012
Jia R, Liang P (2017) Adversarial examples for evaluating reading comprehension systems. In: Proceedings of the conference on empirical methods in natural language processing, pp 2021–2031
Jiang Z, Chen T, Chen T, Wang Z (2020) Robust pre-training by adversarial contrastive learning. In: Proceedings of the 34th conference on neural information processing systems
Jin D, Jin Z, Zhou JT, Szolovits P (2020) Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In: Proceedings of the 34th AAAI conference on artificial intelligence, vol 34, pp 8018–8025
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. arXiv preprint arXiv:2004.11362
Kim M, Tack J, Hwang SJ (2020) Adversarial self-supervised contrastive learning. In: Proceedings of the 34th conference on neural information processing systems
Li J, Ji S, Du T, Li B, Wang T (2019) Textbugger: generating adversarial text against real-world applications. In: Proceedings of the 26th network and distributed system security symposium
Li D, Zhang Y, Peng H, Chen L, Brockett C, Sun MT, Dolan B (2021) Contextualized perturbation for textual adversarial attack. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 5053–5069
Lin J, Zou J, Ding N (2021) Using adversarial attacks to reveal the statistical bias in machine reading comprehension models. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 333–342
Liu H, Zhang Y, Wang Y, Lin Z, Chen Y (2020) Joint character-level word embedding and adversarial stability training to defend adversarial text. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8384–8391
Lv Y, Wei F, Zheng Y, Wang C, Wan C, Wang C (2021) A span-based model for aspect terms extraction and aspect sentiment classification. Neural Comput Appl 33(8):3769–3779
Article Google Scholar
Maruf S, Saleh F, Haffari G (2021) A survey on document-level neural machine translation: methods and evaluation. ACM Comput Surv 54(2):1–36
Article Google Scholar
Rakhlin A (2014) Convolutional neural networks for sentence classification. In: Proceedings of the conference on empirical methods in natural language processing
Ren S, Deng Y, He K, Che W (2019) Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1085–1097
Shen C, Li Z, Chu Y, Zhao Z (2021) GAR: Graph adversarial representation for adverse drug event detection on twitter. Appl Soft Comput 106:107324
Article Google Scholar
Sun X, Jiang J, Shang Y (2021) ESRE: handling repeated entities in distant supervised relation extraction. Neural Comput Appl 33(8):1–13
Google Scholar
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In: Proceedings of the international conference on learning representations
Tan S, Joty S, Kan MY, Socher R (2020) It’s Morphin’Time! combating linguistic discrimination with inflectional perturbations. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 2920–2935
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Wang W, Wang R, Wang L, Wang Z, Ye A (2020) Towards a robust deep neural network in texts: a survey. arXiv preprint arXiv:1902.07285
Wang X, Yang Y, Deng Y, He K (2020) Adversarial training with fast gradient projection method against synonym substitution based text attacks. In: Proceedings of the 35th AAAI conference on artificial intelligence, pp 13997–14005
Wang Y, Che W, Titov I, Cohen S, Zhao Z, Liu T (2021) A closer look into the robustness of neural dependency parsers using better adversarial examples. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 2344–2354
Xu Y, Zhong X, Yepes AJ, Lau JH (2021) Grey-box adversarial attack and defence for sentiment classification. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 4078–4087
Yu H, Yang K, Zhang T, Tsai YY, Ho TY, Jin Y (2020) Cloudleak: large-scale deep learning models stealing through adversarial examples. In: Proceedings of the 27th network and distributed system security symposium
Yuan M, Xu Y (2021) Bound estimation-based safe acceleration for maximum margin of twin spheres machine with pinball loss. Pattern Recogn 114:107860
Article Google Scholar
Zhou Q, Zhou W, Wang S, Xing Y (2021) Unsupervised domain adaptation with adversarial distribution adaptation network. Neural Comput Appl 33(13):7709–7721
Article Google Scholar
Zhou Y, Zheng X, Hsieh CJ, Chang KW, Huang X (2021) Defense against synonym substitution-based adversarial attacks via Dirichlet neighborhood ensemble. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, pp 5482–5492

Download references

Acknowledgements

This work is supported by joint funds of China’s national natural science foundation (U1936122) and Primary Research & Developement Plan of Hubei Province (2020BAB101).

Author information

Authors and Affiliations

School of Cyber Science and Engineering, Wuhan University, Wuhan, China
Weidong Li, Bo Zhao, Chenhan Shangguan, Minzi Ji & Anqi Yuan
School of Computer Science, Wuhan University, Wuhan, China
Yang An

Authors

Weidong Li
View author publications
You can also search for this author inPubMed Google Scholar
Bo Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Yang An
View author publications
You can also search for this author inPubMed Google Scholar
Chenhan Shangguan
View author publications
You can also search for this author inPubMed Google Scholar
Minzi Ji
View author publications
You can also search for this author inPubMed Google Scholar
Anqi Yuan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Bo Zhao.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Financial interest

The authors have no relevant financial or non-financial interests to disclose. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, W., Zhao, B., An, Y. et al. Supervised contrastive learning for robust text adversarial training. Neural Comput & Applic 35, 7357–7368 (2023). https://doi.org/10.1007/s00521-022-07871-5

Download citation

Received: 20 October 2021
Accepted: 21 September 2022
Published: 23 December 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s00521-022-07871-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised contrastive learning for robust text adversarial training

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluating Defensive Distillation for Defending Text Processing Neural Networks Against Adversarial Examples

DaCon: Multi-Domain Text Classification Using Domain Adversarial Contrastive Learning

Chinese Text Classification Based on Adversarial Training

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Financial interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now