Adversarial perturbation denoising utilizing common characteristics in deep feature space

Huang, Jianchang; Dai, Yinyao; Lu, Fang; Wang, Bin; Gu, Zhaoquan; Zhou, Boyang; Qian, Yaguan

doi:10.1007/s10489-023-05253-5

Adversarial perturbation denoising utilizing common characteristics in deep feature space

Published: 13 January 2024

Volume 54, pages 1672–1690, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Jianchang Huang¹,
Yinyao Dai¹,
Fang Lu¹,
Bin Wang²,
Zhaoquan Gu³,
Boyang Zhou⁴ &
…
Yaguan Qian ORCID: orcid.org/0000-0003-4056-9755¹

191 Accesses
Explore all metrics

Abstract

Recent studies have shown that deep neural networks (DNNs) are vulnerable to adversarial examples (AEs). Denoising based on the input pre-processing is one of the defenses against adversarial attacks. However, it is hard to remove multiple adversarial perturbations, especially in the presence of evolving attacks. To address this challenge, we attempt to extract the commonality of adversarial perturbations. Due to the imperceptibility of adversarial perturbations in the input space, we conduct the extraction in the deep feature space where the perturbations become more apparent. Through the obtained common characteristics, we craft common adversarial examples (CAEs) to train the denoiser. Furthermore, to prevent image distortion while removing as much of the adversarial perturbation as possible, we propose a hybrid loss function that guides the training process at both the pixel level and the deep feature space. Our experiments show that our defense method can eliminate multiple adversarial perturbations, significantly enhancing adversarial robustness compared to previous state-of-the-art methods. Moreover, it can be plug-and-play for various classification models, which demonstrates the generalizability of our defense method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 1

Fig. 4

Multi-scale Features Destructive Universal Adversarial Perturbations

Transferable Adversarial Perturbations

Generating transferable adversarial examples based on perceptually-aligned perturbation

Article 12 January 2021

Data Availability and Access

The data used to support the findings of this study will be available from the corresponding author upon request after acceptance.

References

Abbas A, Abdelsamea MM, Gaber MM (2021) Classification of COVID-19 in chest x-ray images using detrac deep convolutional neural network. Appl Intell 51(2):854–864
Article Google Scholar
Szegedy C, Zaremba W, Sutskever I, et al (2014) Intriguing properties of neural networks. In: Proc Int Conf Learn Represent (ICLR)
Tamou AB, Benzinou A, Nasreddine K (2021) Multi-stream fish detection in unconstrained underwater videos by the fusion of two convolutional neural network detectors. Appl Intell 51(8):5809–5821
Article Google Scholar
Liu M, Zhang Y, Xu J et al (2021) Deep bi-directional interaction network for sentence matching. Appl Intell 51(7):4305–4329
Article Google Scholar
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: Proc Int Conf Learn Represent (ICLR)
Yurtsever E, Lambert J, Carballo A et al (2020) A survey of autonomous driving: common practices and emerging technologies. IEEE Access 8:58443–58469
Article Google Scholar
Pham Q, Sevestre P, Pahwa RS, et al (2020) A*3d dataset: Towards autonomous driving in challenging environments. In: 2020 IEEE international conference on robotics and automation, ICRA 2020, Paris, France, May 31 - August 31, 2020. IEEE, pp 2267–2273
Hoai NV, Nguyen HM, Pham C (2022) Masked face recognition with convolutional neural networks and local binary patterns. Appl Intell 52(5):5497–5512
Article Google Scholar
Xu K, Zhang G, Liu S, et al (2020) Adversarial t-shirt! evading person detectors in a physical world. In: Proceeding European Conference Computer Vision (ECCV), Lecture Notes in Computer Science, vol 12350. Springer, pp 665–681
Ho J, Lee B, Kang D (2022) Attack-less adversarial training for a robust adversarial defense. Appl Intell 52(4):4364–4381
Article Google Scholar
Tramèr F, Kurakin A, Papernot N, et al (2018) Ensemble adversarial training: attacks and defenses. In: Proc Int Conf Learn Represent (ICLR)
Jia X, Zhang Y, Wu B et al (2022) Boosting fast adversarial training with learnable adversarial initialization. IEEE Trans Image Process 31:4417–4430
Article ADS PubMed Google Scholar
Madry A, Makelov A, Schmidt L, et al (2018) Towards deep learning models resistant to adversarial attacks. In: Proc Int Conf Learn Represent (ICLR)
Gu S, Rigazio L (2015) Towards deep neural network architectures robust to adversarial examples. In: Proc Int Conf Learn Represent (ICLR)
Wang S, Gong Y (2022) Adversarial example detection based on saliency map features. Appl Intell 52(6):6262–6275
Article Google Scholar
Sarvar A, Amirmazlaghani M (2023) Defense against adversarial examples based on wavelet domain analysis. Appl Intell 53(1):423–439
Article Google Scholar
Guo C, Rana M, Cissé M, et al (2018) Countering adversarial images using input transformations. In: Proc Int Conf Learn Represent (ICLR)
Liao F, Liang M, Dong Y, et al (2018) Defense against adversarial attacks using high-level representation guided denoiser. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE Computer Society, pp 1778–1787
Xu W, Evans D, Qi Y (2018) Feature squeezing: Detecting adversarial examples in deep neural networks. In: 25th annual network and distributed system security symposium, NDSS 2018, San Diego, California, USA, February 18-21, 2018. The Internet Society
He W, Wei J, Chen X et al (2017) (2017) Adversarial example defense: ensembles of weak defenses are not strong. In: Enck W, Mulliner C (eds) 11th USENIX Workshop on Offensive Technologies, WOOT 2017. BC, Canada, August, Vancouver, pp 14–15
Google Scholar
Samangouei P, Kabkab M, Chellappa R (2018) Defense-gan: Protecting classifiers against adversarial attacks using generative models. In: Proc Int Conf Learn Represent (ICLR)
Jin G, Shen S, Zhang D, et al (2019) APE-GAN: adversarial perturbation elimination with GAN. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. IEEE, pp 3842–3846
Dong Y, Liao F, Pang T, et al (2018) Boosting adversarial attacks with momentum. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE Computer Society, pp 9185–9193
Meng D, Chen H (2017) Magnet: A two-pronged defense against adversarial examples. In: Thuraisingham B, Evans D, Malkin T, et al (eds) Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017. ACM, pp 135–147
Carlini N, Wagner DA (2017) Magnet and “efficient defenses against adversarial attacks” are not robust to adversarial examples. arXiv:1711.08478
Dong Y, Liao F, Pang T, et al (2018) Boosting adversarial attacks with momentum. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE Computer Society, pp 9185–9193
Moosavi-Dezfooli S, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). IEEE Computer Society, pp 2574–2582
Rony J, Hafemann LG, Oliveira LS, et al (2019) Decoupling direction and norm for efficient gradient-based L2 adversarial attacks and defenses. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 4322–4330
Croce F, Hein M (2020) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: Proc Int Conf Mach Learn (ICML), Proceedings of Machine Learning Research, vol 119. PMLR, pp 2206–2216
Carlini N, Wagner DA (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017. IEEE Computer Society, pp 39–57
Dong Y, Pang T, Su H, et al (2019) Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 4312–4321
Xie C, Zhang Z, Zhou Y, et al (2019) Improving transferability of adversarial examples with input diversity. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 2730–2739
Wu K, Wang AH, Yu Y (2020) Stronger and faster wasserstein adversarial attacks. In: Proc Int Conf Mach Learn (ICML), Proceedings of Machine Learning Research, vol 119. PMLR, pp 10377–10387
Xiao C, Zhu J, Li B, et al (2018) Spatially transformed adversarial examples. In: Proc Int Conf Learn Represent (ICLR)
Tramèr F, Kurakin A, Papernot N, et al (2018) Ensemble adversarial training: Attacks and defenses. In: Proc Int Conf Learn Represent (ICLR)
Yan Z, Guo Y, Zhang C (2018) Deep defense: Training dnns with improved adversarial robustness. In: Bengio S, Wallach HM, Larochelle H, et al (eds) Proc Adv Neural Inf Process Syst (NeurIPS), pp 417–426
Xie C, Wang J, Zhang Z, et al (2018) Mitigating adversarial effects through randomization. In: Proc Int Conf Learn Represent (ICLR)
Jia X, Wei X, Cao X, et al (2019) Comdefend: An efficient image compression model to defend adversarial examples. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 6084–6092
Xie C, Wu Y, van der Maaten L, et al (2019) Feature denoising for improving adversarial robustness. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 501–509
Zhou D, Wang N, Peng C, et al (2021) Removing adversarial noise in class activation feature space. In: Proc IEEE/CVF Int Conf Comput Vis (ICCV). IEEE, pp 7858–7867
Zhang H, Yu Y, Jiao J, et al (2019) Theoretically principled trade-off between robustness and accuracy. In: Proc Int Conf Mach Learn (ICML), Proceedings of Machine Learning Research, vol 97. PMLR, pp 7472–7482
Netzer Y, Wang T, Coates A, et al (2011) Reading digits in natural images with unsupervised feature learning. nips workshop on deep learning & unsupervised feature learning
Deng J, Dong W, Socher R, et al (2009) Imagenet: a large-scale hierarchical image database. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). IEEE Computer Society, pp 248–255
Lin J, Song C, He K, et al (2020) Nesterov accelerated gradient and scale invariance for adversarial attacks. In: Proc Int Conf Learn Represent (ICLR)
Wang X, He K (2021) Enhancing the transferability of adversarial attacks through variance tuning. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), pp 1924–1933
Wu T, Tong L, Vorobeychik Y (2020) Defending against physically realizable attacks on image classification. In: Proc Int Conf Learn Represent (ICLR)
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proc Int Conf Learn Represent (ICLR)
Kim H (2020) Torchattacks : a pytorch repository for adversarial attacks. arXiv:2010.01950
Ding GW, Wang L, Jin X (2019) advertorch v0.1: an adversarial robustness toolbox based on pytorch. arXiv:1902.07623
Athalye A, Carlini N, Wagner DA (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: Dy JG, Krause A (eds) Proc Int Conf Mach Learn (ICML), pp 274–283

Download references

Acknowledgements

This work was supported by the Major Research Plan of the National Natural Science Foundation of China (92167203), Key Program of Zhejiang Provincial Natural Science Foundation of China (LZ22F020007), the Major Key Project of Peng Cheng Laboratory (2022A03), and Science and Technology Innovation Foundation for Graduate Students of Zhejiang University of Science and Technology (F464108M05).

Author information

Authors and Affiliations

School of Science, Zhejiang University of Science and Technology, Hangzhou, 3210023, Zhejiang, China
Jianchang Huang, Yinyao Dai, Fang Lu & Yaguan Qian
Application and Cybersecurity, Zhejiang Key Laboratory of Multi-dimensional Perception Technology, Hangzhou, 310051, Zhejiang, China
Bin Wang
School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, Guangdong, China
Zhaoquan Gu
Intelligent Network Institute, Zhejiang Lab, Hangzhou, 311121, Zhejiang, China
Boyang Zhou

Authors

Jianchang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yinyao Dai
View author publications
You can also search for this author in PubMed Google Scholar
Fang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoquan Gu
View author publications
You can also search for this author in PubMed Google Scholar
Boyang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yaguan Qian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Jianchang Huang: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Writing - review & editing, Software. Yaguan Qian: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Writing - review & editing, Funding acquisition. Yinyao Dai: Data curation, Investigation, Writing - Review & Editing. Fang Lu: Writing - Review & Editing, Investigation. Bin Wang: Writing - Review & Editing, Funding acquisition. Zhaoquan Gu: Writing - Review & Editing. Boyang Zhou: Writing - Review & Editing.

Corresponding author

Correspondence to Yaguan Qian.

Ethics declarations

Conflicts of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Ethics Approval

The data used in this paper do not involve ethical issues.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, J., Dai, Y., Lu, F. et al. Adversarial perturbation denoising utilizing common characteristics in deep feature space. Appl Intell 54, 1672–1690 (2024). https://doi.org/10.1007/s10489-023-05253-5

Download citation

Accepted: 23 December 2023
Published: 13 January 2024
Issue Date: January 2024
DOI: https://doi.org/10.1007/s10489-023-05253-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adversarial perturbation denoising utilizing common characteristics in deep feature space

Abstract

Access this article

Similar content being viewed by others

Multi-scale Features Destructive Universal Adversarial Perturbations

Transferable Adversarial Perturbations

Generating transferable adversarial examples based on perceptually-aligned perturbation

Data Availability and Access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethics Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adversarial perturbation denoising utilizing common characteristics in deep feature space

Abstract

Access this article

Similar content being viewed by others

Multi-scale Features Destructive Universal Adversarial Perturbations

Transferable Adversarial Perturbations

Generating transferable adversarial examples based on perceptually-aligned perturbation

Data Availability and Access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethics Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation