Skip to main content
Log in

Adversarial perturbation denoising utilizing common characteristics in deep feature space

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Recent studies have shown that deep neural networks (DNNs) are vulnerable to adversarial examples (AEs). Denoising based on the input pre-processing is one of the defenses against adversarial attacks. However, it is hard to remove multiple adversarial perturbations, especially in the presence of evolving attacks. To address this challenge, we attempt to extract the commonality of adversarial perturbations. Due to the imperceptibility of adversarial perturbations in the input space, we conduct the extraction in the deep feature space where the perturbations become more apparent. Through the obtained common characteristics, we craft common adversarial examples (CAEs) to train the denoiser. Furthermore, to prevent image distortion while removing as much of the adversarial perturbation as possible, we propose a hybrid loss function that guides the training process at both the pixel level and the deep feature space. Our experiments show that our defense method can eliminate multiple adversarial perturbations, significantly enhancing adversarial robustness compared to previous state-of-the-art methods. Moreover, it can be plug-and-play for various classification models, which demonstrates the generalizability of our defense method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability and Access

The data used to support the findings of this study will be available from the corresponding author upon request after acceptance.

References

  1. Abbas A, Abdelsamea MM, Gaber MM (2021) Classification of COVID-19 in chest x-ray images using detrac deep convolutional neural network. Appl Intell 51(2):854–864

    Article  Google Scholar 

  2. Szegedy C, Zaremba W, Sutskever I, et al (2014) Intriguing properties of neural networks. In: Proc Int Conf Learn Represent (ICLR)

  3. Tamou AB, Benzinou A, Nasreddine K (2021) Multi-stream fish detection in unconstrained underwater videos by the fusion of two convolutional neural network detectors. Appl Intell 51(8):5809–5821

    Article  Google Scholar 

  4. Liu M, Zhang Y, Xu J et al (2021) Deep bi-directional interaction network for sentence matching. Appl Intell 51(7):4305–4329

    Article  Google Scholar 

  5. Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: Proc Int Conf Learn Represent (ICLR)

  6. Yurtsever E, Lambert J, Carballo A et al (2020) A survey of autonomous driving: common practices and emerging technologies. IEEE Access 8:58443–58469

    Article  Google Scholar 

  7. Pham Q, Sevestre P, Pahwa RS, et al (2020) A*3d dataset: Towards autonomous driving in challenging environments. In: 2020 IEEE international conference on robotics and automation, ICRA 2020, Paris, France, May 31 - August 31, 2020. IEEE, pp 2267–2273

  8. Hoai NV, Nguyen HM, Pham C (2022) Masked face recognition with convolutional neural networks and local binary patterns. Appl Intell 52(5):5497–5512

    Article  Google Scholar 

  9. Xu K, Zhang G, Liu S, et al (2020) Adversarial t-shirt! evading person detectors in a physical world. In: Proceeding European Conference Computer Vision (ECCV), Lecture Notes in Computer Science, vol 12350. Springer, pp 665–681

  10. Ho J, Lee B, Kang D (2022) Attack-less adversarial training for a robust adversarial defense. Appl Intell 52(4):4364–4381

    Article  Google Scholar 

  11. Tramèr F, Kurakin A, Papernot N, et al (2018) Ensemble adversarial training: attacks and defenses. In: Proc Int Conf Learn Represent (ICLR)

  12. Jia X, Zhang Y, Wu B et al (2022) Boosting fast adversarial training with learnable adversarial initialization. IEEE Trans Image Process 31:4417–4430

    Article  ADS  PubMed  Google Scholar 

  13. Madry A, Makelov A, Schmidt L, et al (2018) Towards deep learning models resistant to adversarial attacks. In: Proc Int Conf Learn Represent (ICLR)

  14. Gu S, Rigazio L (2015) Towards deep neural network architectures robust to adversarial examples. In: Proc Int Conf Learn Represent (ICLR)

  15. Wang S, Gong Y (2022) Adversarial example detection based on saliency map features. Appl Intell 52(6):6262–6275

    Article  Google Scholar 

  16. Sarvar A, Amirmazlaghani M (2023) Defense against adversarial examples based on wavelet domain analysis. Appl Intell 53(1):423–439

    Article  Google Scholar 

  17. Guo C, Rana M, Cissé M, et al (2018) Countering adversarial images using input transformations. In: Proc Int Conf Learn Represent (ICLR)

  18. Liao F, Liang M, Dong Y, et al (2018) Defense against adversarial attacks using high-level representation guided denoiser. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE Computer Society, pp 1778–1787

  19. Xu W, Evans D, Qi Y (2018) Feature squeezing: Detecting adversarial examples in deep neural networks. In: 25th annual network and distributed system security symposium, NDSS 2018, San Diego, California, USA, February 18-21, 2018. The Internet Society

  20. He W, Wei J, Chen X et al (2017) (2017) Adversarial example defense: ensembles of weak defenses are not strong. In: Enck W, Mulliner C (eds) 11th USENIX Workshop on Offensive Technologies, WOOT 2017. BC, Canada, August, Vancouver, pp 14–15

    Google Scholar 

  21. Samangouei P, Kabkab M, Chellappa R (2018) Defense-gan: Protecting classifiers against adversarial attacks using generative models. In: Proc Int Conf Learn Represent (ICLR)

  22. Jin G, Shen S, Zhang D, et al (2019) APE-GAN: adversarial perturbation elimination with GAN. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. IEEE, pp 3842–3846

  23. Dong Y, Liao F, Pang T, et al (2018) Boosting adversarial attacks with momentum. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE Computer Society, pp 9185–9193

  24. Meng D, Chen H (2017) Magnet: A two-pronged defense against adversarial examples. In: Thuraisingham B, Evans D, Malkin T, et al (eds) Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017. ACM, pp 135–147

  25. Carlini N, Wagner DA (2017) Magnet and “efficient defenses against adversarial attacks” are not robust to adversarial examples. arXiv:1711.08478

  26. Dong Y, Liao F, Pang T, et al (2018) Boosting adversarial attacks with momentum. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE Computer Society, pp 9185–9193

  27. Moosavi-Dezfooli S, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). IEEE Computer Society, pp 2574–2582

  28. Rony J, Hafemann LG, Oliveira LS, et al (2019) Decoupling direction and norm for efficient gradient-based L2 adversarial attacks and defenses. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 4322–4330

  29. Croce F, Hein M (2020) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: Proc Int Conf Mach Learn (ICML), Proceedings of Machine Learning Research, vol 119. PMLR, pp 2206–2216

  30. Carlini N, Wagner DA (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017. IEEE Computer Society, pp 39–57

  31. Dong Y, Pang T, Su H, et al (2019) Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 4312–4321

  32. Xie C, Zhang Z, Zhou Y, et al (2019) Improving transferability of adversarial examples with input diversity. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 2730–2739

  33. Wu K, Wang AH, Yu Y (2020) Stronger and faster wasserstein adversarial attacks. In: Proc Int Conf Mach Learn (ICML), Proceedings of Machine Learning Research, vol 119. PMLR, pp 10377–10387

  34. Xiao C, Zhu J, Li B, et al (2018) Spatially transformed adversarial examples. In: Proc Int Conf Learn Represent (ICLR)

  35. Tramèr F, Kurakin A, Papernot N, et al (2018) Ensemble adversarial training: Attacks and defenses. In: Proc Int Conf Learn Represent (ICLR)

  36. Yan Z, Guo Y, Zhang C (2018) Deep defense: Training dnns with improved adversarial robustness. In: Bengio S, Wallach HM, Larochelle H, et al (eds) Proc Adv Neural Inf Process Syst (NeurIPS), pp 417–426

  37. Xie C, Wang J, Zhang Z, et al (2018) Mitigating adversarial effects through randomization. In: Proc Int Conf Learn Represent (ICLR)

  38. Jia X, Wei X, Cao X, et al (2019) Comdefend: An efficient image compression model to defend adversarial examples. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 6084–6092

  39. Xie C, Wu Y, van der Maaten L, et al (2019) Feature denoising for improving adversarial robustness. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). Computer Vision Foundation / IEEE, pp 501–509

  40. Zhou D, Wang N, Peng C, et al (2021) Removing adversarial noise in class activation feature space. In: Proc IEEE/CVF Int Conf Comput Vis (ICCV). IEEE, pp 7858–7867

  41. Zhang H, Yu Y, Jiao J, et al (2019) Theoretically principled trade-off between robustness and accuracy. In: Proc Int Conf Mach Learn (ICML), Proceedings of Machine Learning Research, vol 97. PMLR, pp 7472–7482

  42. Netzer Y, Wang T, Coates A, et al (2011) Reading digits in natural images with unsupervised feature learning. nips workshop on deep learning & unsupervised feature learning

  43. Deng J, Dong W, Socher R, et al (2009) Imagenet: a large-scale hierarchical image database. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR). IEEE Computer Society, pp 248–255

  44. Lin J, Song C, He K, et al (2020) Nesterov accelerated gradient and scale invariance for adversarial attacks. In: Proc Int Conf Learn Represent (ICLR)

  45. Wang X, He K (2021) Enhancing the transferability of adversarial attacks through variance tuning. In: Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), pp 1924–1933

  46. Wu T, Tong L, Vorobeychik Y (2020) Defending against physically realizable attacks on image classification. In: Proc Int Conf Learn Represent (ICLR)

  47. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proc Int Conf Learn Represent (ICLR)

  48. Kim H (2020) Torchattacks : a pytorch repository for adversarial attacks. arXiv:2010.01950

  49. Ding GW, Wang L, Jin X (2019) advertorch v0.1: an adversarial robustness toolbox based on pytorch. arXiv:1902.07623

  50. Athalye A, Carlini N, Wagner DA (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: Dy JG, Krause A (eds) Proc Int Conf Mach Learn (ICML), pp 274–283

Download references

Acknowledgements

This work was supported by the Major Research Plan of the National Natural Science Foundation of China (92167203), Key Program of Zhejiang Provincial Natural Science Foundation of China (LZ22F020007), the Major Key Project of Peng Cheng Laboratory (2022A03), and Science and Technology Innovation Foundation for Graduate Students of Zhejiang University of Science and Technology (F464108M05).

Author information

Authors and Affiliations

Authors

Contributions

Jianchang Huang: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Writing - review & editing, Software. Yaguan Qian: Conceptualization, Methodology, Formal analysis, Investigation, Writing - original draft, Writing - review & editing, Funding acquisition. Yinyao Dai: Data curation, Investigation, Writing - Review & Editing. Fang Lu: Writing - Review & Editing, Investigation. Bin Wang: Writing - Review & Editing, Funding acquisition. Zhaoquan Gu: Writing - Review & Editing. Boyang Zhou: Writing - Review & Editing.

Corresponding author

Correspondence to Yaguan Qian.

Ethics declarations

Conflicts of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Ethics Approval

The data used in this paper do not involve ethical issues.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, J., Dai, Y., Lu, F. et al. Adversarial perturbation denoising utilizing common characteristics in deep feature space. Appl Intell 54, 1672–1690 (2024). https://doi.org/10.1007/s10489-023-05253-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-05253-5

Keywords

Navigation