Skip to main content

Advertisement

Log in

Adversarial example detection by predicting adversarial noise in the frequency domain

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recent advances in deep neural network (DNN) techniques have increased the importance of security and robustness of algorithms where DNNs are applied. However, several studies have demonstrated that neural networks are vulnerable to adversarial examples, which are generated by adding crafted adversarial noises to the input images. Because the adversarial noises are typically imperceptible to the human eye, it is difficult to defend DNNs. One method of defense is the detection of adversarial examples by analyzing characteristics of input images. Recent studies have used the hidden layer outputs of the target classifier to improve the robustness but need to access the target classifier. Moreover, there is no post-processing step for the detected adversarial examples. They simply discard the detected adversarial images. To resolve this problem, we propose a novel detection-based method, which predicts the adversarial noise and detects the adversarial example based on the predicted noise without any target classification information. We first generated adversarial examples and adversarial noises, which can be obtained from the residual between the original and adversarial example images. Subsequently, we trained the proposed adversarial noise predictor to estimate the adversarial noise image and trained the adversarial detector using the input images and the predicted noises. The proposed framework has the advantage that it is agnostic to the input image modality. Moreover, the predicted noises can be used to reconstruct the detected adversarial examples as the non-adversarial images instead of discarding the detected adversarial examples. We tested our proposed method against the fast gradient sign method (FGSM), basic iterative method (BIM), projected gradient descent (PGD), Deepfool, and Carlini & Wagner adversarial attack methods on the CIFAR-10 and CIFAR-100 datasets provided by the Canadian Institute for Advanced Research (CIFAR). Our method demonstrated significant improvements in detection accuracy when compared to the state-of-the-art methods and resolved the wastage problem of the detected adversarial examples. The proposed method agnostic to the input image modality demonstrated that the noise predictor successfully captured noise in the Fourier domain and improved the performance of the detection task. Moreover, we resolved the post-processing problem of the detected adversarial examples with the reconstruction process using the predicted noise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Andriushchenko M, Croce F, Flammarion N, Hein M (2020) Square attack: a query-efficient black-box adversarial attack via random search. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision – ECCV 2020. Springer, Cham, pp 484–501

    Chapter  Google Scholar 

  2. Athalye A, Carlini N, Wagner DA (2018) Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: ICML, pp. 274–283. http://proceedings.mlr.press/v80/athalye18a. html

  3. Athalye A, Engstrom L, Ilyas A, Kwok K (2018) Synthesizing robust adversarial examples. In: international conference on machine learning. pp. 284–293

  4. Carlini N, Wagner D (2017) Adversarial examples are not easily detected: Bypassing ten detection methods. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 3–14

  5. Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 Ieee symposium on security and privacy (sp), pp. 39–57. IEEE

  6. Chen P-Y, Zhang H, Sharma Y, Yi J, Hsieh C-J (2017) Zoo. Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. https://doi.org/10.1145/3128572.3140448

  7. Cisse M, Bojanowski P, Grave E, Dauphin Y, Usunier N (2017) Parseval networks: improving robustness to adversarial examples. In: international conference on machine learning, pp. 854–863. PMLR

  8. Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning. ICML ‘08, pp. 160–167. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1390156.1390177

  9. Croce F, Hein M (2020) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: international conference on machine learning. pp. 2206–2216

  10. Croce F, Hein M (2020) Minimally distorted adversarial examples with a fast adaptive boundary attack. In: international conference on machine learning, pp. 2196–2205

  11. Dong Y, Fu Q-A, Yang X, Pang T, Su H, Xiao Z, Zhu J (2020) Benchmarking adversarial robustness on image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 321–331

  12. Dziugaite GK, Ghahramani Z, Roy DM (2016) A study of the effect of jpg compression on adversarial images. arXiv preprint arXiv:1608.00853

  13. Gong Z, Wang W, Ku W (2017) Adversarial and clean data are not twins. CoRR abs/1704.04960 https://arxiv.org/abs/1704.04960

  14. Goodfellow I, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. CoRR abs/1412.6572

  15. Gu T, Dolan JM, Lee J-W (2016) Human-like planning of swerve maneuvers for autonomous vehicles. In: 2016 IEEE intelligent vehicles symposium (IV), pp. 716–721. IEEE

  16. Guo C, Rana M, Cisse M, Van Der Maaten L (2017) Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117

  17. Harder P, Pfreundt F-J, Keuper M, Keuper J (2021) Spectraldefense: Detecting adversarial attacks on cnns in the fourier domain. 2021 International Joint Conference on Neural Networks (IJCNN). 1–8

  18. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778

  19. Hein M, Andriushchenko M (2017) Formal guarantees on the robustness of a classifier against adversarial manipulation. arXiv preprint arXiv:1705.08475

  20. Hendrycks D, Gimpel K (2016) Visible progress on adversarial images and a new saliency map. CoRR abs/1608.00530 https://arxiv.org/abs/1608.00530

  21. Hochreiter S, Schmidhuber J (1997) https://arxiv.org/abs/https://direct.mit.edu/neco/article-pdf/9/8/1735/813796/neco.1997.9.8.1735.pdf) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  22. Kannan H, Kurakin A, Goodfellow IJ (2018) Adversarial logit pairing. CoRR abs/1803.06373 https://arxiv.org/abs/1803.06373

  23. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386

    Article  Google Scholar 

  24. Kurakin A, Goodfellow IJ, Bengio S (2016) Adversarial examples in the physical world. CoRR abs/1607.02533 https://arxiv.org/abs/1607.02533

  25. Lee K, Lee K, Lee H, Shin J (2018) A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18, pp. 7167–7177. Curran Associates Inc., Red Hook, NY, USA

  26. Liao F, Liang M, Dong Y, Pang T, Hu X, Zhu J (2018) Defense against adversarial attacks using high-level representation guided denoiser. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1778–1787

  27. Ma X, Li B, Wang Y, Erfani SM, Wijewickrema SNR, Houle ME, Schoenebeck G, Song D, Bailey J (2018) Characterizing adversarial subspaces using local intrinsic dimensionality. CoRR abs/1801.02613 https://arxiv.org/abs/1801.02613

  28. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations. https://openreview.net/forum?id=rJzIBfZAb

  29. Moosavi-Dezfooli S-M, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2574–2582

  30. Pang T, Xu K, Zhu J (2019) Mixup inference: Better exploiting mixup to defend adversarial attacks. arXiv preprint arXiv:1909.11515

  31. Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. pp. 506–519

  32. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: international conference on medical image computing and computer-assisted intervention, pp. 234–241. Springer

  33. Samangouei P, Kabkab M, Chellappa R (2018) Defense-gan: Protecting classifiers against adversarial attacks using generative models. arXiv preprint arXiv:1805.06605

  34. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus, R (2015) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199

  35. Xu W, Evans D, Qi Y (2017) Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155

Download references

Acknowledgements

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2021-0-00511, Robust AI and Distributed Attack Detection for Edge AI Security).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minyoung Chung.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Author agreement

All authors including Seunghwan Jung, Minyoung Chung, and Yeong-Gil Shin agreed the submission of this manuscript.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jung, S., Chung, M. & Shin, YG. Adversarial example detection by predicting adversarial noise in the frequency domain. Multimed Tools Appl 82, 25235–25251 (2023). https://doi.org/10.1007/s11042-023-14608-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14608-6

Keywords

Navigation