Adversarial example detection by predicting adversarial noise in the frequency domain

Jung, Seunghwan; Chung, Minyoung; Shin, Yeong-Gil

doi:10.1007/s11042-023-14608-6

Adversarial example detection by predicting adversarial noise in the frequency domain

Published: 16 February 2023

Volume 82, pages 25235–25251, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

476 Accesses
1 Altmetric
Explore all metrics

Abstract

Recent advances in deep neural network (DNN) techniques have increased the importance of security and robustness of algorithms where DNNs are applied. However, several studies have demonstrated that neural networks are vulnerable to adversarial examples, which are generated by adding crafted adversarial noises to the input images. Because the adversarial noises are typically imperceptible to the human eye, it is difficult to defend DNNs. One method of defense is the detection of adversarial examples by analyzing characteristics of input images. Recent studies have used the hidden layer outputs of the target classifier to improve the robustness but need to access the target classifier. Moreover, there is no post-processing step for the detected adversarial examples. They simply discard the detected adversarial images. To resolve this problem, we propose a novel detection-based method, which predicts the adversarial noise and detects the adversarial example based on the predicted noise without any target classification information. We first generated adversarial examples and adversarial noises, which can be obtained from the residual between the original and adversarial example images. Subsequently, we trained the proposed adversarial noise predictor to estimate the adversarial noise image and trained the adversarial detector using the input images and the predicted noises. The proposed framework has the advantage that it is agnostic to the input image modality. Moreover, the predicted noises can be used to reconstruct the detected adversarial examples as the non-adversarial images instead of discarding the detected adversarial examples. We tested our proposed method against the fast gradient sign method (FGSM), basic iterative method (BIM), projected gradient descent (PGD), Deepfool, and Carlini & Wagner adversarial attack methods on the CIFAR-10 and CIFAR-100 datasets provided by the Canadian Institute for Advanced Research (CIFAR). Our method demonstrated significant improvements in detection accuracy when compared to the state-of-the-art methods and resolved the wastage problem of the detected adversarial examples. The proposed method agnostic to the input image modality demonstrated that the noise predictor successfully captured noise in the Fourier domain and improved the performance of the detection task. Moreover, we resolved the post-processing problem of the detected adversarial examples with the reconstruction process using the predicted noise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Article 04 June 2022

A review of object detection based on deep learning

Article 12 June 2020

A comprehensive survey of AI-enabled phishing attacks detection techniques

Article 23 October 2020

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

Andriushchenko M, Croce F, Flammarion N, Hein M (2020) Square attack: a query-efficient black-box adversarial attack via random search. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision – ECCV 2020. Springer, Cham, pp 484–501
Chapter Google Scholar
Athalye A, Carlini N, Wagner DA (2018) Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: ICML, pp. 274–283. http://proceedings.mlr.press/v80/athalye18a. html
Athalye A, Engstrom L, Ilyas A, Kwok K (2018) Synthesizing robust adversarial examples. In: international conference on machine learning. pp. 284–293
Carlini N, Wagner D (2017) Adversarial examples are not easily detected: Bypassing ten detection methods. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 3–14
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 Ieee symposium on security and privacy (sp), pp. 39–57. IEEE
Chen P-Y, Zhang H, Sharma Y, Yi J, Hsieh C-J (2017) Zoo. Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. https://doi.org/10.1145/3128572.3140448
Cisse M, Bojanowski P, Grave E, Dauphin Y, Usunier N (2017) Parseval networks: improving robustness to adversarial examples. In: international conference on machine learning, pp. 854–863. PMLR
Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning. ICML ‘08, pp. 160–167. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1390156.1390177
Croce F, Hein M (2020) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: international conference on machine learning. pp. 2206–2216
Croce F, Hein M (2020) Minimally distorted adversarial examples with a fast adaptive boundary attack. In: international conference on machine learning, pp. 2196–2205
Dong Y, Fu Q-A, Yang X, Pang T, Su H, Xiao Z, Zhu J (2020) Benchmarking adversarial robustness on image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 321–331
Dziugaite GK, Ghahramani Z, Roy DM (2016) A study of the effect of jpg compression on adversarial images. arXiv preprint arXiv:1608.00853
Gong Z, Wang W, Ku W (2017) Adversarial and clean data are not twins. CoRR abs/1704.04960 https://arxiv.org/abs/1704.04960
Goodfellow I, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. CoRR abs/1412.6572
Gu T, Dolan JM, Lee J-W (2016) Human-like planning of swerve maneuvers for autonomous vehicles. In: 2016 IEEE intelligent vehicles symposium (IV), pp. 716–721. IEEE
Guo C, Rana M, Cisse M, Van Der Maaten L (2017) Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117
Harder P, Pfreundt F-J, Keuper M, Keuper J (2021) Spectraldefense: Detecting adversarial attacks on cnns in the fourier domain. 2021 International Joint Conference on Neural Networks (IJCNN). 1–8
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778
Hein M, Andriushchenko M (2017) Formal guarantees on the robustness of a classifier against adversarial manipulation. arXiv preprint arXiv:1705.08475
Hendrycks D, Gimpel K (2016) Visible progress on adversarial images and a new saliency map. CoRR abs/1608.00530 https://arxiv.org/abs/1608.00530
Hochreiter S, Schmidhuber J (1997) https://arxiv.org/abs/https://direct.mit.edu/neco/article-pdf/9/8/1735/813796/neco.1997.9.8.1735.pdf) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Kannan H, Kurakin A, Goodfellow IJ (2018) Adversarial logit pairing. CoRR abs/1803.06373 https://arxiv.org/abs/1803.06373
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Kurakin A, Goodfellow IJ, Bengio S (2016) Adversarial examples in the physical world. CoRR abs/1607.02533 https://arxiv.org/abs/1607.02533
Lee K, Lee K, Lee H, Shin J (2018) A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18, pp. 7167–7177. Curran Associates Inc., Red Hook, NY, USA
Liao F, Liang M, Dong Y, Pang T, Hu X, Zhu J (2018) Defense against adversarial attacks using high-level representation guided denoiser. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1778–1787
Ma X, Li B, Wang Y, Erfani SM, Wijewickrema SNR, Houle ME, Schoenebeck G, Song D, Bailey J (2018) Characterizing adversarial subspaces using local intrinsic dimensionality. CoRR abs/1801.02613 https://arxiv.org/abs/1801.02613
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations. https://openreview.net/forum?id=rJzIBfZAb
Moosavi-Dezfooli S-M, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2574–2582
Pang T, Xu K, Zhu J (2019) Mixup inference: Better exploiting mixup to defend adversarial attacks. arXiv preprint arXiv:1909.11515
Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. pp. 506–519
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: international conference on medical image computing and computer-assisted intervention, pp. 234–241. Springer
Samangouei P, Kabkab M, Chellappa R (2018) Defense-gan: Protecting classifiers against adversarial attacks using generative models. arXiv preprint arXiv:1805.06605
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus, R (2015) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199
Xu W, Evans D, Qi Y (2017) Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155

Download references

Acknowledgements

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2021-0-00511, Robust AI and Distributed Attack Detection for Edge AI Security).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, South Korea
Seunghwan Jung & Yeong-Gil Shin
School of Software, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul, 156-743, 06978, South Korea
Minyoung Chung

Authors

Seunghwan Jung
View author publications
You can also search for this author in PubMed Google Scholar
Minyoung Chung
View author publications
You can also search for this author in PubMed Google Scholar
Yeong-Gil Shin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minyoung Chung.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Author agreement

All authors including Seunghwan Jung, Minyoung Chung, and Yeong-Gil Shin agreed the submission of this manuscript.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jung, S., Chung, M. & Shin, YG. Adversarial example detection by predicting adversarial noise in the frequency domain. Multimed Tools Appl 82, 25235–25251 (2023). https://doi.org/10.1007/s11042-023-14608-6

Download citation

Received: 30 October 2021
Revised: 14 September 2022
Accepted: 03 February 2023
Published: 16 February 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11042-023-14608-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adversarial example detection by predicting adversarial noise in the frequency domain

Abstract

Access this article

Similar content being viewed by others

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

A review of object detection based on deep learning

A comprehensive survey of AI-enabled phishing attacks detection techniques

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Author agreement

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adversarial example detection by predicting adversarial noise in the frequency domain

Abstract

Access this article

Similar content being viewed by others

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

A review of object detection based on deep learning

A comprehensive survey of AI-enabled phishing attacks detection techniques

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Author agreement

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation