Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks

Fan, Weiqi; Sun, Guangling; Su, Yuying; Liu, Zhi; Lu, Xiaofeng

doi:10.1007/s11042-019-7353-6

Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks

Published: 01 March 2019

Volume 78, pages 20409–20429, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Weiqi Fan¹,
Guangling Sun ORCID: orcid.org/0000-0001-6440-8513¹,
Yuying Su¹,
Zhi Liu¹ &
…
Xiaofeng Lu¹

807 Accesses
Explore all metrics

Abstract

Deep Neural Networks (DNN) has achieved a great success in many tasks in recent years. However, researchers found that DNN is vulnerable to adversarial examples that are maliciously perturbed inputs. The elaborately designed adversarial perturbations can easily confuse the model whereas have no impacts on human perception. To counter adversarial examples, we propose an integrated detection framework for detecting adversarial examples, which involves statistical detector and Gaussian noise injection detector. The statistical detector extracts Subtractive Pixel Adjacency Matrix (SPAM) and uses the second order Markov transition probability matrix to model SPAM so as to highlight the statistical anomaly hidden in an adversarial input. Then an ensemble classifier using SPAM based feature is applied to detect the adversarial input containing large perturbation. The Gaussian noise injection detector first injects an additive Gaussian noise into the input, and then feeds both the original input and its Gaussian noise injected counterpart into a targeted network. By comparing the two outputs difference, the detector is applied to detect adversarial input containing small perturbation: if the difference exceeds a threshold, the input is adversarial; otherwise legitimate. The two detectors are adaptive to different characteristics of adversarial perturbation so that the proposed detection framework is capable of detecting multiple types of adversarial examples. In our work, we test six categories of adversarial examples produced by Fast Gradient Sign Method (FGSM, untargeted), Randomized Fast Gradient Sign Method (R-FGSM, untargeted), Basic Iterative Method (BIM, untargeted), DeepFool (untargeted), Carlini&Wagner Method (CW_UT, untargeted) and CW_T(targeted). Comprehensive empirical results show that the proposed detection framework has achieved a promising performance on ImageNet database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective and Robust Detection of Adversarial Examples via Benford-Fourier Coefficients

Article 25 April 2022

Adversarial example detection by predicting adversarial noise in the frequency domain

Article 16 February 2023

Adversarial Examples Generation System Based on Gradient Shielding of Restricted Region

References

Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. IEEE Symp Sec Privacy (SP):39–57
Das N, Shanbhogue M, Chen S. T, Hohman F, Chen L, Kounavis M. E, Chau D. H (2017) Keeping the bad guys out: Protecting and vaccinating deep learning with jpeg compression. arXiv preprint arXiv:1705.02900
Deng J, Dong W, Socher R, Li L. J, Li K, Fei-Fei L. (2009). Imagenet: a large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255
Dziugaite G K, Ghahramani Z, Roy D M (2016) A study of the effect of jpg compression on adversarial images. arXiv preprint arXiv:1608.00853
Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Xiao C, Song D (2018) Robust physical-world attacks on deep learning models. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Goodfellow I J, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR)
Grosse K, Manoharan P, Papernot N, Backes M, McDaniel P (2017) On the (statistical) detection of adversarial examples arXiv preprint arXiv:1702.06280
Gryllias KC, Antoniadis IA (2012) A support vector machine approach based on physical model training for rolling element bearing fault detection in industrial environments. Eng Appl Artif Intell 25(2):326–344
Article Google Scholar
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. Comput Therm Sci 14(7):38–39
Google Scholar
Kodovsky J, Fridrich J, Holub V (2012) Ensemble classifiers for steganalysis of digital media. IEEE Trans Inform Forens Sec 7(2):432–444
Article Google Scholar
Kurakin A, Goodfellow I, Bengio S (2017) Adversarial examples in the physical world. In International Conference on Learning Representations (ICLR) Workshop Track
Li X, Li F (2017) Adversarial examples detection in deep networks with convolutional filter statistics. In International Conference on Computer Vision (ICCV) pp. 5775–5783
Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum D. S. (2016). Urban water quality prediction based on multi-task multi-view learning. In International joint conference on artificial intelligence (IJCAI), pp. 2576– 2582
Liu Y, Zhang L, Nie L, Yan Y, Rosenblum D. S. (2016). Fortune teller: predicting your career path. In American Association for Artificial Intelligence (AAAI). Vol. 2016, pp. 201–207
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Article Google Scholar
Lu J, Issaranon T, Forsyth D (2017) Safetynet: detecting and rejecting adversarial examples robustly. In International Conference on Computer Vision (ICCV) pp. 446–454
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083
Meng D, Chen H. (2017) Magnet: a two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, pp. 135–147
Metzen J. H, Genewein T, Fischer V, Bischoff B (2017) On detecting adversarial perturbations. In International Conference on Learning Representations (ICLR)
Miyato T, Dai A M, Goodfellow I (2016) Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725
Moosavi Dezfooli S M, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2674–2582
Papernot N, McDaniel P, Sinha A, Wellman M (2016) Towards the science of security and privacy in machine learning arXiv preprint arXiv:1611.03814
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE European Symposium on Security and Privacy (SP), pp. 582–597
Papernot N, Goodfellow I, Sheatsley R, Feinman R, McDaniel P (2016) cleverhans v2.0.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768
Pevny T, Bas P, Fridrich J (2010) Steganalysis by subtractive pixel adjacency matrix. IEEE Trans Inform Forens Sec 5(2):215–224
Article Google Scholar
Santhanam G. K, Grnarova P. (2018). Defending Against Adversarial Attacks by Leveraging an Entire GAN. arXiv preprint arXiv:1805.10652
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Article Google Scholar
Sharif M, Bhagavatula S, Bauer L, Reiter M. K (2016) Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, pp. 1528–1540
Shen S, Jin G, Gao K, Zhang Y (2017) APE-GAN: Adversarial Perturbation Elimination with GAN, arXiv preprint arXiv preprint arXiv:1707.05474
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR) Workshop Track
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826
Tramèr F, Kurakin A, Papernot N, Goodfellow I, Boneh D, McDaniel P (2018) Ensemble adversarial training: Attacks and defenses. In International Conference on Learning Representations (ICLR)
Xie C, Wang J, Zhang Z, Ren Z, Yuille A (2018) Mitigating adversarial effects through randomization. In International Conference on Learning Representations (ICLR)
Xu, Weilin, David Evans, And Yanjun Qi (2017) feature squeezing: detecting adversarial examples in deep neural networks. In Network and Distributed System Security Symposium (NDSS)

Download references

Acknowledgements

This work was supported by Shanghai Municipal Natural Science Foundation under Grant No. 16ZR1411100 and the National Natural Science Foundation of China under Grant No. 61771301.

Author information

Authors and Affiliations

Communication and Information Engineering School, Shanghai University, Shanghai, 200444, China
Weiqi Fan, Guangling Sun, Yuying Su, Zhi Liu & Xiaofeng Lu

Authors

Weiqi Fan
View author publications
You can also search for this author in PubMed Google Scholar
Guangling Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuying Su
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guangling Sun.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, W., Sun, G., Su, Y. et al. Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks. Multimed Tools Appl 78, 20409–20429 (2019). https://doi.org/10.1007/s11042-019-7353-6

Download citation

Received: 11 June 2018
Revised: 24 December 2018
Accepted: 08 February 2019
Published: 01 March 2019
Issue Date: 30 July 2019
DOI: https://doi.org/10.1007/s11042-019-7353-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effective and Robust Detection of Adversarial Examples via Benford-Fourier Coefficients

Adversarial example detection by predicting adversarial noise in the frequency domain

Adversarial Examples Generation System Based on Gradient Shielding of Restricted Region

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effective and Robust Detection of Adversarial Examples via Benford-Fourier Coefficients

Adversarial example detection by predicting adversarial noise in the frequency domain

Adversarial Examples Generation System Based on Gradient Shielding of Restricted Region

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation