Skip to main content
Log in

Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Deep Neural Networks (DNN) has achieved a great success in many tasks in recent years. However, researchers found that DNN is vulnerable to adversarial examples that are maliciously perturbed inputs. The elaborately designed adversarial perturbations can easily confuse the model whereas have no impacts on human perception. To counter adversarial examples, we propose an integrated detection framework for detecting adversarial examples, which involves statistical detector and Gaussian noise injection detector. The statistical detector extracts Subtractive Pixel Adjacency Matrix (SPAM) and uses the second order Markov transition probability matrix to model SPAM so as to highlight the statistical anomaly hidden in an adversarial input. Then an ensemble classifier using SPAM based feature is applied to detect the adversarial input containing large perturbation. The Gaussian noise injection detector first injects an additive Gaussian noise into the input, and then feeds both the original input and its Gaussian noise injected counterpart into a targeted network. By comparing the two outputs difference, the detector is applied to detect adversarial input containing small perturbation: if the difference exceeds a threshold, the input is adversarial; otherwise legitimate. The two detectors are adaptive to different characteristics of adversarial perturbation so that the proposed detection framework is capable of detecting multiple types of adversarial examples. In our work, we test six categories of adversarial examples produced by Fast Gradient Sign Method (FGSM, untargeted), Randomized Fast Gradient Sign Method (R-FGSM, untargeted), Basic Iterative Method (BIM, untargeted), DeepFool (untargeted), Carlini&Wagner Method (CW_UT, untargeted) and CW_T(targeted). Comprehensive empirical results show that the proposed detection framework has achieved a promising performance on ImageNet database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. IEEE Symp Sec Privacy (SP):39–57

  2. Das N, Shanbhogue M, Chen S. T, Hohman F, Chen L, Kounavis M. E, Chau D. H (2017) Keeping the bad guys out: Protecting and vaccinating deep learning with jpeg compression. arXiv preprint arXiv:1705.02900

  3. Deng J, Dong W, Socher R, Li L. J, Li K, Fei-Fei L. (2009). Imagenet: a large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255

  4. Dziugaite G K, Ghahramani Z, Roy D M (2016) A study of the effect of jpg compression on adversarial images. arXiv preprint arXiv:1608.00853

  5. Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Xiao C, Song D (2018) Robust physical-world attacks on deep learning models. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  6. Goodfellow I J, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR)

  7. Grosse K, Manoharan P, Papernot N, Backes M, McDaniel P (2017) On the (statistical) detection of adversarial examples arXiv preprint arXiv:1702.06280

  8. Gryllias KC, Antoniadis IA (2012) A support vector machine approach based on physical model training for rolling element bearing fault detection in industrial environments. Eng Appl Artif Intell 25(2):326–344

    Article  Google Scholar 

  9. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. Comput Therm Sci 14(7):38–39

    Google Scholar 

  10. Kodovsky J, Fridrich J, Holub V (2012) Ensemble classifiers for steganalysis of digital media. IEEE Trans Inform Forens Sec 7(2):432–444

    Article  Google Scholar 

  11. Kurakin A, Goodfellow I, Bengio S (2017) Adversarial examples in the physical world. In International Conference on Learning Representations (ICLR) Workshop Track

  12. Li X, Li F (2017) Adversarial examples detection in deep networks with convolutional filter statistics. In International Conference on Computer Vision (ICCV) pp. 5775–5783

  13. Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum D. S. (2016). Urban water quality prediction based on multi-task multi-view learning. In International joint conference on artificial intelligence (IJCAI), pp. 2576– 2582

  14. Liu Y, Zhang L, Nie L, Yan Y, Rosenblum D. S. (2016). Fortune teller: predicting your career path. In American Association for Artificial Intelligence (AAAI). Vol. 2016, pp. 201–207

  15. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  16. Lu J, Issaranon T, Forsyth D (2017) Safetynet: detecting and rejecting adversarial examples robustly. In International Conference on Computer Vision (ICCV) pp. 446–454

  17. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083

  18. Meng D, Chen H. (2017) Magnet: a two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, pp. 135–147

  19. Metzen J. H, Genewein T, Fischer V, Bischoff B (2017) On detecting adversarial perturbations. In International Conference on Learning Representations (ICLR)

  20. Miyato T, Dai A M, Goodfellow I (2016) Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725

  21. Moosavi Dezfooli S M, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2674–2582

  22. Papernot N, McDaniel P, Sinha A, Wellman M (2016) Towards the science of security and privacy in machine learning arXiv preprint arXiv:1611.03814

  23. Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE European Symposium on Security and Privacy (SP), pp. 582–597

  24. Papernot N, Goodfellow I, Sheatsley R, Feinman R, McDaniel P (2016) cleverhans v2.0.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768

  25. Pevny T, Bas P, Fridrich J (2010) Steganalysis by subtractive pixel adjacency matrix. IEEE Trans Inform Forens Sec 5(2):215–224

    Article  Google Scholar 

  26. Santhanam G. K, Grnarova P. (2018). Defending Against Adversarial Attacks by Leveraging an Entire GAN. arXiv preprint arXiv:1805.10652

  27. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  28. Sharif M, Bhagavatula S, Bauer L, Reiter M. K (2016) Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, pp. 1528–1540

  29. Shen S, Jin G, Gao K, Zhang Y (2017) APE-GAN: Adversarial Perturbation Elimination with GAN, arXiv preprint arXiv preprint arXiv:1707.05474

  30. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR) Workshop Track

  31. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826

  32. Tramèr F, Kurakin A, Papernot N, Goodfellow I, Boneh D, McDaniel P (2018) Ensemble adversarial training: Attacks and defenses. In International Conference on Learning Representations (ICLR)

  33. Xie C, Wang J, Zhang Z, Ren Z, Yuille A (2018) Mitigating adversarial effects through randomization. In International Conference on Learning Representations (ICLR)

  34. Xu, Weilin, David Evans, And Yanjun Qi (2017) feature squeezing: detecting adversarial examples in deep neural networks. In Network and Distributed System Security Symposium (NDSS)

Download references

Acknowledgements

This work was supported by Shanghai Municipal Natural Science Foundation under Grant No. 16ZR1411100 and the National Natural Science Foundation of China under Grant No. 61771301.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangling Sun.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, W., Sun, G., Su, Y. et al. Integration of statistical detector and Gaussian noise injection detector for adversarial example detection in deep neural networks. Multimed Tools Appl 78, 20409–20429 (2019). https://doi.org/10.1007/s11042-019-7353-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7353-6

Keywords

Navigation