Skip to main content
Log in

Adversarial robustness via noise injection in smoothed models

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Deep neural networks are known to be vulnerable to malicious perturbations. Current methods for improving adversarial robustness make use of either implicit or explicit regularization, with the latter is usually based on adversarial training. Randomized smoothing, the averaging of the classifier outputs over a random distribution centered in the sample, has been shown to guarantee a classifier’s performance subject to bounded perturbations of the input. In this work, we study the application of randomized smoothing to improve performance on unperturbed data and increase robustness to adversarial attacks. We propose to combine smoothing along with adversarial training and randomization approaches, and find that doing so significantly improves the resilience compared to the baseline. We examine our method’s performance on common white-box (FGSM, PGD) and black-box (transferable attack and NAttack) attacks on CIFAR-10 and CIFAR-100, and determine that for a low number of iterations, smoothing provides a significant performance boost that persists even for perturbations with a high attack norm, 𝜖. For example, under a PGD-10 attack on CIFAR-10 using Wide-ResNet28-4, we achieve 60.3% accuracy for infinity norm \(\epsilon _{\infty }=\nicefrac {8}{255}\) and 13.1% accuracy for \(\epsilon _{\infty }=\nicefrac {35}{255}\) – outperforming previous art by 3% and 6%, respectively. We achieve nearly twice the accuracy on \(\epsilon _{\infty }=\nicefrac {35}{255}\) and even more so for perturbations with higher infinity norm. A https://github.com/yanemcovsky/SIAM of the proposed method is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Athalye A, Carlini N, Wagner D (2018a) Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR, Stockholmsmässan. http://proceedings.mlr.press/v80/athalye18a.html, vol 80. Proceedings of Machine Learning Research, Stockholm Sweden, pp 274–283

  2. Athalye A, Engstrom L, Ilyas A, Kwok K (2018b) Synthesizing robust adversarial examples. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v80/athalye18b.html, vol 80. Proceedings of Machine Learning Research, Stockholmsmässan, pp 284–293

  3. Balaji Y, Goldstein T, Hoffman J (2019) Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets. arXiv:1910.08051

  4. Balunovic M, Vechev M (2020) Adversarial training and provable defenses: Bridging the gap. In: International Conference on Learning Representations, https://openreview.net/forum?id=SJxSDxrKDr

  5. Bietti A, Mialon G, Chen D, Mairal J (2019) A kernel perspective for regularizing deep neural networks. In: International Conference on Machine Learning. PMLR, pp 664–674

  6. Blum A, Dick T, Manoj N, Zhang H (2020) Random smoothing might be unable to certify \(\ell _{\infty }\) robustness for high-dimensional images. arXiv:2002.03517

  7. Brown TB, Mané D, Roy A, Abadi M, Gilmer J (2017) Adversarial patch. arXiv:1712.09665

  8. Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp 39–57. https://doi.org/10.1109/SP.2017.49

  9. Carlini N, Wagner D (2018) Audio adversarial examples: Targeted attacks on speech-to-text. In: 2018 IEEE Security and Privacy Workshops (SPW). IEEE, pp 1–7

  10. Chaturvedi A, KP A, Garain U (2019) Exploring the robustness of nmt systems to nonsensical inputs. arXiv:1908.01165

  11. Chen PY, Zhang H, Sharma Y, Yi J, Hsieh CJ (2017) ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec ’17, ACM, New York, pp 15–26. https://doi.org/10.1145/3128572.3140448

  12. Cohen J, Rosenfeld E, Kolter Z (2019) Certified adversarial robustness via randomized smoothing. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/cohen19c.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 1310–1320

  13. Gao J, Lanchantin J, Soffa M L, Qi Y (2018) Black-box generation of adversarial text sequences to evade deep learning classifiers. In: 2018 IEEE Security and Privacy Workshops (SPW) IEEE, pp 50–56

  14. Gilmer J, Metz L, Faghri F, Schoenholz SS, Raghu M, Wattenberg M, Goodfellow I (2018) Adversarial spheres. arXiv:180102774

  15. Gilmer J, Ford N, Carlini N, Cubuk E (2019) Adversarial examples are a natural consequence of test error in noise. In: International Conference on Machine Learning, PMLR, pp 2280–2289

  16. Gleave A, Dennis M, Wild C, Kant N, Levine S, Russell S (2020) Adversarial policies: Attacking deep reinforcement learning. In: International Conference on Learning Representations, https://openreview.net/forum?id=HJgEMpVFwB

  17. Gong C, Ren T, Ye M, Liu Q (2020) Maxup: A simple way to improve generalization of neural network training. arXiv:2002.090242002.09024

  18. Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv:1412.6572

  19. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  20. He Z, Rakin AS, Fan D (2019) Parametric noise injection: Trainable randomness to improve deep neural network robustness against adversarial attack. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), http://openaccess.thecvf.com/content_CVPR_2019/html/He_Parametric_Noise_Injection_Trainable_Randomness_to_Improve_Deep_Neural_Network_CVPR_2019_paper.html

  21. Jiang H, Chen Z, Shi Y, Dai B, Zhao T (2018) Learning to defense by learning to attack. arXiv:1811.01213

  22. Jin D, Jin Z, Zhou J T, Szolovits P (2020) Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8018–8025

  23. Khoury M, Hadfield-Menell D (2019) Adversarial training with voronoi constraints. arXiv:1905.01019

  24. Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114

  25. Kingma DP, Salimans T, Welling M (2015) Variational dropout and the local reparameterization trick. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems, vol 28. Curran Associates, Inc., pp 2575–2583. http://papers.nips.cc/paper/5666-variational-dropout-and-the-local-reparameterization-trick.pdf

  26. Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf

  27. Kumar A, Levine A, Goldstein T, Feizi S (2020) Curse of dimensionality on randomized smoothing for certifiable robustness. arXiv:2002.03239

  28. Li Y, Li L, Wang L, Zhang T, Gong B (2019) NATTACK: Learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/li19g.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 3866–3876

  29. Li Z, Feng C, Zheng J, Wu M, Yu H (2020) Towards adversarial robustness via feature matching. IEEE Access https://ieeexplore.ieee.org/abstract/document/9089860/

  30. Liu A, Liu X, Yu H, Zhang C, Liu Q, Tao D (2021) Training robust deep neural networks via adversarial noise propagation. IEEE Trans Image Process 30:5769–5781

    Article  Google Scholar 

  31. Liu X, Cheng M, Zhang H, Hsieh CJ (2018) Towards robust neural networks via random self-ensemble. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 369–385

  32. Liu Y, Chen X, Liu C, Song D (2016) Delving into transferable adversarial examples and black-box attacks. arXiv:1611.02770

  33. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations, https://openreview.net/forum?id=rJzIBfZAb

  34. Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ASIA CCS ’17, ACM, New York, pp 506–519. https://doi.org/10.1145/3052973.3053009

  35. Pinot R, Meunier L, Araujo A, Kashima H, Yger F, Gouy-Pailler C, Atif J (2019) Theoretical evidence for adversarial robustness through randomization. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E, Garnett R (eds). https://papers.nips.cc/paper/2019/hash/36ab62655fa81ce8735ce7cfdaf7c9e8-Abstract.html, vol 32. Advances in Neural Information Processing Systems. Curran Associates, Inc.

  36. Rony J, Hafemann L G, Oliveira L S, Ayed I B, Sabourin R, Granger E (2019) Decoupling direction and norm for efficient gradient-based L2 adversarial attacks and defenses. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  37. Salman H, Yang G, Li J, Zhang P, Zhang H, Razenshteyn I, Bubeck S (2019) Provably robust deep learning via adversarially trained smoothed classifiers. arXiv:1906.04584

  38. Sarkar A, Gupta NK, Iyengar R (2019) Enforcing linearity in dnn succours robustness and adversarial image generation. arXiv:1910.08108

  39. Shafahi A, Najibi M, Ghiasi MA, Xu Z, Dickerson J, Studer C, Davis LS, Taylor G, Goldstein T (2019) Adversarial training for free!. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E, Garnett R (eds) Advances in Neural Information Processing Systems. http://papers.nips.cc/paper/8597-adversarial-training-for-free, vol 32. Curran Associates, Inc., pp 3358–3369

  40. Sun H, Wang R, Chen K, Utiyama M, Sumita E, Zhao T (2020) Robust unsupervised neural machine translation with adversarial training. arXiv:2002.12549

  41. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv:1312.6199

  42. Wang B, Yuan B, Shi Z, Osher S J (2020a) Enresnet: Resnets ensemble via the feynman–kac formalism for adversarial defense and beyond. SIAM J Math Data Sci 2(3):559–582

    Article  MathSciNet  MATH  Google Scholar 

  43. Wang D, Li C, Wen S, Nepal S, Xiang Y (2019a) Daedalus: Breaking non-maximum suppression in object detection via adversarial examples. arXiv:1902.02067

  44. Wang Y, Ma X, Bailey J, Yi J, Zhou B, Gu Q (2019b) On the convergence and robustness of adversarial training. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/wang19i.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 6586–6595

  45. Wang Y, Zou D, Yi J, Bailey J, Ma X, Gu Q (2020b) Improving adversarial robustness requires revisiting misclassified examples. In: International Conference on Learning Representations, https://openreview.net/forum?id=rklOg6EFwS

  46. Wong E, Kolter Z (2018) Provable defenses against adversarial examples via the convex outer adversarial polytope. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v80/wong18a.html, vol 80. Proceedings of Machine Learning Research, Stockholmsmässan, pp 5286–5295

  47. Xiang C, Qi C R, Li B (2019) Generating 3d adversarial point clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  48. Xie C, Tan M, Gong B, Wang J, Yuille A, Le QV (2019a) Adversarial examples improve image recognition. arXiv:1911.09665

  49. Xie C, Wu Y, Maaten Lvd, Yuille AL, He K (2019b) Feature denoising for improving adversarial robustness. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 501–509

  50. Xiong Y, Hsieh CJ (2020) Improved adversarial training via learned optimizer. arXiv:2004.12227

  51. Xu H, Caramanis C, Mannor S (2009) Robustness and regularization of support vector machines. J Mach Learn Res 10(51):1485–1510. http://jmlr.org/papers/v10/xu09b.html

  52. Xu K, Zhang G, Liu S, Fan Q, Sun M, Chen H, Chen PY, Wang Y, Lin X (2019) Evading real-time person detectors by adversarial t-shirt. arXiv:1910.11099

  53. Yang G, Duan T, Hu E, Salman H, Razenshteyn I, Li J (2020) Randomized smoothing of all shapes and sizes. arXiv:2002.08118

  54. Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146

  55. Zhang H, Yu Y, Jiao J, Xing E, Ghaoui LE, Jordan M (2019) Theoretically principled trade-off between robustness and accuracy. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of Machine Learning Research. http://proceedings.mlr.press/v97/zhang19p.html, vol 97, Long Beach, pp 7472–7482

  56. Zhang H, Chen H, Xiao C, Gowal S, Stanforth R, Li B, Boning D, Hsieh CJ (2020) Towards stable and efficient training of verifiably robust neural networks. In: International Conference on Learning Representations, https://openreview.net/forum?id=Skxuk1rFwB

  57. Zhang Y, Liang P (2019) Defending against whitebox adversarial attacks via randomized discretization. In: Chaudhuri K, Sugiyama M (eds) Proceedings of Machine Learning Research. http://proceedings.mlr.press/v89/zhang19b.html, vol 89. PMLR, Proceedings of Machine Learning Research, pp 684–693

  58. Zheltonozhskii E, Baskin C, Nemcovsky Y, Chmiel B, Mendelson A, Bronstein AM (2020) Colored noise injection for training adversarially robust neural networks. arXiv:2003.02188

  59. Zheng S, Song Y, Leung T, Goodfellow I (2016) Improving the robustness of deep neural networks via stability training. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4480–4488. https://doi.org/10.1109/CVPR.2016.485

  60. Zheng T, Wang D, Li B, Xu J (2020) Towards assessment of randomized mechanisms for certifying adversarial robustness. arXiv:2005.07347

Download references

Acknowledgements

The research was funded by the Hyundai Motor Company through the HYUNDAI-TECHNION-KAIST Consortium, National Cyber Security Authority, and the Hiroshi Fujiwara Technion Cyber Security Research Center.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evgenii Zheltonozhskii.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nemcovsky, Y., Zheltonozhskii, E., Baskin, C. et al. Adversarial robustness via noise injection in smoothed models. Appl Intell 53, 9483–9498 (2023). https://doi.org/10.1007/s10489-022-03423-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03423-5

Keywords

Navigation