Adversarial robustness via noise injection in smoothed models

Nemcovsky, Yaniv; Zheltonozhskii, Evgenii; Baskin, Chaim; Chmiel, Brian; Bronstein, Alex M.; Mendelson, Avi

doi:10.1007/s10489-022-03423-5

Adversarial robustness via noise injection in smoothed models

Published: 09 August 2022

Volume 53, pages 9483–9498, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Yaniv Nemcovsky¹,
Evgenii Zheltonozhskii ORCID: orcid.org/0000-0002-5400-9321¹,
Chaim Baskin¹,
Brian Chmiel^2,3,
Alex M. Bronstein¹ &
…
Avi Mendelson¹

501 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Deep neural networks are known to be vulnerable to malicious perturbations. Current methods for improving adversarial robustness make use of either implicit or explicit regularization, with the latter is usually based on adversarial training. Randomized smoothing, the averaging of the classifier outputs over a random distribution centered in the sample, has been shown to guarantee a classifier’s performance subject to bounded perturbations of the input. In this work, we study the application of randomized smoothing to improve performance on unperturbed data and increase robustness to adversarial attacks. We propose to combine smoothing along with adversarial training and randomization approaches, and find that doing so significantly improves the resilience compared to the baseline. We examine our method’s performance on common white-box (FGSM, PGD) and black-box (transferable attack and NAttack) attacks on CIFAR-10 and CIFAR-100, and determine that for a low number of iterations, smoothing provides a significant performance boost that persists even for perturbations with a high attack norm, 𝜖. For example, under a PGD-10 attack on CIFAR-10 using Wide-ResNet28-4, we achieve 60.3% accuracy for infinity norm \(\epsilon _{\infty }=\nicefrac {8}{255}\) and 13.1% accuracy for \(\epsilon _{\infty }=\nicefrac {35}{255}\) – outperforming previous art by 3% and 6%, respectively. We achieve nearly twice the accuracy on \(\epsilon _{\infty }=\nicefrac {35}{255}\) and even more so for perturbations with higher infinity norm. A https://github.com/yanemcovsky/SIAM of the proposed method is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Deepfake: An Overview

A comprehensive survey of AI-enabled phishing attacks detection techniques

Article 23 October 2020

References

Athalye A, Carlini N, Wagner D (2018a) Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR, Stockholmsmässan. http://proceedings.mlr.press/v80/athalye18a.html, vol 80. Proceedings of Machine Learning Research, Stockholm Sweden, pp 274–283
Athalye A, Engstrom L, Ilyas A, Kwok K (2018b) Synthesizing robust adversarial examples. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v80/athalye18b.html, vol 80. Proceedings of Machine Learning Research, Stockholmsmässan, pp 284–293
Balaji Y, Goldstein T, Hoffman J (2019) Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets. arXiv:1910.08051
Balunovic M, Vechev M (2020) Adversarial training and provable defenses: Bridging the gap. In: International Conference on Learning Representations, https://openreview.net/forum?id=SJxSDxrKDr
Bietti A, Mialon G, Chen D, Mairal J (2019) A kernel perspective for regularizing deep neural networks. In: International Conference on Machine Learning. PMLR, pp 664–674
Blum A, Dick T, Manoj N, Zhang H (2020) Random smoothing might be unable to certify \(\ell _{\infty }\) robustness for high-dimensional images. arXiv:2002.03517
Brown TB, Mané D, Roy A, Abadi M, Gilmer J (2017) Adversarial patch. arXiv:1712.09665
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp 39–57. https://doi.org/10.1109/SP.2017.49
Carlini N, Wagner D (2018) Audio adversarial examples: Targeted attacks on speech-to-text. In: 2018 IEEE Security and Privacy Workshops (SPW). IEEE, pp 1–7
Chaturvedi A, KP A, Garain U (2019) Exploring the robustness of nmt systems to nonsensical inputs. arXiv:1908.01165
Chen PY, Zhang H, Sharma Y, Yi J, Hsieh CJ (2017) ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec ’17, ACM, New York, pp 15–26. https://doi.org/10.1145/3128572.3140448
Cohen J, Rosenfeld E, Kolter Z (2019) Certified adversarial robustness via randomized smoothing. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/cohen19c.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 1310–1320
Gao J, Lanchantin J, Soffa M L, Qi Y (2018) Black-box generation of adversarial text sequences to evade deep learning classifiers. In: 2018 IEEE Security and Privacy Workshops (SPW) IEEE, pp 50–56
Gilmer J, Metz L, Faghri F, Schoenholz SS, Raghu M, Wattenberg M, Goodfellow I (2018) Adversarial spheres. arXiv:180102774
Gilmer J, Ford N, Carlini N, Cubuk E (2019) Adversarial examples are a natural consequence of test error in noise. In: International Conference on Machine Learning, PMLR, pp 2280–2289
Gleave A, Dennis M, Wild C, Kant N, Levine S, Russell S (2020) Adversarial policies: Attacking deep reinforcement learning. In: International Conference on Learning Representations, https://openreview.net/forum?id=HJgEMpVFwB
Gong C, Ren T, Ye M, Liu Q (2020) Maxup: A simple way to improve generalization of neural network training. arXiv:2002.09024 2002.09024
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv:1412.6572
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He Z, Rakin AS, Fan D (2019) Parametric noise injection: Trainable randomness to improve deep neural network robustness against adversarial attack. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), http://openaccess.thecvf.com/content_CVPR_2019/html/He_Parametric_Noise_Injection_Trainable_Randomness_to_Improve_Deep_Neural_Network_CVPR_2019_paper.html
Jiang H, Chen Z, Shi Y, Dai B, Zhao T (2018) Learning to defense by learning to attack. arXiv:1811.01213
Jin D, Jin Z, Zhou J T, Szolovits P (2020) Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 8018–8025
Khoury M, Hadfield-Menell D (2019) Adversarial training with voronoi constraints. arXiv:1905.01019
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114
Kingma DP, Salimans T, Welling M (2015) Variational dropout and the local reparameterization trick. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in Neural Information Processing Systems, vol 28. Curran Associates, Inc., pp 2575–2583. http://papers.nips.cc/paper/5666-variational-dropout-and-the-local-reparameterization-trick.pdf
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
Kumar A, Levine A, Goldstein T, Feizi S (2020) Curse of dimensionality on randomized smoothing for certifiable robustness. arXiv:2002.03239
Li Y, Li L, Wang L, Zhang T, Gong B (2019) NATTACK: Learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/li19g.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 3866–3876
Li Z, Feng C, Zheng J, Wu M, Yu H (2020) Towards adversarial robustness via feature matching. IEEE Access https://ieeexplore.ieee.org/abstract/document/9089860/
Liu A, Liu X, Yu H, Zhang C, Liu Q, Tao D (2021) Training robust deep neural networks via adversarial noise propagation. IEEE Trans Image Process 30:5769–5781
Article Google Scholar
Liu X, Cheng M, Zhang H, Hsieh CJ (2018) Towards robust neural networks via random self-ensemble. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 369–385
Liu Y, Chen X, Liu C, Song D (2016) Delving into transferable adversarial examples and black-box attacks. arXiv:1611.02770
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations, https://openreview.net/forum?id=rJzIBfZAb
Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ASIA CCS ’17, ACM, New York, pp 506–519. https://doi.org/10.1145/3052973.3053009
Pinot R, Meunier L, Araujo A, Kashima H, Yger F, Gouy-Pailler C, Atif J (2019) Theoretical evidence for adversarial robustness through randomization. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E, Garnett R (eds). https://papers.nips.cc/paper/2019/hash/36ab62655fa81ce8735ce7cfdaf7c9e8-Abstract.html, vol 32. Advances in Neural Information Processing Systems. Curran Associates, Inc.
Rony J, Hafemann L G, Oliveira L S, Ayed I B, Sabourin R, Granger E (2019) Decoupling direction and norm for efficient gradient-based L2 adversarial attacks and defenses. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Salman H, Yang G, Li J, Zhang P, Zhang H, Razenshteyn I, Bubeck S (2019) Provably robust deep learning via adversarially trained smoothed classifiers. arXiv:1906.04584
Sarkar A, Gupta NK, Iyengar R (2019) Enforcing linearity in dnn succours robustness and adversarial image generation. arXiv:1910.08108
Shafahi A, Najibi M, Ghiasi MA, Xu Z, Dickerson J, Studer C, Davis LS, Taylor G, Goldstein T (2019) Adversarial training for free!. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E, Garnett R (eds) Advances in Neural Information Processing Systems. http://papers.nips.cc/paper/8597-adversarial-training-for-free, vol 32. Curran Associates, Inc., pp 3358–3369
Sun H, Wang R, Chen K, Utiyama M, Sumita E, Zhao T (2020) Robust unsupervised neural machine translation with adversarial training. arXiv:2002.12549
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv:1312.6199
Wang B, Yuan B, Shi Z, Osher S J (2020a) Enresnet: Resnets ensemble via the feynman–kac formalism for adversarial defense and beyond. SIAM J Math Data Sci 2(3):559–582
Article MathSciNet MATH Google Scholar
Wang D, Li C, Wen S, Nepal S, Xiang Y (2019a) Daedalus: Breaking non-maximum suppression in object detection via adversarial examples. arXiv:1902.02067
Wang Y, Ma X, Bailey J, Yi J, Zhou B, Gu Q (2019b) On the convergence and robustness of adversarial training. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v97/wang19i.html, vol 97. Proceedings of Machine Learning Research, Long Beach, pp 6586–6595
Wang Y, Zou D, Yi J, Bailey J, Ma X, Gu Q (2020b) Improving adversarial robustness requires revisiting misclassified examples. In: International Conference on Learning Representations, https://openreview.net/forum?id=rklOg6EFwS
Wong E, Kolter Z (2018) Provable defenses against adversarial examples via the convex outer adversarial polytope. In: Dy J, Krause A (eds) Proceedings of the 35th International Conference on Machine Learning, PMLR. http://proceedings.mlr.press/v80/wong18a.html, vol 80. Proceedings of Machine Learning Research, Stockholmsmässan, pp 5286–5295
Xiang C, Qi C R, Li B (2019) Generating 3d adversarial point clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Xie C, Tan M, Gong B, Wang J, Yuille A, Le QV (2019a) Adversarial examples improve image recognition. arXiv:1911.09665
Xie C, Wu Y, Maaten Lvd, Yuille AL, He K (2019b) Feature denoising for improving adversarial robustness. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 501–509
Xiong Y, Hsieh CJ (2020) Improved adversarial training via learned optimizer. arXiv:2004.12227
Xu H, Caramanis C, Mannor S (2009) Robustness and regularization of support vector machines. J Mach Learn Res 10(51):1485–1510. http://jmlr.org/papers/v10/xu09b.html
Xu K, Zhang G, Liu S, Fan Q, Sun M, Chen H, Chen PY, Wang Y, Lin X (2019) Evading real-time person detectors by adversarial t-shirt. arXiv:1910.11099
Yang G, Duan T, Hu E, Salman H, Razenshteyn I, Li J (2020) Randomized smoothing of all shapes and sizes. arXiv:2002.08118
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146
Zhang H, Yu Y, Jiao J, Xing E, Ghaoui LE, Jordan M (2019) Theoretically principled trade-off between robustness and accuracy. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of Machine Learning Research. http://proceedings.mlr.press/v97/zhang19p.html, vol 97, Long Beach, pp 7472–7482
Zhang H, Chen H, Xiao C, Gowal S, Stanforth R, Li B, Boning D, Hsieh CJ (2020) Towards stable and efficient training of verifiably robust neural networks. In: International Conference on Learning Representations, https://openreview.net/forum?id=Skxuk1rFwB
Zhang Y, Liang P (2019) Defending against whitebox adversarial attacks via randomized discretization. In: Chaudhuri K, Sugiyama M (eds) Proceedings of Machine Learning Research. http://proceedings.mlr.press/v89/zhang19b.html, vol 89. PMLR, Proceedings of Machine Learning Research, pp 684–693
Zheltonozhskii E, Baskin C, Nemcovsky Y, Chmiel B, Mendelson A, Bronstein AM (2020) Colored noise injection for training adversarially robust neural networks. arXiv:2003.02188
Zheng S, Song Y, Leung T, Goodfellow I (2016) Improving the robustness of deep neural networks via stability training. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4480–4488. https://doi.org/10.1109/CVPR.2016.485
Zheng T, Wang D, Li B, Xu J (2020) Towards assessment of randomized mechanisms for certifying adversarial robustness. arXiv:2005.07347

Download references

Acknowledgements

The research was funded by the Hyundai Motor Company through the HYUNDAI-TECHNION-KAIST Consortium, National Cyber Security Authority, and the Hiroshi Fujiwara Technion Cyber Security Research Center.

Author information

Authors and Affiliations

Department of Computer Science, Technion, Haifa, Israel
Yaniv Nemcovsky, Evgenii Zheltonozhskii, Chaim Baskin, Alex M. Bronstein & Avi Mendelson
Habana Labs, An Intel Company, Caesarea, Israel
Brian Chmiel
Department of Electrical Engineering, Technion, Haifa, Israel
Brian Chmiel

Authors

Yaniv Nemcovsky
View author publications
You can also search for this author in PubMed Google Scholar
Evgenii Zheltonozhskii
View author publications
You can also search for this author in PubMed Google Scholar
Chaim Baskin
View author publications
You can also search for this author in PubMed Google Scholar
Brian Chmiel
View author publications
You can also search for this author in PubMed Google Scholar
Alex M. Bronstein
View author publications
You can also search for this author in PubMed Google Scholar
Avi Mendelson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Evgenii Zheltonozhskii.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nemcovsky, Y., Zheltonozhskii, E., Baskin, C. et al. Adversarial robustness via noise injection in smoothed models. Appl Intell 53, 9483–9498 (2023). https://doi.org/10.1007/s10489-022-03423-5

Download citation

Accepted: 22 February 2022
Published: 09 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10489-022-03423-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adversarial robustness via noise injection in smoothed models

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Deepfake: An Overview

A comprehensive survey of AI-enabled phishing attacks detection techniques

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adversarial robustness via noise injection in smoothed models

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Deepfake: An Overview

A comprehensive survey of AI-enabled phishing attacks detection techniques

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation