Skip to main content
Log in

Certified defense against patch attacks via mask-guided randomized smoothing

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

The adversarial patch is a practical and effective method that modifies a small region on an image, making DNNs fail to classify. Existing empirical defenses against adversarial patch attacks lack theoretical analysis and are vulnerable to adaptive attacks. To overcome such shortcomings, certified defenses that provide a guaranteed classification performance in the face of strong unknown adversarial attacks are proposed. However, on the one hand, existing certified defenses either have low clean accuracy or need specified architecture, which is not robust enough. On the other hand, they can only provide provable accuracy but ignore the relationship to the number of perturbations. In this paper, we propose a certified defense against patch attacks that provides both the provable radius and high classification accuracy. By adding Gaussian noises only on the patch region with a mask, we prove that a stronger certificate with high confidence can be achieved by randomized smoothing. Furthermore, we design a practical scheme based on joint voting to find the patch with a high probability and certify it effectively. Our defense achieves 86.4% clean accuracy and 71.8% certified accuracy on CIFAR-10 exceeding the maximum 60% certified accuracy of existing methods. The clean accuracy of 67.8% and the certified accuracy of 53.6% on ImageNet are better than the state-of-the-art method, whose certified accuracy is 26%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Szegedy C, Zaremba W, Sutskever I, et al. Intriguing properties of neural networks. In: Proceedings of the 2nd International Conference on Learning Representations, 2014

  2. Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: Proceedings of the 3rd International Conference on Learning Representations, 2015

  3. Moosavi-Dezfooli S M, Fawzi A, Frossard P. Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2574–2582

  4. Carlini N, Wagner D. Towards evaluating the robustness of neural Networks. In: Proceedings of IEEE Symposium on Security and Privacy, 2017. 39–57

  5. Chen P Y, Zhang H, Sharma Y, et al. Zoo: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 2017. 15–26

  6. Madry A, Makelov A, Schmidt L, et al. Towards deep learning models resistant to adversarial attacks. In: Proceedings of the 6th International Conference on Learning Representations, 2018

  7. Brown T B, Mané D, Roy A, et al. Adversarial patch. 2017. ArXiv:1712.09665

  8. Karmon D, Zoran D, Goldberg Y. LaVAN: localized and visible adversarial noise. In: Proceedings of the 35th International Conference on Machine Learning, 2018. 2507–2515

  9. Yang C L, Kortylewski A, Xie C, et al. Patchattack: a black-box texture-based attack with reinforcement learning. In: Proceedings of the 16th European Conference on Computer Vision, 2020. 681–698

  10. Li Y, Bian X, Lyu S. Attacking object detectors via imperceptible patches on background. 2018. ArXiv:1809.05966

  11. Lee M, Kolter J Z. On physical adversarial patches for object detection. 2019. ArXiv:1906.11897

  12. Wu Z, Lim S N, Davis L, et al. Making an invisibility cloak: real world adversarial attacks on object detectors. In: Proceedings of the 16th European Conference on Computer Vision, 2020. 1–17

  13. Xu K, Zhang G, Liu S, et al. Adversarial T-shirt! Evading person detectors in a physical world. In: Proceedings of the 16th European Conference on Computer Vision, 2020. 665–681

  14. Saha A, Subramanya A, Patil K, et al. Role of spatial context in adversarial robustness for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2020. 784–785

  15. Redmon J, Farhadi A. YOLOv3: an incremental improvement. 2018. ArXiv:1804.02767

  16. Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1137–1149

    Article  Google Scholar 

  17. Pautov M, Melnikov G, Kaziakhmedov E, et al. On adversarial patches: real-world attack on ArcFace-100 face recognition system. In: Proceedings of International Multi-Conference on Engineering, Computer and Information Sciences, 2019. 391–396

  18. Komkov S A, Petiushko A. Advhat: real-world adversarial attack on arcface face id system. In: Proceedings of the 25th International Conference on Pattern Recognition, 2021. 819–826

  19. Yang X, Wei F, Zhang H, et al. Design and interpretation of universal adversarial patches in face detection. In: Proceedings of the 16th European Conference on Computer Vision, 2020. 174–191

  20. Hayes J. On visible adversarial perturbations & digital watermarking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018. 1597–1604

  21. Naseer M, Khan S, Porikli F. Local gradients smoothing: defense against localized adversarial attacks. In: Proceedings of IEEE Winter Conference on Applications of Computer Vision, 2019. 1300–1307

  22. Wu T, Tong L, Vorobeychik Y. Defending against physically realizable attacks on image classification. In: Proceedings of the 8th International Conference on Learning Representations, 2020

  23. Rao S, Stutz D, Schiele B. Adversarial training against location-optimized adversarial patches. In: Proceedings of European Conference on Computer Vision Workshops, 2020. 429–448

  24. Athalye A, Carlini N, Wagner D A. Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: Proceedings of the 35th International Conference on Machine Learning, 2018. 274–283

  25. Carlini N, Athalye A, Papernot N, et al. On evaluating adversarial robustness. 2019. ArXiv:1902.06705

  26. Chiang P Y, Ni R, Abdelkader A, et al. Certified defenses for adversarial patches. In: Proceedings of the 8th International Conference on Learning Representations, 2020

  27. Levine A, Feizi S. (De) randomized smoothing for certifiable defense against patch attacks. In: Proceedings of Advances in Neural Information Processing Systems, 2020

  28. Zhang Z, Yuan B, McCoyd M, et al. Clipped bagNet: defending against sticker attacks with clipped bag-of-features. In: Proceedings of IEEE Security and Privacy Workshops, 2020. 55–61

  29. Xiang C, Bhagoji A N, Sehwag V, et al. Patchguard: a provably robust defense against adversarial patches via small receptive fields and masking. In: Proceedings of the 30th USENIX Security Symposium, 2021

  30. Metzen J H, Yatsura M. Efficient certified defenses against patch attacks on image classifiers. In: Proceedings of the 9th International Conference on Learning Representations, 2021

  31. Subramanya A, Pillai V, Pirsiavash H. Fooling network interpretation in image classification. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 2020–2029

  32. Gittings T, Schneider S, Collomosse J. Robust synthesis of adversarial visual examples using a deep image prior. In: Proceedings of the 30th British Machine Vision Conference, 2019

  33. Ulyanov D, Vedaldi A S, Lempitsky V. Deep image prior. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 9446–9454

  34. Fendley N, Lennon M, Wang I, et al. Jacks of all trades, masters of none: addressing distributional shift and obtrusiveness via transparent patch attacks. In: Proceedings of European Conference on Computer Vision Workshops, 2020. 105–119

  35. Brunner T, Diehl F, Knoll A. Copy and paste: a simple but effective initialization method for black-box adversarial attacks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

  36. Liu A, Liu X, Fan J, et al. Perceptual-sensitive GAN for generating adversarial patches. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019. 1028–1035

  37. Luo J, Bai T, Zhao J, et al. Generating adversarial yet inconspicuous patches with a single image. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021. 15837–15838

  38. Gowal S, Stanforth R. Scalable verified training for provably robust image classification. In: Proceedings of IEEE International Conference on Computer Vision, 2019. 4841–4850

  39. Cohen J, Rosenfeld E, Kolter Z. Certified adversarial robustness via randomized smoothing. In: Proceedings of the 36th International Conference on Machine Learning, 2019. 1310–1320

  40. Levine A, Feizi S. Robustness certificates for sparse adversarial attacks by randomized ablation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020. 4585–4593

  41. McCoyd M, Park W, Chen S, et al. Minority reports defense: defending against adversarial patches. In: Proceedings of Applied Cryptography and Network Security Workshops, 2020. 564–582

  42. Neyman J, Pearson E S. On the problem of the most efficient tests of statistical hypotheses. Phil Trans R Soc Lond A, 1933, 231: 289–337

    Article  MATH  Google Scholar 

  43. Telea A. An image inpainting technique based on the fast marching method. J Graphics Tools, 2004, 9: 23–34

    Article  Google Scholar 

  44. Su J, Vargas D V, Sakurai K. One pixel attack for fooling deep neural networks. IEEE Trans Evol Comput, 2019, 23: 828–841

    Article  Google Scholar 

  45. Black S, Keshavarz S, Souvenir R. Evaluation of image inpainting for classification and retrieval. In: Proceedings of IEEE Winter Conference on Applications of Computer Vision, 2020. 1060–1069

  46. Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. 2009

  47. Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: Proceedings of IEEE conference on Computer Vision and Pattern Recognition, 2009. 248–255

  48. Selvaraju R R, Cogswell M, Das A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis, 2020, 128: 336–359

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Grant Nos. U20B2047, 62072421, 62002334, 62102386, 62121002), Exploration Fund Project of University of Science and Technology of China (Grant No. YD348000-2001), and Fundamental Research Funds for the Central Universities (Grant No. WK2100000011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiming Zhang.

Additional information

Supporting information Appendixes A–C. The supporting information is available online at info.scichina.com and link. springer.com. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, K., Zhou, H., Bian, H. et al. Certified defense against patch attacks via mask-guided randomized smoothing. Sci. China Inf. Sci. 65, 170306 (2022). https://doi.org/10.1007/s11432-021-3457-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-021-3457-7

Keywords

Navigation