Abstract
Deep neural networks (DNNs) have been demonstrated to be vulnerable to adversarial samples and many powerful defense methods have been proposed to enhance the adversarial robustness of DNNs. However, these defenses often require adding regularization terms to the loss function or augmenting the training data, which often involves modification of the target model and increases computational consumption. In this paper, we propose a novel adversarial defense approach that leverages the diffusion model with a large purification space to purify potential adversarial samples, and introduce two training strategies termed PSPG and PDPG to defend against different attacks. Our method preprocesses adversarial examples before they are inputted into the target model, and thus can provide protection for DNNs in the inference phase. It does not require modifications to the target model and can protect even deployed models. Extensive experiments on CIFAR-10 and ImageNet demonstrate that our method has good accuracy and transferability, it can provide protection effectively for different models in various defense scenarios. Our code is available at: https://github.com/YNU-JI/PDPG.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Li, J., Chen, J., Sheng, B., Li, P., Yang, P., Feng, D.D., Qi, J.: Automatic detection and classification system of domestic waste via multimodel cascaded convolutional neural network. IEEE Trans. Ind. Inform. 18(1), 163–173 (2022). https://doi.org/10.1109/TII.2021.3085669
Xie, Z., Zhang, W., Sheng, B., Li, P., Chen, C.L.P.: Bagfn: broad attentive graph fusion network for high-order feature interactions. IEEE Trans. Neural Netw. Learn. Syst. 34(8), 4499–4513 (2023). https://doi.org/10.1109/TNNLS.2021.3116209
Sheng, B., Li, P., Ali, R., Chen, C.L.P.: Improving video temporal consistency via broad learning system. IEEE Trans. Cybern. 52(7), 6662–6675 (2022). https://doi.org/10.1109/TCYB.2021.3079311
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: 3rd International Conference on Learning Representations (2015)
Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. In: 5th International Conference on Learning Representations (2017)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: 6th International Conference on Learning Representations (2018)
Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., Li, J.: Boosting adversarial attacks with momentum. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp. 9185–9193 (2018). https://doi.org/10.1109/CVPR.2018.00957
Moosavi-Dezfooli, S., Fawzi, A., Frossard, P.: Deepfool: A simple and accurate method to fool deep neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574–2582 (2016). https://doi.org/10.1109/CVPR.2016.282
Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy, pp. 39–57 (2017). https://doi.org/10.1109/SP.2017.49
Xie, C., Wu, Y., Maaten, L., Yuille, A.L., He, K.: Feature denoising for improving adversarial robustness. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 501–509 (2019). https://doi.org/10.1109/CVPR.2019.00059
Song, C., He, K., Lin, J., Wang, L., Hopcroft, J.E.: Robust local features for improving the generalization of adversarial training. In: 8th International Conference on Learning Representations (2020)
Papernot, N., McDaniel, P.D., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: IEEE Symposium on Security and Privacy, pp. 582–597 (2016). https://doi.org/10.1109/SP.2016.41
Ross, A.S., Doshi-Velez, F.: Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pp. 1660–1669 (2018). https://doi.org/10.1609/AAAI.V32I1.11504
Dziugaite, G.K., Ghahramani, Z., Roy, D.M.: A study of the effect of JPG compression on adversarial images (2016). arXiv preprint arXiv:1608.00853
Guo, C., Rana, M., Cissé, M., Maaten, L.: Countering adversarial images using input transformations. In: 6th International Conference on Learning Representations (2018)
Gu, S., Rigazio, L.: Towards deep neural network architectures robust to adversarial examples. In: 3rd International Conference on Learning Representations (2015)
Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., Zhu, J.: Defense against adversarial attacks using high-level representation guided denoiser. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778–1787 (2018). https://doi.org/10.1109/CVPR.2018.00191
Gao, S., Yao, S., Li, R.: Transferable adversarial defense by fusing reconstruction learning and denoising learning. In: IEEE Conference on Computer Communications Workshops, pp. 1–6 (2021). https://doi.org/10.1109/INFOCOMWKSHPS51825.2021.9484542
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020)
Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 8162–8171 (2021)
Bansal, A., Borgnia, E., Chu, H., Li, J.S., Kazemi, H., Huang, F., Goldblum, M., Geiping, J., Goldstein, T.: Cold diffusion: inverting arbitrary image transforms without noise. arXiv preprint arXiv:2208.09392
Peebles, W., Xie, S.: Scalable diffusion models with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4195–4205 (2023)
Cheng, S., Chen, Y., Chiu, W., Tseng, H., Lee, H.: Adaptively-realistic image generation from stroke and sketch with diffusion model. In: IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 4043–4051 (2023). https://doi.org/10.1109/WACV56688.2023.00404
Kawar, B., Song, J., Ermon, S., Elad, M.: JPEG artifact correction using denoising diffusion restoration models. In: NeurIPS 2022 Workshop on Score-Based Methods (2022)
Saharia, C., Chan, W., Chang, H., Lee, C.A., Ho, J., Salimans, T., Fleet, D.J., Norouzi, M.: Palette: image-to-image diffusion models. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–10 (2022). https://doi.org/10.1145/3528233.3530757
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4713–4726 (2023). https://doi.org/10.1109/TPAMI.2022.3204461
Jiang, N., Sheng, B., Li, P., Lee, T.: Photohelper: portrait photographing guidance via deep feature retrieval and fusion. IEEE Trans. Multimed. 25, 2226–2238 (2023). https://doi.org/10.1109/TMM.2022.3144890
Lin, X., Sun, S., Huang, W., Sheng, B., Li, P., Feng, D.D.: EAPT: efficient attention pyramid transformer for image processing. IEEE Trans. Multimed. 25, 50–61 (2023). https://doi.org/10.1109/TMM.2021.3120873
Chen, Z., Qiu, G., Li, P., Zhu, L., Yang, X., Sheng, B.: MNGNAS: distilling adaptive combination of multiple searched networks for one-shot neural architecture search. IEEE Trans. Pattern Anal. Mach. Intell. 45(11), 13489–13508 (2023). https://doi.org/10.1109/TPAMI.2023.3293885
Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: Proceedings of the 37th International Conference on Machine Learning, vol. 119, pp. 2206–2216 (2020)
Papernot, N., McDaniel, P.D., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: IEEE European Symposium on Security and Privacy, pp. 372–387 (2016). https://doi.org/10.1109/EUROSP.2016.36
Moosavi-Dezfooli, S., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017, pp. 86–94 (2017). https://doi.org/10.1109/CVPR.2017.17
Bhattad, A., Chong, M.J., Liang, K., Li, B., Forsyth, D.A.: Unrestricted adversarial examples via semantic manipulation. In: 8th International Conference on Learning Representations (2020)
Shamsabadi, A.S., Sánchez-Matilla, R., Cavallaro, A.: Colorfool: semantic adversarial colorization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1148–1157 (2020). https://doi.org/10.1109/CVPR42600.2020.00123
Zhao, Z., Liu, Z., Larson, M.A.: Adversarial color enhancement: generating unrestricted adversarial images by optimizing a color filter. In: 31st British Machine Vision Conference 2020 (2020)
Yuan, S., Zhang, Q., Gao, L., Cheng, Y., Song, J.: Natural color fool: towards boosting black-box unrestricted attacks. Adv. Neural Inf. Process. Syst. 35, 7546–7560 (2022)
Chen, J., Chen, H., Chen, K., Zhang, Y., Zou, Z., Shi, Z.: Diffusion models for imperceptible and transferable adversarial attack (2023). arXiv preprint arXiv:2305.08192
Zheng, S., Song, Y., Leung, T., Goodfellow, I.J.: Improving the robustness of deep neural networks via stability training. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4480–4488 (2016). https://doi.org/10.1109/CVPR.2016.485
Ye, N., Li, Q., Zhou, X., Zhu, Z.: Amata: An annealing mechanism for adversarial training acceleration. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10691–10699 (2021). https://doi.org/10.1609/AAAI.V35I12.17278
Gokhale, T., Anirudh, R., Kailkhura, B., Thiagarajan, J.J., Baral, C., Yang, Y.: Attribute-guided adversarial training for robustness to natural perturbations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 7574–7582 (2021). https://doi.org/10.1609/AAAI.V35I9.16927
Terzi, M., Achille, A., Maggipinto, M., Susto, G.A.: Adversarial training reduces information and improves transferability. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2674–2682. https://doi.org/10.1609/AAAI.V35I3.16371
Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. In: 5th International Conference on Learning Representations (2017)
Xu, W., Evans, D., Qi, Y.: Feature squeezing: Detecting adversarial examples in deep neural networks. In: 25th Annual Network and Distributed System Security Symposium (2018)
Pang, T., Du, C., Dong, Y., Zhu, J.: Towards robust detection of adversarial examples. Adv. Neural Inf. Process. Syst. 31, 25 (2018)
Zheng, Z., Hong, P.: Robust detection of adversarial attacks by modeling the intrinsic properties of deep neural networks. Adv. Neural Inf. Process. Syst. 31, 25 (2018)
Chen, N., Zhang, Y., Zen, H., Weiss, R.J., Norouzi, M., Chan, W.: Wavegrad: estimating gradients for waveform generation. In: 9th International Conference on Learning Representations (2021)
Chattopadhyay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-cam++: generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision, pp. 839–847 (2018). https://doi.org/10.1109/WACV.2018.00097
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations (2015)
Huang, G., Liu, Z., Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261–2269 (2017). https://doi.org/10.1109/CVPR.2017.243
Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A.C., Li, Y.: Maxvit: multi-axis vision transformer. In: European Conference on Computer Vision, vol. 13684, pp. 459–479 (2022). https://doi.org/10.1007/978-3-031-20053-3_27
Hill, M., Mitchell, J.C., Zhu, S.: Stochastic security: adversarial defense using long-run dynamics of energy-based models. In: 9th International Conference on Learning Representations (2021)
Acknowledgements
This paper was supported by the National Natural Science Foundation of China under Grant 62162067 and 62101480.
Author information
Authors and Affiliations
Contributions
JJ: Methodology, Software, Conceptualization, Writing original draft. SG: Methodology, Software, Investigation, Writing review. WZ: Supervision, Funding acquisition.
Corresponding author
Ethics declarations
Conflicts of interest
We declare that there is no conflict of interest related to the content of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ji, J., Gao, S. & Zhou, W. Transferable adversarial sample purification by expanding the purification space of diffusion models. Vis Comput 40, 8531–8543 (2024). https://doi.org/10.1007/s00371-023-03253-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03253-7