Abstract
Clean-label indiscriminate poisoning attacks add invisible perturbations to correctly labeled training images, thus dramatically reducing the generalization capability of the victim models. Recently, defense mechanisms such as adversarial training, image transformation techniques, and image purification have been proposed. However, these schemes are either susceptible to adaptive attacks, built on unrealistic assumptions, or only effective against specific poison types, limiting their universal applicability. In this research, we propose a more universally effective, practical, and robust defense scheme called ECLIPSE. We first investigate the impact of Gaussian noise on the poisons and theoretically prove that any kind of poison will be largely assimilated when imposing sufficient random noise. In light of this, we assume the victim has access to an extremely limited number of clean images (a more practical scene) and subsequently enlarge this sparse set for training a denoising probabilistic model (a universal denoising tool). We then introduce Gaussian noise to absorb the poisons and apply the model for denoising, resulting in a roughly purified dataset. Finally, to address the trade-off of the inconsistency in the assimilation sensitivity of different poisons by Gaussian noise, we propose a lightweight corruption compensation module to effectively eliminate residual poisons, providing a more universal defense approach. Extensive experiments demonstrate that our defense approach outperforms 10 state-of-the-art defenses. We also propose an adaptive attack against ECLIPSE and verify the robustness of our defense scheme. Our code is available at https://github.com/CGCL-codes/ECLIPSE.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Biggio, B., Nelson, B., Laskov, P.: Support vector machines under adversarial label noise. In: Proceedings of the 3rd Asian Conference on Machine Learning (ACML’11), pp. 97–112 (2011)
Borgnia, E., et al.: Strong data augmentation sanitizes poisoning and backdoor attacks without an accuracy tradeoff. In: Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’21), pp. 3855–3859 (2021)
Chen, S., et al.: Self-ensemble protection: training checkpoints are good data protectors. In: Proceedings of the 11th International Conference on Learning Representations (ICLR’23) (2023)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the 2009 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’09), pp. 248–255 (2009)
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Dolatabadi, H.M., Erfani, S., Leckie, C.: The devil’s advocate: shattering the illusion of unexploitable data using diffusion models. arXiv preprint arXiv:2303.08500 (2023)
Feng, J., Cai, Q.-Z., Zhou, Z.H.: Learning to confuse: generating training time adversarial data with auto-encoder. In: Proceedings of the 33rd Neural Information Processing Systems (NeruIPS’19), vol. 32, pp. 11971–11981 (2019)
Fowl, L., et al.: Preventing unauthorized use of proprietary data: poisoning for secure dataset release. arXiv preprint arXiv:2103.02683 (2021)
Fowl, L., Goldblum, M., Chiang, P.V., Geiping, J., Czaja, W., Goldstein, T.: Adversarial examples make strong poisons. In: Proceedings of the 35th Neural Information Processing Systems (NeurIPS’21), vol. 34, pp. 30339–30351 (2021)
Fu, S., He, F., Liu, Y., Shen, L., Tao, D.: Robust unlearnable examples: protecting data privacy against adversarial learning. In: Proceedings of the 10th International Conference on Learning Representations (ICLR’22) (2022)
Geirhos, R., et al.: Shortcut learning in deep neural networks. Nature Mach. Intell. 2, 665–673 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’16), pp. 770–778 (2016)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Proceedings of the 34th Neural Information Processing Systems (NeurIPS’20), vol. 33, pp. 6840–6851 (2020)
Hong, S., Chandrasekaran, V., Kaya, Y., Dumitraş, T. Papernot, N.: On the effectiveness of mitigating data poisoning attacks with gradient shaping. arXiv preprint arXiv:2002.11497 (2020)
Hu, S., et al.: PointCRT: detecting backdoor in 3D point cloud via corruption robustness. In: Proceedings of the 31st ACM International Conference on Multimedia (MM’23), pp. 666–675 (2023)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17), pp. 4700–4708 (2017)
Huang, H., Ma, X., Erfani, S.M., Wang, J.B.A.Y.: Unlearnable examples: making personal data unexploitable. In: Proceedings of the 9th International Conference on Learning Representations (ICLR’21) (2021)
Jiang, W., Diao, Y., Wang, H., Sun, J., Wang, M., Hong, R.: Unlearnable examples give a false sense of security: Piercing through unexploitable data with learnable examples. In: Proceedings of the 31st ACM International Conference on Multimedia (MM’23), pp. 8910–8921 (2023)
Kostrikov, I., Fergus, R., Tompson, J., Nachum, O.: Offline reinforcement learning with fisher divergence critic regularization. In: Proceedings of the 38th International Conference on Machine Learning (ICML’21), pp. 5774–5783 (2021)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Master’s thesis, University of Tront (2009)
Liu, X., et al.: Detecting backdoors during the inference stage based on corruption robustness consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’23), pp. 16363–16372 (2023)
Liu, Z., Zhao, Z., Larson, M.: Image shortcut squeezing: countering perturbative availability poisons with compression. In: Proceedings of the 40th International Conference on Machine Learning (ICML’23) (2023)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: Proceedings of the 6th International Conference on Learning Representations (ICLR’18) (2018)
Muñoz-González, L., et al.: Towards poisoning of deep learning algorithms with back-gradient optimization. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec’17), pp. 27–38 (2017)
Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: Proceedings of the 38th International Conference on Machine Learning (ICML’21), pp. 8162–8171 (2021)
Nie, W., Guo, B., Huang, Y., Xiao, C., Vahdat, A., Anandkumar, A.: Diffusion models for adversarial purification. In: Proceedings of the 39th International Conference on Machine Learning (ICML’22) (2022)
Qin, T., Gao, X., Zhao, J., Ye, K., Xu, C.Z.: Learning the unlearnable: adversarial augmentations suppress unlearnable example attacks. arXiv preprint arXiv:2303.15127 (2023)
Ren, J., Xu, H., Wan, Y., Ma, X., Sun, L., Tang, J.: Transferable unlearnable examples. In: Proceedings of the 11th International Conference on Learning Representations (ICLR’23) (2023)
Sadasivan, V.S., Soltanolkotabi, M., Feizi, S.: CUDA: convolution-based unlearnable datasets. In: Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’23), pp. 3862–3871 (2023)
Sandoval-Segura, P., et al.: Poisons that are learned faster are more effective. In: Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’22), pp. 198–205 (2022)
Sandoval-Segura, P., Singla, V., Geiping, J., Goldblum, M., Goldstein, T., Jacobs, D.W.: Autoregressive perturbations for data poisoning. In: Proceedings of the 36th Neural Information Processing Systems (NeurIPS’22), vol. 35 (2022)
Sandoval-Segura, P., Singla, V., Geiping, J., Goldblum, M., Goldstein, T.: What can we learn from unlearnable datasets? In: Proceedings of the 37th Neural Information Processing Systems (NeurIPS’23) (2023)
Särkkä, S., Solin, A.: Applied Stochastic Differential Equations, vol. 10. Cambridge University Press, Cambridge (2019)
Shen, J., Zhu, X., Ma, D.: TensorClog: an imperceptible poisoning attack on deep neural network applications. IEEE Access 7, 41498–41506 (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, Y., Durkan, C., Murray, I., Ermon, S.: Maximum likelihood training of score-based diffusion models. In: Proceedings of the 35th Neural Information Processing Systems (NeurIPS’21), vol. 34, pp. 1415–1428 (2021)
Song, Y., Sohl-Dickstein, J., Kingma, D., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. In: Proceedings of the 9th International Conference on Learning Representations (ICLR’21) (2021)
Tao, L., Feng, L., Yi, J., Huang, S.-J., Chen, S.: Better safe than sorry: preventing delusive adversaries with adversarial training. In: Proceedings of the 35th Neural Information Processing Systems (NeurIPS’21), vol. 34, pp. 16209–16225 (2021)
Wang, X., Hu, S., Li, M., Yu, Z., Zhou, Z., Zhang, L.Y.: Corrupting convolution-based unlearnable datasets with pixel-based image transformations. arXiv preprint arXiv:2311.18403 (2023)
Wang, Z., Wang, Y., Wang, Y.: Fooling adversarial training with inducing noise. arXiv preprint arXiv:2111.10130 (2021)
Wen, R., Zhao, Z., Liu, Z., Backes, M., Wang, T., Zhang, Y.: Is adversarial training really a silver bullet for mitigating data poisoning? In: Proceedings of the 11th International Conference on Learning Representations (ICLR’23) (2023)
Yu, D., Zhang, H., Chen, W., Liu, Yin, J., Liu, T.Y.: Availability attacks create shortcuts. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’22), pp. 2367–2376 (2022)
Yuan, C.H., Wu, S.H.: Neural tangent generalization attacks. In: Proceedings of the 38th International Conference on Machine Learning (ICML’21), pp. 12230–12240 (2021)
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutmMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the 17th International Conference on Computer Vision (ICCV’19) (2019)
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: MixUp: beyond empirical risk minimization. In: Proceedings of the 6th International Conference on Learning Representations (ICLR’18) (2018)
Zhang, L., Shen, B., Barnawi, A., Xi, S., Kumar, N., Wu, Y.: FEDDPGAN: federated differentially private generative adversarial networks framework for the detection of COVID-19 pneumonia. Inf. Syst. Front. 23(6), 1403–1415 (2021)
Zhang, R., Zhu, Q.: A game-theoretic analysis of label flipping attacks on distributed support vector machines. In: Proceedings of the 51st Annual Conference on Information Sciences and Systems (CISS’17), pp. 1–6 (2017)
Zhang, Y., et al.: Why does little robustness help? A further step towards understanding adversarial transferability. In: Proceedings of the 45th IEEE Symposium on Security and Privacy (S &P’24), vol. 2 (2024)
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: a nested u-net architecture for medical image segmentation. In: Proceedings of the International Workshop on Deep Learning in Medical Image Analysis, pp. 3–11 (2018)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No. U20A20177) and Hubei Province Key R&D Technology Special Innovation Project (Grant No.2021BAA032). Shengshan Hu and Peng Xu are co-corresponding authors.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
A Appendix
A Appendix
Proof for Theorem 1: Based on the continuous-time forward process defined as the solution to the SDE [36], we have:
where g(t) is the diffusion coefficient, h(x, t) is the drift coefficient, and \(\beta (t)\) is a Brownian motion with a diffusion matrix. After this, according to the Fokker-Planck-Kolmogorov equation [33], we have:
where \(k_p(x,t)\) is defined as \(-h(x,t)+\frac{\nabla _x \log {p(x,t)}}{2}g^2(t)\). Then we have:
where the fourth equality follows from the integration by parts and our assumption of smooth and fast-decaying p(x, t) and q(x, t). Here, \(D_F\) denotes the Fisher divergence [19]. Since \(g^2(t) > 0\) and the Fisher divergence is non-negative, we have:
where equality holds only if \(p(x,t)=q(x,t)\).
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, X. et al. (2024). ECLIPSE: Expunging Clean-Label Indiscriminate Poisons via Sparse Diffusion Purification. In: Garcia-Alfaro, J., Kozik, R., Choraś, M., Katsikas, S. (eds) Computer Security – ESORICS 2024. ESORICS 2024. Lecture Notes in Computer Science, vol 14982. Springer, Cham. https://doi.org/10.1007/978-3-031-70879-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-70879-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70878-7
Online ISBN: 978-3-031-70879-4
eBook Packages: Computer ScienceComputer Science (R0)