ECLIPSE: Expunging Clean-Label Indiscriminate Poisons via Sparse Diffusion Purification

Wang, Xianlong; Hu, Shengshan; Zhang, Yechao; Zhou, Ziqi; Zhang, Leo Yu; Xu, Peng; Wan, Wei; Jin, Hai

doi:10.1007/978-3-031-70879-4_8

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14982))

Included in the following conference series:

European Symposium on Research in Computer Security

1165 Accesses

Abstract

Clean-label indiscriminate poisoning attacks add invisible perturbations to correctly labeled training images, thus dramatically reducing the generalization capability of the victim models. Recently, defense mechanisms such as adversarial training, image transformation techniques, and image purification have been proposed. However, these schemes are either susceptible to adaptive attacks, built on unrealistic assumptions, or only effective against specific poison types, limiting their universal applicability. In this research, we propose a more universally effective, practical, and robust defense scheme called ECLIPSE. We first investigate the impact of Gaussian noise on the poisons and theoretically prove that any kind of poison will be largely assimilated when imposing sufficient random noise. In light of this, we assume the victim has access to an extremely limited number of clean images (a more practical scene) and subsequently enlarge this sparse set for training a denoising probabilistic model (a universal denoising tool). We then introduce Gaussian noise to absorb the poisons and apply the model for denoising, resulting in a roughly purified dataset. Finally, to address the trade-off of the inconsistency in the assimilation sensitivity of different poisons by Gaussian noise, we propose a lightweight corruption compensation module to effectively eliminate residual poisons, providing a more universal defense approach. Extensive experiments demonstrate that our defense approach outperforms 10 state-of-the-art defenses. We also propose an adaptive attack against ECLIPSE and verify the robustness of our defense scheme. Our code is available at https://github.com/CGCL-codes/ECLIPSE.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

From Deconstruction to Reconstruction: A Plug-In Module for Diffusion-Based Purification of Adversarial Examples

PatchBreaker: defending against adversarial attacks by cutting-inpainting patches and joint adversarial training

Article 29 August 2024

Patch Selection Denoiser: An Effective Approach Defending Against One-Pixel Attacks

References

Biggio, B., Nelson, B., Laskov, P.: Support vector machines under adversarial label noise. In: Proceedings of the 3rd Asian Conference on Machine Learning (ACML’11), pp. 97–112 (2011)
Google Scholar
Borgnia, E., et al.: Strong data augmentation sanitizes poisoning and backdoor attacks without an accuracy tradeoff. In: Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’21), pp. 3855–3859 (2021)
Google Scholar
Chen, S., et al.: Self-ensemble protection: training checkpoints are good data protectors. In: Proceedings of the 11th International Conference on Learning Representations (ICLR’23) (2023)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the 2009 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’09), pp. 248–255 (2009)
Google Scholar
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Dolatabadi, H.M., Erfani, S., Leckie, C.: The devil’s advocate: shattering the illusion of unexploitable data using diffusion models. arXiv preprint arXiv:2303.08500 (2023)
Feng, J., Cai, Q.-Z., Zhou, Z.H.: Learning to confuse: generating training time adversarial data with auto-encoder. In: Proceedings of the 33rd Neural Information Processing Systems (NeruIPS’19), vol. 32, pp. 11971–11981 (2019)
Google Scholar
Fowl, L., et al.: Preventing unauthorized use of proprietary data: poisoning for secure dataset release. arXiv preprint arXiv:2103.02683 (2021)
Fowl, L., Goldblum, M., Chiang, P.V., Geiping, J., Czaja, W., Goldstein, T.: Adversarial examples make strong poisons. In: Proceedings of the 35th Neural Information Processing Systems (NeurIPS’21), vol. 34, pp. 30339–30351 (2021)
Google Scholar
Fu, S., He, F., Liu, Y., Shen, L., Tao, D.: Robust unlearnable examples: protecting data privacy against adversarial learning. In: Proceedings of the 10th International Conference on Learning Representations (ICLR’22) (2022)
Google Scholar
Geirhos, R., et al.: Shortcut learning in deep neural networks. Nature Mach. Intell. 2, 665–673 (2020)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’16), pp. 770–778 (2016)
Google Scholar
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Proceedings of the 34th Neural Information Processing Systems (NeurIPS’20), vol. 33, pp. 6840–6851 (2020)
Google Scholar
Hong, S., Chandrasekaran, V., Kaya, Y., Dumitraş, T. Papernot, N.: On the effectiveness of mitigating data poisoning attacks with gradient shaping. arXiv preprint arXiv:2002.11497 (2020)
Hu, S., et al.: PointCRT: detecting backdoor in 3D point cloud via corruption robustness. In: Proceedings of the 31st ACM International Conference on Multimedia (MM’23), pp. 666–675 (2023)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17), pp. 4700–4708 (2017)
Google Scholar
Huang, H., Ma, X., Erfani, S.M., Wang, J.B.A.Y.: Unlearnable examples: making personal data unexploitable. In: Proceedings of the 9th International Conference on Learning Representations (ICLR’21) (2021)
Google Scholar
Jiang, W., Diao, Y., Wang, H., Sun, J., Wang, M., Hong, R.: Unlearnable examples give a false sense of security: Piercing through unexploitable data with learnable examples. In: Proceedings of the 31st ACM International Conference on Multimedia (MM’23), pp. 8910–8921 (2023)
Google Scholar
Kostrikov, I., Fergus, R., Tompson, J., Nachum, O.: Offline reinforcement learning with fisher divergence critic regularization. In: Proceedings of the 38th International Conference on Machine Learning (ICML’21), pp. 5774–5783 (2021)
Google Scholar
Krizhevsky, A.: Learning multiple layers of features from tiny images. Master’s thesis, University of Tront (2009)
Google Scholar
Liu, X., et al.: Detecting backdoors during the inference stage based on corruption robustness consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’23), pp. 16363–16372 (2023)
Google Scholar
Liu, Z., Zhao, Z., Larson, M.: Image shortcut squeezing: countering perturbative availability poisons with compression. In: Proceedings of the 40th International Conference on Machine Learning (ICML’23) (2023)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: Proceedings of the 6th International Conference on Learning Representations (ICLR’18) (2018)
Google Scholar
Muñoz-González, L., et al.: Towards poisoning of deep learning algorithms with back-gradient optimization. In: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec’17), pp. 27–38 (2017)
Google Scholar
Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: Proceedings of the 38th International Conference on Machine Learning (ICML’21), pp. 8162–8171 (2021)
Google Scholar
Nie, W., Guo, B., Huang, Y., Xiao, C., Vahdat, A., Anandkumar, A.: Diffusion models for adversarial purification. In: Proceedings of the 39th International Conference on Machine Learning (ICML’22) (2022)
Google Scholar
Qin, T., Gao, X., Zhao, J., Ye, K., Xu, C.Z.: Learning the unlearnable: adversarial augmentations suppress unlearnable example attacks. arXiv preprint arXiv:2303.15127 (2023)
Ren, J., Xu, H., Wan, Y., Ma, X., Sun, L., Tang, J.: Transferable unlearnable examples. In: Proceedings of the 11th International Conference on Learning Representations (ICLR’23) (2023)
Google Scholar
Sadasivan, V.S., Soltanolkotabi, M., Feizi, S.: CUDA: convolution-based unlearnable datasets. In: Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’23), pp. 3862–3871 (2023)
Google Scholar
Sandoval-Segura, P., et al.: Poisons that are learned faster are more effective. In: Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’22), pp. 198–205 (2022)
Google Scholar
Sandoval-Segura, P., Singla, V., Geiping, J., Goldblum, M., Goldstein, T., Jacobs, D.W.: Autoregressive perturbations for data poisoning. In: Proceedings of the 36th Neural Information Processing Systems (NeurIPS’22), vol. 35 (2022)
Google Scholar
Sandoval-Segura, P., Singla, V., Geiping, J., Goldblum, M., Goldstein, T.: What can we learn from unlearnable datasets? In: Proceedings of the 37th Neural Information Processing Systems (NeurIPS’23) (2023)
Google Scholar
Särkkä, S., Solin, A.: Applied Stochastic Differential Equations, vol. 10. Cambridge University Press, Cambridge (2019)
Book Google Scholar
Shen, J., Zhu, X., Ma, D.: TensorClog: an imperceptible poisoning attack on deep neural network applications. IEEE Access 7, 41498–41506 (2019)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, Y., Durkan, C., Murray, I., Ermon, S.: Maximum likelihood training of score-based diffusion models. In: Proceedings of the 35th Neural Information Processing Systems (NeurIPS’21), vol. 34, pp. 1415–1428 (2021)
Google Scholar
Song, Y., Sohl-Dickstein, J., Kingma, D., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. In: Proceedings of the 9th International Conference on Learning Representations (ICLR’21) (2021)
Google Scholar
Tao, L., Feng, L., Yi, J., Huang, S.-J., Chen, S.: Better safe than sorry: preventing delusive adversaries with adversarial training. In: Proceedings of the 35th Neural Information Processing Systems (NeurIPS’21), vol. 34, pp. 16209–16225 (2021)
Google Scholar
Wang, X., Hu, S., Li, M., Yu, Z., Zhou, Z., Zhang, L.Y.: Corrupting convolution-based unlearnable datasets with pixel-based image transformations. arXiv preprint arXiv:2311.18403 (2023)
Wang, Z., Wang, Y., Wang, Y.: Fooling adversarial training with inducing noise. arXiv preprint arXiv:2111.10130 (2021)
Wen, R., Zhao, Z., Liu, Z., Backes, M., Wang, T., Zhang, Y.: Is adversarial training really a silver bullet for mitigating data poisoning? In: Proceedings of the 11th International Conference on Learning Representations (ICLR’23) (2023)
Google Scholar
Yu, D., Zhang, H., Chen, W., Liu, Yin, J., Liu, T.Y.: Availability attacks create shortcuts. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’22), pp. 2367–2376 (2022)
Google Scholar
Yuan, C.H., Wu, S.H.: Neural tangent generalization attacks. In: Proceedings of the 38th International Conference on Machine Learning (ICML’21), pp. 12230–12240 (2021)
Google Scholar
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutmMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the 17th International Conference on Computer Vision (ICCV’19) (2019)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: MixUp: beyond empirical risk minimization. In: Proceedings of the 6th International Conference on Learning Representations (ICLR’18) (2018)
Google Scholar
Zhang, L., Shen, B., Barnawi, A., Xi, S., Kumar, N., Wu, Y.: FEDDPGAN: federated differentially private generative adversarial networks framework for the detection of COVID-19 pneumonia. Inf. Syst. Front. 23(6), 1403–1415 (2021)
Article Google Scholar
Zhang, R., Zhu, Q.: A game-theoretic analysis of label flipping attacks on distributed support vector machines. In: Proceedings of the 51st Annual Conference on Information Sciences and Systems (CISS’17), pp. 1–6 (2017)
Google Scholar
Zhang, Y., et al.: Why does little robustness help? A further step towards understanding adversarial transferability. In: Proceedings of the 45th IEEE Symposium on Security and Privacy (S &P’24), vol. 2 (2024)
Google Scholar
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: a nested u-net architecture for medical image segmentation. In: Proceedings of the International Workshop on Deep Learning in Medical Image Analysis, pp. 3–11 (2018)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant No. U20A20177) and Hubei Province Key R&D Technology Special Innovation Project (Grant No.2021BAA032). Shengshan Hu and Peng Xu are co-corresponding authors.

Author information

Authors and Affiliations

National Engineering Research Center for Big Data Technology and System, Huazhong University of Science and Technology, Wuhan, 430074, China
Xianlong Wang, Shengshan Hu, Yechao Zhang, Ziqi Zhou, Peng Xu, Wei Wan & Hai Jin
Services Computing Technology and System Lab, Huazhong University of Science and Technology, Wuhan, 430074, China
Xianlong Wang, Shengshan Hu, Yechao Zhang, Ziqi Zhou, Peng Xu, Wei Wan & Hai Jin
Cluster and Grid Computing Lab, Huazhong University of Science and Technology, Wuhan, 430074, China
Ziqi Zhou & Hai Jin
Hubei Engineering Research Center on Big Data Security, Huazhong University of Science and Technology, Wuhan, 430074, China
Xianlong Wang, Shengshan Hu, Yechao Zhang, Peng Xu & Wei Wan
Hubei Key Laboratory of Distributed System Security, Huazhong University of Science and Technology, Wuhan, 430074, China
Xianlong Wang, Shengshan Hu, Yechao Zhang, Peng Xu & Wei Wan
School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China
Xianlong Wang, Shengshan Hu, Yechao Zhang, Peng Xu & Wei Wan
School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
Ziqi Zhou & Hai Jin
School of Information and Communication Technology, Griffith University, Southport, QLD, 4215, Australia
Leo Yu Zhang

Authors

Xianlong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shengshan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yechao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ziqi Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Leo Yu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wan
View author publications
You can also search for this author in PubMed Google Scholar
Hai Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shengshan Hu or Peng Xu .

Editor information

Editors and Affiliations

Institut Polytechnique de Paris, Palaiseau, France
Joaquin Garcia-Alfaro
Bydgoszcz University of Science and Technology, Bydgoszcz, Poland
Rafał Kozik
Bydgoszcz University of Science and Technology, Bydgoszcz, Poland
Michał Choraś
Norwegian University of Science and Technology - NTNU, Gjøvik, Norway
Sokratis Katsikas

A Appendix

Proof for Theorem 1: Based on the continuous-time forward process defined as the solution to the SDE [36], we have:

$$\begin{aligned} \textrm{d}x=g(t)\textrm{d}\beta + h(x,t)\textrm{d}t \end{aligned}$$

(14)

where g(t) is the diffusion coefficient, h(x, t) is the drift coefficient, and $\beta (t)$ is a Brownian motion with a diffusion matrix. After this, according to the Fokker-Planck-Kolmogorov equation [33], we have:

$$\begin{aligned} \frac{\partial p(x,t)}{\partial t}&= -\nabla _x\left( h(x,t)p(x,t)-\frac{g^2(t)}{2}\nabla _xp(x,t) \right) \nonumber \\ &=-\nabla _x\left( h(x,t)p(x,t)-\frac{g^2(t)}{2}p(x,t)\nabla _x \log {p(x,t)} \right) \nonumber \\ &=\nabla _x\left( k_p(x,t)p(x,t) \right) \end{aligned}$$

(15)

where $k_p(x,t)$ is defined as $-h(x,t)+\frac{\nabla _x \log {p(x,t)}}{2}g^2(t)$. Then we have:

$$\begin{aligned} \frac{\partial D_{KL}\left( p(x, t) \Vert q(x, t)\right) }{\partial t}&= \frac{\partial }{\partial t } \int p(x,t)log\frac{p(x,t)}{q(x,t)}dx \nonumber \\ &=\int \frac{\partial p(x,t)}{\partial t} \log {\frac{p(x,t)}{q(x,t)}dx} +\int \frac{p(x,t)}{q(x,t)}\frac{\partial q(x,t)}{\partial t}dx + \int \frac{\partial p(x,t)}{\partial t}dx \nonumber \\ &=\int \nabla _x\left( k_p(x,t)p(x,t) \right) \log {\frac{p(x,t)}{q(x,t)}dx} + \int \nabla _x\left( k_q(x,t)q(x,t) \right) \frac{p(x,t)}{q(x,t)}dx \nonumber \\ &= -\int p(x,t)\left[ k_p(x,t)-k_q(x,t) \right] ^T[\nabla _x\log {p(x,t)} -\nabla _x\log {q(x,t)} ]dx \nonumber \\ &=-\frac{g^2(t)}{2} \int p(x,t)\left\| \nabla _x\log {p(x,t)}-\nabla _x\log {q(x,t)} \right\| _{2}^{2}dx \nonumber \\ &=-\frac{g^2(t)}{2} D_F(p(x,t)||q(x,t)) \end{aligned}$$

(16)

where the fourth equality follows from the integration by parts and our assumption of smooth and fast-decaying p(x, t) and q(x, t). Here, $D_F$ denotes the Fisher divergence [19]. Since $g^2(t) > 0$ and the Fisher divergence is non-negative, we have:

$$\begin{aligned} \frac{\partial D_{KL}\left( p(x, t) \Vert q(x, t)\right) }{\partial t} \le 0 \end{aligned}$$

(17)

where equality holds only if $p(x,t)=q(x,t)$.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X. et al. (2024). ECLIPSE: Expunging Clean-Label Indiscriminate Poisons via Sparse Diffusion Purification. In: Garcia-Alfaro, J., Kozik, R., Choraś, M., Katsikas, S. (eds) Computer Security – ESORICS 2024. ESORICS 2024. Lecture Notes in Computer Science, vol 14982. Springer, Cham. https://doi.org/10.1007/978-3-031-70879-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-70879-4_8
Published: 05 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70878-7
Online ISBN: 978-3-031-70879-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ECLIPSE: Expunging Clean-Label Indiscriminate Poisons via Sparse Diffusion Purification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

From Deconstruction to Reconstruction: A Plug-In Module for Diffusion-Based Purification of Adversarial Examples

PatchBreaker: defending against adversarial attacks by cutting-inpainting patches and joint adversarial training

Patch Selection Denoiser: An Effective Approach Defending Against One-Pixel Attacks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

A Appendix

A Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us