Dual Feature Distributional Regularization for Defending Against Adversarial Attacks

Li, Mingyang; Xu, Xiangxiang; Huang, Shao-Lun; Zhang, Lin

doi:10.1007/978-3-030-92310-5_44

Mingyang Li¹⁰,
Xiangxiang Xu¹⁰,
Shao-Lun Huang¹⁰ &
…
Lin Zhang¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1517))

Included in the following conference series:

International Conference on Neural Information Processing

2082 Accesses

Abstract

In recent years, the security of widely used deep learning models has been threatened by adversarial attacks. In this paper, we aim to incorporate a newly proposed metric H-score with the adversarial training framework to further improve model robustness on classification tasks. Specifically, we propose a novel defense method called Dual Feature Distributional Regularization (DFDR) to give dual-level regularization on feature distribution of both normal and adversarial examples, achieving maximal inter-class and minimal intra-class feature distance in a normalized feature space. The experimental results show that our DFDR can not only outperform many other defense methods against adversarial attacks but also improve the adversarial detection results effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Delving into Feature Space: Improving Adversarial Robustness by Feature Spectral Regularization

RDAT: an efficient regularized decoupled adversarial training mechanism

Article 07 May 2024

Effective and Robust Detection of Adversarial Examples via Benford-Fourier Coefficients

Article 25 April 2022

References

Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420 (2018)
Feinman, R., Curtin, R.R., Shintre, S., Gardner, A.B.: Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410 (2017)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
He, W., Wei, J., Chen, X., Carlini, N., Song, D.: Adversarial example defense: ensembles of weak defenses are not strong. In: 11th $\{$USENIX$\}$ Workshop on Offensive Technologies ($\{$WOOT$\}$ 17) (2017)
Google Scholar
Huang, S.L., Xu, X., Zheng, L., Wornell, G.W.: An information theoretic interpretation to deep neural networks. arXiv preprint arXiv:1905.06600 (2019)
Ma, X., et al.: Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv preprint arXiv:1801.02613 (2018)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: DeepFool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574–2582 (2016)
Google Scholar
Mustafa, A., Khan, S., Hayat, M., Goecke, R., Shen, J., Shao, L.: Adversarial defense by restricting the hidden space of deep neural networks. arXiv preprint arXiv:1904.00887 (2019)
Ross, A.S., Doshi-Velez, F.: Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In: Thirty-second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Zhang, H., Yu, Y., Jiao, J., Xing, E.P., Ghaoui, L.E., Jordan, M.I.: Theoretically principled trade-off between robustness and accuracy. arXiv preprint arXiv:1901.08573 (2019)
Zheng, S., Song, Y., Leung, T., Goodfellow, I.: Improving the robustness of deep neural networks via stability training. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4480–4488 (2016)
Google Scholar
Zheng, Z., Hong, P.: Robust detection of adversarial attacks by modeling the intrinsic properties of deep neural networks. In: Advances in Neural Information Processing Systems, pp. 7913–7922 (2018)
Google Scholar

Download references

Acknowledgement

The research of Shao-Lun Huang was supported in part by the Natural Science Foundation of China under Grant 61807021, in part by the Shenzhen Science and Technology Program under Grant KQTD20170810150821146, and in part by the Innovation and Entrepreneurship Project for Overseas High-Level Talents of Shenzhen under Grant KQJSCX20180327144037831.

Author information

Authors and Affiliations

Tsinghua-Berkeley Shenzhen Institute (TBSI), Tsinghua University, Shenzhen, China
Mingyang Li, Xiangxiang Xu, Shao-Lun Huang & Lin Zhang

Authors

Mingyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiangxiang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Shao-Lun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Lin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shao-Lun Huang .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, M., Xu, X., Huang, SL., Zhang, L. (2021). Dual Feature Distributional Regularization for Defending Against Adversarial Attacks. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1517. Springer, Cham. https://doi.org/10.1007/978-3-030-92310-5_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-92310-5_44
Published: 02 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92309-9
Online ISBN: 978-3-030-92310-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dual Feature Distributional Regularization for Defending Against Adversarial Attacks