Why does batch normalization induce the model vulnerability on adversarial images?

Kong, Fei; Liu, Fangqi; Xu, Kaidi; Shi, Xiaoshuang

doi:10.1007/s11280-022-01066-7

Why does batch normalization induce the model vulnerability on adversarial images?

Published: 04 July 2022

Volume 26, pages 1073–1091, (2023)
Cite this article

World Wide Web Aims and scope Submit manuscript

Fei Kong ORCID: orcid.org/0000-0003-1888-2091¹,
Fangqi Liu¹,
Kaidi Xu² &
…
Xiaoshuang Shi¹

479 Accesses
1 Altmetric
Explore all metrics

Abstract

Batch normalization is one of the most widely used components in deep neural networks. It can accelerate training, and boost model performance on normal samples. However, batch normalization induces vulnerability to models on adversarial examples, especially in medical images, and the reason is still not clear. In this paper, we aim to explain the vulnerability aroused by batch normalization under adversarial images. Specifically, we first discover that both natural and medical images contain a large number of trivial features, whose weights will be enlarged under adversarial attacks, and batch normalization can further enlarge their weights. Additionally, we find that batch normalization will reduce the inter-class margin of high-level features, leading to less tolerance to adversarial perturbations, thereby decreasing the model robustness. Moreover, we hypothesize that the smaller inter-class margin, the more difficult to attain the optimal classification space, which means batch normalization will restrict the performance of adversarial training. This further verifies that a narrower inter-class margin induced by batch normalization reduces the model robustness. Experiments on four benchmark datasets demonstrate our discovery, interpretation and hypothesis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking

Article 09 August 2024

Adversarial examples: attacks and defences on medical deep learning systems

Article 08 March 2023

Unveiling the veil: high-frequency components as the key to understanding medical DNNs’ vulnerability to adversarial examples

Article Open access 27 March 2025

Notes

You can access it from https://www.kaggle.com/c/imagenet-object-localization-challenge/data
You can access it from https://www.kaggle.com/nih-chest-xrays/data
You can access it from https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data
You can access it from https://cocodataset.org/#download

References

Peng, L., Hu, R., Kong, F., Gan, J., Mo, Y., Shi, X., Zhu, X.: Reverse graph learning for graph neural network. IEEE Trans. Neural Netw. Learn. Syst. (2022). https://doi.org/10.1109/TNNLS.2022.3161030
Yuan, C., Zhong, Z., Lei, C., Zhu, X., Hu, R.: Adaptive reverse graph learning for robust subspace learning. Inf Process Manage. (2021). https://doi.org/10.1016/j.ipm.2021.102733
Zhu, X., Li, X., Zhang, S., Xu, Z., Yu, L., Wang, C.: Graph pca hashing for similarity search. IEEE Trans. Multimedia 19(9), 2033–2044 (2017)
Article Google Scholar
Zhu, X., Zhang, S., Zhu, Y., Zhu, P., Gao, Y.: Unsupervised spectral feature selection with dynamic hyper-graph learning. IEEE Trans. Knowl. Data Eng. (2020). https://doi.org/10.1109/TKDE.2020.3017250
Zhu, X., Li, X., Zhang, S.: Block-row sparse multiview multilabel learning for image classification. IEEE Trans. Cybern. 46(2), 450–461 (2016)
Article Google Scholar
Shi, X., Guo, Z., Xing, F., Liang, Y., Yang, L.: Anchor-based self-ensembling for semi-supervised deep pairwise hashing. Int. J. Comput. Vis. 128(8), 2307–2324 (2020)
Article MathSciNet MATH Google Scholar
Shi, X., Xing, F., Zhang, Z., Sapkota, M., Guo, Z., Yang, L.: A scalable optimization mechanism for pairwise based discrete hashing. IEEE Trans. Image Process. 30, 1130–1142 (2020)
Article MathSciNet Google Scholar
Gan, J., Peng, Z., Zhu, X., Hu, R., Ma, J., Wu, G.: Brain functional connectivity analysis based on multi-graph fusion. Med. Image Anal. (2021). https://doi.org/10.1016/j.media.2021.102057
Hu, R., Peng, Z., Zhu, X., Gan, J., Zhu, Y., Ma, J., Wu, G.: Multi-band brain network analysis for functional neuroimaging biomarker identification. IEEE Trans. Med. Imaging. (2021). https://doi.org/10.1109/TMI.2021.3099641
Zhu, Y., Ma, J., Yuan, C., Zhu, X.: Interpretable learning based dynamic graph convolutional networks for alzheimer’s disease analysis. Information Fusion 77, 53–61 (2022)
Article Google Scholar
Zhao, Z., Dua, D., Singh, S.: Generating natural adversarial examples. ICLR (2018)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. ICLR (2018)
Schmidt, L., Talwar, K., Santurkar, S., Tsipras, D., Madry, A.: Adversarially robust generalization requires more data. In: NIPS, pp. 5014–5026 (2018)
Yin, D., Lopes, R.G., Shlens, J., Cubuk, E.D., Gilmer, J.: A fourier perspective on model robustness in computer vision. In: NIPS, pp. 13255–13265 (2019)
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. NIPS (2019)
Ford, N., Gilmer, J., Carlini, N., Cubuk, E.D.: Adversarial examples are a natural consequence of test error in noise. In: ICML, pp. 4115–4139 (2019)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)
Tanay, T., Griffin, L.: A boundary tilting persepective on the phenomenon of adversarial examples. arXiv:1608.07690 (2016)
Gilmer, J., Metz, L., Faghri, F., Schoenholz, S.S., Raghu, M., Wattenberg, M., Goodfellow, I.: Adversarial Spheres. In: ICLR (2018)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML, pp. 448–456 (2015)
Scherer, D., Muller, A., Behnke, S.: Evaluation of pooling operations in convolutional architectures for object recognition. In: ICANN, pp. 92–101 (2010)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR, 1929–1958 (2014)
Galloway, A., Golubeva, A., Tanay, T., Moussa, M., Taylor, G.W.: Batch normalization is a cause of adversarial vulnerability. arXiv:1905.02161 (2019)
Benz, P., Zhang, C., Kweon, I.S.: Batch normalization increases adversarial vulnerability: Disentangling usefulness and robustness of model features. arXiv:2010.03316 (2020)
Lin, M., Chen, Q., Yan, S.: Network In Network. arXiv:1312.4400 (2014)
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer Normalization. arXiv:1607.06450 (2016)
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance Normalization: The Missing Ingredient for Fast Stylization. arXiv:1607.08022 (2017)
Awais, M., Shamshad, F., Bae, S.H.: Towards an Adversarially Robust Normalization Approach. arXiv:2006.11007 (2020)
Nado, Z., Padhy, S., Sculley, D., D’Amour, A., Lakshminarayanan, B., Snoek, J.: Evaluating prediction-time batch normalization for robustness under covariate shift. arXiv:2006.10963 [cs, stat] (2021)
Sun, J., Cao, X., Liang, H., Huang, W., Chen, Z., Li, Z.: New interpretations of normalization methods in deep learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (04), pp 5875–5882 (2020)
Benz, P., Zhang, C., Karjauv, A., Kweon, I.S.: Revisiting batch normalization for improving corruption robustness. In: WACV, pp. 494–503 (2021)
Dauphin, Y., Cubuk, D.E.: Deconstructing the regularization of batchnorm. ICLR (2021)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR, pp. 2921–2929 (2016)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: ICCV, pp. 618–626 (2017)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.: Chestx-Ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: CVPR, pp. 3462–71 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.: Microsoft COCO: Common Objects in Context. In: Computer Vision - ECCV 2014. 13Th European Conference. Proceedings: LNCS 8693, Vol. Pt.V, pp 740–55. Cham, Switzerland (2014)
Rauber, J., Brendel, W., Bethge, M.: Foolbox: a Python Toolbox to Benchmark the Robustness of Machine Learning Models. In: ICML (2017)
Rauber, J., Zimmermann, R., Bethge, M., Brendel, W.: Foolbox native: Fast adversarial attacks to benchmark the robustness of machine learning models in pytorch, tensorflow, and jax. Journal of Open Source Software 5(53), 2607 (2020)
Article Google Scholar

Download references

Funding

This work is partially supported by the National Natural Science Foundation of China (Grant No: 61876046) and the Guangxi “Bagui” Teams for Innovation and Research.

Author information

Authors and Affiliations

School of Computer Science and Engineering at University of Electronic Science and Technology of China, Chengdu, 611731, China
Fei Kong, Fangqi Liu & Xiaoshuang Shi
Department of Computer Science at Drexel University, Philadelphia, PA, 19104, USA
Kaidi Xu

Authors

Fei Kong
View author publications
You can also search for this author inPubMed Google Scholar
Fangqi Liu
View author publications
You can also search for this author inPubMed Google Scholar
Kaidi Xu
View author publications
You can also search for this author inPubMed Google Scholar
Xiaoshuang Shi
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Xiaoshuang Shi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Web-based Intelligent Financial Services.

Guest Editors: Hong-Ning Dai, Xiaohui Haoran, and Miguel Martinez.

This work is partially supported by the National Natural Science Foundation of China (Grant No: 61876046) and the Guangxi “Bagui” Teams for Innovation and Research.

Appendices

Appendix : A: Details about the Datasets

Table 5 and 6 show the selected categories and their numbers of images for training and testing. We select those classes because of their relatively large number of samples. It is easy to train our model and analyze our results on those samples.

Table 5 Selected categories and the number of selected images from the ILSVRC dataset

Full size table

Table 6 Selected categories and the number of selected images from the COCO dataset

Full size table

Appendix : B: Details of Network Architectures

Table 7 shows the details of convolutional layers of the model VGG. We mark the location for extracting mid-level features and high-level features by space. Table 8 displays the details of fully connection layers. Top block presents the details of VGG16, and the bottom block provides the details of VGG-C.

Table 7 Convolutional layers in VGG16 and VGG-C. In each Conv2d, the bias is True, the padding size is 1, the stride size is 1. out_ch means the number of channels of output, ks means the kernel size. Batch normalization and ReLU are followed by each Conv2d layer. The first and the second space mark that the layer above is middle-level feature and high-level feature

Full size table

Table 8 The used classifiers in VGG16 and VGG-C. VGG16 adopts the classifier in the top block (rows above space), and VGG-C employs the classifier in the bottom block (rows below space)

Full size table

Table 9 The used architecture of ResNet18. The first and the second space mark that the layer above is middle-level feature and high-level feature

Full size table

Appendix : C: The Architecture of Basic Block

Table 9 shows the details of ResNet18. We mark the location for extracting mid-level features and high-level features by space. Figure 5 presents the details of Basic Block used in Table 9.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kong, F., Liu, F., Xu, K. et al. Why does batch normalization induce the model vulnerability on adversarial images?. World Wide Web 26, 1073–1091 (2023). https://doi.org/10.1007/s11280-022-01066-7

Download citation

Received: 15 March 2022
Revised: 18 April 2022
Accepted: 13 May 2022
Published: 04 July 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11280-022-01066-7

Keywords

Part of a collection:

Web-based Intelligent Financial Services: Emerging Challenges and Recent Advances

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Why does batch normalization induce the model vulnerability on adversarial images?

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking

Adversarial examples: attacks and defences on medical deep learning systems

Unveiling the veil: high-frequency components as the key to understanding medical DNNs’ vulnerability to adversarial examples

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendices

Appendix : A: Details about the Datasets

Appendix : B: Details of Network Architectures

Appendix : C: The Architecture of Basic Block

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now