Abstract
Adversarial examples can attack multiple unknown convolutional neural networks (CNNs) due to adversarial transferability, which reveals the vulnerability of CNNs and facilitates the development of adversarial attacks. However, most of the existing adversarial attack methods possess a limited transferability on vision transformers (ViTs). In this paper, we propose a partial blocks search attack (PBSA) method to generate adversarial examples on ViTs, which significantly enhance transferability. Instead of directly employing the same strategy for all encoder blocks on ViTs, we divide encoder blocks into two categories by introducing the block weight score and exploit distinct strategies to process them. In addition, we optimize the generation of perturbations by regularizing the self-attention feature maps and creating an ensemble of partial blocks. Finally, perturbations are adjusted by an adaptive weight to disturb the most effective pixels of original images. Extensive experiments on the ImageNet dataset are conducted to demonstrate the validity and effectiveness of the proposed PBSA. The experimental results reveal the superiority of the proposed PBSA to state-of-the-art attack methods on both ViTs and CNNs. Furthermore, PBSA can be flexibly combined with existing methods, which significantly enhances the transferability of adversarial examples.









Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data Availability Statement
All datasets utilized during this study are available at https://image-net.org. These pre-trained CNN models are all publicly available at https://github.com/Cadene/pretrained-models.pytorch. The pre-trained ViT models are all publicly available at https://github.com/rwightman/pytorch-image-models.
References
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of the advances in neural information processing systems, pp. 5998–6008
Kawara Y, Chu C, Arase Y (2020) Preordering encoding on transformer for translation. IEEE/ACM Trans Audio, Speech, and Language Process 29:644–655
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision, pp. 213–229
He J, Zhao L, Yang H, Zhang M, Li W (2019) Hsi-bert: Hyperspectral image classification using the bidirectional encoder representation from transformers. IEEE Trans Geosci Remote Sens 58(1):165–178
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. In: 8th International conference on learning representations
Paul S, Chen PY (2021) Vision transformers are robust learners. http://arxiv.org/abs/2105.07581
Naseer M, Ranasinghe K, Khan S, Hayat M, Khan FS, Yang MH (2021) Intriguing properties of vision transformers. http://arxiv.org/abs/2105.10497
Shao R, Shi Z, Yi J, Chen PY, Hsieh CJ (2021) On the adversarial robustness of visual transformers. http://arxiv.org/abs/2103.15670
Naseer M, Ranasinghe K, Khan S, Khan FS, Porikli F (2021) On improving adversarial transferability of vision transformers. http://arxiv.org/abs/2106.04169
Zhang Y, Wang S, Zhao H, Guo Z, Sun D (2020) Ct image classification based on convolutional neural network. Neural Comput Appl 33(14):8191–8200
Goswami G, Agarwal A, Ratha N, Singh R, Vatsa M (2019) Detecting and mitigating adversarial perturbations for robust face recognition. Int J Computer Vision 127(6):719–742
Yuan X, He P, Zhu Q, Li X (2019) Adversarial examples: attacks and defenses for deep learning. IEEE Trans Neural Netw Learn Syst 30(9):2805–2824
Zhuang J, Sun J, Yuan G (2021) Arrhythmia diagnosis of young martial arts athletes based on deep learning for smart medical care. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06159-4
Deng Y, Zhang T, Lou G, Zheng X, Jin J, Han QL (2021) Deep learning-based autonomous driving systems: a survey of attacks and defenses. IEEE Trans Indus Inf 17(12):7897–7912
Zhou Z, Yu H, Fan G (2021) Adversarial training and ensemble learning for automatic code summarization. Neural Comput Appl 33:12571–12589. https://doi.org/10.1007/s00521-021-05907-w
Arnab A, Miksik O, Torr PH (2019) On the robustness of semantic segmentation models to adversarial attacks. IEEE Trans Pattern Anal Mach Intell 42(12):3040–3053
Kherchouche A, Fezza SA, Hamidouche W (2021) Detect and defense against adversarial examples in deep learning using natural scene statistics and adaptive denoising. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06330-x
Dong Y, Fu QA, Yang X, Pang T, Su H, Xiao Z, Zhu J (2020) Benchmarking adversarial robustness on image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 321–331
Boopathy A, Liu S, Zhang G, Liu C, Chen PY, Chang S, Daniel L (2020) Proper network interpretability helps adversarial robustness in classification. In: Proceedings of the international conference on machine learning, pp. 1014–1023
Zhang J, Li C (2019) Adversarial examples: opportunities and challenges. IEEE Trans Neural Netw Learn Syst 31(7):2578–2593
Andriushchenko M, Croce F, Flammarion N, Hein M (2020) Square attack: a query-efficient black-box adversarial attack via random search. In: European conference on computer vision, pp. 484–501
Dong Y, Cheng S, Pang T, Su H, Zhu J (2021) Query-efficient black-box adversarial attacks guided by a transfer-based prior. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3126733
Li Y, Xu X, Xiao J, Li S, Shen HT (2020) Adaptive square attack: fooling autonomous cars with adversarial traffic signs. IEEE Internet of Things J 8(8):6337–6347
Cinà AE, Torcinovich A, Pelillo M (2022) A black-box adversarial attack for poisoning clustering. Pattern Recognit 122:8. https://doi.org/10.1016/j.patcog.2021.108306
Xie C, Zhang Z, Zhou Y, Bai S, Wang J, Ren Z, Yuille AL (2019) Improving transferability of adversarial examples with input diversity. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2730–2739
Wu D, Wang Y, Xia ST, Bailey J, Ma X (2019) Skip connections matter: On the transferability of adversarial examples generated with resnets. In: 7th International conference on learning representations
Yuan L, Chen Y, Wang T, Yu W, Shi Y, Jiang Z, Tay FE, Feng J, Yan S (2021) Tokens-to-token vit: Training vision transformers from scratch on imagenet. http://arxiv.org/abs/2101.11986
Han K, Xiao A, Wu E, Guo J, Xu C, Wang Y (2021) Transformer in transformer. http://arxiv.org/abs/2103.00112
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. http://arxiv.org/abs/1409.1556
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. http://arxiv.org/abs/1607.06450
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. http://arxiv.org/abs/1412.6572
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: 6th International conference on learning representations
Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jégou H (2021) Training data-efficient image transformers & distillation through attention. In: Proceedings of the international conference on machine learning, pp. 10,347–10,357
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Computer Vision 115(3):211–252
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. http://arxiv.org/abs/1312.6199
Wu W, Su Y, Chen X, Zhao S, King I, Lyu MR, Tai YW (2020) Boosting the transferability of adversarial samples via attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1161–1170
Dong Y, Liao F, Pang T, Su H, Zhu J, Hu X, Li J (2018) Boosting adversarial attacks with momentum. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9185–9193
Wang Z, Guo H, Zhang Z, Liu W, Qin Z, Ren K (2021) Feature importance-aware transferable adversarial attacks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 7639–7648
Wang J, Liu A, Yin Z, Liu S, Tang S, Liu X (2021) Dual attention suppression attack: Generate adversarial camouflage in physical world. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8565–8574
Kantipudi J, Dubey SR, Chakraborty S (2020) Color channel perturbation attacks for fooling convolutional neural networks and a defense against such attacks. IEEE Trans Artif Intell 1(2):181–191
De K, Pedersen M (2021) Impact of colour on robustness of deep neural networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 21–30
Wei Z, Chen J, Goldblum M, Wu Z, Goldstein T, Jiang YG (2021) Towards transferable adversarial attacks on vision transformers. http://arxiv.org/abs/2109.04176
Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 15,750–15,758
Acknowledgements
The work described in this paper was supported in part by the National Natural Science Foundation of China (Grant No. 62071275), and in part by the Shandong Province Key Innovation Project (Grant No. 2020CXGC010903, 2021SFGC0701).
Funding
The work described in this paper was supported in part by the National Natural Science Foundation of China (Grant No. 62071275), and in part by the Shandong Province Key Innovation Project (Grant No. 2020CXGC010903, 2021SFGC0701).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. The first draft of the manuscript was written by Yanyang Han and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethics approval
Not applicable
Consent to participate
Not applicable
Consent for publication
Not applicable
Code availability
The code of PBSA utilized during this study are available at https://github.com/yanyanghan/PBSA.git.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Han, Y., Liu, J., Liu, X. et al. Enhancing adversarial transferability with partial blocks on vision transformer. Neural Comput & Applic 34, 20249–20262 (2022). https://doi.org/10.1007/s00521-022-07568-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07568-9