Abstract
This paper focuses on an important type of black-box attacks, i.e., transfer-based adversarial attacks, where the adversary generates adversarial examples using a substitute (source) model and utilizes them to attack an unseen target model, without knowing its information. Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures (e.g., ResNet-18 and Swin Transformer). In this paper, we observe that the above phenomenon is induced by the output inconsistency problem. To alleviate this problem while effectively utilizing the existing DNN models, we propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples with better transferability, under fixed network architectures. Specifically, to reduce the model-specific features and obtain better output distributions, we construct a multi-teacher framework, where the knowledge is distilled from different teacher architectures into one student network. By considering that the gradient of input is usually utilized to generate adversarial examples, we impose constraints on the gradients between the student and teacher models, to further alleviate the output inconsistency problem and enhance the adversarial transferability. Extensive experiments demonstrate that our proposed work can significantly improve the adversarial transferability.
Similar content being viewed by others
References
Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. 2012, 1097–1105
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations. 2015, 1–14
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An image is worth 16x16 words: transformers for image recognition at scale. In: Proceedings of the 9th International Conference on Learning Representations. 2021, 1–21
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B. Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. 2021, 9992–10002
Sun Y P, Zhang M L. Compositional metric learning for multi-label classification. Frontiers of Computer Science, 2021, 15(5): 155320
Ma F, Wu Y, Yu X, Yang Y. Learning with noisy labels via self-reweighting from class centroids. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(11): 6275–6285
Yang Y, Guo J, Li G, Li L, Li W, Yang J. Alignment efficient image-sentence retrieval considering transferable cross-modal representation learning. Frontiers of Computer Science, 2024, 18(1): 181335
Hu T, Long C, Xiao C. CRD-CGAN: category-consistent and relativistic constraints for diverse text-to-image generation. Frontiers of Computer Science, 2024, 18(1): 181304
Liang X, Qian Y, Guo Q, Zheng K. A data representation method using distance correlation. Frontiers of Computer Science, 2025, 19(1): 191303
Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: Proceedings of the 3rd International Conference on Learning Representations. 2015, 1–11
Lin J, Song C, He K, Wang L, Hopcroft J E. Nesterov accelerated gradient and scale invariance for adversarial attacks. In: Proceedings of the 8th International Conference on Learning Representations. 2020, 1–12
Miao H, Ma F, Quan R, Zhan K, Yang Y. Autonomous LLM-enhanced adversarial attack for text-to-motion. 2024, arXiv preprint arXiv: 2408.00352
Dong Y, Liao F, Pang T, Su H, Zhu J, Hu X, Li J. Boosting adversarial attacks with momentum. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 9185–9193
Yang Y, Huang P, Cao J, Li J, Lin Y, Ma F. A prompt-based approach to adversarial example generation and robustness enhancement. Frontiers of Computer Science, 2024, 18(4): 184318
Lu S, Li R, Liu W. FedDAA: a robust federated learning framework to protect privacy and defend against adversarial attack. Frontiers of Computer Science, 2024, 18(2): 182307
Zou J, Duan Y, Li B, Zhang W, Pan Y, Pan Z. Making adversarial examples more transferable and indistinguishable. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2022, 3662–3670
Xie C, Zhang Z, Zhou Y, Bai S, Wang J, Ren Z, Yuille A L. Improving transferability of adversarial examples with input diversity. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 2725–2734
Dong Y, Pang T, Su H, Zhu J. Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 4307–4316
Wang X, He X, Wang J, He K. Admix: enhancing the transferability of adversarial attacks. In: Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. 2021, 16138–16147
Wu W, Su Y, Lyu M R, King I. Improving the transferability of adversarial samples with adversarial transformations. In: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 9020–9029
Mahmood K, Mahmood R, van Dijk M. On the robustness of vision transformers to adversarial examples. In: Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. 2021, 7818–7827
Naseer M, Ranasinghe K, Khan S, Khan F S, Porikli F. On improving adversarial transferability of vision transformers. In: Proceedings of the 10th International Conference on Learning Representations. 2022, 1–24
Wei Z, Chen J, Goldblum M, Wu Z, Goldstein T, Jiang Y G. Towards transferable adversarial attacks on vision transformers. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 2668–2676
Waseda F, Nishikawa S, Le T N, Nguyen H H, Echizen I. Closer look at the transferability of adversarial examples: how they fool different models differently. In: Proceedings of 2023 IEEE/CVF Winter Conference on Applications of Computer Vision. 2023, 1360–1368
Yu T, Kumar S, Gupta A, Levine S, Hausman K, Finn C. Gradient surgery for multi-task learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 489
Bruna J, Szegedy C, Sutskever I, Goodfellow I, Zaremba W, Fergus R, Erhan D. Intriguing properties of neural networks. In: Proceedings of the 2nd International Conference on Learning Representations. 2014, 1–10
Dong Y, Fu Q A, Yang X, Pang T, Su H, Xiao Z, Zhu J. Benchmarking adversarial robustness on image classification. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 318–328
Li Y, Li L, Wang L, Zhang T, Gong B. NATTACK: learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 3866–3876
Cheng M, Le T, Chen P Y, Zhang H, Yi J, Hsieh C J. Query-efficient hard-label black-box attack: an optimization-based approach. In: Proceedings of the 7th International Conference on Learning Representations. 2019, 1–14
Shi Y, Han Y, Tian Q. Polishing decision-based adversarial noise with a customized sampling. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 1027–1035
Zhou L, Cui P, Zhang X, Jiang Y, Yang S. Adversarial Eigen attack on BlackBox models. In: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 15233–15241
Lord N A, Mueller R, Bertinetto L. Attacking deep networks with surrogate-based adversarial black-box methods is easy. In: Proceedings of the 10th International Conference on Learning Representations. 2022, 1–17
Huang Q, Katsman I, Gu Z, He H, Belongie S J, Lim S N. Enhancing adversarial example transferability with an intermediate level attack. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. 2019, 4732–4741
Yang D, Li W, Ni R, Zhao Y. Enhancing adversarial examples transferability via ensemble feature manifolds. In: Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia. 2021, 49–54
Gumus F, Amasyali M F. Exploiting natural language services: a polarity based black-box attack. Frontiers of Computer Science, 2022, 16(5): 165325
Yang B, Zhang H, Li Z, Zhang Y, Xu K, Wang J. Adversarial example generation with adabelief optimizer and crop invariance. Applied Intelligence, 2023, 53(2): 2332–2347
Wang X, He K. Enhancing the transferability of adversarial attacks through variance tuning. In: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 1924–1933
Yuan H, Chu Q, Zhu F, Zhao R, Liu B, Yu N. AutoMA: towards automatic model augmentation for transferable adversarial attacks. IEEE Transactions on Multimedia, 2023, 25: 203–213
Tolstikhin I O, Houlsby N, Kolesnikov A, Beyer L, Zhai X, Unterthiner T, Yung J, Steiner A, Keysers D, Uszkoreit J, Lucic M, Dosovitskiy A. MLP-mixer: an all-MLP architecture for vision. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 24261–24272
Gou J, Yu B, Maybank S J, Tao D. Knowledge distillation: a survey. International Journal of Computer Vision, 2021, 129(6): 1789–1819
Passban P, Wu Y, Rezagholizadeh M, Liu Q. ALP-KD: attention-based layer projection for knowledge distillation. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 13657–13665
Chen D, Mei J P, Zhang Y, Wang C, Wang Z, Feng Y, Chen C. Cross-layer distillation with semantic calibration. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 7028–7036
Lee S, Song B C. Graph-based knowledge distillation by multi-head attention network. In: Proceedings of the 30th British Machine Vision Conference. 2019, 141
Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. 2015, arXiv preprint arXiv: 1503.02531
Liu B, Liu X, Jin X, Stone P, Liu Q. Conflict-averse gradient descent for multi-task learning. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 1443
Krizhevsky A. Learning multiple layers of features from tiny images. University of Toronto, Dissertation, 2009
Huang G, Liu Z, Van Der Maaten L, Weinberger K Q. Densely connected convolutional networks. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 2261–2269
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016, 2818–2826
Sandler M, Howard A G, Zhu M, Zhmoginov A, Chen L C. MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 4510–4520
Ng D, Chen Y, Tian B, Fu Q, Chng E S. Convmixer: feature interactive convolution with curriculum learning for small footprint and noisy far-field keyword spotting. In: Proceedings of the ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing. 2022, 3603–3607
Yan C W, Cheung T H, Yeung D Y. ILA-DA: improving transferability of intermediate level attack with data augmentation. In: Proceedings of the 11th International Conference on Learning Representations. 2023, 1–25
Zhao Z, Liu Z, Larson M A. On success and simplicity: a second look at transferable targeted attacks. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 468
Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 2020, 128(2): 336–359
Wu Y, Jiang L, Yang Y. Switchable novel object captioner. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 1162–1173
Yang Y, Zhuang Y, Pan Y. Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies. Frontiers of Information Technology & Electronic Engineering, 2021, 22(12): 1551–1558
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 62272020 and U20B2069), in part by the State Key Laboratory of Complex & Critical Software Environment (SKLSDE2023ZX-16), and in part by the Fundamental Research Funds for Central Universities.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.
Additional information
Ruijie YANG received the BS degree from Beihang University, China in 2018. He is now a PhD student at the Laboratory of Intelligent Recognition and Image Processing, Beihang University, China. His research interests are adversarial machine learning, interpretability of DNN and model security.
Yuanfang GUO received his BEng degree in computer engineering and PhD degree in electronic and computer engineering from The Hong Kong University of Science and Technology, China in 2009 and 2015, respectively. He is currently an associate professor with the School of Computer Science and Engineering, Beihang University, China. His current research interests include multimedia security, artificial intelligence security, graph neural networks. He has published over 90 scientific papers.
Junfu WANG received the BE degree in software engineering from Chongqing University, China in 2019. He is currently working toward the PhD degree with the Laboratory of Intelligent Recognition and Image Processing, School of Computer Science and Engineering, Beihang University, China. His research interests include graph representation learning, in particular, on large-scale network, heterogeneous network, network with heterophily.
Jiantao ZHOU received his BEng degree from the Department of Electronic Engineering, Dalian University of Technology, China in 2002, the MPhil degree from the Department of Radio Engineering, Southeast University, China in 2005, and the PhD degree from the Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, China in 2009. He is currently a professor with the Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, China. His research interests include multimedia security and forensics, multimedia signal processing, artificial intelligence, and big data.
Yunhong WANG (Fellow, IEEE) is currently a professor at Beihang University, China, where she is also the Director of the Laboratory of Intelligent Recognition and Image Processing, School of Computer Science and Engineering. She received the PhD degree in electronic engineering from the Nanjing University of Science and Technology, China in 1998. She was with the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China from 1998 to 2004. Her research interests include biometrics, pattern recognition, computer vision, data fusion, and image processing. She has published more than 300 research papers in international journals and conferences. She has served on the editorial board of IEEE Transactions on Dependable and Secure Computing, IEEE Transactions on Biometrics, Behavior, and Identity Sciences, and Pattern Recognition. She is a Fellow of IEEE, IAPR, CAAI, and CCF.
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Yang, R., Guo, Y., Wang, J. et al. Common knowledge learning for generating transferable adversarial examples. Front. Comput. Sci. 19, 1910359 (2025). https://doi.org/10.1007/s11704-024-40533-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-024-40533-4