skip to main content
research-article

Towards Query-Efficient Black-Box Attacks: A Universal Dual Transferability-Based Framework

Published: 08 May 2023 Publication History

Abstract

Adversarial attacks have threatened the application of deep neural networks in security-sensitive scenarios. Most existing black-box attacks fool the target model by interacting with it many times and producing global perturbations. However, all pixels are not equally crucial to the target model; thus, indiscriminately treating all pixels will increase query overhead inevitably. In addition, existing black-box attacks take clean samples as start points, which also limits query efficiency. In this article, we propose a novel black-box attack framework, constructed on a strategy of dual transferability (DT), to perturb the discriminative areas of clean examples within limited queries. The first kind of transferability is the transferability of model interpretations. Based on this property, we identify the discriminative areas of clean samples for generating local perturbations. The second is the transferability of adversarial examples, which helps us to produce local pre-perturbations for further improving query efficiency. We achieve the two kinds of transferability through an independent auxiliary model and do not incur extra query overhead. After identifying discriminative areas and generating pre-perturbations, we use the pre-perturbed samples as better start points and further perturb them locally in a black-box manner to search the corresponding adversarial examples. The DT strategy is general; thus, the proposed framework can be applied to different types of black-box attacks. We conduct extensive experiments to show that, under various system settings, our framework can significantly improve the query efficiency of existing black-box attacks and attack success rates.

References

[1]
Moustafa Alzantot, Yash Sharma, Supriyo Chakraborty, Huan Zhang, Cho-Jui Hsieh, and Mani B. Srivastava. 2019. Genattack: Practical black-box attacks with gradient-free optimization. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’19). ACM, New York, NY, 1111–1119.
[2]
Arjun Nitin Bhagoji, Warren He, Bo Li, and Dawn Song. 2018. Practical black-box attacks on deep neural networks using efficient query mechanisms. In Proceedings of the European Conference on Computer Vision (ECCV’18). Springer International Publishing, 158–174.
[3]
Wieland Brendel, Jonas Rauber, and Matthias Bethge. 2018. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In International Conference on Learning Representations (ICLR’18). OpenReview.net, Vancouver, 1–12.
[4]
Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (SP’17). IEEE Computer Society, Seattle, 39–57.
[5]
Nicholas Carlini and David Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE Security and Privacy Workshops (SPW’17). IEEE Computer Society, San Francisco, 1–7.
[6]
Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N. Balasubramanian. 2018. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE Computer Society, Lake Tahoe, 839–847.
[7]
Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. 2017. ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec’17). ACM, Dallas, 15–26.
[8]
Piotr Dabkowski and Yarin Gal. 2017. Real time image saliency for black box classifiers. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc., Long Beach.
[9]
Xiaoyi Dong, Jiangfan Han, Dongdong Chen, Jiayang Liu, Huanyu Bian, Zehua Ma, Hongsheng Li, Xiaogang Wang, Weiming Zhang, and Nenghai Yu. 2020. Robust superpixel-guided attentional adversarial attack. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE Computer Society, Los Alamitos, 12892–12901.
[10]
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting adversarial attacks with momentum. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, 9185–9193.
[11]
Yali Du, Meng Fang, Jinfeng Yi, Jun Cheng, and Dacheng Tao. 2018. Towards query efficient black-box attacks: An input-free perspective. In Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security (AISec’18). ACM, 13–24.
[12]
Sid Ahmed Fezza, Yassine Bakhti, Wassim Hamidouche, and Olivier Déforges. 2019. Perceptual evaluation of adversarial attacks for CNN-based image classification. In International Conference on Quality of Multimedia Experience (QoMEX’19). IEEE, 1–6.
[13]
Ruigang Fu, Qingyong Hu, Xiaohu Dong, Yulan Guo, Yinghui Gao, and Biao Li. 2020. Axiom-based Grad-CAM: Towards accurate visualization and explanation of CNNs. In British Machine Vision Conference (BMVC). BMVA Press, UK, 1–13.
[14]
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR’15). OpenReview.net, San Diego, 1–11.
[15]
Chuan Guo, Jacob Gardner, Yurong You, Andrew Gordon Wilson, and Kilian Weinberger. 2019. Simple black-box adversarial attacks. In Proceedings of the 36th International Conference on Machine Learning (ICML’19). PMLR, Long Beach, 2484–2493.
[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, 770–778.
[17]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv:1503.02531 (2015).
[18]
Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. 2018. Black-box adversarial attacks with limited queries and information. In Proceedings of the 35th International Conference on Machine Learning (ICML’18), Vol. 80. PMLR, Stockholmsmässan, 2137–2146.
[19]
Peng-Tao Jiang, Chang-Bin Zhang, Qibin Hou, Ming-Ming Cheng, and Yunchao Wei. 2021. LayerCAM: Exploring hierarchical class activation maps for localization. IEEE Transactions on Image Processing 30 (2021), 5875–5888.
[20]
Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In International Conference on Learning Representations (ICLR’17), 1–17.
[21]
Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2017. Adversarial machine learning at scale. International Conference on Learning Representations Workshop, (ICLR Workshop). OpenReview.net, Toulon, 1–14.
[22]
Eric Cooper Larson and Damon Michael Chandler. 2010. Most apparent distortion: Full-reference image quality assessment and the role of strategy. Journal of Electronic Imaging 19, 1 (2010), 011006.
[23]
Xurong Li, Shouling Ji, Meng Han, Juntao Ji, Zhenyu Ren, Yushan Liu, and Chunming Wu. 2021. Adversarial examples versus cloud-based detectors: A black-box empirical study. IEEE Transactions on Dependable and Secure Computing 18, 4 (2021), 1933–1949.
[24]
Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2017. Delving into transferable adversarial examples and black-box attacks. In International Conference on Learning Representations (ICLR’17). OpenReview.net, Toulon, 1–14.
[25]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (ICLR’18). Open-Review.net, Vancouver, 1–23.
[26]
Seyed Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. 2017. Universal adversarial perturbations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, 1765–1773.
[27]
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, 2574–2582.
[28]
Mohammed Bany Muhammad and Mohammed Yeasin. 2020. Eigen-cam: Class activation map using principal components. In International Joint Conference on Neural Networks (IJCNN’20). IEEE, 1–7.
[29]
Nina Narodytska and Shiva Kasiviswanathan. 2017. Simple black-box adversarial attacks on deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17), Vol. 2. IEEE Computer, 1310–1318.
[30]
Daniel Omeiza, Skyler Speakman, Celia Cintas, and Komminist Weldermariam. 2019. Smooth Grad-CAM++: An enhanced inference level visualization technique for deep convolutional neural network models. arXiv preprint arXiv:1908.01224 (2019).
[31]
Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (AisaCCS’17). ACM, 506–519.
[32]
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In IEEE European symposium on security and privacy (EuroS&P’16). IEEE, 372–387.
[33]
Harish Guruprasad Ramaswamy et al. 2020. Ablation-CAM: Visual explanations for deep convolutional network via gradient-free localization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV’20). IEEE, 983–991.
[34]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’16). ACM, 1135–1144.
[35]
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, 618–626.
[36]
Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR’15). San Diego, 1–14.
[37]
Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. 2019. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation 23, 5 (2019), 828–841.
[38]
Fnu Suya, Jianfeng Chi, David Evans, and Yuan Tian. 2020. Hybrid batch attacks: Finding black-box adversarial examples with limited queries. In 29th USENIX Security Symposium (USENIX’20). USENIX Association, 1327–1344.
[39]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, 2818–2826.
[40]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR’14). Banff, AB, 1–10.
[41]
Chun-Chen Tu, Paishun Ting, Pin-Yu Chen, Sijia Liu, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, and Shin-Ming Cheng. 2019. Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’19), Vol. 33. AAAI Press, 742–749.
[42]
Xianmin Wang, Jing Li, Xiaohui Kuang, Yu-an Tan, and Jin Li. 2019. The security of machine learning in an adversarial setting: A survey. Journal of Parallel and Distributed Computing 130 (2019), 12–23.
[43]
Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, Jan Peters, and Jürgen Schmidhuber. 2014. Natural evolution strategies. Journal of Machine Learning Research 15, 1 (2014), 949–980.
[44]
Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan Yuille. 2017. Adversarial examples for semantic segmentation and object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE Computer Society, 1369–1378.
[45]
Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L. Yuille. 2019. Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE, 2730–2739.
[46]
Jiliang Zhang and Chen Li. 2020. Adversarial examples: Opportunities and challenges. IEEE Transactions on Neural Networks and Learning Systems 31, 7 (2020), 2578–2593.
[47]
Wei Emma Zhang, Quan Z. Sheng, Ahoud Alhazmi, and Chenliang Li. 2020. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM Transactions on Intelligent Systems and Technology 11, 3 (2020), 1–41.
[48]
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, 2921–2929.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 14, Issue 4
August 2023
481 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3596215
  • Editor:
  • Huan Liu
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2023
Online AM: 13 February 2023
Accepted: 23 January 2023
Revised: 20 October 2022
Received: 01 May 2022
Published in TIST Volume 14, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Black-box attack
  2. query efficiency
  3. transferability
  4. model interpretation

Qualifiers

  • Research-article

Funding Sources

  • National Key R&D Program of China
  • National Natural Science Foundation of China
  • Natural Science Foundation of Chongqing, China
  • Sichuan Science and Technology Program
  • Fundamental Research Funds for the Central Universities

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 408
    Total Downloads
  • Downloads (Last 12 months)93
  • Downloads (Last 6 weeks)5
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media