Improving and Evaluating Sparse Decision-Based Black-Box Attacks and Defenses

Jacobsen, Jonas Brager; Li, Jingyue; Mohus, Mathias Lundteigen

doi:10.1007/978-981-96-1621-3_6

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15393))

Included in the following conference series:

International Symposium on Software Fault Prevention, Verification, and Validation

102 Accesses

Abstract

Decision-based black box attacks are rising concerns in adversarial machine learning, as they allow attackers to manipulate the outputs of machine learning models without having access to the model’s internal architecture or hyperparameters. Sparse attacks, aiming to minimize the number of perturbed pixels, expose critical vulnerabilities in machine learning models, representing a considerable threat to real-world systems. A current limitation of sparse attacks is the need to query the target model in the range of thousands of queries to create imperceptible adversarial examples, which can be costly and easily detected. Our study demonstrates the potential of using the patch-wise adversarial removal (PAR) algorithm to improve the query efficiency of sparse attacks. To defend against sparse decision-based attackers, we find that adversarial training is an effective countermeasure, strengthened further by using median filtering and adversarial detection. We probe the possibility of enhancing the attacks with our modification of the PAR algorithm, blurring the adversarial example with the original unperturbed input, with results showing that the F1-score of the trained detector drops from 0.97 to 0.89. The study highlights the importance of continued research into understanding the potential severity of sparse attacks and optimizing related defenses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmadian, S., Jalali, S.M.J., Islam, S.M.S., Khosravi, A., Fazli, E., Nahavandi, S.: A novel deep neuroevolution-based image classification method to diagnose coronavirus disease (covid-19). Comput. Biol. Medi. 139, 104994 (2021), ISSN 0010-4825, https://doi.org/10.1016/j.compbiomed.2021.104994, https://www.sciencedirect.com/science/article/pii/S0010482521007885
Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: A survey (2018). https://doi.org/10.48550/arxiv.1801.00553
Akyon, F.C., Temizel, A.: Deep architectures for content moderation and movie content rating (2022). https://doi.org/10.48550/arxiv.2212.04533
An, J.H., Wang, Z., Joe, I.: A CNN-based automatic vulnerability detection. EURASIP J. Wireless Commun. Netw. 2023(1), 41 (May 2023), ISSN 1687-149https://doi.org/10.1186/s13638-023-02255-2
Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples (2018). https://doi.org/10.48550/arxiv.1802.00420
Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: Reliable attacks against black-box machine learning models (2017). https://doi.org/10.48550/ARXIV.1712.04248
Carlini, N., Wagner, D.: Adversarial examples are not easily detected: bypassing ten detection methods (2017). https://doi.org/10.48550/arxiv.1705.07263
Chawla, D., Trivedi, M.C.: A comparative study on face detection techniques for security surveillance. In: Advances in Computer and Computational Sciences: Proceedings of ICCCCS 2016, vol. 2, pp. 531–541, Springer (2018)
Google Scholar
Chen, J., Gu, Q.: Rays: A ray searching method for hard-label adversarial attack (2020). https://doi.org/10.48550/arxiv.2006.12792
Chen, J., Jordan, M.I.: Boundary attack++: Query-efficient decision-based adversarial attack. CoRR 1904.02144 (2019). https://doi.org/10.48550/arxiv.1904.02144
Chen, J., Jordan, M.I., Wainwright, M.J.: Hopskipjumpattack: A query-efficient decision-based attack (2020). https://doi.org/10.48550/arxiv.1904.02144
Croce, F., Hein, M.: Sparse and imperceivable adversarial attacks. CoRR abs/1909.05040 (2019). https://doi.org/10.48550/arxiv.1909.05040
Das, N., Shanbhogue, M., Chen, S.T., Hohman, F., Chen, L., Kounavis, M.E., Chau, D.H.: Keeping the bad guys out: Protecting and vaccinating deep learning with jpeg compression (2017). https://doi.org/10.48550/arxiv.1705.02900
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Dhillon, G.S., et al.: Stochastic activation pruning for robust adversarial defense (2018). https://doi.org/10.48550/arxiv.1803.01442
Dziugaite, G.K., Ghahramani, Z., Roy, D.M.: A study of the effect of jpg compression on adversarial images (2016). https://doi.org/10.48550/arxiv.1608.00853
E Woods, R., C Gonzalez, R.: Digital image processing (2008)
Google Scholar
Engstrom, L., Ilyas, A., Salman, H., Santurkar, S., Tsipras, D.: Robustness (python library) (2019). https://github.com/MadryLab/robustness
Erkan, U., Enginoğlu, S., Thanh, D.N., Hieu, L.M.: Adaptive frequency median filter for the salt and pepper denoising problem. IET Image Proc. 14(7), 1291–1302 (2020). https://doi.org/10.1049/iet-ipr.2019.0398
Article MATH Google Scholar
Erkan, U., Gökrem, L., Enginoğlu, S.: Different applied median filter in salt and pepper noise. Comput. Electr. Eng. 70, 789–798 (2018). ISSN 0045-790https://doi.org/10.1016/j.compeleceng.2018.01.019, https://www.sciencedirect.com/science/article/pii/S0045790617320244
Gao, J., Wang, B., Lin, Z., Xu, W., Qi, Y.: Deepcloak: masking deep neural network models for robustness against adversarial samples (2017). https://doi.org/10.48550/arxiv.1702.06763
Gong, Z., Wang, W., Ku, W.S.: Adversarial and clean data are not twins (2017).https://doi.org/10.48550/arxiv.1704.04960
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2015). https://doi.org/10.48550/arxiv.1412.6572
Grosse, K., Manoharan, P., Papernot, N., Backes, M., McDaniel, P.: On the (statistical) detection of adversarial examples (2017). https://doi.org/10.48550/arxiv.1702.06280
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, CVPR ’16, IEEE, Las Vegas, NV, USA (2016). ISSN 1063-691https://doi.org/10.1109/CVPR.2016.90, http://ieeexplore.ieee.org/document/7780459
Lee, H., Han, S., Lee, J.: Generative adversarial trainer: defense to adversarial perturbations with gan (2017),. https://doi.org/10.48550/arxiv.1705.03387
Li, G., Zhu, P., Li, J., Yang, Z., Cao, N., Chen, Z.: Security matters: A survey on adversarial machine learning (2018). https://doi.org/10.48550/arxiv.1810.07339
Li, H., Li, L., Xu, X., Zhang, X., Yang, S., Li, B.: Nonlinear projection based gradient estimation for query efficient blackbox attacks (2021). https://doi.org/10.48550/arxiv.2102.13184
Li, H., Xu, X., Zhang, X., Yang, S., Li, B.: QEBA: query-efficient boundary-based blackbox attack. CoRR abs/2005.14137 (2020). https://doi.org/10.48550/arxiv.2005.14137
Li, X., Li, F.: Adversarial examples detection in deep networks with convolutional filter statistics (2017). https://doi.org/10.48550/arxiv.1612.07767
Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks (2017). https://doi.org/10.48550/arxiv.1611.02770
Lu, J., Issaranon, T., Forsyth, D.: Safetynet: Detecting and rejecting adversarial examples robustly (2017). https://doi.org/10.48550/arxiv.1704.00103
Lyu, C., Huang, K., Liang, H.N.: A unified gradient regularization family for adversarial examples (2015). https://doi.org/10.48550/arxiv.1511.06385
Ma, S., Liu, Y., Tao, G., Lee, W.C., Zhang, X.: Nic: detecting adversarial samples with neural network invariant checking. In: 26th Annual Network And Distributed System Security Symposium (NDSS 2019), Internet Soc (2019)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks (2019). https://doi.org/10.48550/arxiv.1706.06083
Marcel, S., Rodriguez, Y.: Torchvision the machine-vision package of torch. In: Proceedings of the 18th ACM International Conference on Multimedia, p. 1485-1488, MM ’10, Association for Computing Machinery, New York, NY, USA (2010). ISBN 978160558933 https://doi.org/10.1145/1873951.1874254
Meng, D., Chen, H.: Magnet: a two-pronged defense against adversarial examples (2017). https://doi.org/10.48550/arxiv.1705.09064
Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations (2017). https://doi.org/10.48550/arxiv.1702.04267
Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples (2016). https://doi.org/10.48550/arxiv.1605.07277
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning (2017). https://doi.org/10.48550/arxiv.1602.02697
PyTorch: Pytorch, resnet50. https://pytorch.org/ (2017). Accessed 06 May 2023
Qiu, S., Liu, Q., Zhou, S., Wu, C.: Review of artificial intelligence adversarial attack and defense technologies. Appl. Sci. 9(5) (2019). ISSN 2076-341https://doi.org/10.3390/app9050909, https://www.mdpi.com/2076-3417/9/5/909
Rahmati, A., Moosavi-Dezfooli, S.M., Frossard, P., Dai, H.: Geoda: a geometric framework for black-box adversarial attacks (2020). https://doi.org/10.48550/arxiv.2003.06468
Schott, L., Rauber, J., Brendel, W., Bethge, M.: Robust perception through analysis by synthesis. CoRR abs/1805.09190 (2018). https://doi.org/10.48550/arxiv.1805.09190
Shen, S., Jin, G., Gao, K., Zhang, Y.: Ape-gan: adversarial perturbation elimination with gan (2017). https://doi.org/10.48550/arxiv.1707.05474
Shi, Y., Han, Y.: Decision-based black-box attack against vision transformers via patch-wise adversarial removal. CoRR abs/2112.03492 (2021). https://doi.org/10.48550/arxiv.2112.03492
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks (2014). https://doi.org/10.48550/arxiv.1312.6199
Thanh, D.N., Prasath, V., Phung, T.K., Hung, N.Q.: Impulse denoising based on noise accumulation and harmonic analysis techniques. Optik 241, 166163 (2021), ISSN 0030-402 https://doi.org/10.1016/j.ijleo.2020.166163, https://www.sciencedirect.com/science/article/pii/S0030402620319690
Turay, T., Vladimirova, T.: Toward performing image classification and object detection with convolutional neural networks in autonomous driving systems: a survey. IEEE Access 10, 14076–14119 (2022). https://doi.org/10.1109/ACCESS.2022.3147495
Article MATH Google Scholar
Vo, V.Q., Abbasnejad, E., Ranasinghe, D.C.: Query efficient decision based sparse attacks against black-box deep learning models. CoRR abs/2202.00091 (2022). https://doi.org/10.48550/arxiv.2202.00091
Wang, X., Zhang, Z., Tong, K., Gong, D., He, K., Li, Z., Liu, W.: Triangle attack: A query-efficient decision-based adversarial attack. CoRR abs/2112.06569 (2021). https://doi.org/10.48550/arxiv.2112.06569
Xie, C., Wang, J., Zhang, Z., Zhou, Y., Xie, L., Yuille, A.: Adversarial examples for semantic segmentation and object detection (2017). https://doi.org/10.48550/arxiv.1703.08603
Xu, H., Ma, Y., Liu, H., Deb, D., Liu, H., Tang, J., Jain, A.K.: Adversarial attacks and defenses in images, graphs and text: a review (2019). https://doi.org/10.48550/arxiv.1909.08072
Xu, W., Evans, D., Qi, Y.: Feature squeezing: Detecting adversarial examples in deep neural networks. In: Proceedings 2018 Network and Distributed System Security Symposium, Internet Society (2018). https://doi.org/10.14722/ndss.2018.23198
Zuo, F., Yang, B., Li, X., Zeng, Q.: Exploiting the inherent limitation of l0 adversarial examples. In: 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019), pp. 293–307 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Norwegian University of Science and Technology, Trondheim, Norway
Jonas Brager Jacobsen, Jingyue Li & Mathias Lundteigen Mohus

Authors

Jonas Brager Jacobsen
View author publications
You can also search for this author in PubMed Google Scholar
Jingyue Li
View author publications
You can also search for this author in PubMed Google Scholar
Mathias Lundteigen Mohus
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingyue Li .

Editor information

Editors and Affiliations

Hiroshima University, Hiroshima, Hiroshima, Japan
Shaoying Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jacobsen, J.B., Li, J., Mohus, M.L. (2025). Improving and Evaluating Sparse Decision-Based Black-Box Attacks and Defenses. In: Liu, S. (eds) Software Fault Prevention, Verification, and Validation. SFPVV 2024. Lecture Notes in Computer Science, vol 15393. Springer, Singapore. https://doi.org/10.1007/978-981-96-1621-3_6

Download citation

DOI: https://doi.org/10.1007/978-981-96-1621-3_6
Published: 25 February 2025
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-1620-6
Online ISBN: 978-981-96-1621-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving and Evaluating Sparse Decision-Based Black-Box Attacks and Defenses