Skip to main content

Improving and Evaluating Sparse Decision-Based Black-Box Attacks and Defenses

  • Conference paper
  • First Online:
Software Fault Prevention, Verification, and Validation (SFPVV 2024)

Abstract

Decision-based black box attacks are rising concerns in adversarial machine learning, as they allow attackers to manipulate the outputs of machine learning models without having access to the model’s internal architecture or hyperparameters. Sparse attacks, aiming to minimize the number of perturbed pixels, expose critical vulnerabilities in machine learning models, representing a considerable threat to real-world systems. A current limitation of sparse attacks is the need to query the target model in the range of thousands of queries to create imperceptible adversarial examples, which can be costly and easily detected. Our study demonstrates the potential of using the patch-wise adversarial removal (PAR) algorithm to improve the query efficiency of sparse attacks. To defend against sparse decision-based attackers, we find that adversarial training is an effective countermeasure, strengthened further by using median filtering and adversarial detection. We probe the possibility of enhancing the attacks with our modification of the PAR algorithm, blurring the adversarial example with the original unperturbed input, with results showing that the F1-score of the trained detector drops from 0.97 to 0.89. The study highlights the importance of continued research into understanding the potential severity of sparse attacks and optimizing related defenses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmadian, S., Jalali, S.M.J., Islam, S.M.S., Khosravi, A., Fazli, E., Nahavandi, S.: A novel deep neuroevolution-based image classification method to diagnose coronavirus disease (covid-19). Comput. Biol. Medi. 139, 104994 (2021), ISSN 0010-4825, https://doi.org/10.1016/j.compbiomed.2021.104994, https://www.sciencedirect.com/science/article/pii/S0010482521007885

  2. Akhtar, N., Mian, A.: Threat of adversarial attacks on deep learning in computer vision: A survey (2018). https://doi.org/10.48550/arxiv.1801.00553

  3. Akyon, F.C., Temizel, A.: Deep architectures for content moderation and movie content rating (2022). https://doi.org/10.48550/arxiv.2212.04533

  4. An, J.H., Wang, Z., Joe, I.: A CNN-based automatic vulnerability detection. EURASIP J. Wireless Commun. Netw. 2023(1), 41 (May 2023), ISSN 1687-149https://doi.org/10.1186/s13638-023-02255-2

  5. Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples (2018). https://doi.org/10.48550/arxiv.1802.00420

  6. Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: Reliable attacks against black-box machine learning models (2017). https://doi.org/10.48550/ARXIV.1712.04248

  7. Carlini, N., Wagner, D.: Adversarial examples are not easily detected: bypassing ten detection methods (2017). https://doi.org/10.48550/arxiv.1705.07263

  8. Chawla, D., Trivedi, M.C.: A comparative study on face detection techniques for security surveillance. In: Advances in Computer and Computational Sciences: Proceedings of ICCCCS 2016, vol. 2, pp. 531–541, Springer (2018)

    Google Scholar 

  9. Chen, J., Gu, Q.: Rays: A ray searching method for hard-label adversarial attack (2020). https://doi.org/10.48550/arxiv.2006.12792

  10. Chen, J., Jordan, M.I.: Boundary attack++: Query-efficient decision-based adversarial attack. CoRR 1904.02144 (2019). https://doi.org/10.48550/arxiv.1904.02144

  11. Chen, J., Jordan, M.I., Wainwright, M.J.: Hopskipjumpattack: A query-efficient decision-based attack (2020). https://doi.org/10.48550/arxiv.1904.02144

  12. Croce, F., Hein, M.: Sparse and imperceivable adversarial attacks. CoRR abs/1909.05040 (2019). https://doi.org/10.48550/arxiv.1909.05040

  13. Das, N., Shanbhogue, M., Chen, S.T., Hohman, F., Chen, L., Kounavis, M.E., Chau, D.H.: Keeping the bad guys out: Protecting and vaccinating deep learning with jpeg compression (2017). https://doi.org/10.48550/arxiv.1705.02900

  14. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848

  15. Dhillon, G.S., et al.: Stochastic activation pruning for robust adversarial defense (2018). https://doi.org/10.48550/arxiv.1803.01442

  16. Dziugaite, G.K., Ghahramani, Z., Roy, D.M.: A study of the effect of jpg compression on adversarial images (2016). https://doi.org/10.48550/arxiv.1608.00853

  17. E Woods, R., C Gonzalez, R.: Digital image processing (2008)

    Google Scholar 

  18. Engstrom, L., Ilyas, A., Salman, H., Santurkar, S., Tsipras, D.: Robustness (python library) (2019). https://github.com/MadryLab/robustness

  19. Erkan, U., Enginoğlu, S., Thanh, D.N., Hieu, L.M.: Adaptive frequency median filter for the salt and pepper denoising problem. IET Image Proc. 14(7), 1291–1302 (2020). https://doi.org/10.1049/iet-ipr.2019.0398

    Article  MATH  Google Scholar 

  20. Erkan, U., Gökrem, L., Enginoğlu, S.: Different applied median filter in salt and pepper noise. Comput. Electr. Eng. 70, 789–798 (2018). ISSN 0045-790https://doi.org/10.1016/j.compeleceng.2018.01.019, https://www.sciencedirect.com/science/article/pii/S0045790617320244

  21. Gao, J., Wang, B., Lin, Z., Xu, W., Qi, Y.: Deepcloak: masking deep neural network models for robustness against adversarial samples (2017). https://doi.org/10.48550/arxiv.1702.06763

  22. Gong, Z., Wang, W., Ku, W.S.: Adversarial and clean data are not twins (2017).https://doi.org/10.48550/arxiv.1704.04960

  23. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2015). https://doi.org/10.48550/arxiv.1412.6572

  24. Grosse, K., Manoharan, P., Papernot, N., Backes, M., McDaniel, P.: On the (statistical) detection of adversarial examples (2017). https://doi.org/10.48550/arxiv.1702.06280

  25. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, CVPR ’16, IEEE, Las Vegas, NV, USA (2016). ISSN 1063-691https://doi.org/10.1109/CVPR.2016.90, http://ieeexplore.ieee.org/document/7780459

  26. Lee, H., Han, S., Lee, J.: Generative adversarial trainer: defense to adversarial perturbations with gan (2017),. https://doi.org/10.48550/arxiv.1705.03387

  27. Li, G., Zhu, P., Li, J., Yang, Z., Cao, N., Chen, Z.: Security matters: A survey on adversarial machine learning (2018). https://doi.org/10.48550/arxiv.1810.07339

  28. Li, H., Li, L., Xu, X., Zhang, X., Yang, S., Li, B.: Nonlinear projection based gradient estimation for query efficient blackbox attacks (2021). https://doi.org/10.48550/arxiv.2102.13184

  29. Li, H., Xu, X., Zhang, X., Yang, S., Li, B.: QEBA: query-efficient boundary-based blackbox attack. CoRR abs/2005.14137 (2020). https://doi.org/10.48550/arxiv.2005.14137

  30. Li, X., Li, F.: Adversarial examples detection in deep networks with convolutional filter statistics (2017). https://doi.org/10.48550/arxiv.1612.07767

  31. Liu, Y., Chen, X., Liu, C., Song, D.: Delving into transferable adversarial examples and black-box attacks (2017). https://doi.org/10.48550/arxiv.1611.02770

  32. Lu, J., Issaranon, T., Forsyth, D.: Safetynet: Detecting and rejecting adversarial examples robustly (2017). https://doi.org/10.48550/arxiv.1704.00103

  33. Lyu, C., Huang, K., Liang, H.N.: A unified gradient regularization family for adversarial examples (2015). https://doi.org/10.48550/arxiv.1511.06385

  34. Ma, S., Liu, Y., Tao, G., Lee, W.C., Zhang, X.: Nic: detecting adversarial samples with neural network invariant checking. In: 26th Annual Network And Distributed System Security Symposium (NDSS 2019), Internet Soc (2019)

    Google Scholar 

  35. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks (2019). https://doi.org/10.48550/arxiv.1706.06083

  36. Marcel, S., Rodriguez, Y.: Torchvision the machine-vision package of torch. In: Proceedings of the 18th ACM International Conference on Multimedia, p. 1485-1488, MM ’10, Association for Computing Machinery, New York, NY, USA (2010). ISBN 978160558933 https://doi.org/10.1145/1873951.1874254

  37. Meng, D., Chen, H.: Magnet: a two-pronged defense against adversarial examples (2017). https://doi.org/10.48550/arxiv.1705.09064

  38. Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations (2017). https://doi.org/10.48550/arxiv.1702.04267

  39. Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples (2016). https://doi.org/10.48550/arxiv.1605.07277

  40. Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning (2017). https://doi.org/10.48550/arxiv.1602.02697

  41. PyTorch: Pytorch, resnet50. https://pytorch.org/ (2017). Accessed 06 May 2023

  42. Qiu, S., Liu, Q., Zhou, S., Wu, C.: Review of artificial intelligence adversarial attack and defense technologies. Appl. Sci. 9(5) (2019). ISSN 2076-341https://doi.org/10.3390/app9050909, https://www.mdpi.com/2076-3417/9/5/909

  43. Rahmati, A., Moosavi-Dezfooli, S.M., Frossard, P., Dai, H.: Geoda: a geometric framework for black-box adversarial attacks (2020). https://doi.org/10.48550/arxiv.2003.06468

  44. Schott, L., Rauber, J., Brendel, W., Bethge, M.: Robust perception through analysis by synthesis. CoRR abs/1805.09190 (2018). https://doi.org/10.48550/arxiv.1805.09190

  45. Shen, S., Jin, G., Gao, K., Zhang, Y.: Ape-gan: adversarial perturbation elimination with gan (2017). https://doi.org/10.48550/arxiv.1707.05474

  46. Shi, Y., Han, Y.: Decision-based black-box attack against vision transformers via patch-wise adversarial removal. CoRR abs/2112.03492 (2021). https://doi.org/10.48550/arxiv.2112.03492

  47. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks (2014). https://doi.org/10.48550/arxiv.1312.6199

  48. Thanh, D.N., Prasath, V., Phung, T.K., Hung, N.Q.: Impulse denoising based on noise accumulation and harmonic analysis techniques. Optik 241, 166163 (2021), ISSN 0030-402 https://doi.org/10.1016/j.ijleo.2020.166163, https://www.sciencedirect.com/science/article/pii/S0030402620319690

  49. Turay, T., Vladimirova, T.: Toward performing image classification and object detection with convolutional neural networks in autonomous driving systems: a survey. IEEE Access 10, 14076–14119 (2022). https://doi.org/10.1109/ACCESS.2022.3147495

    Article  MATH  Google Scholar 

  50. Vo, V.Q., Abbasnejad, E., Ranasinghe, D.C.: Query efficient decision based sparse attacks against black-box deep learning models. CoRR abs/2202.00091 (2022). https://doi.org/10.48550/arxiv.2202.00091

  51. Wang, X., Zhang, Z., Tong, K., Gong, D., He, K., Li, Z., Liu, W.: Triangle attack: A query-efficient decision-based adversarial attack. CoRR abs/2112.06569 (2021). https://doi.org/10.48550/arxiv.2112.06569

  52. Xie, C., Wang, J., Zhang, Z., Zhou, Y., Xie, L., Yuille, A.: Adversarial examples for semantic segmentation and object detection (2017). https://doi.org/10.48550/arxiv.1703.08603

  53. Xu, H., Ma, Y., Liu, H., Deb, D., Liu, H., Tang, J., Jain, A.K.: Adversarial attacks and defenses in images, graphs and text: a review (2019). https://doi.org/10.48550/arxiv.1909.08072

  54. Xu, W., Evans, D., Qi, Y.: Feature squeezing: Detecting adversarial examples in deep neural networks. In: Proceedings 2018 Network and Distributed System Security Symposium, Internet Society (2018). https://doi.org/10.14722/ndss.2018.23198

  55. Zuo, F., Yang, B., Li, X., Zeng, Q.: Exploiting the inherent limitation of l0 adversarial examples. In: 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019), pp. 293–307 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingyue Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jacobsen, J.B., Li, J., Mohus, M.L. (2025). Improving and Evaluating Sparse Decision-Based Black-Box Attacks and Defenses. In: Liu, S. (eds) Software Fault Prevention, Verification, and Validation. SFPVV 2024. Lecture Notes in Computer Science, vol 15393. Springer, Singapore. https://doi.org/10.1007/978-981-96-1621-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-96-1621-3_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-96-1620-6

  • Online ISBN: 978-981-96-1621-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics