Abstract:
Deep neural networks are vulnerable to adversarial examples that alter the output significantly with imperceptible change in the input. In our black-box setting, the adve...Show MoreMetadata
Abstract:
Deep neural networks are vulnerable to adversarial examples that alter the output significantly with imperceptible change in the input. In our black-box setting, the adversarial attacker can only query the model to predict the value after the softmax layer without accessing the underlying model. Currently, generating adversarial examples with high qualifications in the query-limited setting and investigating the distribution of adversarial examples are two main challenges in the black-box attack. In this paper, we propose a zeroth-order optimization method for the black-box adversarial attack, termed subspace activation evolution strategy (SA-ES). It captures the most promising direction for generating more convincing adversarial examples.Moreover, instead of only searching for one reliable adversarial example for an original input, SA-ES finds a distribution of adversarial examples, such that a sample drawn from this distribution is likely an adversarial example. We conduct comprehensive experiments on various data sets and validate that the proposed algorithm can efficiently find perturbation-sensitive regions of an image and stably explore the distribution of adversarial examples with the limited query, and outperforms the existing methods. In addition, we apply SA-ES to physical reality black-box attacks, which effectively generate simulated physical adversarial examples for the adversarial training model.
Published in: IEEE Transactions on Emerging Topics in Computational Intelligence ( Volume: 7, Issue: 3, June 2023)