Impact Statement:While DNNs are widely applied in face recognition tasks, they are susceptible to adversarial attacks even under black-box scenarios. Black-box adversarial examples are ge...Show More
Abstract:
In recent years, deep neural networks (DNNs) have made significant progress on face recognition (FR). However, DNNs have been found to be vulnerable to adversarial exampl...Show MoreMetadata
Impact Statement:
While DNNs are widely applied in face recognition tasks, they are susceptible to adversarial attacks even under black-box scenarios. Black-box adversarial examples are generally crafted against an ensemble of virtual models for the sake of transferability, where these virtual models are generated by eroding intermediate structures of a base model. However, this erosion mechanism impairs the virtual model's accuracy, which in turn, causes miscalculations for the adversarial loss, damaging the attack's effectiveness. We solve this problem by introducing Gradient Erosion (GE) so that the crafted model ensemble remains diverse while not losing its accuracy, thereby improving the effectiveness of adversarial attacks against black-box face recognition systems impressively. Our proposed approach implies urgent improvements lying within face recognition systems deployed in both lab experimental environments and production commercial products.
Abstract:
In recent years, deep neural networks (DNNs) have made significant progress on face recognition (FR). However, DNNs have been found to be vulnerable to adversarial examples, leading to fatal consequences in real-world applications. This article focuses on improving the transferability of adversarial examples against FR models. We propose gradient eroding (GE) to make the gradient of the residual blocks more diverse, by eroding the back-propagation dynamically. We also propose a novel black-box adversarial attack named corrasion attack based on GE. Extensive experiments demonstrate that our approach can effectively improve the transferability of adversarial attacks against FR models. Our approach overperforms 29.35% in fooling rate than state-of-the-art black-box attacks. Leveraging adversarial training with adversarial examples generated by us, the robustness of models can be improved by up to 43.2%. Besides, corrasion attack successfully breaks two online FR systems, achieving a highest fooling rate of 89.8%.
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 5, Issue: 1, January 2024)