Abstract:
Bit-Trojan attacks based on Bit-Flip Attacks (BFAs) have emerged as severe threats to Deep Neural Networks (DNNs) deployed in safety-critical systems since they can injec...Show MoreMetadata
Abstract:
Bit-Trojan attacks based on Bit-Flip Attacks (BFAs) have emerged as severe threats to Deep Neural Networks (DNNs) deployed in safety-critical systems since they can inject Trojans during the model deployment stage without accessing training supply chains. Existing works are mainly devoted to improving the executability of Bit-Trojan attacks, while seriously ignoring the concerns on evasiveness. In this paper, we propose a highly Evasive Targeted Bit-Trojan (ETBT) with evasiveness improvements from three aspects, i.e., reducing the number of bit-flips (improving executability), smoothing activation distribution, and reducing accuracy fluctuation. Specifically, key neuron extraction is utilized to identify essential neurons from DNNs precisely and decouple the key neurons between different classes, thus improving the evasiveness regarding accuracy fluctuation and executability. Additionally, activation-constrained trigger generation is devised to eliminate the differences between activation distributions of Trojaned and clean models, which enhances evasiveness from the perspective of activation distribution. Ultimately, the strategy of constrained target bits search is designed to reduce bit-flip numbers, directly benefits the evasiveness of ETBT. Benchmark-based experiments are conducted to evaluate the superiority of ETBT. Compared with existing works, ETBT can significantly improve evasiveness-relevant performances with much lower computation overheads, better robustness, and generalizability. Our code is released at https://github.com/bluefier/ETBT.
Published in: IEEE Transactions on Computers ( Volume: 73, Issue: 9, September 2024)