ABSTRACT
As a critical threat to deep neural networks (DNNs), backdoor attacks can be categorized into two types, i.e., source-agnostic backdoor attacks (SABAs) and source-specific backdoor attacks (SSBAs). Compared to traditional SABAs, SSBAs are more advanced in that they have superior stealthier in bypassing mainstream countermeasures that are effective against SABAs. Nonetheless, existing SSBAs suffer from two major limitations. First, they can hardly achieve a good trade-off between ASR (attack success rate) and FPR (false positive rate). Besides, they can be effectively detected by the state-of-the-art (SOTA) countermeasures (e.g., SCAn [40]).
To address the limitations above, we propose a new class of viable source-specific backdoor attacks coined as CASSOCK. Our key insight is that trigger designs when creating poisoned data and cover data in SSBAs play a crucial role in demonstrating a viable source-specific attack, which has not been considered by existing SSBAs. With this insight, we focus on trigger transparency and content when crafting triggers for poisoned dataset where a sample has an attacker-targeted label and cover dataset where a sample has a ground-truth label. Specifically, we implement CASSOCKTrans that designs a trigger with heterogeneous transparency to craft poisoned and cover datasets, presenting better attack performance than existing SSBAs. We also propose CASSOCKCont that extracts salient features of the attacker-targeted label to generate a trigger, entangling the trigger features with normal features of the label, which is stealthier in bypassing the SOTA defenses. While both CASSOCKTrans and CASSOCKCont are orthogonal, they are complementary to each other, generating a more powerful attack, called CASSOCKComp, with further improved attack performance and stealthiness. To demonstrate their viability, we perform a comprehensive evaluation of the three CASSOCK-based attacks on four popular datasets (i.e., MNIST, CIFAR10, GTSRB and LFW) and three SOTA defenses (i.e., extended Neural Cleanse [45], Februus [8], and SCAn [40]). Compared with a representative SSBA as a baseline (SSBABase), CASSOCK-based attacks have significantly advanced the attack performance, i.e., higher ASR and lower FPR with comparable CDA (clean data accuracy). Besides, CASSOCK-based attacks have effectively bypassed the SOTA defenses, and SSBABase cannot.
- Sana Awan, Bo Luo, and Fengjun Li. 2021. CONTRA: Defending against poisoning attacks in federated learning. In European Symposium on Research in Computer Security. Springer, 455–475.Google ScholarDigital Library
- Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, and Biplav Srivastava. 2018. Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728 (2018).Google Scholar
- Huili Chen, Cheng Fu, Jishen Zhao, and Farinaz Koushanfar. 2019. DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks.. In International Joint Conference on Artificial Intelligence. 4658–4664.Google ScholarCross Ref
- Xiaoyi Chen, Ahmed Salem, Michael Backes, Shiqing Ma, and Yang Zhang. 2021. BadNL: Backdoor attacks against NLP models. In International Conference on Machine Learning 2021 Workshop on Adversarial Machine Learning.Google Scholar
- Zhenzhu Chen, Anmin Fu, Robert H Deng, Ximeng Liu, Yang Yang, and Yinghui Zhang. 2021. Secure and verifiable outsourced data dimension reduction on dynamic data. Information Sciences 573 (2021), 182–193.Google ScholarCross Ref
- Zhenzhu Chen, Shang Wang, Anmin Fu, Yansong Gao, Shui Yu, and Robert H Deng. 2022. LinkBreaker: Breaking the Backdoor-Trigger Link in DNNs via Neurons Consistency Check. IEEE Transactions on Information Forensics and Security 17 (2022), 2000–2014. https://doi.org/10.1109/TIFS.2022.3175616Google ScholarCross Ref
- Edward Chou, Florian Tramer, and Giancarlo Pellegrino. 2020. Sentinet: Detecting localized universal attacks against deep learning systems. In 2020 IEEE Security and Privacy Workshops (SPW). IEEE, 48–54.Google ScholarCross Ref
- Bao Gia Doan, Ehsan Abbasnejad, and Damith C Ranasinghe. 2020. Februus: Input purification defense against Trojan attacks on deep neural network systems. In Annual Computer Security Applications Conference. 897–912.Google ScholarDigital Library
- Khoa Doan, Yingjie Lao, and Ping Li. 2021. Backdoor attack with imperceptible input and latent modification. Advances in Neural Information Processing Systems 34 (2021), 18944–18957.Google Scholar
- Khoa Doan, Yingjie Lao, Weijie Zhao, and Ping Li. 2021. Lira: Learnable, imperceptible and robust backdoor attacks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 11966–11976.Google ScholarCross Ref
- Yansong Gao, Bao Gia Doan, Zhi Zhang, Siqi Ma, Jiliang Zhang, Anmin Fu, Surya Nepal, and Hyoungshick Kim. 2020. Backdoor attacks and countermeasures on deep learning: A comprehensive review. arXiv preprint arXiv:2007.10760 (2020).Google Scholar
- Yansong Gao, Yeonjae Kim, Bao Gia Doan, Zhi Zhang, Gongxuan Zhang, Surya Nepal, Damith Ranasinghe, and Hyoungshick Kim. 2021. Design and evaluation of a multi-domain Trojan detection method on deep neural networks. IEEE Transactions on Dependable and Secure Computing (2021).Google Scholar
- Yansong Gao, Change Xu, Derui Wang, Shiping Chen, Damith C Ranasinghe, and Surya Nepal. 2019. STRIP: A defence against Trojan attacks on deep neural networks. In Annual Computer Security Applications Conference. 113–125.Google ScholarDigital Library
- Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. 2017. BadNets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733 (2017).Google Scholar
- Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1735–1742.Google ScholarDigital Library
- Can He, Mingfu Xue, Jian Wang, and Weiqiang Liu. 2020. Embedding backdoors as the facial features: Invisible backdoor attacks against face recognition systems. In Proceedings of the ACM Turing Celebration Conference-China. 231–235.Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarCross Ref
- Gary B Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition.Google Scholar
- Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2017. Globally and locally consistent image completion. ACM Transactions on Graphics 36, 4 (2017), 1–14.Google ScholarDigital Library
- Yujie Ji, Xinyang Zhang, Shouling Ji, Xiapu Luo, and Ting Wang. 2018. Model-reuse attacks on deep learning systems. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. 349–363.Google ScholarDigital Library
- Michael I Jordan and Tom M Mitchell. 2015. Machine learning: trends, perspectives, and prospects. Science 349, 6245 (2015), 255–260.Google Scholar
- Matthew Joslin and Shuang Hao. 2020. Attributing and Detecting Fake Images Generated by Known GANs. In 2020 IEEE Security and Privacy Workshops (SPW). IEEE, 8–14.Google ScholarCross Ref
- Alex Krizhevsky, Geoffrey Hinton, 2009. Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Tech. Rep 1 (2009).Google Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarCross Ref
- Haoliang Li, Yufei Wang, Xiaofei Xie, Yang Liu, Shiqi Wang, Renjie Wan, Lap-Pui Chau, and Alex C Kot. 2020. Light can hack your face! black-box backdoor attack on face recognition systems. arXiv preprint arXiv:2009.06996 (2020).Google Scholar
- Yiming Li, Baoyuan Wu, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. 2020. Backdoor learning: A survey. arXiv preprint arXiv:2007.08745 (2020).Google Scholar
- Junyu Lin, Lei Xu, Yingqi Liu, and Xiangyu Zhang. 2020. Composite backdoor attack for deep neural network by mixing existing benign features. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 113–131.Google ScholarDigital Library
- Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. 2018. Fine-Pruning: Defending against backdooring attacks on deep neural networks. In International Symposium on Research in Attacks, Intrusions, and Defenses. Springer, 273–294.Google ScholarCross Ref
- Yingqi Liu, Wen-Chuan Lee, Guanhong Tao, Shiqing Ma, Yousra Aafer, and Xiangyu Zhang. 2019. ABS: Scanning neural networks for back-doors by artificial brain stimulation. In ACM SIGSAC Conference on Computer and Communications Security. 1265–1282.Google ScholarDigital Library
- Yunfei Liu, Xingjun Ma, James Bailey, and Feng Lu. 2020. Reflection backdoor: A natural backdoor attack on deep neural networks. In European Conference on Computer Vision. Springer, 182–199.Google ScholarDigital Library
- Hua Ma, Yinshan Li, Yansong Gao, Alsharif Abuadbba, Zhi Zhang, Anmin Fu, Hyoungshick Kim, Said F Al-Sarawi, Nepal Surya, and Derek Abbott. 2022. Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical World. arXiv preprint arXiv:2201.08619 (2022).Google Scholar
- Tuan Anh Nguyen and Anh Tuan Tran. 2020. WaNet-Imperceptible Warping-based Backdoor Attack. In International Conference on Learning Representations.Google Scholar
- Omkar M Parkhi, Andrea Vedaldi, and Andrew Zisserman. 2015. Deep face recognition. (2015).Google Scholar
- Han Qiu, Yi Zeng, Shangwei Guo, Tianwei Zhang, Meikang Qiu, and Bhavani Thuraisingham. 2021. Deepsweep: An evaluation framework for mitigating dnn backdoor attacks using data augmentation. In ACM Asia Conference on Computer and Communications Security. 363–377.Google ScholarDigital Library
- Aniruddha Saha, Akshayvarun Subramanya, and Hamed Pirsiavash. 2020. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 11957–11965.Google ScholarCross Ref
- Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. 618–626.Google ScholarCross Ref
- Ali Shafahi, W Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, and Tom Goldstein. 2018. Poison Frogs! targeted clean-label poisoning attacks on neural networks. Advances in Neural Information Processing Systems 31 (2018), 6106–6116.Google Scholar
- Johannes Stallkamp, Marc Schlipsing, Jan Salmen, and Christian Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks 32 (2012), 323–332.Google ScholarDigital Library
- Chuanqi Tan, Fuchun Sun, Tao Kong, Wenchang Zhang, Chao Yang, and Chunfang Liu. 2018. A survey on deep transfer learning. In International conference on artificial neural networks. 270–279.Google ScholarCross Ref
- Di Tang, XiaoFeng Wang, Haixu Tang, and Kehuan Zhang. 2021. Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection. In 30th USENIX Security Symposium. 1541–1558.Google Scholar
- Guanhong Tao, Yingqi Liu, Guangyu Shen, Qiuling Xu, Shengwei An, Zhuo Zhang, and Xiangyu Zhang. 2022. Model orthogonalization: Class distance hardening in neural networks for better security. In IEEE Symposium on Security and Privacy.Google ScholarCross Ref
- Brandon Tran, Jerry Li, and Aleksander Madry. 2018. Spectral signatures in backdoor attacks. Advances in Neural Information Processing Systems 31 (2018), 8011–8021.Google Scholar
- Miguel Villarreal-Vasquez and Bharat Bhargava. 2020. Confoc: Content-focus protection against trojan attacks on neural networks. arXiv preprint arXiv:2007.00711 (2020).Google Scholar
- Renjie Wan, Boxin Shi, Ling-Yu Duan, Ah-Hwee Tan, and Alex C Kot. 2017. Benchmarking single-image reflection removal algorithms. In IEEE International Conference on Computer Vision. 3922–3930.Google ScholarCross Ref
- Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y Zhao. 2019. Neural Cleanse: Identifying and mitigating backdoor attacks in neural networks. In IEEE Symposium on Security and Privacy. 707–723.Google ScholarCross Ref
- Ning Wang, Yang Xiao, Yimin Chen, Yang Hu, Wenjing Lou, and Y Thomas Hou. 2022. FLARE: Defending Federated Learning against Model Poisoning Attacks via Latent Space Representations. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security. 946–958.Google ScholarDigital Library
- Emily Wenger, Josephine Passananti, Arjun Nitin Bhagoji, Yuanshun Yao, Haitao Zheng, and Ben Y Zhao. 2021. Backdoor attacks against deep learning systems in the physical world. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6206–6215.Google ScholarCross Ref
- Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. 2021. Graph backdoor. In USENIX Security Symposium. 1523–1540.Google Scholar
- Shihao Zhao, Xingjun Ma, Xiang Zheng, James Bailey, Jingjing Chen, and Yu-Gang Jiang. 2020. Clean-Label backdoor attacks on video recognition models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14443–14452.Google ScholarCross Ref
- Lei Zhou, Anmin Fu, Guomin Yang, Huaqun Wang, and Yuqing Zhang. 2020. Efficient certificateless multi-copy integrity auditing scheme supporting data dynamics. IEEE Transactions on Dependable and Secure Computing (2020).Google Scholar
- Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, and Qing He. 2020. A comprehensive survey on transfer learning. Proc. IEEE 109 (2020), 43–76.Google ScholarCross Ref
Index Terms
- CASSOCK: Viable Backdoor Attacks against DNN in the Wall of Source-Specific Backdoor Defenses
Recommendations
Composite Backdoor Attack for Deep Neural Network by Mixing Existing Benign Features
CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications SecurityWith the prevalent use of Deep Neural Networks (DNNs) in many applications, security of these networks is of importance. Pre-trained DNNs may contain backdoors that are injected through poisoned training. These trojaned models perform well when regular ...
Countermeasures Against Backdoor Attacks Towards Malware Detectors
Cryptology and Network SecurityAbstractAttacks on machine learning systems have been systematized as adversarial machine learning, and a variety of attack algorithms have been studied until today. In the malware classification problem, several papers have suggested the possibility of ...
Imperceptible and multi-channel backdoor attack
AbstractRecent researches demonstrate that Deep Neural Networks (DNN) models are vulnerable to backdoor attacks. The backdoored DNN model will behave maliciously when images containing backdoor triggers arrive. To date, almost all the existing backdoor ...
Comments