Why is Your Trojan NOT Responding? A Quantitative Analysis of Failures in Backdoor Attacks of Neural Networks

Hu, Xingbo; Lan, Yibing; Gao, Ruimin; Meng, Guozhu; Chen, Kai

doi:10.1007/978-3-030-95391-1_47

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13157))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1624 Accesses

Abstract

Backdoor has offered a new attack vector to degrade or even subvert deep learning systems and thus has been extensively studied in the past few years. In reality, however, it is not as robust as expected and oftentimes fails due to many factors, such as data transformations on backdoor triggers and defensive measures of the target model. Different backdoor algorithms vary from resilience to these factors. To evaluate the robustness of backdoor attacks, we conduct a quantitative analysis of backdoor failures and further provide an interpretable way to unveil why these transformations can counteract backdoors. First, we build a uniform evaluation framework in which five backdoor algorithms and three types of transformations are implemented. We randomly select a number of samples from each test dataset, and then these samples are poisoned by triggers. These distorted variants of samples are passed to the trojan models after various data transformations. We measure the differences of predicated results between input samples as influences of transformations for backdoor attacks. Moreover, we present a simple approach to interpret the caused degradation. The results as well as conclusions in this study shed light on the difficulties of backdoor attacks in the real world, and can facilitate the future research on robust backdoor attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal, A., Singh, R., Vatsa, M., Ratha, N.: Image transformation-based defense against adversarial perturbation on deep learning models. IEEE Trans. Dependable Secure Comput. 18(5), 2106–2121 (2021)
Google Scholar
Bagdasaryan, E., Shmatikov, V.: Blind backdoors in deep learning models. arXiv abs/2005.03823 (2020)
Google Scholar
Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. CoRR abs/1712.05526 (2017)
Google Scholar
Cheng, S., Liu, Y., Ma, S., Zhang, X.: Deep feature space trojan attack of neural networks by controlled detoxification. In: AAAI, pp. 1148–1156 (2021)
Google Scholar
Doan, B.G., Abbasnejad, E., Ranasinghe, D.C.: Februus: input purification defense against trojan attacks on deep neural network systems. In: ACSAC 2020: Annual Computer Security Applications Conference, Virtual Event/Austin, TX, USA, 7–11 December 2020, pp. 897–912. ACM (2020)
Google Scholar
Dumford, J., Scheirer, W.: Backdooring convolutional neural networks via targeted weight perturbations. In: 2020 IEEE International Joint Conference on Biometrics (IJCB), pp. 1–9 (2020)
Google Scholar
Gao, Y., Xu, C., Wang, D., Chen, S., Ranasinghe, D.C., Nepal, S.: STRIP: a defence against trojan attacks on deep neural networks. In: Balenson, D. (ed.) ACSAC, pp. 113–125. ACM (2019)
Google Scholar
Gu, T., Dolan-Gavitt, B., Garg, S.: Badnets: identifying vulnerabilities in the machine learning model supply chain. CoRR abs/1708.06733 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE CVPR, pp. 770–778 (2016)
Google Scholar
He, Y., Meng, G., Chen, K., He, J., Hu, X.: Deepobliviate: a powerful charm for erasing data residual memory in deep neural networks. CoRR abs/2105.06209 (2021). https://arxiv.org/abs/2105.06209
He, Y., Meng, G., Chen, K., He, J., Hu, X.: DRMI: a dataset reduction technology based on mutual information for black-box attacks. In: Proceedings of the 30th USENIX Security Symposium (USENIX), August 2021
Google Scholar
He, Y., Meng, G., Chen, K., Hu, X., He, J.: Towards security threats of deep learning systems: a survey, pp. 1–28 (2020). https://doi.org/10.1109/TSE.2020.3034721
Kolouri, S., Saha, A., Pirsiavash, H., Hoffmann, H.: Universal litmus patterns: revealing backdoor attacks in CNNs. In: CVPR, pp. 298–307. Computer Vision Foundation/IEEE (2020)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
LeCun, Y.: The MNIST database of handwritten digits (2017). http://yann.lecun.com/exdb/mnist/
Li, S., Xue, M., Zhao, B.Z.H., Zhu, H., Zhang, X.: Invisible backdoor attacks on deep neural networks via steganography and regularization. IEEE Trans. Dependable Secure Comput. 18, 2088–2105 (2021)
Google Scholar
Li, Y., Zhai, T., Wu, B., Jiang, Y., Li, Z., Xia, S.: Rethinking the trigger of backdoor attack. arXiv abs/2004.04692 (2020)
Google Scholar
Li, Y., Li, Y., Wu, B., Li, L., He, R., Lyu, S.: Backdoor attack with sample-specific triggers. arXiv abs/2012.03816 (2020)
Google Scholar
Lin, J., Xu, L., Liu, Y., Zhang, X.: Composite backdoor attack for deep neural network by mixing existing benign features. In: CCS (2020)
Google Scholar
Liu, K., Dolan-Gavitt, B., Garg, S.: Fine-pruning: defending against backdooring attacks on deep neural networks. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) RAID 2018. LNCS, vol. 11050, pp. 273–294. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00470-5_13
Chapter Google Scholar
Liu, Y., Lee, W., Tao, G., Ma, S., Aafer, Y., Zhang, X.: ABS: scanning neural networks for back-doors by artificial brain stimulation. In: Cavallaro, L., Kinder, J., Wang, X., Katz, J. (eds.) CCS, pp. 1265–1282. ACM (2019)
Google Scholar
Liu, Y., et al.: Trojaning attack on neural networks. In: NDSS. The Internet Society (2018)
Google Scholar
Liu, Y., Ma, X., Bailey, J., Lu, F.: Reflection backdoor: a natural backdoor attack on deep neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 182–199. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_11
Chapter Google Scholar
Murphy, K.P.: Machine Learning - A Probabilistic Perspective. Adaptive Computation and Machine Learning Series. MIT Press, Cambridge (2012)
Google Scholar
Neuroinformatik, I.F.: German Traffic Sign Detection Benchmark (GTSRB) (2019). https://benchmark.ini.rub.de/
Pasquini, C., Böhme, R.: Trembling triggers: exploring the sensitivity of backdoors in DNN-based face recognition. EURASIP J. Inf. Secur. 2020, 12 (2020)
Article Google Scholar
Quiring, E., Rieck, K.: Backdooring and poisoning neural networks with image-scaling attacks. In: 2020 IEEE Security and Privacy Workshops (SPW), pp. 41–47 (2020)
Google Scholar
Rakin, A.S., He, Z., Fan, D.: TBT: targeted neural network attack with bit trojan. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13195–13204 (2020)
Google Scholar
Saha, A., Subramanya, A., Pirsiavash, H.: Hidden trigger backdoor attacks. In: AAAI (2020)
Google Scholar
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019)
Article Google Scholar
Smilkov, D., Thorat, N., Kim, B., Viégas, F., Wattenberg, M.: Smoothgrad: removing noise by adding noise (2017)
Google Scholar
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 3319–3328. PMLR (2017). https://proceedings.mlr.press/v70/sundararajan17a.html
Szegedy, C., et al.: Intriguing properties of neural networks. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2014)
Google Scholar
Tang, R., Du, M., Liu, N., Yang, F., Hu, X.: An embarrassingly simple approach for trojan attack in deep neural networks. In: KDD (2020)
Google Scholar
Turner, A., Tsipras, D., Madry, A.: Label-consistent backdoor attacks. arXiv abs/1912.02771 (2019)
Google Scholar
Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, 19–23 May 2019, pp. 707–723. IEEE (2019)
Google Scholar
Weng, C.H., Lee, Y.T., Wu, S.H.: On the trade-off between adversarial and backdoor robustness. In: NeurIPS (2020)
Google Scholar
Wenger, E., Passananti, J., Bhagoji, A.N., Yao, Y., Zheng, H., Zhao, B.Y.: Backdoor attacks against deep learning systems in the physical world. In: CVPR, pp. 6206–6215. Computer Vision Foundation/IEEE (2021)
Google Scholar
Xiao, Q., Chen, Y., Shen, C., Chen, Y., Li, K.: Seeing is not believing: camouflage attacks on image scaling algorithms. In: USENIX Security Symposium (2019)
Google Scholar
Xue, M., He, C., Sun, S., Wang, J., Liu, W.: Robust backdoor attacks against deep neural networks in real physical world. arXiv abs/2104.07395 (2021)
Google Scholar
Zha, M., Meng, G., Lin, C., Zhou, Z., Chen, K.: RoLMA: a practical adversarial attack against deep learning-based LPR systems. In: Liu, Z., Yung, M. (eds.) Inscrypt 2019. LNCS, vol. 12020, pp. 101–117. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-42921-8_6
Chapter Google Scholar
Zhao, S., Ma, X., Zheng, X., Bailey, J., Chen, J., Jiang, Y.: Clean-label backdoor attacks on video recognition models. In: CVPR, pp. 14431–14440. Computer Vision Foundation/IEEE (2020)
Google Scholar
Zhao, Y., Zhu, H., Liang, R., Shen, Q., Zhang, S., Chen, K.: Seeing isn’t believing: towards more robust adversarial attack against real world object detectors. In: Cavallaro, L., Kinder, J., Wang, X., Katz, J. (eds.) CCS, pp. 1989–2004. ACM (2019)
Google Scholar

Download references

Acknowledgement

We thank all the anonymous reviewers for their constructive feedback. IIE authors are supported in part of the National Key Research and Development Program (No. 2020AAA0107800), National Natural Science Foundation of China (No. U1836211, 61902395), the Anhui Department of Science and Technology (No. 202103a05020009), and Beijing Natural Science Foundation (No. JQ18011).

Author information

Authors and Affiliations

SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Xingbo Hu, Yibing Lan, Guozhu Meng & Kai Chen
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Xingbo Hu, Yibing Lan, Guozhu Meng & Kai Chen
Mathematics and Statistics, University of Victoria, Victoria, Canada
Ruimin Gao

Authors

Xingbo Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yibing Lan
View author publications
You can also search for this author in PubMed Google Scholar
Ruimin Gao
View author publications
You can also search for this author in PubMed Google Scholar
Guozhu Meng
View author publications
You can also search for this author in PubMed Google Scholar
Kai Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guozhu Meng .

Editor information

Editors and Affiliations

Xiamen University, Xiamen, China
Yongxuan Lai
Beijing Normal University, Zhuhai, China
Tian Wang
Xiamen University, Xiamen, China
Min Jiang
Tianjin University, Tianjin, China
Guangquan Xu
Hunan University, Changsha, China
Wei Liang
University of Naples Parthenope, Naples, Italy
Aniello Castiglione

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, X., Lan, Y., Gao, R., Meng, G., Chen, K. (2022). Why is Your Trojan NOT Responding? A Quantitative Analysis of Failures in Backdoor Attacks of Neural Networks. In: Lai, Y., Wang, T., Jiang, M., Xu, G., Liang, W., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2021. Lecture Notes in Computer Science(), vol 13157. Springer, Cham. https://doi.org/10.1007/978-3-030-95391-1_47

Download citation

DOI: https://doi.org/10.1007/978-3-030-95391-1_47
Published: 23 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95390-4
Online ISBN: 978-3-030-95391-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Why is Your Trojan NOT Responding? A Quantitative Analysis of Failures in Backdoor Attacks of Neural Networks