Skip to main content

Why is Your Trojan NOT Responding? A Quantitative Analysis of Failures in Backdoor Attacks of Neural Networks

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2021)

Abstract

Backdoor has offered a new attack vector to degrade or even subvert deep learning systems and thus has been extensively studied in the past few years. In reality, however, it is not as robust as expected and oftentimes fails due to many factors, such as data transformations on backdoor triggers and defensive measures of the target model. Different backdoor algorithms vary from resilience to these factors. To evaluate the robustness of backdoor attacks, we conduct a quantitative analysis of backdoor failures and further provide an interpretable way to unveil why these transformations can counteract backdoors. First, we build a uniform evaluation framework in which five backdoor algorithms and three types of transformations are implemented. We randomly select a number of samples from each test dataset, and then these samples are poisoned by triggers. These distorted variants of samples are passed to the trojan models after various data transformations. We measure the differences of predicated results between input samples as influences of transformations for backdoor attacks. Moreover, we present a simple approach to interpret the caused degradation. The results as well as conclusions in this study shed light on the difficulties of backdoor attacks in the real world, and can facilitate the future research on robust backdoor attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agarwal, A., Singh, R., Vatsa, M., Ratha, N.: Image transformation-based defense against adversarial perturbation on deep learning models. IEEE Trans. Dependable Secure Comput. 18(5), 2106–2121 (2021)

    Google Scholar 

  2. Bagdasaryan, E., Shmatikov, V.: Blind backdoors in deep learning models. arXiv abs/2005.03823 (2020)

    Google Scholar 

  3. Chen, X., Liu, C., Li, B., Lu, K., Song, D.: Targeted backdoor attacks on deep learning systems using data poisoning. CoRR abs/1712.05526 (2017)

    Google Scholar 

  4. Cheng, S., Liu, Y., Ma, S., Zhang, X.: Deep feature space trojan attack of neural networks by controlled detoxification. In: AAAI, pp. 1148–1156 (2021)

    Google Scholar 

  5. Doan, B.G., Abbasnejad, E., Ranasinghe, D.C.: Februus: input purification defense against trojan attacks on deep neural network systems. In: ACSAC 2020: Annual Computer Security Applications Conference, Virtual Event/Austin, TX, USA, 7–11 December 2020, pp. 897–912. ACM (2020)

    Google Scholar 

  6. Dumford, J., Scheirer, W.: Backdooring convolutional neural networks via targeted weight perturbations. In: 2020 IEEE International Joint Conference on Biometrics (IJCB), pp. 1–9 (2020)

    Google Scholar 

  7. Gao, Y., Xu, C., Wang, D., Chen, S., Ranasinghe, D.C., Nepal, S.: STRIP: a defence against trojan attacks on deep neural networks. In: Balenson, D. (ed.) ACSAC, pp. 113–125. ACM (2019)

    Google Scholar 

  8. Gu, T., Dolan-Gavitt, B., Garg, S.: Badnets: identifying vulnerabilities in the machine learning model supply chain. CoRR abs/1708.06733 (2017)

    Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE CVPR, pp. 770–778 (2016)

    Google Scholar 

  10. He, Y., Meng, G., Chen, K., He, J., Hu, X.: Deepobliviate: a powerful charm for erasing data residual memory in deep neural networks. CoRR abs/2105.06209 (2021). https://arxiv.org/abs/2105.06209

  11. He, Y., Meng, G., Chen, K., He, J., Hu, X.: DRMI: a dataset reduction technology based on mutual information for black-box attacks. In: Proceedings of the 30th USENIX Security Symposium (USENIX), August 2021

    Google Scholar 

  12. He, Y., Meng, G., Chen, K., Hu, X., He, J.: Towards security threats of deep learning systems: a survey, pp. 1–28 (2020). https://doi.org/10.1109/TSE.2020.3034721

  13. Kolouri, S., Saha, A., Pirsiavash, H., Hoffmann, H.: Universal litmus patterns: revealing backdoor attacks in CNNs. In: CVPR, pp. 298–307. Computer Vision Foundation/IEEE (2020)

    Google Scholar 

  14. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  15. LeCun, Y.: The MNIST database of handwritten digits (2017). http://yann.lecun.com/exdb/mnist/

  16. Li, S., Xue, M., Zhao, B.Z.H., Zhu, H., Zhang, X.: Invisible backdoor attacks on deep neural networks via steganography and regularization. IEEE Trans. Dependable Secure Comput. 18, 2088–2105 (2021)

    Google Scholar 

  17. Li, Y., Zhai, T., Wu, B., Jiang, Y., Li, Z., Xia, S.: Rethinking the trigger of backdoor attack. arXiv abs/2004.04692 (2020)

    Google Scholar 

  18. Li, Y., Li, Y., Wu, B., Li, L., He, R., Lyu, S.: Backdoor attack with sample-specific triggers. arXiv abs/2012.03816 (2020)

    Google Scholar 

  19. Lin, J., Xu, L., Liu, Y., Zhang, X.: Composite backdoor attack for deep neural network by mixing existing benign features. In: CCS (2020)

    Google Scholar 

  20. Liu, K., Dolan-Gavitt, B., Garg, S.: Fine-pruning: defending against backdooring attacks on deep neural networks. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) RAID 2018. LNCS, vol. 11050, pp. 273–294. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00470-5_13

    Chapter  Google Scholar 

  21. Liu, Y., Lee, W., Tao, G., Ma, S., Aafer, Y., Zhang, X.: ABS: scanning neural networks for back-doors by artificial brain stimulation. In: Cavallaro, L., Kinder, J., Wang, X., Katz, J. (eds.) CCS, pp. 1265–1282. ACM (2019)

    Google Scholar 

  22. Liu, Y., et al.: Trojaning attack on neural networks. In: NDSS. The Internet Society (2018)

    Google Scholar 

  23. Liu, Y., Ma, X., Bailey, J., Lu, F.: Reflection backdoor: a natural backdoor attack on deep neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 182–199. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_11

    Chapter  Google Scholar 

  24. Murphy, K.P.: Machine Learning - A Probabilistic Perspective. Adaptive Computation and Machine Learning Series. MIT Press, Cambridge (2012)

    Google Scholar 

  25. Neuroinformatik, I.F.: German Traffic Sign Detection Benchmark (GTSRB) (2019). https://benchmark.ini.rub.de/

  26. Pasquini, C., Böhme, R.: Trembling triggers: exploring the sensitivity of backdoors in DNN-based face recognition. EURASIP J. Inf. Secur. 2020, 12 (2020)

    Article  Google Scholar 

  27. Quiring, E., Rieck, K.: Backdooring and poisoning neural networks with image-scaling attacks. In: 2020 IEEE Security and Privacy Workshops (SPW), pp. 41–47 (2020)

    Google Scholar 

  28. Rakin, A.S., He, Z., Fan, D.: TBT: targeted neural network attack with bit trojan. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13195–13204 (2020)

    Google Scholar 

  29. Saha, A., Subramanya, A., Pirsiavash, H.: Hidden trigger backdoor attacks. In: AAAI (2020)

    Google Scholar 

  30. Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019)

    Article  Google Scholar 

  31. Smilkov, D., Thorat, N., Kim, B., Viégas, F., Wattenberg, M.: Smoothgrad: removing noise by adding noise (2017)

    Google Scholar 

  32. Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 3319–3328. PMLR (2017). https://proceedings.mlr.press/v70/sundararajan17a.html

  33. Szegedy, C., et al.: Intriguing properties of neural networks. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2014)

    Google Scholar 

  34. Tang, R., Du, M., Liu, N., Yang, F., Hu, X.: An embarrassingly simple approach for trojan attack in deep neural networks. In: KDD (2020)

    Google Scholar 

  35. Turner, A., Tsipras, D., Madry, A.: Label-consistent backdoor attacks. arXiv abs/1912.02771 (2019)

    Google Scholar 

  36. Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, 19–23 May 2019, pp. 707–723. IEEE (2019)

    Google Scholar 

  37. Weng, C.H., Lee, Y.T., Wu, S.H.: On the trade-off between adversarial and backdoor robustness. In: NeurIPS (2020)

    Google Scholar 

  38. Wenger, E., Passananti, J., Bhagoji, A.N., Yao, Y., Zheng, H., Zhao, B.Y.: Backdoor attacks against deep learning systems in the physical world. In: CVPR, pp. 6206–6215. Computer Vision Foundation/IEEE (2021)

    Google Scholar 

  39. Xiao, Q., Chen, Y., Shen, C., Chen, Y., Li, K.: Seeing is not believing: camouflage attacks on image scaling algorithms. In: USENIX Security Symposium (2019)

    Google Scholar 

  40. Xue, M., He, C., Sun, S., Wang, J., Liu, W.: Robust backdoor attacks against deep neural networks in real physical world. arXiv abs/2104.07395 (2021)

    Google Scholar 

  41. Zha, M., Meng, G., Lin, C., Zhou, Z., Chen, K.: RoLMA: a practical adversarial attack against deep learning-based LPR systems. In: Liu, Z., Yung, M. (eds.) Inscrypt 2019. LNCS, vol. 12020, pp. 101–117. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-42921-8_6

    Chapter  Google Scholar 

  42. Zhao, S., Ma, X., Zheng, X., Bailey, J., Chen, J., Jiang, Y.: Clean-label backdoor attacks on video recognition models. In: CVPR, pp. 14431–14440. Computer Vision Foundation/IEEE (2020)

    Google Scholar 

  43. Zhao, Y., Zhu, H., Liang, R., Shen, Q., Zhang, S., Chen, K.: Seeing isn’t believing: towards more robust adversarial attack against real world object detectors. In: Cavallaro, L., Kinder, J., Wang, X., Katz, J. (eds.) CCS, pp. 1989–2004. ACM (2019)

    Google Scholar 

Download references

Acknowledgement

We thank all the anonymous reviewers for their constructive feedback. IIE authors are supported in part of the National Key Research and Development Program (No. 2020AAA0107800), National Natural Science Foundation of China (No. U1836211, 61902395), the Anhui Department of Science and Technology (No. 202103a05020009), and Beijing Natural Science Foundation (No. JQ18011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guozhu Meng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, X., Lan, Y., Gao, R., Meng, G., Chen, K. (2022). Why is Your Trojan NOT Responding? A Quantitative Analysis of Failures in Backdoor Attacks of Neural Networks. In: Lai, Y., Wang, T., Jiang, M., Xu, G., Liang, W., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2021. Lecture Notes in Computer Science(), vol 13157. Springer, Cham. https://doi.org/10.1007/978-3-030-95391-1_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-95391-1_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-95390-4

  • Online ISBN: 978-3-030-95391-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics