Skip to main content

AGNES: Abstraction-Guided Framework for Deep Neural Networks Security

  • Conference paper
  • First Online:
Verification, Model Checking, and Abstract Interpretation (VMCAI 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14500))

  • 289 Accesses

Abstract

Deep Neural Networks (DNNs) are becoming widespread, particularly in safety-critical areas. One prominent application is image recognition in autonomous driving, where the correct classification of objects, such as traffic signs, is essential for safe driving. Unfortunately, DNNs are prone to backdoors, meaning that they concentrate on attributes of the image that should be irrelevant for their correct classification. Backdoors are integrated into a DNN during training, either with malicious intent (such as a manipulated training process, because of which a yellow sticker always leads to a traffic sign being recognised as a stop sign) or unintentional (such as a rural background leading to any traffic sign being recognised as “animal crossing”, because of biased training data).

In this paper, we introduce AGNES, a tool to detect backdoors in DNNs for image recognition. We discuss the principle approach on which AGNES is based. Afterwards, we show that our tool performs better than many state-of-the-art methods for multiple relevant case studies.

This research was funded in part by the EU under project 864075 CAESAR, the project Audi Verifiable AI, and the BMWi funded KARLI project (grant 19A21031C).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abadi, M.: Tensorflow: learning functions at scale. In: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, p. 1 (2016)

    Google Scholar 

  2. Ashok, P., Hashemi, V., Křetínský, J., Mohr, S.: DeepAbstract: neural network abstraction for accelerating verification. In: Hung, D.V., Sokolsky, O. (eds.) ATVA 2020. LNCS, vol. 12302, pp. 92–107. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59152-6_5

    Chapter  Google Scholar 

  3. Bai, J., Lu, F., Zhang, K.: ONNX: open neural network exchange, github (online) (2023). https://github.com/onnx/onnx

  4. Chen, C., Seff, A., Kornhauser, A., Xiao, J.: Deepdriving: learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2722–2730 (2015)

    Google Scholar 

  5. Chen, X., et al.: Recent advances and clinical applications of deep learning in medical image analysis. Med. Image Anal. 79, 102444 (2022)

    Article  Google Scholar 

  6. Dhonthi, A., Hahn, E.M., Hashemi, V.: Backdoor mitigation in deep neural networks via strategic retraining. In: Chechik, M., Katoen, J.P., Leucker, M. (eds.) FM 2023. LNCS, vol. 14000, pp. 635–647. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-27481-7_37

    Chapter  Google Scholar 

  7. Dmitriev, K., Schumann, J., Holzapfel, F.: Toward certification of machine-learning systems for low criticality airborne applications. In: 2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC), pp. 1–7. IEEE (2021)

    Google Scholar 

  8. Faster, R.: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 9199, no. 10.5555, pp. 2969239–2969250 (2015)

    Google Scholar 

  9. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    Google Scholar 

  10. Gu, T., Dolan-Gavitt, B., Garg, S.: Badnets: identifying vulnerabilities in the machine learning model supply chain. In: Proceedings of Machine Learning and Computer Security Workshop (2017)

    Google Scholar 

  11. Huang, S., Peng, W., Jia, Z., Tu, Z.: One-pixel signature: characterizing CNN models for backdoor detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12372, pp. 326–341. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58583-9_20

    Chapter  Google Scholar 

  12. Huang, X., et al.: A survey of safety and trustworthiness of deep neural networks: verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37, 100270 (2020)

    Article  MathSciNet  Google Scholar 

  13. Khan, K., Rehman, S.U., Aziz, K., Fong, S., Sarasvady, S.: DBSCAN: past, present and future. In: The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), pp. 232–238. IEEE (2014)

    Google Scholar 

  14. Kolouri, S., Saha, A., Pirsiavash, H., Hoffmann, H.: Universal litmus patterns: revealing backdoor attacks in CNNs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 301–310 (2020)

    Google Scholar 

  15. Li, Y., Jiang, Y., Li, Z., Xia, S.T.: Backdoor learning: a survey. IEEE Trans. Neural Netw. Learn. Syst. 1–18 (2022). https://doi.org/10.1109/TNNLS.2022.3182979. https://ieeexplore.ieee.org/document/9802938/

  16. Liu, Y., Lee, W.C., Tao, G., Ma, S., Aafer, Y., Zhang, X.: ABS: scanning neural networks for back-doors by artificial brain stimulation. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 1265–1282 (2019)

    Google Scholar 

  17. Liu, Y., et al.: Trojaning attack on neural networks. In: 25th Annual Network and Distributed System Security Symposium (NDSS 2018). Internet Soc (2018)

    Google Scholar 

  18. Liu, Y., Shen, G., Tao, G., Wang, Z., Ma, S., Zhang, X.: Complex backdoor detection by symmetric feature differencing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15003–15013 (2022)

    Google Scholar 

  19. Ma, S., Liu, Y., Tao, G., Lee, W.C., Zhang, X.: NIC: detecting adversarial samples with neural network invariant checking. In: 26th Annual Network and Distributed System Security Symposium (NDSS 2019). Internet Soc (2019)

    Google Scholar 

  20. Ma, W., Lu, J.: An equivalence of fully connected layer and convolutional layer. arXiv preprint arXiv:1712.01252 (2017)

  21. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  22. Räuker, T., Ho, A., Casper, S., Hadfield-Menell, D.: Toward transparent AI: a survey on interpreting the inner structures of deep neural networks. In: 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pp. 464–483. IEEE (2023)

    Google Scholar 

  23. Saha, A., Subramanya, A., Pirsiavash, H.: Hidden trigger backdoor attacks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11957–11965 (2020)

    Google Scholar 

  24. Salay, R., Queiroz, R., Czarnecki, K.: An analysis of ISO 26262: using machine learning safely in automotive software. arXiv preprint arXiv:1709.02435 (2017)

  25. Shen, G., et al.: Backdoor scanning for deep neural networks through k-arm optimization. In: International Conference on Machine Learning, pp. 9525–9536. PMLR (2021)

    Google Scholar 

  26. Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 32, 323–332 (2012)

    Google Scholar 

  27. Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE (2019)

    Google Scholar 

  28. Zhao, S., Ma, X., Zheng, X., Bailey, J., Chen, J., Jiang, Y.G.: Clean-label backdoor attacks on video recognition models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14443–14452 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akshay Dhonthi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dhonthi, A., Eiermann, M., Hahn, E.M., Hashemi, V. (2024). AGNES: Abstraction-Guided Framework for Deep Neural Networks Security. In: Dimitrova, R., Lahav, O., Wolff, S. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2024. Lecture Notes in Computer Science, vol 14500. Springer, Cham. https://doi.org/10.1007/978-3-031-50521-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-50521-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-50520-1

  • Online ISBN: 978-3-031-50521-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics