AGNES: Abstraction-Guided Framework for Deep Neural Networks Security

Dhonthi, Akshay; Eiermann, Marcello; Hahn, Ernst Moritz; Hashemi, Vahid

doi:10.1007/978-3-031-50521-8_6

Akshay Dhonthi^10,11,
Marcello Eiermann¹²,
Ernst Moritz Hahn¹¹ &
…
Vahid Hashemi¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14500))

Included in the following conference series:

International Conference on Verification, Model Checking, and Abstract Interpretation

289 Accesses

Abstract

Deep Neural Networks (DNNs) are becoming widespread, particularly in safety-critical areas. One prominent application is image recognition in autonomous driving, where the correct classification of objects, such as traffic signs, is essential for safe driving. Unfortunately, DNNs are prone to backdoors, meaning that they concentrate on attributes of the image that should be irrelevant for their correct classification. Backdoors are integrated into a DNN during training, either with malicious intent (such as a manipulated training process, because of which a yellow sticker always leads to a traffic sign being recognised as a stop sign) or unintentional (such as a rural background leading to any traffic sign being recognised as “animal crossing”, because of biased training data).

In this paper, we introduce AGNES, a tool to detect backdoors in DNNs for image recognition. We discuss the principle approach on which AGNES is based. Afterwards, we show that our tool performs better than many state-of-the-art methods for multiple relevant case studies.

This research was funded in part by the EU under project 864075 CAESAR, the project Audi Verifiable AI, and the BMWi funded KARLI project (grant 19A21031C).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Deep Dive into Deep Learning-Based Adversarial Attacks and Defenses in Computer Vision: From a Perspective of Cybersecurity

A Novel Approach for Traffic Rules Violation Detection Using Deep Learning

Improving Traffic Surveillance with Deep Learning Powered Vehicle Detection, Identification, and Recognition

References

Abadi, M.: Tensorflow: learning functions at scale. In: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, p. 1 (2016)
Google Scholar
Ashok, P., Hashemi, V., Křetínský, J., Mohr, S.: DeepAbstract: neural network abstraction for accelerating verification. In: Hung, D.V., Sokolsky, O. (eds.) ATVA 2020. LNCS, vol. 12302, pp. 92–107. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59152-6_5
Chapter Google Scholar
Bai, J., Lu, F., Zhang, K.: ONNX: open neural network exchange, github (online) (2023). https://github.com/onnx/onnx
Chen, C., Seff, A., Kornhauser, A., Xiao, J.: Deepdriving: learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2722–2730 (2015)
Google Scholar
Chen, X., et al.: Recent advances and clinical applications of deep learning in medical image analysis. Med. Image Anal. 79, 102444 (2022)
Article Google Scholar
Dhonthi, A., Hahn, E.M., Hashemi, V.: Backdoor mitigation in deep neural networks via strategic retraining. In: Chechik, M., Katoen, J.P., Leucker, M. (eds.) FM 2023. LNCS, vol. 14000, pp. 635–647. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-27481-7_37
Chapter Google Scholar
Dmitriev, K., Schumann, J., Holzapfel, F.: Toward certification of machine-learning systems for low criticality airborne applications. In: 2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC), pp. 1–7. IEEE (2021)
Google Scholar
Faster, R.: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 9199, no. 10.5555, pp. 2969239–2969250 (2015)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Google Scholar
Gu, T., Dolan-Gavitt, B., Garg, S.: Badnets: identifying vulnerabilities in the machine learning model supply chain. In: Proceedings of Machine Learning and Computer Security Workshop (2017)
Google Scholar
Huang, S., Peng, W., Jia, Z., Tu, Z.: One-pixel signature: characterizing CNN models for backdoor detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12372, pp. 326–341. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58583-9_20
Chapter Google Scholar
Huang, X., et al.: A survey of safety and trustworthiness of deep neural networks: verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37, 100270 (2020)
Article MathSciNet Google Scholar
Khan, K., Rehman, S.U., Aziz, K., Fong, S., Sarasvady, S.: DBSCAN: past, present and future. In: The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), pp. 232–238. IEEE (2014)
Google Scholar
Kolouri, S., Saha, A., Pirsiavash, H., Hoffmann, H.: Universal litmus patterns: revealing backdoor attacks in CNNs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 301–310 (2020)
Google Scholar
Li, Y., Jiang, Y., Li, Z., Xia, S.T.: Backdoor learning: a survey. IEEE Trans. Neural Netw. Learn. Syst. 1–18 (2022). https://doi.org/10.1109/TNNLS.2022.3182979. https://ieeexplore.ieee.org/document/9802938/
Liu, Y., Lee, W.C., Tao, G., Ma, S., Aafer, Y., Zhang, X.: ABS: scanning neural networks for back-doors by artificial brain stimulation. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 1265–1282 (2019)
Google Scholar
Liu, Y., et al.: Trojaning attack on neural networks. In: 25th Annual Network and Distributed System Security Symposium (NDSS 2018). Internet Soc (2018)
Google Scholar
Liu, Y., Shen, G., Tao, G., Wang, Z., Ma, S., Zhang, X.: Complex backdoor detection by symmetric feature differencing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15003–15013 (2022)
Google Scholar
Ma, S., Liu, Y., Tao, G., Lee, W.C., Zhang, X.: NIC: detecting adversarial samples with neural network invariant checking. In: 26th Annual Network and Distributed System Security Symposium (NDSS 2019). Internet Soc (2019)
Google Scholar
Ma, W., Lu, J.: An equivalence of fully connected layer and convolutional layer. arXiv preprint arXiv:1712.01252 (2017)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Räuker, T., Ho, A., Casper, S., Hadfield-Menell, D.: Toward transparent AI: a survey on interpreting the inner structures of deep neural networks. In: 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pp. 464–483. IEEE (2023)
Google Scholar
Saha, A., Subramanya, A., Pirsiavash, H.: Hidden trigger backdoor attacks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11957–11965 (2020)
Google Scholar
Salay, R., Queiroz, R., Czarnecki, K.: An analysis of ISO 26262: using machine learning safely in automotive software. arXiv preprint arXiv:1709.02435 (2017)
Shen, G., et al.: Backdoor scanning for deep neural networks through k-arm optimization. In: International Conference on Machine Learning, pp. 9525–9536. PMLR (2021)
Google Scholar
Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 32, 323–332 (2012)
Google Scholar
Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE (2019)
Google Scholar
Zhao, S., Ma, X., Zheng, X., Bailey, J., Chen, J., Jiang, Y.G.: Clean-label backdoor attacks on video recognition models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14443–14452 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

AUDI AG, Auto-Union-Straße 1, 85057, Ingolstadt, Germany
Akshay Dhonthi & Vahid Hashemi
Formal Methods and Tools, University of Twente, Enschede, Netherlands
Akshay Dhonthi & Ernst Moritz Hahn
Information and Computing Sciences, Utrecht University, Utrecht, Netherlands
Marcello Eiermann

Authors

Akshay Dhonthi
View author publications
You can also search for this author in PubMed Google Scholar
Marcello Eiermann
View author publications
You can also search for this author in PubMed Google Scholar
Ernst Moritz Hahn
View author publications
You can also search for this author in PubMed Google Scholar
Vahid Hashemi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akshay Dhonthi .

Editor information

Editors and Affiliations

CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Rayna Dimitrova
Tel Aviv University, Tel Aviv, Israel
Ori Lahav
New York University, New York, NY, USA
Sebastian Wolff

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dhonthi, A., Eiermann, M., Hahn, E.M., Hashemi, V. (2024). AGNES: Abstraction-Guided Framework for Deep Neural Networks Security. In: Dimitrova, R., Lahav, O., Wolff, S. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2024. Lecture Notes in Computer Science, vol 14500. Springer, Cham. https://doi.org/10.1007/978-3-031-50521-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-50521-8_6
Published: 30 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50520-1
Online ISBN: 978-3-031-50521-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

AGNES: Abstraction-Guided Framework for Deep Neural Networks Security