Abstract
Deep Neural Networks (DNNs) are becoming widespread, particularly in safety-critical areas. One prominent application is image recognition in autonomous driving, where the correct classification of objects, such as traffic signs, is essential for safe driving. Unfortunately, DNNs are prone to backdoors, meaning that they concentrate on attributes of the image that should be irrelevant for their correct classification. Backdoors are integrated into a DNN during training, either with malicious intent (such as a manipulated training process, because of which a yellow sticker always leads to a traffic sign being recognised as a stop sign) or unintentional (such as a rural background leading to any traffic sign being recognised as “animal crossing”, because of biased training data).
In this paper, we introduce AGNES, a tool to detect backdoors in DNNs for image recognition. We discuss the principle approach on which AGNES is based. Afterwards, we show that our tool performs better than many state-of-the-art methods for multiple relevant case studies.
This research was funded in part by the EU under project 864075 CAESAR, the project Audi Verifiable AI, and the BMWi funded KARLI project (grant 19A21031C).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abadi, M.: Tensorflow: learning functions at scale. In: Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, p. 1 (2016)
Ashok, P., Hashemi, V., Křetínský, J., Mohr, S.: DeepAbstract: neural network abstraction for accelerating verification. In: Hung, D.V., Sokolsky, O. (eds.) ATVA 2020. LNCS, vol. 12302, pp. 92–107. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59152-6_5
Bai, J., Lu, F., Zhang, K.: ONNX: open neural network exchange, github (online) (2023). https://github.com/onnx/onnx
Chen, C., Seff, A., Kornhauser, A., Xiao, J.: Deepdriving: learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2722–2730 (2015)
Chen, X., et al.: Recent advances and clinical applications of deep learning in medical image analysis. Med. Image Anal. 79, 102444 (2022)
Dhonthi, A., Hahn, E.M., Hashemi, V.: Backdoor mitigation in deep neural networks via strategic retraining. In: Chechik, M., Katoen, J.P., Leucker, M. (eds.) FM 2023. LNCS, vol. 14000, pp. 635–647. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-27481-7_37
Dmitriev, K., Schumann, J., Holzapfel, F.: Toward certification of machine-learning systems for low criticality airborne applications. In: 2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC), pp. 1–7. IEEE (2021)
Faster, R.: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 9199, no. 10.5555, pp. 2969239–2969250 (2015)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Gu, T., Dolan-Gavitt, B., Garg, S.: Badnets: identifying vulnerabilities in the machine learning model supply chain. In: Proceedings of Machine Learning and Computer Security Workshop (2017)
Huang, S., Peng, W., Jia, Z., Tu, Z.: One-pixel signature: characterizing CNN models for backdoor detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12372, pp. 326–341. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58583-9_20
Huang, X., et al.: A survey of safety and trustworthiness of deep neural networks: verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37, 100270 (2020)
Khan, K., Rehman, S.U., Aziz, K., Fong, S., Sarasvady, S.: DBSCAN: past, present and future. In: The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), pp. 232–238. IEEE (2014)
Kolouri, S., Saha, A., Pirsiavash, H., Hoffmann, H.: Universal litmus patterns: revealing backdoor attacks in CNNs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 301–310 (2020)
Li, Y., Jiang, Y., Li, Z., Xia, S.T.: Backdoor learning: a survey. IEEE Trans. Neural Netw. Learn. Syst. 1–18 (2022). https://doi.org/10.1109/TNNLS.2022.3182979. https://ieeexplore.ieee.org/document/9802938/
Liu, Y., Lee, W.C., Tao, G., Ma, S., Aafer, Y., Zhang, X.: ABS: scanning neural networks for back-doors by artificial brain stimulation. In: Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pp. 1265–1282 (2019)
Liu, Y., et al.: Trojaning attack on neural networks. In: 25th Annual Network and Distributed System Security Symposium (NDSS 2018). Internet Soc (2018)
Liu, Y., Shen, G., Tao, G., Wang, Z., Ma, S., Zhang, X.: Complex backdoor detection by symmetric feature differencing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15003–15013 (2022)
Ma, S., Liu, Y., Tao, G., Lee, W.C., Zhang, X.: NIC: detecting adversarial samples with neural network invariant checking. In: 26th Annual Network and Distributed System Security Symposium (NDSS 2019). Internet Soc (2019)
Ma, W., Lu, J.: An equivalence of fully connected layer and convolutional layer. arXiv preprint arXiv:1712.01252 (2017)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Räuker, T., Ho, A., Casper, S., Hadfield-Menell, D.: Toward transparent AI: a survey on interpreting the inner structures of deep neural networks. In: 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), pp. 464–483. IEEE (2023)
Saha, A., Subramanya, A., Pirsiavash, H.: Hidden trigger backdoor attacks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11957–11965 (2020)
Salay, R., Queiroz, R., Czarnecki, K.: An analysis of ISO 26262: using machine learning safely in automotive software. arXiv preprint arXiv:1709.02435 (2017)
Shen, G., et al.: Backdoor scanning for deep neural networks through k-arm optimization. In: International Conference on Machine Learning, pp. 9525–9536. PMLR (2021)
Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 32, 323–332 (2012)
Wang, B., et al.: Neural cleanse: identifying and mitigating backdoor attacks in neural networks. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 707–723. IEEE (2019)
Zhao, S., Ma, X., Zheng, X., Bailey, J., Chen, J., Jiang, Y.G.: Clean-label backdoor attacks on video recognition models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14443–14452 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dhonthi, A., Eiermann, M., Hahn, E.M., Hashemi, V. (2024). AGNES: Abstraction-Guided Framework for Deep Neural Networks Security. In: Dimitrova, R., Lahav, O., Wolff, S. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2024. Lecture Notes in Computer Science, vol 14500. Springer, Cham. https://doi.org/10.1007/978-3-031-50521-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-50521-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50520-1
Online ISBN: 978-3-031-50521-8
eBook Packages: Computer ScienceComputer Science (R0)