Abstract
As an emerging type of Neural Networks (NNs), Transformers are used in many domains ranging from Natural Language Processing to Autonomous Driving. In this paper, we study the robustness problem of Transformers, a key characteristic as low robustness may cause safety concerns. Specifically, we focus on Sparsemax-based Transformers and reduce the finding of their maximum robustness to a Mixed Integer Quadratically Constrained Programming (MIQCP) problem. We also design two pre-processing heuristics that can be embedded in the MIQCP encoding and substantially accelerate its solving. We then conduct experiments using the application of Land Departure Warning to compare the robustness of Sparsemax-based Transformers against that of the more conventional Multi-Layer-Perceptron (MLP) NNs. To our surprise, Transformers are not necessarily more robust, leading to profound considerations in selecting appropriate NN architectures for safety-critical domain applications.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 956123 - FOCETA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Placing LN before MSA and MLP is found to give better network performance than placing it after residual addition [32].
- 2.
- 3.
This technique also applies to Softmax.
- 4.
The utilization of the High-D dataset in this paper is for knowledge dissemination and scientific publication and is not for commercial use.
- 5.
Verifying the MLP follows similar steps in Sect. 4 except that the encoding is relatively simple and can be solved by Mixed Integer Linear Programming (MILP).
- 6.
The admissible perturbation region can be derived from input feature values analytically for better physical interpretability. For example, we can set \(\epsilon \) as the normalized value of ego car lateral acceleration, considering it a decisive feature for LDW. Perturbations in this context can stem from sensor noises or hardware faults. The binary features are not perturbed.
References
Bhojanapalli, S., Chakrabarti, A., Glasner, D., Li, D., Unterthiner, T., Veit, A.: Understanding robustness of transformers for image classification. In: ICCV (2021)
Bojarski, M., et al.: End to end learning for self-driving cars (2016)
Bonaert, G., Dimitrov, D.I., Baader, M., Vechev, M.: Fast and precise certification of transformers. In: PLDI (2021)
Cheng, C.H., Nührenberg, G., Ruess, H.: Maximum resilience of artificial neural networks. In: ATVA (2017)
Cruise: Cruise Under the Hood 2021, https://youtu.be/uJWN0K26NxQ?t=1342
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
Ehlers, R.: Formal verification of piece-wise linear feed-forward neural networks. In: ATVA (2017)
European Commission: EU AI Act (2021), https://artificialintelligenceact.eu/
Everett, M., Habibi, G., How, J.P.: Robustness analysis of neural networks via efficient partitioning with applications in control systems. IEEE Control Syst. Lett. 5, 2114–2119 (2021)
Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.: Ai2: safety and robustness certification of neural networks with abstract interpretation. In: SP (2018)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)
Grossmann, I.E.: Review of nonlinear mixed-integer and disjunctive programming techniques. Optim. Eng. 3, 227–252 (2002)
Gurobi Optimization, LLC: Gurobi optimizer reference manual (2021)
Hu, B.C., Marsso, L., Czarnecki, K., Salay, R., Shen, H., Chechik, M.: If a human can see it, so should your system: Reliability requirements for machine vision components. In: ICSE (2022)
Huang, X., et al.: A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37, 100270 (2020)
Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: CAV (2017)
Katz, G., Barrett, C., Dill, D., Julian, K., Kochenderfer, M.: Reluplex: An efficient SMT solver for verifying deep neural networks. In: CAV (2017)
Krajewski, R., Bock, J., Kloeker, L., Eckstein, L.: The highD dataset: a drone dataset of naturalistic vehicle trajectories on German highways for validation of highly automated driving systems. In: ITSC (2018)
Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward relu neural networks (2017)
Mahajan, V., Katrakazas, C., Antoniou, C.: Prediction of lane-changing maneuvers with automatic labeling and deep learning. TRR J. 2674, 336–347 (2020)
Martins, A.F.T., Astudillo, R.F.: From softmax to sparsemax: A sparse model of attention and multi-label classification. In: ICML (2016)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)
Poretschkin, M., et al.: AI assessment catalog (2023), https://www.iais.fraunhofer.de/en/research/artificial-intelligence/ai-assessment-catalog.html
Shao, R., Shi, Z., Yi, J., Chen, P.Y., Hsieh, C.J.: On the adversarial robustness of vision transformers. In: UCCV (2021)
Shi, Z., Zhang, H., Chang, K.W., Huang, M., Hsieh, C.J.: Robustness verification for transformers. In: ICLR (2020)
Su, J., Vargas, D.V., Sakurai, K.: One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. 23, 828–841 (2019)
Tesla: Tesla AI Day 2022, https://www.youtube.com/live/ODSJsviD_SU?feature=share &t=4464
Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In: ICLR (2019)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Wang, S., et al.: Beta-crown: efficient bound propagation with per-neuron split constraints for complete and incomplete neural network verification (2021)
Wong, E., Kolter, J.Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: ICML (2018)
Xiong, R., et al.: On layer normalization in the transformer architecture. In: ICLR (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liao, B.HC., Cheng, CH., Esen, H., Knoll, A. (2023). Are Transformers More Robust? Towards Exact Robustness Verification for Transformers. In: Guiochet, J., Tonetta, S., Bitsch, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2023. Lecture Notes in Computer Science, vol 14181. Springer, Cham. https://doi.org/10.1007/978-3-031-40923-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-40923-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40922-6
Online ISBN: 978-3-031-40923-3
eBook Packages: Computer ScienceComputer Science (R0)