Are Transformers More Robust? Towards Exact Robustness Verification for Transformers

Liao, Brian Hsuan-Cheng; Cheng, Chih-Hong; Esen, Hasan; Knoll, Alois

doi:10.1007/978-3-031-40923-3_8

Brian Hsuan-Cheng Liao^10,11,
Chih-Hong Cheng¹²,
Hasan Esen¹⁰ &
…
Alois Knoll¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14181))

Included in the following conference series:

International Conference on Computer Safety, Reliability, and Security

1023 Accesses

Abstract

As an emerging type of Neural Networks (NNs), Transformers are used in many domains ranging from Natural Language Processing to Autonomous Driving. In this paper, we study the robustness problem of Transformers, a key characteristic as low robustness may cause safety concerns. Specifically, we focus on Sparsemax-based Transformers and reduce the finding of their maximum robustness to a Mixed Integer Quadratically Constrained Programming (MIQCP) problem. We also design two pre-processing heuristics that can be embedded in the MIQCP encoding and substantially accelerate its solving. We then conduct experiments using the application of Land Departure Warning to compare the robustness of Sparsemax-based Transformers against that of the more conventional Multi-Layer-Perceptron (MLP) NNs. To our surprise, Transformers are not necessarily more robust, leading to profound considerations in selecting appropriate NN architectures for safety-critical domain applications.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 956123 - FOCETA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PaRoT: A Practical Framework for Robust Deep Neural Network Training

DeepSafe: A Data-Driven Approach for Assessing Robustness of Neural Networks

gRoMA: A Tool for Measuring the Global Robustness of Deep Neural Networks

Notes

1.
Placing LN before MSA and MLP is found to give better network performance than placing it after residual addition [32].
2.
We write here in vector form (c.f. (1)-(3)) as the operations are applied vector-wise essentially.
3.
This technique also applies to Softmax.
4.
The utilization of the High-D dataset in this paper is for knowledge dissemination and scientific publication and is not for commercial use.
5.
Verifying the MLP follows similar steps in Sect. 4 except that the encoding is relatively simple and can be solved by Mixed Integer Linear Programming (MILP).
6.
The admissible perturbation region can be derived from input feature values analytically for better physical interpretability. For example, we can set $\epsilon $ as the normalized value of ego car lateral acceleration, considering it a decisive feature for LDW. Perturbations in this context can stem from sensor noises or hardware faults. The binary features are not perturbed.

References

Bhojanapalli, S., Chakrabarti, A., Glasner, D., Li, D., Unterthiner, T., Veit, A.: Understanding robustness of transformers for image classification. In: ICCV (2021)
Google Scholar
Bojarski, M., et al.: End to end learning for self-driving cars (2016)
Google Scholar
Bonaert, G., Dimitrov, D.I., Baader, M., Vechev, M.: Fast and precise certification of transformers. In: PLDI (2021)
Google Scholar
Cheng, C.H., Nührenberg, G., Ruess, H.: Maximum resilience of artificial neural networks. In: ATVA (2017)
Google Scholar
Cruise: Cruise Under the Hood 2021, https://youtu.be/uJWN0K26NxQ?t=1342
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
Google Scholar
Ehlers, R.: Formal verification of piece-wise linear feed-forward neural networks. In: ATVA (2017)
Google Scholar
European Commission: EU AI Act (2021), https://artificialintelligenceact.eu/
Everett, M., Habibi, G., How, J.P.: Robustness analysis of neural networks via efficient partitioning with applications in control systems. IEEE Control Syst. Lett. 5, 2114–2119 (2021)
Article MathSciNet Google Scholar
Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.: Ai2: safety and robustness certification of neural networks with abstract interpretation. In: SP (2018)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)
Google Scholar
Grossmann, I.E.: Review of nonlinear mixed-integer and disjunctive programming techniques. Optim. Eng. 3, 227–252 (2002)
Article MathSciNet MATH Google Scholar
Gurobi Optimization, LLC: Gurobi optimizer reference manual (2021)
Google Scholar
Hu, B.C., Marsso, L., Czarnecki, K., Salay, R., Shen, H., Chechik, M.: If a human can see it, so should your system: Reliability requirements for machine vision components. In: ICSE (2022)
Google Scholar
Huang, X., et al.: A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37, 100270 (2020)
Google Scholar
Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: CAV (2017)
Google Scholar
Katz, G., Barrett, C., Dill, D., Julian, K., Kochenderfer, M.: Reluplex: An efficient SMT solver for verifying deep neural networks. In: CAV (2017)
Google Scholar
Krajewski, R., Bock, J., Kloeker, L., Eckstein, L.: The highD dataset: a drone dataset of naturalistic vehicle trajectories on German highways for validation of highly automated driving systems. In: ITSC (2018)
Google Scholar
Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward relu neural networks (2017)
Google Scholar
Mahajan, V., Katrakazas, C., Antoniou, C.: Prediction of lane-changing maneuvers with automatic labeling and deep learning. TRR J. 2674, 336–347 (2020)
Google Scholar
Martins, A.F.T., Astudillo, R.F.: From softmax to sparsemax: A sparse model of attention and multi-label classification. In: ICML (2016)
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)
Google Scholar
Poretschkin, M., et al.: AI assessment catalog (2023), https://www.iais.fraunhofer.de/en/research/artificial-intelligence/ai-assessment-catalog.html
Shao, R., Shi, Z., Yi, J., Chen, P.Y., Hsieh, C.J.: On the adversarial robustness of vision transformers. In: UCCV (2021)
Google Scholar
Shi, Z., Zhang, H., Chang, K.W., Huang, M., Hsieh, C.J.: Robustness verification for transformers. In: ICLR (2020)
Google Scholar
Su, J., Vargas, D.V., Sakurai, K.: One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. 23, 828–841 (2019)
Article Google Scholar
Tesla: Tesla AI Day 2022, https://www.youtube.com/live/ODSJsviD_SU?feature=share &t=4464
Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In: ICLR (2019)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Google Scholar
Wang, S., et al.: Beta-crown: efficient bound propagation with per-neuron split constraints for complete and incomplete neural network verification (2021)
Google Scholar
Wong, E., Kolter, J.Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: ICML (2018)
Google Scholar
Xiong, R., et al.: On layer normalization in the transformer architecture. In: ICLR (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

DENSO AUTOMOTIVE Deutschland GmbH, 85386, Eching, Germany
Brian Hsuan-Cheng Liao & Hasan Esen
Technical University of Munich, 85748, Garching, Germany
Brian Hsuan-Cheng Liao & Alois Knoll
Fraunhofer IKS, 80686, Munich, Germany
Chih-Hong Cheng

Authors

Brian Hsuan-Cheng Liao
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Hong Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Hasan Esen
View author publications
You can also search for this author in PubMed Google Scholar
Alois Knoll
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brian Hsuan-Cheng Liao .

Editor information

Editors and Affiliations

University of Toulouse / LAAS-CNRS, Toulouse, France
Jérémie Guiochet
Fondazione Bruno Kessler, Trento, Italy
Stefano Tonetta
GTS Deutschland GmbH, Ditzingen, Germany
Friedemann Bitsch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liao, B.HC., Cheng, CH., Esen, H., Knoll, A. (2023). Are Transformers More Robust? Towards Exact Robustness Verification for Transformers. In: Guiochet, J., Tonetta, S., Bitsch, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2023. Lecture Notes in Computer Science, vol 14181. Springer, Cham. https://doi.org/10.1007/978-3-031-40923-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-40923-3_8
Published: 11 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40922-6
Online ISBN: 978-3-031-40923-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Are Transformers More Robust? Towards Exact Robustness Verification for Transformers