Skip to main content

Are Transformers More Robust? Towards Exact Robustness Verification for Transformers

  • Conference paper
  • First Online:
Computer Safety, Reliability, and Security (SAFECOMP 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14181))

Included in the following conference series:

Abstract

As an emerging type of Neural Networks (NNs), Transformers are used in many domains ranging from Natural Language Processing to Autonomous Driving. In this paper, we study the robustness problem of Transformers, a key characteristic as low robustness may cause safety concerns. Specifically, we focus on Sparsemax-based Transformers and reduce the finding of their maximum robustness to a Mixed Integer Quadratically Constrained Programming (MIQCP) problem. We also design two pre-processing heuristics that can be embedded in the MIQCP encoding and substantially accelerate its solving. We then conduct experiments using the application of Land Departure Warning to compare the robustness of Sparsemax-based Transformers against that of the more conventional Multi-Layer-Perceptron (MLP) NNs. To our surprise, Transformers are not necessarily more robust, leading to profound considerations in selecting appropriate NN architectures for safety-critical domain applications.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 956123 - FOCETA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Placing LN before MSA and MLP is found to give better network performance than placing it after residual addition [32].

  2. 2.

    We write here in vector form (c.f. (1)-(3)) as the operations are applied vector-wise essentially.

  3. 3.

    This technique also applies to Softmax.

  4. 4.

    The utilization of the High-D dataset in this paper is for knowledge dissemination and scientific publication and is not for commercial use.

  5. 5.

    Verifying the MLP follows similar steps in Sect. 4 except that the encoding is relatively simple and can be solved by Mixed Integer Linear Programming (MILP).

  6. 6.

    The admissible perturbation region can be derived from input feature values analytically for better physical interpretability. For example, we can set \(\epsilon \) as the normalized value of ego car lateral acceleration, considering it a decisive feature for LDW. Perturbations in this context can stem from sensor noises or hardware faults. The binary features are not perturbed.

References

  1. Bhojanapalli, S., Chakrabarti, A., Glasner, D., Li, D., Unterthiner, T., Veit, A.: Understanding robustness of transformers for image classification. In: ICCV (2021)

    Google Scholar 

  2. Bojarski, M., et al.: End to end learning for self-driving cars (2016)

    Google Scholar 

  3. Bonaert, G., Dimitrov, D.I., Baader, M., Vechev, M.: Fast and precise certification of transformers. In: PLDI (2021)

    Google Scholar 

  4. Cheng, C.H., Nührenberg, G., Ruess, H.: Maximum resilience of artificial neural networks. In: ATVA (2017)

    Google Scholar 

  5. Cruise: Cruise Under the Hood 2021, https://youtu.be/uJWN0K26NxQ?t=1342

  6. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)

    Google Scholar 

  7. Ehlers, R.: Formal verification of piece-wise linear feed-forward neural networks. In: ATVA (2017)

    Google Scholar 

  8. European Commission: EU AI Act (2021), https://artificialintelligenceact.eu/

  9. Everett, M., Habibi, G., How, J.P.: Robustness analysis of neural networks via efficient partitioning with applications in control systems. IEEE Control Syst. Lett. 5, 2114–2119 (2021)

    Article  MathSciNet  Google Scholar 

  10. Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.: Ai2: safety and robustness certification of neural networks with abstract interpretation. In: SP (2018)

    Google Scholar 

  11. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)

    Google Scholar 

  12. Grossmann, I.E.: Review of nonlinear mixed-integer and disjunctive programming techniques. Optim. Eng. 3, 227–252 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  13. Gurobi Optimization, LLC: Gurobi optimizer reference manual (2021)

    Google Scholar 

  14. Hu, B.C., Marsso, L., Czarnecki, K., Salay, R., Shen, H., Chechik, M.: If a human can see it, so should your system: Reliability requirements for machine vision components. In: ICSE (2022)

    Google Scholar 

  15. Huang, X., et al.: A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37, 100270 (2020)

    Google Scholar 

  16. Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: CAV (2017)

    Google Scholar 

  17. Katz, G., Barrett, C., Dill, D., Julian, K., Kochenderfer, M.: Reluplex: An efficient SMT solver for verifying deep neural networks. In: CAV (2017)

    Google Scholar 

  18. Krajewski, R., Bock, J., Kloeker, L., Eckstein, L.: The highD dataset: a drone dataset of naturalistic vehicle trajectories on German highways for validation of highly automated driving systems. In: ITSC (2018)

    Google Scholar 

  19. Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward relu neural networks (2017)

    Google Scholar 

  20. Mahajan, V., Katrakazas, C., Antoniou, C.: Prediction of lane-changing maneuvers with automatic labeling and deep learning. TRR J. 2674, 336–347 (2020)

    Google Scholar 

  21. Martins, A.F.T., Astudillo, R.F.: From softmax to sparsemax: A sparse model of attention and multi-label classification. In: ICML (2016)

    Google Scholar 

  22. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)

    Google Scholar 

  23. Poretschkin, M., et al.: AI assessment catalog (2023), https://www.iais.fraunhofer.de/en/research/artificial-intelligence/ai-assessment-catalog.html

  24. Shao, R., Shi, Z., Yi, J., Chen, P.Y., Hsieh, C.J.: On the adversarial robustness of vision transformers. In: UCCV (2021)

    Google Scholar 

  25. Shi, Z., Zhang, H., Chang, K.W., Huang, M., Hsieh, C.J.: Robustness verification for transformers. In: ICLR (2020)

    Google Scholar 

  26. Su, J., Vargas, D.V., Sakurai, K.: One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. 23, 828–841 (2019)

    Article  Google Scholar 

  27. Tesla: Tesla AI Day 2022, https://www.youtube.com/live/ODSJsviD_SU?feature=share &t=4464

  28. Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In: ICLR (2019)

    Google Scholar 

  29. Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)

    Google Scholar 

  30. Wang, S., et al.: Beta-crown: efficient bound propagation with per-neuron split constraints for complete and incomplete neural network verification (2021)

    Google Scholar 

  31. Wong, E., Kolter, J.Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: ICML (2018)

    Google Scholar 

  32. Xiong, R., et al.: On layer normalization in the transformer architecture. In: ICLR (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brian Hsuan-Cheng Liao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liao, B.HC., Cheng, CH., Esen, H., Knoll, A. (2023). Are Transformers More Robust? Towards Exact Robustness Verification for Transformers. In: Guiochet, J., Tonetta, S., Bitsch, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2023. Lecture Notes in Computer Science, vol 14181. Springer, Cham. https://doi.org/10.1007/978-3-031-40923-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40923-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40922-6

  • Online ISBN: 978-3-031-40923-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics