Abstract
Transformer-based models are dominating the field of natural language processing and are becoming increasingly popular in the field of computer vision. However, the black box characteristics of transformers seriously hamper their application in certain fields. Prior work relies on the raw attention scores or employs heuristic propagation along with the attention graph. In this work, we propose a new way to visualize model. The method computes attention scores based on attribution and then propagates these attention scores through the layers. This propagation involves attention layers and multi-head attention mechanism. Our method extracts salient dependencies in each layer to visualize prediction results. We benchmark our method on recent visual transformer networks and demonstrate its many advantages over the existing interpretability methods. Our code is available at: https://github.com/yxheartipp/attr-rollout.
Similar content being viewed by others
References
Abnar S, Zuidema W (2020) Quantifying attention flow in transformers. arXiv:2005.00928
Adebayo J, Gilmer J, Muelly M et al (2018) Sanity checks for saliency maps. arXiv:1810.03292
Binder A, Montavon G, Lapuschkin S et al (2016) Layer-wise relevance propagation for neural networks with local renormalization layers. In: International conference on artificial neural networks. Springer, pp 63–71
Carion N, Massa F, Synnaeve G et al (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229
Chefer H, Gur S, Wolf L (2021) Transformer interpretability beyond attention visualization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 782–791
Chen J, Song L, Wainwright MJ et al (2018) L-shapley and c-shapley: Efficient model interpretation for structured data. arXiv:1808.02610
Chen M, Radford A, Child R et al (2020) Generative pretraining from pixels. In: International conference on machine learning, PMLR, pp 1691–1703
Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. arXiv:1601.06733
Devlin J, Chang MW, Lee K et al (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929
Erhan D, Bengio Y, Courville A et al (2009) Visualizing higher-layer features of a deep network. Univ Montreal 1341(3):1
Fong RC, Vedaldi A (2017) Interpretable explanations of black boxes by meaningful perturbation. In: Proceedings of the IEEE international conference on computer vision, pp 3429–3437
Fong R, Patrick M, Vedaldi A (2019) Understanding deep networks via extremal perturbations and smooth masks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2950–2958
Gu J, Yang Y, Tresp V (2018) Understanding individual decisions of cnns via contrastive backpropagation. In: Asian conference on computer vision. Springer, pp 119–134
Guillaumin M, Küttel D, Ferrari V (2014) Imagenet auto-annotation with segmentation propagation. Int J Comput Vis 110(3):328–348
Gur S, Ali A, Wolf L (2021) Visualization of supervised and self-supervised neural networks via attribution guided factorization. In: Proceedings of the AAAI conference on artificial intelligence, pp 11545–11554
Hao Y, Dong L, Wei F et al (2020) Self-attention attribution: interpreting information interactions inside transformer. arXiv:2004.11207
Iwana BK, Kuroki R, Uchida S (2019) Explaining convolutional neural networks using softmax gradient layer-wise relevance propagation. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW), pp 4176–4185
Li K, Wu Z, Peng KC et al (2018) Tell me where to look: guided attention inference network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9215–9223
Lu J, Batra D, Parikh D et al (2019) Vilbert: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv:1908.02265
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777
Mahendran A, Vedaldi A (2016) Visualizing deep convolutional neural networks using natural pre-images. Int J Comput Vis 120(3):233–255
Montavon G, Lapuschkin S, Binder A et al (2017) Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn 65:211–222
Murdoch WJ, Liu PJ, Yu B (2018) Beyond word importance: Contextual decomposition to extract interactions from lstms. arXiv:1801.05453
Nam WJ, Gur S, Choi J et al (2020) Relative attributing propagation: interpreting the comparative contributions of individual units in deep neural networks. In: Proceedings of the AAAI conference on artificial intelligence, pp 2501–2508
Ren Y, Zhu F, Sharma PK et al (2020) Data query mechanism based on hash computing power of blockchain in internet of things. Sensors 20(1):207
Ren Y, Leng Y, Qi J et al (2021) Multiple cloud storage mechanism based on blockchain in smart homes. Future Gener Comput Syst 115:304–313
Russakovsky O, Deng J, Su H et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Selvaraju RR, Cogswell M, Das A et al (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Shrikumar A, Greenside P, Shcherbina A et al (2016) Not just a black box: learning important features through propagating activation differences. arXiv:1605.01713
Shrikumar A, Greenside P, Kundaje A (2017) Learning important features through propagating activation differences. In: International conference on machine learning, PMLR, pp 3145–3153
Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv:1312.6034
Singh C, Murdoch WJ, Yu B (2018) Hierarchical interpretations for neural network predictions. arXiv:1806.05337
Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In: International conference on machine learning, PMLR, pp 3319–3328
Tan H, Bansal M (2019) Lxmert: learning cross-modality encoder representations from transformers. arXiv:1908.07490
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Voita E, Talbot D, Moiseev F et al (2019) Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned. arXiv:1905.09418
Wang H, Wang Z, Du M et al (2020) Score-cam: score-weighted visual explanations for convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 24–25
Xu K, Ba J, Kiros R et al (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, PMLR, pp 2048–2057
Yuan T, Li X, Xiong H et al (2021) Explaining information flow inside vision transformers using Markov chain. In: eXplainable AI approaches for debugging and diagnosis
Yun J, Basak M, Han MM (2021) Bayesian rule modeling for interpretable mortality classification of Covid-19 patients. In: Cmc-Computers Materials & Continua, pp 2827–2843
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833
Zhang J, Bargal SA, Lin Z et al (2018) Top-down neural attention by excitation backprop. Int J Comput Vis 126(10):1084–1102
Zhang XR, Sun X, Sun XM et al (2022) Robust reversible audio watermarking scheme for telemedicine and privacy protection. Comput Mater Continua 71(2):3035–3050
Zhang XR, Zhang WF, Sun W et al (2022) A robust 3-d medical watermarking based on wavelet transform for data protection. Comput Syst Sci Eng 41(3):1043–1056
Zhou B, Khosla A, Lapedriza A et al (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929
Zhou B, Bau D, Oliva A et al (2018) Interpreting deep visual representations via network dissection. IEEE Trans Pattern Anal Mach Intell 41(9):2131–2145
Funding
This research was funded in part by the National Natural Science Foundation of China, Grant Number 62172122, and the Fundamental Research Funds for the Central Universities, Jilin University, Grant Number 93K172021K04.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, L., Yan, X., Ding, W. et al. Attribution rollout: a new way to interpret visual transformer. J Ambient Intell Human Comput 14, 163–173 (2023). https://doi.org/10.1007/s12652-022-04354-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-022-04354-2