Journals & Magazines >IEEE Transactions on Artifici... >Volume: 5 Issue: 5

Neural Tree Decoder for Interpretation of Vision Transformers

Download PDF
Download References
Request Permissions
Save to
Alerts

Impact Statement:Explainable artificial intelligence (XAI) is essential to ensuring model reliability. However, recent XAI methods have a tradeoff between performance and interpretability...Show More

Abstract:

In this study, we propose a novel vision transformer neural tree decoder (ViT-NeT) that is interpretable and highly accurate in terms of fine-grained visual categorizatio...Show More

Metadata

Impact Statement:

Explainable artificial intelligence (XAI) is essential to ensuring model reliability. However, recent XAI methods have a tradeoff between performance and interpretability. The proposed Neural Tree Decoder can obtain good performance and high interpretability at the same time by solving the tradeoff of the existing XAI model in the fine-grained image classification problem.

Abstract:

In this study, we propose a novel vision transformer neural tree decoder (ViT-NeT) that is interpretable and highly accurate in terms of fine-grained visual categorization (FGVC). A ViT acts as a backbone, and to overcome the limitations of ViT, the output context image patch is fed to the proposed NeT. NeT aims to more accurately classify fine-grained objects using similar interclass correlations and different intra-class correlations. ViT-NeT can also describe decision-making processes and visually interpret the results through tree structures and prototypes. Because the proposed ViT-NeT is designed not only to improve FGVC classification performance, but also to provide human-friendly interpretation, it is effective in resolving the tradeoff between performance and interpretability. We compared the performance of ViT-NeT with other state-of-the-art (SoTA) methods using the widely applied FGVC benchmark datasets CUB-200-2011, Stanford Dogs, Stanford Cars, NABirds, and iNaturalist. The proposed method shows a promising quantitative and qualitative performance in comparison to previous SoTA methods as well as an excellent interpretability.

Published in: IEEE Transactions on Artificial Intelligence ( Volume: 5, Issue: 5, May 2024)

Page(s): 2067 - 2078

Date of Publication: 07 September 2023

Electronic ISSN: 2691-4581

DOI: 10.1109/TAI.2023.3312645

Funding Agency:

Contents

References is not available for this document.

Neural Tree Decoder for Interpretation of Vision Transformers

Abstract:

Metadata

Abstract:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Neural Tree Decoder for Interpretation of Vision Transformers

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?