Impact Statement:Explainable artificial intelligence (XAI) is essential to ensuring model reliability. However, recent XAI methods have a tradeoff between performance and interpretability...Show More
Abstract:
In this study, we propose a novel vision transformer neural tree decoder (ViT-NeT) that is interpretable and highly accurate in terms of fine-grained visual categorizatio...Show MoreMetadata
Impact Statement:
Explainable artificial intelligence (XAI) is essential to ensuring model reliability. However, recent XAI methods have a tradeoff between performance and interpretability. The proposed Neural Tree Decoder can obtain good performance and high interpretability at the same time by solving the tradeoff of the existing XAI model in the fine-grained image classification problem.
Abstract:
In this study, we propose a novel vision transformer neural tree decoder (ViT-NeT) that is interpretable and highly accurate in terms of fine-grained visual categorization (FGVC). A ViT acts as a backbone, and to overcome the limitations of ViT, the output context image patch is fed to the proposed NeT. NeT aims to more accurately classify fine-grained objects using similar interclass correlations and different intra-class correlations. ViT-NeT can also describe decision-making processes and visually interpret the results through tree structures and prototypes. Because the proposed ViT-NeT is designed not only to improve FGVC classification performance, but also to provide human-friendly interpretation, it is effective in resolving the tradeoff between performance and interpretability. We compared the performance of ViT-NeT with other state-of-the-art (SoTA) methods using the widely applied FGVC benchmark datasets CUB-200-2011, Stanford Dogs, Stanford Cars, NABirds, and iNaturalist. The proposed method shows a promising quantitative and qualitative performance in comparison to previous SoTA methods as well as an excellent interpretability.
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 5, Issue: 5, May 2024)