Abstract
Although deep neural networks exhibiting superior performance across numerous tasks, their application in high-risk domains is limited by a lack of interpretability and trustworthiness. In this paper, an interaction value calculation method is firstly proposed, which faithfully represents the interaction utility of each variable in the feature map. Secondly, an interpretable method for top-down construction of interaction hierarchy graph based on interaction utility is proposed to understand the visualized knowledge represented by filters and to elucidate the decision-making process of the network. Extensive experiments were carried out on publicly available datasets and models that had been pre-trained. Experimental results indicate that each node in the graph consistently corresponds to the same part of an object across various images belonging to the same category. The faithfulness evaluation shows that the filters involved in the graph nodes have a major role in the network. Furthermore, the quantitative evaluation shows that our method improves over the others by an average of 0.18%, 1.19%, and 2.18% on the EBPG, mIoU, and Bbox metrics, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ancona, M., Oztireli, C., Gross, M.: Explaining deep neural networks with a polynomial time algorithm for Shapley value approximation. In: International Conference on Machine Learning, pp. 272–281. PMLR (2019)
Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-CAM++: generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 839–847. IEEE (2018)
Deng, H., Ren, Q., Zhang, H., Zhang, Q.: Discovering and explaining the representation bottleneck of DNNs. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=iRCUlgmdfHJ
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Dosovitskiy, A., Brox, T.: Inverting visual representations with convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4829–4837 (2016)
Fan, Q., Zhuo, W., Tang, C.K., Tai, Y.W.: Few-shot object detection with attention-RPN and multi-relation detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4013–4022 (2020)
Fong, R., Patrick, M., Vedaldi, A.: Understanding deep networks via extremal perturbations and smooth masks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2950–2958 (2019)
Hadji Misheva, B., Hirsa, A., Osterrieder, J., Kulkarni, O., Fung Lin, S.: Explainable AI in credit risk management. Credit Risk Manag. 57, 203–216 (2021)
Hase, P., Bansal, M.: Evaluating explainable AI: which algorithmic explanations help users predict model behavior? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5540–5552 (2020). https://doi.org/10.18653/v1/2020.acl-main.491. https://aclanthology.org/2020.acl-main.491
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Hoiem, D., Divvala, S.K., Hays, J.H.: PASCAL VOC 2008 challenge. World Lit. Today 24 (2009)
Hu, Y., et al.: Efficient semantic segmentation by altering resolutions for compressed videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22627–22637 (2023)
Kim, Y.J., Bae, J.P., Chung, J.W., Park, D.K., Kim, K.G., Kim, Y.J.: New polyp image classification technique using transfer learning of network-in-network structure in endoscopic images. Sci. Rep. 11(1), 1–8 (2021)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Loquercio, A., Segu, M., Scaramuzza, D.: A general framework for uncertainty estimation in deep learning. IEEE Robot. Autom. Lett. 5(2), 3153–3160 (2020)
Mohankumar, A.K., Nema, P., Narasimhan, S., Khapra, M.M., Srinivasan, B.V., Ravindran, B.: Towards transparent and explainable attention models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4206–4216 (2020). https://doi.org/10.18653/v1/2020.acl-main.387. https://aclanthology.org/2020.acl-main.387
Petsiuk, V., Das, A., Saenko, K.: RISE: randomized input sampling for explanation of black-box models. In: British Machine Vision Conference (BMVC) (2018). http://bmvc2018.org/contents/papers/1064.pdf
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015). http://arxiv.org/abs/1409.1556
Smilkov, D., Thorat, N., Kim, B., Viégas, F., Wattenberg, M.: SmoothGrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825 (2017)
Srinivas, S., Fleuret, F.: Full-gradient representation for neural network visualization. Adv. Neural Inf. Process. Syst. 32 (2019)
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: International Conference on Machine Learning, pp. 3319–3328. PMLR (2017)
Tonekaboni, S., Joshi, S., Campbell, K., Duvenaud, D.K., Goldenberg, A.: What went wrong and when? Instance-wise feature importance for time-series black-box models. Adv. Neural. Inf. Process. Syst. 33, 799–809 (2020)
Torrent, N.L., Visani, G., Bagli, E.: PSD2 explainable AI model for credit scoring. arXiv preprint arXiv:2011.10367 (2020)
Wan, A., et al.: NBDT: Neural-backed decision tree. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=mCLVeEpplNE
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475 (2023)
Wang, H., et al.: Score-CAM: score-weighted visual explanations for convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 24–25 (2020)
Wang, Y., Su, H., Zhang, B., Hu, X.: Interpret neural networks by identifying critical data routing paths. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8906–8914 (2018)
Zhang, H., Li, S., Ma, Y., Li, M., Xie, Y., Zhang, Q.: Interpreting and boosting dropout from a game-theoretic view. In: International Conference on Learning Representations (2020)
Zhang, Q., Wang, X., Cao, R., Wu, Y.N., Shi, F., Zhu, S.C.: Extraction of an explanatory graph to interpret a CNN. IEEE Trans. Pattern Anal. Mach. Intell. 43(11), 3863–3877 (2020)
Zhang, Q., Yang, Y., Ma, H., Wu, Y.N.: Interpreting CNNs via decision trees. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6261–6270 (2019)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Acknowledgements
This work was supported by National Natural Science Foundation of China [62372215] and Jiangsu Science and Technology Project [BE2022781].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cheng, K., Zhou, H., Wan, H. (2025). Interpreting Convolutional Neural Network Decision via Pixel-Wise Interaction Hierarchy Graph. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15307. Springer, Cham. https://doi.org/10.1007/978-3-031-78183-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-78183-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78182-7
Online ISBN: 978-3-031-78183-4
eBook Packages: Computer ScienceComputer Science (R0)