Abstract
With the increasing interest in explainable attribution for deep neural networks, it is important to consider not only the importance of individual inputs, but also the model parameters themselves. Existing methods, such as Neuron Integrated Gradients [18] and Conductance [6], attempt model attribution by applying attribution methods, such as Integrated Gradients, to the inputs of each model parameter. While these methods seem to map attributions to individual parameters, these are actually aggregated feature attributions which completely ignore the parameter space and also suffer from the same underlying limitations of Integrated Gradients. In this work, we compute parameter attributions by leveraging the recent family of measures proposed by Generalized Integrated Attributions, by instead computing integrals over the product space of inputs and parameters. This usage of the product space allows us to now explain individual neurons from varying perspectives and interpret them with the same intuition as inputs. To the best of our knowledge, ours is the first method which actually utilizes the gradient landscape of the parameter space to explain each individual weight and bias. We confirm the utility of our parameter attributions by computing exploratory statistics for a wide variety of image classification datasets and by performing pruning analyses on a standard architecture, which demonstrate that our attribution measures are able to identify both important and unimportant neurons in a convolutional neural network.
Supported by the National Science Foundation under Grant No. 2134237. Code: https://github.com/UTSA-VAIL/Explaining-Model-Parameters-Using-the-Product-Space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alqahtani, A., Xie, X., Essa, E., Jones, M.W.: Neuron-based network pruning based on majority voting. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 3090–3097 (2021). https://api.semanticscholar.org/CorpusID:233877899
Ancona, M., Ceolini, E., Öztireli, C., Gross, M.: Towards better understanding of gradient-based attribution methods for deep neural networks. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=Sy21R9JAW
Barkan, O., Elisha, Asher, Y., Eshel, A., Koenigstein, N.: Visual explanations via iterated integrated attributions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2073–2084 (October 2023)
Chundawat, V.S., Tarun, A.K., Mandal, M., Kankanhalli, M.: Zero-shot machine unlearning. Trans. Info. For. Sec. 18, 2345–2354 (2023). https://doi.org/10.1109/TIFS.2023.3265506
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Dhamdhere, K., Sundararajan, M., Yan, Q.: How important is a neuron. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=SylKoo0cKm
Erion, G., Janizek, J., Sturmfels, P., Lundberg, S., Lee, S.I.: Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nat. Mach. Intell. 3, 1–12 (2021).https://doi.org/10.1038/s42256-021-00343-w
Fel, T., et al.: Craft: concept recursive activation factorization for explainability. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2711–2721 (2023)
Ghorbani, A., Zou, J.Y.: Neuron shapley: Discovering the responsible neurons. ArXiv abs/2002.09815 (2020). https://api.semanticscholar.org/CorpusID:211258568
Hase, P., Xie, H., Bansal, M.: The out-of-distribution problem in explainability and search methods for feature importance explanations. Adv. Neural Inf. Process. Syst. 34 (2021)
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)
LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist2 (2010)
Leino, K., Li, L., Sen, S., Datta, A., Fredrikson, M.: Influence-directed explanations for deep convolutional networks. In: 2018 IEEE International Test Conference (ITC), pp. 1–8 (2018)
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 4768–4777. NIPS’17, Curran Associates Inc., Red Hook, NY, USA (2017)
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NeurIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011)
Sajjad, H., Durrani, N., Dalvi, F.: Neuron-level interpretation of deep NLP models: a survey. Trans. Assoc. Comput. Linguist. 10, 1285–1303 (2021). https://api.semanticscholar.org/CorpusID:237353268
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning - Volume 70, pp. 3145–3153. ICML’17, JMLR.org (2017)
Shrikumar, A., Su, J., Kundaje, A.: Computationally efficient measures of internal neuron importance. CoRR arxiv preprint arxiv: abs/1807.09946 (2018)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: Proceedings of the International Conference on Learning Representations (ICLR). ICLR (2014)
Srinivas, S., Fleuret, F.: Full-gradient representation for neural network visualization. Adv. Neural Inf. Process. Syst. 32 (2019)
anonymous submission: Generalized integrated gradients. In: Under Review. pp. supplemental materials (2023)
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: International Conference on Machine Learning, pp. 3319–3328. PMLR (2017)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)
Zhang, J., Bargal, S.A., Lin, Z., Brandt, J., Shen, X., Sclaroff, S.: Top-down neural attention by excitation backprop. Int. J. Comput. Vision 126(10), 1084–1102 (2018)
Acknowledgements
This material is based upon work supported by the National Science Foundation under Grant No. 2134237. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Payne, E., Patrick, D., Fernandez, A.S. (2025). Explaining Model Parameters Using the Product Space. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15310. Springer, Cham. https://doi.org/10.1007/978-3-031-78192-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-78192-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78191-9
Online ISBN: 978-3-031-78192-6
eBook Packages: Computer ScienceComputer Science (R0)