Skip to main content

Identifying Trends in Feature Attributions During Training of Neural Networks

  • Conference paper
  • First Online:
Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2134))

  • 74 Accesses

Abstract

This study investigates the evolving dynamics of commonly used feature attribution (FA) values during training of neural networks. As models transition from a state of high uncertainty to low uncertainty, we show that the features’ significance also changes, which is inline with the general learning theory of deep neural networks. During model training, we compute FA scores through Layer-wise Relevance Propagation (LRP) and Gradient-weighted Class Activation Mapping (Grad-CAM), which are selected for their efficiency and speed of computation. We summarize the attribution scores in terms of the sum of the absolute values of FA scores and their entropy. We further analyze these summary scores in relation to the models’ generalization capabilities. The analysis identifies trends where FA values increase in magnitude while entropy decreases during the training process, regardless of model generalization, suggesting independence of overfitting. This research offers a unique view on the application of FA methods in explainable artificial intelligence (XAI) and raises intriguing questions about their behavior across varying model architectures and datasets, which may have implications for future work combining XAI and uncertainty estimation in machine learning.

We gratefully acknowledge funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation): TRR 318/1 2021 - 438445824.

E. Terzieva and M. Muschalik—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The source code of the descriptive analysis conducted in Sect. 3 and Sect. 4 is publicly available at https://github.com/EliTerzieva1995/Identifying-Trends-in-Feature-Attributions-during-Training-of-Neural-Networks. This repository also contains the appendix and further supplementary material of this work.

  2. 2.

    For a detailed description of the models used and the datasets we refer to the supplementary material (Section A.1 and Section A.2).

  3. 3.

    For details on the negative FA values we refer to the appendix.

References

  1. Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018). https://doi.org/10.1109/ACCESS.2018.2870052

    Article  Google Scholar 

  2. Ancona, M., Ceolini, E., Öztireli, C., Gross, M.: Gradient-Based Attribution Methods, pp. 169–191. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_9

  3. Hüllermeier, E., Waegeman, W.: Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Mach. Learn. 110(3), 457–506 (2021). https://doi.org/10.1007/s10994-021-05946-3

    Article  MathSciNet  MATH  Google Scholar 

  4. Kendall, A., Gal, Y.: What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? CoRR abs/1703.04977 (2017)

    Google Scholar 

  5. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report (2009)

    Google Scholar 

  6. Lapuschkin, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, e0130140 (2015). https://doi.org/10.1371/journal.pone.0130140

  7. Löfström, H., Löfström, T., Johansson, U., Sönströd, C.: Calibrated Explanations: with Uncertainty Information and Counterfactuals. CoRR abs/2305.02305 (2023)

    Google Scholar 

  8. Montavon, G., Binder, A., Lapuschkin, S., Samek, W., Müller, K.-R.: Layer-wise relevance propagation: an overview. In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 193–209. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_10

    Chapter  MATH  Google Scholar 

  9. Schwalbe, G., Finzel, B.: A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts. Data Min. Knowl. Disc. (2023). https://doi.org/10.1007/s10618-022-00867-8

  10. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision, ICCV, pp. 618–626. IEEE Computer Society (2017). https://doi.org/10.1109/ICCV.2017.74

  11. Watson, D.S., O’Hara, J., Tax, N., Mudd, R., Guy, I.: Explaining Predictive Uncertainty with Information Theoretic Shapley Values. CoRR abs/2306.05724 (2023)

    Google Scholar 

  12. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. CoRR abs/1708.07747 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Elena Terzieva or Maximilian Muschalik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Terzieva, E., Muschalik, M., Hofman, P., Hüllermeier, E. (2025). Identifying Trends in Feature Attributions During Training of Neural Networks. In: Meo, R., Silvestri, F. (eds) Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2023. Communications in Computer and Information Science, vol 2134. Springer, Cham. https://doi.org/10.1007/978-3-031-74627-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-74627-7_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-74626-0

  • Online ISBN: 978-3-031-74627-7

  • eBook Packages: Artificial Intelligence (R0)

Publish with us

Policies and ethics