Interpretation with baseline shapley value for feature groups on tree models

Xu, Fan; Zhou, Zhi-Jian; Ni, Jie; Gao, Wei

doi:10.1007/s11704-024-40117-2

Interpretation with baseline shapley value for feature groups on tree models

Research Article
Published: 22 November 2024

Volume 19, article number 195316, (2025)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Fan Xu^1,2,
Zhi-Jian Zhou^1,2,
Jie Ni^1,2 &
…
Wei Gao^1,2

84 Accesses
36 Altmetric
5 Mentions
Explore all metrics

Abstract

Tree models have made an impressive progress during the past years, while an important problem is to understand how these models predict, in particular for critical applications such as finance and medicine. For this issue, most previous works measured the importance of individual features. In this work, we consider the interpretation of feature groups, which is more effective to capture intrinsic structures and correlations of multiple features. We propose the Baseline Group Shapley value (short for BGShapvalue) to calculate the importance of a feature group for tree models. We further develop a polynomial algorithm, BGShapTree, to deal with the sum of exponential terms in the BGShapvalue. The basic idea is to decompose the BGShapvalue into leaves’ weights and exploit the relationships between features and leaves. Based on this idea, we could greedily search salient feature groups with large BGShapvalues. Extensive experiments have validated the effectiveness of our approach, in comparison with state-of-the-art methods on the interpretation of tree models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining SHAP-Driven Co-clustering and Shallow Decision Trees to Explain XGBoost

On the Trustworthiness of Tree Ensemble Explainability Methods

Approximation of SHAP Values for Randomized Tree Ensembles

References

Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32
Article Google Scholar
Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 785–794
Chapter Google Scholar
Zhou Z H, Feng J. Deep forest: towards an alternative to deep neural networks. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 3553–3559
Google Scholar
Ribeiro M T, Singh S, Guestrin C. “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1135–1144
Chapter Google Scholar
Grath R M, Costabello L, Le Van C, Sweeney P, Kamiab F, Shen Z, Lecue F. Interpretable credit application predictions with counterfactual explanations. 2018, arXiv preprint arXiv: 1811.05245
Lundberg S M, Nair B, Vavilala M S, Horibe M, Eisses M J, Adams T, Liston D E, Low D K W, Newman S F, Kim J, Lee S I. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nature Biomedical Engineering, 2018, 2(10): 749–760
Article Google Scholar
Tjoa E, Guan C. A survey on explainable artificial intelligence (XAI): toward medical XAI. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(11): 4793–4813
Article Google Scholar
Zablocki É, Ben-Younes H, Pérez P, Cord M. Explainability of deep vision-based autonomous driving systems: review and challenges. International Journal of Computer Vision, 2022, 130(10): 2425–2452
Article Google Scholar
Breiman L, Friedman J, Olshen R A, Stone C J. Classification and Regression Trees. New York: CRC Press, 1984
Google Scholar
Strobl C, Boulesteix A L, Zeileis A, Hothorn T. Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics, 2007, 8: 25
Article Google Scholar
Louppe G, Wehenkel L, Sutera A, Geurts P. Understanding variable importances in forests of randomized trees. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 431–439
Google Scholar
Saabas A. Interpreting random forests. See interpreting-random-forests/website, 2014
Google Scholar
Kazemitabar S J, Amini A A, Bloniarz A, Talwalkar A. Variable importance using decision trees. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 425–434
Google Scholar
Li X, Wang Y, Basu S, Kumbier K, Yu B. A debiased MDI feature importance measure for random forests. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 723
Google Scholar
Shapley L S. A value for n-person games. In: Kuhn H W, Tucker A W, eds. Contributions to the Theory of Games. Princeton: Princeton University Press, 1953, 307–317
Google Scholar
Lundberg S M, Erion G, Chen H, DeGrave A, Prutkin J M, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S I. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2020, 2(1): 56–67
Article Google Scholar
Athanasiou M, Sfrintzeri K, Zarkogianni K, Thanopoulou A C, Nikita K S. An explainable XGBoost–based approach towards assessing the risk of cardiovascular disease in patients with type 2 diabetes mellitus. In: Proceedings of the 20th IEEE International Conference on Bioinformatics and Bioengineering. 2020, 859–864
Google Scholar
Feng D C, Wang W J, Mangalathu S, Taciroglu E. Interpretable XGBoost-SHAP machine-learning model for shear strength prediction of squat RC walls. Journal of Structural Engineering, 2021, 147(11): 04021173
Article Google Scholar
Sutera A, Louppe G, Huynh-Thu V A, Wehenkel L, Geurts P. From global to local MDI variable importances for random forests and when they are Shapley values. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021, 3533–3543
Google Scholar
Amoukou S I, Salaün T, Brunel N J B. Accurate Shapley values for explaining tree-based models. In: Proceedings of the 25th International Conference on Artificial Intelligence and Statistics. 2022, 2448–2465
Google Scholar
Sundararajan M, Najmi A. The many Shapley values for model explanation. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 859
Google Scholar
Lundberg S M, Lee S I. A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4768–4777
Google Scholar
Marichal J L. The influence of variables on pseudo-Boolean functions with applications to game theory and multicriteria decision making. Discrete Applied Mathematics, 2000, 107(1–3): 139–164
Article MathSciNet Google Scholar
Flores R, Molina E, Tejada J. Evaluating groups with the generalized Shapley value. 4OR, 2019, 17(2): 141–172
Article MathSciNet Google Scholar
Marichal J L, Kojadinovic I, Fujimoto K. Axiomatic characterizations of generalized values. Discrete Applied Mathematics, 2007, 155(1): 26–43
Article MathSciNet Google Scholar
Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 3319–3328
Google Scholar
Štrumbelj E, Kononenko I. Explaining prediction models and individual predictions with feature contributions. Knowledge and Information Systems, 2014, 41(3): 647–665
Article Google Scholar
Datta A, Sen S, Zick Y. Algorithmic transparency via quantitative input influence: theory and experiments with learning systems. In: Proceedings of 2016 IEEE Symposium on Security and Privacy. 2016, 598–617
Chapter Google Scholar
Díaz-Uriarte R, de Andrés S A. Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 2006, 7: 3
Article Google Scholar
Ishwaran H. Variable importance in binary regression trees and forests. Electronic Journal of Statistics, 2007, 1: 519–537
Article MathSciNet Google Scholar
Archer K J, Kimes R V. Empirical characterization of random forest variable importance measures. Computational Statistics & Data Analysis, 2008, 52(4): 2249–2260
Article MathSciNet Google Scholar
Strobl C, Boulesteix A L, Kneib T, Augustin T, Zeileis A. Conditional variable importance for random forests. BMC Bioinformatics, 2008, 9: 307
Article Google Scholar
Auret L, Aldrich C. Empirical comparison of tree ensemble variable importance measures. Chemometrics and Intelligent Laboratory Systems, 2011, 105(2): 157–170
Article Google Scholar
Louppe G. Understanding random forests: from theory to practice. 2014, arXiv preprint arXiv: 1407.7502
Nembrini S, König I R, Wright M N. The revival of the Gini importance?. Bioinformatics, 2018, 34(21): 3711–3718
Article Google Scholar
Scornet E. Trees, forests, and impurity-based variable importance. 2020, arXiv preprint arXiv: 2001.04295
Sagi O, Rokach L. Explainable decision forest: transforming a decision forest into an interpretable tree. Information Fusion, 2020, 61: 124–138
Article Google Scholar
Tan S, Soloviev M, Hooker G, Wells M T. Tree space prototypes: another look at making tree ensembles interpretable. In: Proceedings of 2020 ACM-IMS on Foundations of Data Science Conference. 2020, 23–34
Chapter Google Scholar
Lucic A, Oosterhuis H, Haned H, de Rijke M. FOCUS: flexible optimizable counterfactual explanations for tree ensembles. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 5313–5322
Google Scholar
Parmentier A, Vidal T. Optimal counterfactual explanations in tree ensembles. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 8422–8431
Google Scholar
Dutta S, Long J, Mishra S, Tilli C, Magazzeni D. Robust counterfactual explanations for tree-based ensembles. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 5742–5756
Google Scholar
Ignatiev A. Towards trustable explainable AI. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2020, 5154–5158
Google Scholar
Izza Y, Ignatiev A, Marques-Silva J. On explaining decision trees. 2020, arXiv preprint arXiv: 2010.11034
Izza Y, Marques-Silva J. On explaining random forests with SAT. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. 2021, 2584–2591
Google Scholar
Ignatiev A, Izza Y, Stuckey P J, Marques-Silva J. Using MaxSAT for efficient explanations of tree ensembles. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 3776–3785
Google Scholar
Agarwal A, Tan Y S, Ronen O, Singh C, Yu B. Hierarchical shrinkage: improving the accuracy and interpretability of tree-based models. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 111–135
Google Scholar
Yang J. Fast TreeSHAP: accelerating SHAP value computation for trees. 2021, arXiv preprint arXiv: 2109.09847
Grömping U. Estimators of relative importance in linear regression based on variance decomposition. The American Statistician, 2007, 61(2): 139–147
Article MathSciNet Google Scholar
Sun Y, Sundararajan M. Axiomatic attribution for multilinear functions. In: Proceedings of the 12th ACM Conference on Electronic Commerce. 2011, 177–178
Chapter Google Scholar
Aas K, Jullum M, Løland A. Explaining individual predictions when features are dependent: more accurate approximations to Shapley values. Artificial Intelligence, 2021, 298: 103502
Article MathSciNet Google Scholar
Chau S L, Hu R, Gonzalez J, Sejdinovic D. RKHS-SHAP: Shapley values for kernel methods. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 13050–13063
Google Scholar
Ancona M, Oztireli C, Gross M. Explaining deep neural networks with a polynomial time algorithm for Shapley value approximation. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 272–281
Google Scholar
Ghorbani A, Zou J. Neuron Shapley: discovering the responsible neurons. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 5922–5932
Google Scholar
Bento J, Saleiro P, Cruz A F, Figueiredo M A T, Bizarro P. TimeSHAP: explaining recurrent models through sequence perturbations. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2021, 2565–2573
Chapter Google Scholar
Wang G, Chuang Y N, Du M, Yang F, Zhou Q, Tripathi P, Cai X, Hu X. Accelerating Shapley explanation via contributive cooperator selection. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 22576–22590
Google Scholar
Chen L, Lou S, Zhang K, Huang J, Zhang Q. HarsanyiNet: computing accurate Shapley values in a single forward propagation. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 4804–4825
Google Scholar
Štrumbelj E, Kononenko I, Šikonja M R. Explaining instance classifications with interactions of subsets of feature values. Data & Knowledge Engineering, 2009, 68(10): 886–904
Article Google Scholar
Owen A B. Sobol’ indices and Shapley value. SIAM/ASA Journal on Uncertainty Quantification, 2014, 2(1): 245–251
Article MathSciNet Google Scholar
Owen A B, Prieur C. On Shapley value for measuring importance of dependent inputs. SIAM/ASA Journal on Uncertainty Quantification, 2017, 5(1): 986–1002
Article MathSciNet Google Scholar
Frye C, Rowat C, Feige I. Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1229–1239
Google Scholar
Heskes T, Sijben E, Bucur I G, Claassen T. Causal Shapley values: exploiting causal knowledge to explain individual predictions of complex models. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 4778–4789
Google Scholar
Dhamdhere K, Agarwal A, Sundararajan M. The Shapley Taylor interaction index. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 9259–9268
Google Scholar
Covert I, Lee S I. Improving KernelSHAP: practical Shapley value estimation using linear regression. In: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics. 2021, 3457–3465
Google Scholar
Janizek J D, Sturmfels P, Lee S I. Explaining explanations: axiomatic feature interactions for deep networks. The Journal of Machine Learning Research, 2021, 22(1): 104
MathSciNet Google Scholar
Wang J, Zhang Y, Gu Y, Kim T K. SHAQ: incorporating Shapley value theory into multi-agent Q-learning. In: Proceedings of the 36th Conference on Neural Information Processing Systems. 2022, 5941–5954
Google Scholar
Beechey D, Smith T M S, Şimşek Ö. Explaining reinforcement learning with Shapley values. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 2003–2014
Google Scholar
Ren J, Zhang D, Wang Y, Chen L, Zhou Z, Chen Y, Cheng X, Wang X, Zhou M, Shi J, Zhang Q. Towards a unified game-theoretic view of adversarial perturbations and robustness. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021, 3797–3810
Google Scholar
Chau S L, Muandet K, Sejdinovic D. Explaining the uncertain: stochastic Shapley values for Gaussian process models. In: Proceedings of the 37th Conference on Neural Information Processing Systems. 2023, 50769–50795
Google Scholar
Watson D S, O’Hara J, Tax N, Mudd R, Guy I. Explaining predictive uncertainty with information theoretic Shapley values. In: Proceedings of the 37th Conference on Neural Information Processing Systems. 2023, 7330–7350
Google Scholar
Janzing D, Minorics L, Blöbaum P. Feature relevance quantification in explainable AI: a causal problem. In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics. 2020, 2907–2916
Google Scholar
Kumar I E, Venkatasubramanian S, Scheidegger C, Friedler S A. Problems with Shapley-value-based explanations as feature importance measures. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 5491–5500
Google Scholar
Kumar I E, Scheidegger C, Venkatasubramanian S, Friedler S A. Shapley residuals: quantifying the limits of the Shapley value for explanations. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021, 26598–26608
Google Scholar
Kwon Y, Zou J. WeightedSHAP: analyzing and improving Shapley based feature attributions. In: Proceedings of the 36th Conference on Neural Information Processing Systems. 2022, 34363–34376
Google Scholar
Van den Broeck G, Lykov A, Schleich M, Suciu D. On the tractability of SHAP explanations. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 6505–6513
Google Scholar
Bordt S, von Luxburg U. From Shapley values to generalized additive models and back. In: Proceedings of the 26th International Conference on Artificial Intelligence and Statistics. 2023, 709–745
Google Scholar
Jullum M, Redelmeier A, Aas K. groupShapley: efficient prediction explanation with Shapley values for feature groups. 2021, arXiv preprint arXiv: 2106.12228
Miroshnikov A, Kotsiopoulos K, Filom K, Kannan A R. Stability theory of game-theoretic group feature explanations for machine learning models. 2021, arXiv preprint arXiv: 2102.10878
Au Q, Herbinger J, Stachl C, Bischl B, Casalicchio G. Grouped feature importance and combined features effect plot. Data Mining and Knowledge Discovery, 2022, 36(4): 1401–1450
Article MathSciNet Google Scholar
Vanschoren J, van Rijn J N, Bischl B, Torgo L. OpenML: networked science in machine learning. ACM SIGKDD Explorations Newsletter, 2014, 15(2): 49–60
Article Google Scholar
Kelly M, Longjohn R, Nottingham K. The UCI Machine Learning Repository. See archive.ics.uci.edu website. 2024
Google Scholar
Samek W, Binder A, Montavon G, Lapuschkin S, Müller K R. Evaluating the visualization of what a deep neural network has learned. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(11): 2660–2673
Article MathSciNet Google Scholar
Lundberg S M, Erion G G, Lee S I. Consistent individualized feature attribution for tree ensembles. 2018, arXiv preprint arXiv: 1802.03888

Download references

Acknowledgements

The authors want to thank the editors and reviewers for their helpful comments and suggestions. The authors also thank Jia-He Yao for helpful advice. This research was supported by the National Science and Technology Major Project (2021ZD0112802) and the National Natural Science Foundation of China (Grant No. 62376119).

Author information

Authors and Affiliations

National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Fan Xu, Zhi-Jian Zhou, Jie Ni & Wei Gao
School of Artificial Intelligence, Nanjing University, Nanjing, 210023, China
Fan Xu, Zhi-Jian Zhou, Jie Ni & Wei Gao

Authors

Fan Xu
View author publications
Search author on:PubMed Google Scholar
Zhi-Jian Zhou
View author publications
Search author on:PubMed Google Scholar
Jie Ni
View author publications
Search author on:PubMed Google Scholar
Wei Gao
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Wei Gao.

Ethics declarations

Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.

Additional information

Fan Xu received his BSc degree from Southeast University, China in 2020. Currently, he is working towards the PhD degree in Nanjing University, China. His research interest is mainly on machine learning.

Zhi-Jian Zhou received his BSc degree from Dalian University of Technology, China in 2021. He is now a graduate student in Nanjing University, China. His research interest is mainly on hypothesis testing.

Jie Ni received his BSc degree from Nanjing University, China in 2021. Currently, he is a graduate student in Nanjing University, China. His research interest include machine learning and data mining.

Wei Gao received his PhD degree from Nanjing University, China in 2014, and he is currently an associate professor of School of Artificial Intelligence in Nanjing University, China. His research interests include learning theory. His works have been published in top-tier international journals or conference proceedings such as AIJ, IEEE TPAMI, COLT, ICML and NeurIPS. He is also a co-author of the book Introduction to the Theory of Machine Learning.

Electronic supplementary material