Abstract
Explaining a classification made by tree-ensembles is an inherently hard problem that is traditionally solved approximately, without guaranteeing sufficiency or necessity. Abductive explanations were the first attempt to provide concise sufficient information: Given a sample, they consist of the minimal set of features that are relevant for the outcome. Inflated explanations are a refinement that additionally specify how much at least one feature must be altered in order to allow a change of the prediction. In this paper, we present the first algorithm for generating inflated explanations for gradient boosted trees, today’s de facto standard for tree-based classifiers. Key to our algorithm is a compilation approach based on algebraic decision diagrams. The impact of our approach is illustrated along a number of popular data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Neither abductive nor inflated explanations are unique in general (cf. [3]).
- 2.
All of our techniques in this paper can also be applied to decision trees learned on categorical features.
- 3.
- 4.
This is an extension of classic ADDs which are defined over a domain \(\{0,1\}^n\). By allowing predicates over \(\mathbb F\) we extend the expressiveness of ADDs at the cost of semantic dependencies along paths [13].
- 5.
Based on the class characterization the value 0 represents any class different from \(c_i\).
- 6.
While there are also other approaches to generating abductive explanations such as [1] that is able to achieve slight performance improvement over [17], our approach remains significantly faster by several orders of magnitude compared to [17]. Moreover, our primary contribution lies in the generation of inflated explanations, that neither [17] nor [1] can handle.
- 7.
Note that (1) the learning process can terminate early which results in less than 50 trees and (2) for 2 classes it is sufficient to learn 50 trees in total.
References
Audemard, G., Lagniez, J., Marquis, P., Szczepanski, N.: Computing abductive explanations for boosted trees. In: Ruiz, F.J.R., Dy, J.G., van de Meent, J. (eds.) International Conference on Artificial Intelligence and Statistics, 25–27 April 2023, Palau de Congressos, Valencia, Spain. Proceedings of Machine Learning Research, vol. 206, pp. 4699–4711. PMLR (2023). https://proceedings.mlr.press/v206/audemard23a.html
Bahar, R.I., et al.: Algebraic decision diagrams and their applications. Form. Methods Syst. Des. 10, 171–206 (1997)
Biradar, G., Izza, Y., Lobo, E., Viswanathan, V., Zick, Y.: Axiomatic aggregations of abductive explanations. In: Wooldridge, M.J., Dy, J.G., Natarajan, S. (eds.) Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2014, 20–27 February 2024, Vancouver, Canada, pp. 11096–11104. AAAI Press (2024). https://doi.org/10.1609/AAAI.V38I10.28986
Borisov, V., Leemann, T., Seßler, K., Haug, J., Pawelczyk, M., Kasneci, G.: Deep neural networks and tabular data: a survey. IEEE Trans. Neural Netw. Learn. Syst. 1–21 (2022). https://doi.org/10.1109/TNNLS.2022.3229161
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Bryant: Graph-based algorithms for Boolean function manipulation. IEEE Trans. Comput. C-35(8), 677–691 (1986).https://doi.org/10.1109/TC.1986.1676819
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016, pp. 785–794. ACM (2016). https://doi.org/10.1145/2939672.2939785
Darwiche, A., Hirth, A.: On the reasons behind decisions. In: Giacomo, G.D., et al. (eds.) ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August–8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020 - Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020). Frontiers in Artificial Intelligence and Applications, vol. 325, pp. 712–720. IOS Press (2020). https://doi.org/10.3233/FAIA200158
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)
Gossen, F., Margaria, T., Steffen, B.: Towards explainability in machine learning: the formal methods way. IT Prof. 22(4), 8–12 (2020). https://doi.org/10.1109/MITP.2020.3005640
Gossen, F., Margaria, T., Steffen, B.: Formal methods boost experimental performance for explainable AI. IT Prof. 23(6), 8–12 (2021). https://doi.org/10.1109/MITP.2021.3123495
Gossen, F., Murtovi, A., Zweihoff, P., Steffen, B.: Add-lib: decision diagrams in practice. CoRR abs/1912.11308 (2019). http://arxiv.org/abs/1912.11308
Gossen, F., Steffen, B.: Algebraic aggregation of random forests: towards explainability and rapid evaluation. Int. J. Softw. Tools Technol. Transf. 1–19 (2021)
Grinsztajn, L., Oyallon, E., Varoquaux, G.: Why do tree-based models still outperform deep learning on typical tabular data? In: NeurIPS (2022). http://papers.nips.cc/paper_files/paper/2022/hash/0378c7692da36807bdec87ab043cdadc-Abstract-Datasets_and_Benchmarks.html
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), 93:1–93:42 (2019). https://doi.org/10.1145/3236009
Huang, X., Izza, Y., Ignatiev, A., Marques-Silva, J.: On efficiently explaining graph-based classifiers. In: Bienvenu, M., Lakemeyer, G., Erdem, E. (eds.) Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning, KR 2021, Online event, 3–12 November 2021.,pp. 356–367 (2021). https://doi.org/10.24963/KR.2021/34
Ignatiev, A., Izza, Y., Stuckey, P.J., Marques-Silva, J.: Using maxsat for efficient explanations of tree ensembles. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22–1 March 2022, pp. 3776–3785. AAAI Press (2022). https://doi.org/10.1609/AAAI.V36I4.20292
Ignatiev, A., Narodytska, N., Marques-Silva, J.: Abduction-based explanations for machine learning models. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–1 February 2019, pp. 1511–1519. AAAI Press (2019). https://doi.org/10.1609/AAAI.V33I01.33011511
Ignatiev, A., Narodytska, N., Marques-Silva, J.: On validating, repairing and refining heuristic ML explanations. CoRR abs/1907.02509 (2019). http://arxiv.org/abs/1907.02509
Izza, Y., Ignatiev, A., Marques-Silva, J.: On tackling explanation redundancy in decision trees. J. Artif. Intell. Res. 75, 261–321 (2022). https://doi.org/10.1613/JAIR.1.13575
Izza, Y., Ignatiev, A., Stuckey, P.J., Marques-Silva, J.: Delivering inflated explanations. CoRR abs/2306.15272 (2023). https://doi.org/10.48550/ARXIV.2306.15272
Izza, Y., Marques-Silva, J.: On explaining random forests with SAT. In: Zhou, Z. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 19–27 August 2021, pp. 2584–2591. ijcai.org (2021). https://doi.org/10.24963/IJCAI.2021/356
Jörges, S., Margaria, T., Steffen, B.: Genesys: service-oriented construction of property conform code generators. Innov. Syst. Softw. Eng. 4(4), 361–384 (2008). https://doi.org/10.1007/S11334-008-0071-2
Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 4765–4774 (2017). https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
Margaria, T.: Fully automatic verification and error detection for parameterized iterative sequential circuits. In: Margaria, T., Steffen, B. (eds.) TACAS 1996. LNCS, vol. 1055, pp. 258–277. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-61042-1_49
Margaria, T., Meyer, D., Kubczak, C., Isberner, M., Steffen, B.: Synthesizing semantic web service compositions with jMosel and Golog. In: Bernstein, A., et al. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 392–407. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04930-9_25
Murtovi, A., Bainczyk, A., Nolte, G., Schlüter, M., Steffen, B.: Forest GUMP: a tool for verification and explanation. Int. J. Softw. Tools Technol. Transf. 25(3), 287–299 (2023). https://doi.org/10.1007/S10009-023-00702-5
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should I trust you?’: explaining the predictions of any classifier. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016, pp. 1135–1144. ACM (2016). https://doi.org/10.1145/2939672.2939778
Shih, A., Choi, A., Darwiche, A.: A symbolic approach to explaining Bayesian network classifiers. In: Lang, J. (ed.) Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, 13–19 July 2018, Stockholm, Sweden, pp. 5103–5111. ijcai.org (2018). https://doi.org/10.24963/IJCAI.2018/708
Shwartz-Ziv, R., Armon, A.: Tabular data: deep learning is not all you need. Inf. Fusion 81, 84–90 (2022). https://doi.org/10.1016/J.INFFUS.2021.11.011
Steffen, B., Gossen, F., Naujokat, S., Margaria, T.: Language-driven engineering: from general-purpose to purpose-specific languages. In: Steffen, B., Woeginger, G. (eds.) Computing and Software Science. LNCS, vol. 10000, pp. 311–344. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91908-9_17
Steffen, B., Margaria, T., Nagel, R., Jörges, S., Kubczak, C.: Model-driven development with the jABC. In: Bin, E., Ziv, A., Ur, S. (eds.) HVC 2006. LNCS, vol. 4383, pp. 92–108. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-70889-6_7
Topnik, C., Wilhelm, E., Margaria, T., Steffen, B.: jMosel: a stand-alone tool and jABC plugin for M2L(Str). In: Valmari, A. (ed.) SPIN 2006. LNCS, vol. 3925, pp. 293–298. Springer, Heidelberg (2006). https://doi.org/10.1007/11691617_18
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Murtovi, A., Schlüter, M., Steffen, B. (2025). Computing Inflated Explanations for Boosted Trees: A Compilation-Based Approach. In: Hinchey, M., Steffen, B. (eds) The Combined Power of Research, Education, and Dissemination. Lecture Notes in Computer Science, vol 15240. Springer, Cham. https://doi.org/10.1007/978-3-031-73887-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-73887-6_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73886-9
Online ISBN: 978-3-031-73887-6
eBook Packages: Computer ScienceComputer Science (R0)