Abstract
Understanding why a classifier makes a certain prediction is crucial in high-stakes applications. It is also one of the central problems studied in the field of Explainable AI. To accurately explain predictions of a classifier, it is essential to take information about relationships between features into account. Many approaches, however, ignore this information. We address this problem in the context of symbolically encoded boolean classifiers. Darwiche and Hirth proposed the notion of sufficient reason (also called PI explanation or abductive explanation) to explain predictions of such classifiers. We show that sufficient reasons may be inaccurate and overly verbose, as they ignore information about relationships between features. We propose to represent this information using preferential models, which we use to encode hard as well as soft constraints between features. Preferential models define non-monotonic consequence relations that encode statements such as “birds typically fly” and “penguins typically don’t fly”. We introduce a number of ways to define reasons in the presence of background knowledge about the feature space, and we analyse these notions by means of general principles that characterise their behaviour.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Choi, A., Shih, A., Goyanka, A., Darwiche, A.: On symbolically encoding the behavior of random forests. CoRR arxiv:2007.01493 (2020)
Darwiche, A., Goldszmidt, M.: On the relation between kappa calculus and probabilistic reasoning. In: UAI, pp. 145–153. Morgan Kaufmann (1994)
Darwiche, A., Hirth, A.: On the reasons behind decisions. In: De Giacomo, G., et al. (eds.) ECAI 2020, vol. 325 of Frontiers in Artificial Intelligence and Applications, pp. 712–720. IOS Press (2020)
Darwiche, A., Pearl, J.: On the logic of iterated belief revision. Artif. Intell. 89(1–2), 1–29 (1996)
Giang, P.H., Shenoy, P.P.: On transformations between probability and spolinian disbelief functions. In: UAI, pp. 236–244. Morgan Kaufmann (1999)
Goldszmidt, M., Morris, P.H., Pearl, J.: A maximum entropy approach to nonmonotonic reasoning. IEEE Trans. Pattern Anal. Mach. Intell. 15(3), 220–232 (1993)
Gorji, N., Rubin, S.: Sufficient reasons for classifier decisions in the presence of domain constraints. In: AAAI, pp. 5660–5667. AAAI Press (2022)
Ignatiev, A., Narodytska, N., Asher, N., Marques-Silva, J.: From contrastive to abductive explanations and back again. In: Baldoni, M., Bandini, S. (eds.) AIxIA 2020. LNCS (LNAI), vol. 12414, pp. 335–355. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77091-4_21
Ignatiev, A., Narodytska, N., Marques-Silva, J.: Abduction-based explanations for machine learning models. In: AAAI, pp. 1511–1519. AAAI Press (2019)
Kern-Isberner, G., Eichhorn, C.: Structural inference from conditional knowledge bases. Stud. Logica. 102(4), 751–769 (2014)
Kraus, S., Lehmann, D.J., Magidor, M.: Nonmonotonic reasoning, preferential models and cumulative logics. Artif. Intell. 44(1–2), 167–207 (1990)
Lehmann, D.J., Magidor, M.: What does a conditional knowledge base entail? Artif. Intell. 55(1), 1–60 (1992)
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. CoRR arxiv:1705.07874 (2017)
Marques-Silva, J., Gerspacher, T., Cooper, M.C., Ignatiev, A., Narodytska, N.: Explanations for monotonic classifiers. In: ICML, vol. 139 of Proceedings of Machine Learning Research, pp. 7469–7479. PMLR (2021)
Marques-Silva, J., Ignatiev, A.: Delivering trustworthy AI through formal XAI. In: AAAI, pp. 12342–12350. AAAI Press (2022)
Pearl, J.: System Z: a natural ordering of defaults with tractable applications to nonmonotonic reasoning. In: TARK, pp. 121–135. Morgan Kaufmann (1990)
Ribeiro, M.T., Singh, S., Guestrin, C.: why should I trust you?”: explaining the predictions of any classifier. In: KDD, pp. 1135–1144. ACM (2016)
Shih, A., Choi, A., Darwiche, A.: A symbolic approach to explaining bayesian network classifiers. In: IJCAI, pp. 5103–5111. ijcai.org (2018)
Slack, D., Hilgard, S., Jia, E., Singh, S., Lakkaraju, H.: Fooling LIME and SHAP: adversarial attacks on post hoc explanation methods. In: Markham, A.N., Powles, J., Walsh, T., Washington, A.L. (eds.) AIES ’20: AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, 7–8 February 2020, pp. 180–186. ACM (2020)
Weydert, E.: System JLZ - rational default reasoning by minimal ranking constructions. J. Appl. Log. 1(3–4), 273–308 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rienstra, T. (2025). Explaining Boolean Classifiers with Non-monotonic Background Theories. In: Oliehoek, F.A., Kok, M., Verwer, S. (eds) Artificial Intelligence and Machine Learning. BNAIC/Benelearn 2023. Communications in Computer and Information Science, vol 2187. Springer, Cham. https://doi.org/10.1007/978-3-031-74650-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-74650-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-74649-9
Online ISBN: 978-3-031-74650-5
eBook Packages: Artificial Intelligence (R0)