Skip to main content

Explainability Metrics and Properties for Counterfactual Explanation Methods

  • Conference paper
  • First Online:
Explainable and Transparent AI and Multi-Agent Systems (EXTRAAMAS 2022)

Abstract

The increasing application of Explainable AI (XAI) methods to enhance the transparency and trustworthiness of AI systems designates the need to quantitatively assess and analyze the theoretical and behavioral characteristics of explanations generated by these methods. A fair amount of metrics and properties exist, however these metrics are method-specific, complex and at times hard to interpret. This work focuses on (i) identification of these metrics and properties applicable to the selected post-hoc counterfactual explanation methods (a mechanism for generating explanations), (ii) assessing the applicability of the identified metrics and properties to compare counterfactual examples across explanation methods, and (iii) analyzing the properties of those counterfactual explanation methods. A pipeline is designed to implement the proof-of-concept tool, comprising of the following steps-selecting a data set, training some suitable classifier, deploying counterfactual generation method(s), and implementing defined XAI metrics to infer properties satisfied by explanation methods. The outcome of the experiments reflects that desirable properties for counterfactual explanations are more or less satisfied as measured by different metrics. Certain inconsistencies were identified in the counterfactual explanation methods such as the resulting counterfactual instances failed to be pushed to the desired class, defeating one of the main purposes of obtaining counterfactual explanations. Besides, several other properties have been discussed to analyze counterfactual explanation methods.

Supported by Ericsson Research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps (2020)

    Google Scholar 

  2. Apley, D.W., Zhu, J.: Visualizing the effects of predictor variables in black box supervised learning models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 82(4), 1059–1086 (2020)

    Article  MathSciNet  Google Scholar 

  3. Bhatt, U., Weller, A., Moura, J.M.F.: Evaluating and aggregating feature-based model explanations. In: Bessiere, C. (ed.) 29th International Joint Conference on Artificial Intelligence, Yokohama, pp. 3016–3022. IJCAI (2020). https://doi.org/10.24963/ijcai.2020/417. https://www.ijcai.org/proceedings/2020/417

  4. Camburu, O.M., Giunchiglia, E., Foerster, J., Lukasiewicz, T., Blunsom, P.: Can I trust the explainer? Verifying post-hoc explanatory methods. In: NeurIPS 2019 Workshop on Safety and Robustness in Decision Making, Vancouver (2019). http://arxiv.org/abs/1910.02065

  5. Covert, I., Lundberg, S., Lee, S.: Understanding global feature contributions through additive importance measures. CoRR abs/2004.00668 (2020). https://arxiv.org/abs/2004.00668

  6. Cyras, K., et al.: Machine reasoning explainability (2020)

    Google Scholar 

  7. Dhurandhar, A.: Explanations based on the missing: towards contrastive explanations with pertinent negatives. CoRR abs/1802.07623 (2018). http://arxiv.org/abs/1802.07623

  8. Dhurandhar, A., Pedapati, T., Balakrishnan, A., Chen, P., Shanmugam, K., Puri, R.: Model agnostic contrastive explanations for structured data. CoRR abs/1906.00117 (2019). http://arxiv.org/abs/1906.00117

  9. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x. https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1469-1809.1936.tb02137.x

  10. Ghidini, V., Perotti, A., Schifanella, R.: Quantitative and ontology-based comparison of explanations for image classification. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds.) LOD 2019. LNCS, vol. 11943, pp. 58–70. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37599-7_6

    Chapter  Google Scholar 

  11. Guidotti, R.: Evaluating local explanation methods on ground truth. Artif. Intell. 291, 103428 (2021). https://doi.org/10.1016/j.artint.2020.103428. https://www.sciencedirect.com/science/article/pii/S0004370220301776

  12. Gunning, D., Aha, D.: Darpa’s explainable artificial intelligence (XAI) program. AI Mag. 40(2), 44–58 (2019). https://doi.org/10.1609/aimag.v40i2.2850. https://ojs.aaai.org/index.php/aimagazine/article/view/2850

  13. Gurumoorthy, K.S., Dhurandhar, A., Cecchi, G., Aggarwal, C.: Efficient data representation by selecting prototypes with importance weights (2019)

    Google Scholar 

  14. Hoffman, R.R., Mueller, S.T., Klein, G., Litman, J.: Metrics for explainable AI: challenges and prospects, pp. 1–50 (2020). http://arxiv.org/abs/1812.04608

  15. Inam, R., Terra, A., Mujumdar, A., Fersman, E., Feljan., A.V.: Explainable AI - how humans can trust AI (2021). https://www.ericsson.com/en/reports-and-papers/white-papers/explainable-ai-how-humans-can-trust-ai

  16. Kim, B., et al.: Interpretability beyond feature attribution: quantitative testing with concept activation vectors (TCAV). In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, 10–15 July 2018, vol. 80, pp. 2668–2677. PMLR (2018). http://proceedings.mlr.press/v80/kim18d.html

  17. Kotu, V., Deshpande, B.: Chapter 4 - Classification. In: Kotu, V., Deshpande, B. (eds.) Data Science, 2nd edn, pp. 65–163. Morgan Kaufmann (2019). https://doi.org/10.1016/B978-0-12-814761-0.00004-6. https://www.sciencedirect.com/science/article/pii/B9780128147610000046

  18. Laugel, T., Lesot, M.J., Marsala, C., Detyniecki, M.: Issues with post-hoc counterfactual explanations: a discussion. In: Workshop on Human In the Loop Learning (HILL), Long Beach, CA (2019). http://arxiv.org/abs/1906.04774

  19. Laugel, T., Lesot, M.J., Marsala, C., Renard, X., Detyniecki, M.: The dangers of post-hoc interpretability: unjustified counterfactual explanations. In: IJCAI International Joint Conference on Artificial Intelligence, August 2019, pp. 2801–2807 (2019). https://doi.org/10.24963/ijcai.2019/388

  20. LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision, pp. 253–256. IEEE (2010)

    Google Scholar 

  21. Van Looveren, A., Klaise, J.: Interpretable counterfactual explanations guided by prototypes. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12976, pp. 650–665. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86520-7_40

    Chapter  Google Scholar 

  22. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf

  23. Montavon, G., Samek, W., Müller, K.R.: Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018). https://doi.org/10.1016/j.dsp.2017.10.011. https://www.sciencedirect.com/science/article/pii/S1051200417302385

  24. Mothilal, R.K., Mahajan, D., Tan, C., Sharma, A.: Towards unifying feature attribution and counterfactual explanations: different means to the same end (2021)

    Google Scholar 

  25. Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, FAT* 2020, New York, NY, USA, pp. 607–617. Association for Computing Machinery (2020). https://doi.org/10.1145/3351095.3372850

  26. Poyiadzi, R., Sokol, K., Santos-Rodríguez, R., De Bie, T., Flach, P.: FACE: feasible and actionable counterfactual explanations. In: Markham, A.N., Powles, J., Walsh, T., Washington, A.L. (eds.) AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, pp. 344–350. ACM (2020). https://doi.org/10.1145/3375627.3375850. https://dl.acm.org/doi/10.1145/3375627.3375850

  27. Ribeiro, M.T., Singh, S., Guestrin, C.: Nothing else matters: model-agnostic explanations by identifying prediction invariance (2016)

    Google Scholar 

  28. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, New York, NY, USA, pp. 1135–1144. Association for Computing Machinery (2016). https://doi.org/10.1145/2939672.2939778

  29. Russell, C.: Efficient search for diverse coherent explanations. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* 2019, New York, NY, USA, pp. 20–28. Association for Computing Machinery (2019). https://doi.org/10.1145/3287560.3287569

  30. Samek, W., Montavon, G., Binder, A., Lapuschkin, S., Müller, K.R.: Interpreting the predictions of complex ml models by layer-wise relevance propagation (2016)

    Google Scholar 

  31. Sharma, S., Henderson, J., Ghosh, J.: CERTIFAI: a common framework to provide explanations and analyse the fairness and robustness of black-box models. In: Markham, A.N., Powles, J., Walsh, T., Washington, A.L. (eds.) AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, pp. 166–172. ACM (2020). https://doi.org/10.1145/3375627.3375812

  32. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences (2019)

    Google Scholar 

  33. Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps (2013)

    Google Scholar 

  34. Singh, V.: Explainable AI metrics and properties for evaluation and analysis of counterfactual explanations. Master’s thesis, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology (2021). http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-462552

  35. Sinnott, R., Duan, H., Sun, Y.: Chapter 15 - A case study in big data analytics: exploring twitter sentiment analysis and the weather. In: Buyya, R., Calheiros, R.N., Dastjerdi, A.V. (eds.) Big Data, pp. 357–388. Morgan Kaufmann (2016). https://doi.org/10.1016/B978-0-12-805394-2.00015-5. https://www.sciencedirect.com/science/article/pii/B9780128053942000155

  36. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf

  37. Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks (2017)

    Google Scholar 

  38. Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harvard J. Law Technol. 31(2), October 2017. https://doi.org/10.2139/ssrn.3063289. https://dx.doi.org/10.2139/ssrn.3063289

  39. White, A., d’Avila Garcez, A.: Measurable counterfactual local explanations for any classifier (2019). http://arxiv.org/abs/1908.03020

  40. Yang, M., Kim, B.: Benchmarking attribution methods with relative feature importance (2018, 2019). http://arxiv.org/abs/1907.09701

  41. Zhou, J., Gandomi, A.H., Chen, F., Holzinger, A.: Evaluating the quality of machine learning explanations: a survey on methods and metrics. Electronics 10(5) (2021). https://doi.org/10.3390/electronics10050593. https://www.mdpi.com/2079-9292/10/5/593

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vandita Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Singh, V., Cyras, K., Inam, R. (2022). Explainability Metrics and Properties for Counterfactual Explanation Methods. In: Calvaresi, D., Najjar, A., Winikoff, M., Främling, K. (eds) Explainable and Transparent AI and Multi-Agent Systems. EXTRAAMAS 2022. Lecture Notes in Computer Science(), vol 13283. Springer, Cham. https://doi.org/10.1007/978-3-031-15565-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15565-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15564-2

  • Online ISBN: 978-3-031-15565-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics