Skip to main content

Robust Explanations for Human-Neural Multi-agent Systems with Formal Verification

  • Conference paper
  • First Online:
Multi-Agent Systems (EUMAS 2023)

Abstract

The quality of explanations in human-agent interactions is fundamental to the development of trustworthy AI systems. In this paper we study the problem of generating robust contrastive explanations for human-neural multi-agent systems and introduce two novel verification-based algorithms to (i) identify non-robust explanations generated by other methods and (ii) generate contrastive explanations equipped with formal robustness certificates. We present an implementation and evaluate the effectiveness of the approach on two case studies involving neural agents trained on credit scoring and traffic sign recognition tasks.

Work partially supported by the DARPA Assured Autonomy programme (FA8750-18-C-0095), the UK Royal Academy of Engineering (CiET17/18-26) and an Imperial College Research Fellowship awarded to Leofante. This paper is an extended version of [26] presented at AAMAS 2023.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Akintunde, M., Botoeva, E., Kouvaros, P., Lomuscio, A.: Formal verification of neural agents in non-deterministic environments. J. Auton. Agents Multi-Agent Syst. 36(1) (2022)

    Google Scholar 

  2. Barrett, S., Rosenfeld, A., Kraus, S., Stone, P.: Making friends on the fly: cooperating with new teammates. Artif. Intell. 242, 132–171 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  3. Björkegren, D., Blumenstock, J., Knight, S.: Manipulation-proof machine learning. arXiv preprint arXiv:2004.03865 (2020)

  4. Black, E., Wang, Z., Fredrikson, M.: Consistent counterfactuals for deep models. In: Proceedings of the International Conference on Learning Representations (ICLR22). OpenReview.net (2022)

    Google Scholar 

  5. Botoeva, E., Kouvaros, P., Kronqvist, J., Lomuscio, A., Misener, R.: Efficient verification of neural networks via dependency analysis. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI20), pp. 3291–3299. AAAI Press (2020)

    Google Scholar 

  6. Brix, C., Müller, M.N., Bak, S., Johnson, T.T., Liu, C.: First three years of the international verification of neural networks competition (VNN-COMP). arXiv preprint arXiv:2301.05815 (2023)

  7. Byrne, R.: Counterfactuals in explainable artificial intelligence (XAI): evidence from human reasoning. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI19, pp. 6276–6282 (2019)

    Google Scholar 

  8. Dhurandhar, A., et al.: Explanations based on the missing: towards contrastive explanations with pertinent negatives. In: Advances in Neural Information Processing Systems (NeurIPS18), pp. 590–601 (2018)

    Google Scholar 

  9. Dutta, S., Long, J., Mishra, S., Tilli, C., Magazzeni, D.: Robust counterfactual explanations for tree-based ensembles. In: Proceedings of the International Conference on Machine Learning (ICML22). Proceedings of Machine Learning Research, vol. 162, pp. 5742–5756. PMLR (2022)

    Google Scholar 

  10. FICO Community: Explainable Machine Learning Challenge (2019). https://community.fico.com/s/explainable-machine-learning-challenge

  11. Guidotti, D., Leofante, F., Pulina, L., Tacchella, A.: Verification of neural networks: enhancing scalability through pruning. In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI20), pp. 2505–2512. IOS Press (2020)

    Google Scholar 

  12. Guidotti, D., Pulina, L., Tacchella, A.: pyNeVer: a framework for learning and verification of neural networks. In: Hou, Z., Ganesh, V. (eds.) ATVA 2021. LNCS, vol. 12971, pp. 357–363. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88885-5_23

    Chapter  Google Scholar 

  13. Hancox-Li, L.: Robustness in machine learning explanations: does it matter? In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT*20), pp. 640–647. ACM (2020)

    Google Scholar 

  14. Henriksen, P., Hammernik, K., Rueckert, D., Lomuscio, A.: Bias field robustness verification of large neural image classifiers. In: Proceedings of the 32nd British Machine Vision Conference (BMVC21). BMVA Press (2021)

    Google Scholar 

  15. Henriksen, P., Lomuscio, A.: Efficient neural network verification via adaptive refinement and adversarial search. In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI20), pp. 2513–2520. IOS Press (2020)

    Google Scholar 

  16. Henriksen, P., Lomuscio, A.: DEEPSPLIT: an efficient splitting method for neural network verification via indirect effect analysis. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI21), pp. 2549–2555. ijcai.org (2021)

    Google Scholar 

  17. Jennings, N.R., et al.: Human-agent collectives. Commun. ACM 57(12), 80–88 (2014)

    Article  Google Scholar 

  18. Jiang, J., Leofante, F., Rago, A., Toni, F.: Formalising the robustness of counterfactual explanations for neural networks. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI23), pp. 14901–14909. AAAI Press (2023)

    Google Scholar 

  19. Johnson, T., et al.: ARCH-COMP20 category report: artificial intelligence and neural network control systems (AINNCS) for continuous and hybrid systems plants. In: Proceedings of the 7th International Workshop on Applied Verification of Continuous and Hybrid Systems (ARCH20), pp. 107–139. EasyChair (2020)

    Google Scholar 

  20. Karimi, A., Barthe, G., Balle, B., Valera, I.: Model-agnostic counterfactual explanations for consequential decisions. In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS20), pp. 895–905. PMLR (2020)

    Google Scholar 

  21. Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_5

    Chapter  Google Scholar 

  22. Kenny, E., Keane, M.: On generating plausible counterfactual and semi-factual explanations for deep learning. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence, AAAI21, pp. 11575–11585. AAAI Press (2021)

    Google Scholar 

  23. Kouvaros, P., et al.: Formal analysis of neural network-based systems in the aircraft domain. In: Huisman, M., Păsăreanu, C., Zhan, N. (eds.) FM 2021. LNCS, vol. 13047, pp. 730–740. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-90870-6_41

    Chapter  Google Scholar 

  24. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  25. Leofante, F., Botoeva, E., Rajani, V.: Counterfactual explanations and model multiplicity: a relational verification view. In: Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning (KR23) (2023, to appear)

    Google Scholar 

  26. Leofante, F., Lomuscio, A.: Towards robust contrastive explanations for human-neural multi-agent systems. In: Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS23), pp. 2343–2345. ACM (2023)

    Google Scholar 

  27. Leofante, F., Narodytska, N., Pulina, L., Tacchella, A.: Automated verification of neural networks: advances, challenges and perspectives. CoRR abs/1805.09938 (2018)

    Google Scholar 

  28. Liu, C., Arnon, T., Lazarus, C., Strong, C.A., Barrett, C.W., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optim. 4(3–4), 244–404 (2021)

    Article  Google Scholar 

  29. Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward ReLU neural networks. arXiv preprint arXiv:1706.07351 (2017)

  30. Van Looveren, A., Klaise, J.: Interpretable counterfactual explanations guided by prototypes. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12976, pp. 650–665. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86520-7_40

    Chapter  Google Scholar 

  31. McCloy, R., Byrne, R.: Semifactual “even if’’ thinking. Thinking Reason. 8(1), 41–67 (2002)

    Article  Google Scholar 

  32. Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  33. Mohammadi, K., Karimi, A., Barthe, G., Valera, I.: Scaling guarantees for nearest counterfactual explanations. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES21), pp. 177–187. ACM (2021)

    Google Scholar 

  34. Mohapatra, J., Weng, T., Chen, P., Liu, S., Daniel, L.: Towards verifying robustness of neural networks against A family of semantic perturbations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR20), pp. 241–249. IEEE (2020)

    Google Scholar 

  35. Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the International Conference on Fairness, Accountability, and Transparency (FAT*20), pp. 607–617. ACM (2020)

    Google Scholar 

  36. Pawelczyk, M., Agarwal, C., Joshi, S., Upadhyay, S., Lakkaraju, H.: Exploring counterfactual explanations through the lens of adversarial examples: a theoretical and empirical analysis. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS22). Proceedings of Machine Learning Research, vol. 151, pp. 4574–4594. PMLR (2022)

    Google Scholar 

  37. Pawelczyk, M., Broelemann, K., Kasneci, G.: On counterfactual explanations under predictive multiplicity. In: Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI20). Proceedings of Machine Learning Research, vol. 124, pp. 809–818. AUAI Press (2020)

    Google Scholar 

  38. Pawelczyk, M., Datta, T., van den Heuvel, J., Kasneci, G., Lakkaraju, H.: Probabilistically robust recourse: navigating the trade-offs between costs and robustness in algorithmic recourse. In: Proceedings of the 11th International Conference on Learning Representations (ICLR23). OpenReview.net (2023)

    Google Scholar 

  39. ProPublica: How We Analyzed the COMPAS Recidivism Algorithm (2016). https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

  40. Pulina, L., Tacchella, A.: An abstraction-refinement approach to verification of artificial neural networks. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 243–257. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14295-6_24

    Chapter  Google Scholar 

  41. Rosenfeld, A., Richardson, A.: Explainability in human-agent systems. Auton. Agents Multi Agent Syst. 33(6), 673–705 (2019)

    Article  Google Scholar 

  42. Russell, C.: Efficient search for diverse coherent explanations. In: Proceedings of the International Conference on Fairness, Accountability, and Transparency (FAT*19), pp. 20–28. ACM (2019)

    Google Scholar 

  43. Sharma, S., Henderson, J., Ghosh, J.: CERTIFAI: a common framework to provide explanations and analyse the fairness and robustness of black-box models. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES20), pp. 166–172. ACM (2020)

    Google Scholar 

  44. Slack, D., Hilgard, A., Lakkaraju, H., Singh, S.: Counterfactual explanations can be manipulated. In: Advances in Neural Information Processing Systems 34 (NeurIPS21), pp. 62–75 (2021)

    Google Scholar 

  45. Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The German traffic sign recognition benchmark: a multi-class classification competition. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN11), pp. 1453–1460. IEEE (2011)

    Google Scholar 

  46. Upadhyay, S., Joshi, S., Lakkaraju, H.: Towards robust and reliable algorithmic recourse. In: Advances in Neural Information Processing Systems 34 (NeurIPS21), pp. 16926–16937 (2021)

    Google Scholar 

  47. Ustun, B., Spangher, A., Liu, Y.: Actionable recourse in linear classification. In: Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*19), pp. 10–19. ACM (2019)

    Google Scholar 

  48. Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harv. JL Tech. 31, 841 (2017)

    Google Scholar 

  49. Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Efficient formal safety analysis of neural networks. In: Advances in Neural Information Processing Systems (NeurIPS18), pp. 6367–6377. Curran Associates, Inc. (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Leofante .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Leofante, F., Lomuscio, A. (2023). Robust Explanations for Human-Neural Multi-agent Systems with Formal Verification. In: Malvone, V., Murano, A. (eds) Multi-Agent Systems. EUMAS 2023. Lecture Notes in Computer Science(), vol 14282. Springer, Cham. https://doi.org/10.1007/978-3-031-43264-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43264-4_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43263-7

  • Online ISBN: 978-3-031-43264-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics