Robust Explanations for Human-Neural Multi-agent Systems with Formal Verification

Leofante, Francesco; Lomuscio, Alessio

doi:10.1007/978-3-031-43264-4_16

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14282))

Included in the following conference series:

European Conference on Multi-Agent Systems

475 Accesses
1 Citations

Abstract

The quality of explanations in human-agent interactions is fundamental to the development of trustworthy AI systems. In this paper we study the problem of generating robust contrastive explanations for human-neural multi-agent systems and introduce two novel verification-based algorithms to (i) identify non-robust explanations generated by other methods and (ii) generate contrastive explanations equipped with formal robustness certificates. We present an implementation and evaluate the effectiveness of the approach on two case studies involving neural agents trained on credit scoring and traffic sign recognition tasks.

Work partially supported by the DARPA Assured Autonomy programme (FA8750-18-C-0095), the UK Royal Academy of Engineering (CiET17/18-26) and an Imperial College Research Fellowship awarded to Leofante. This paper is an extended version of [26] presented at AAMAS 2023.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Akintunde, M., Botoeva, E., Kouvaros, P., Lomuscio, A.: Formal verification of neural agents in non-deterministic environments. J. Auton. Agents Multi-Agent Syst. 36(1) (2022)
Google Scholar
Barrett, S., Rosenfeld, A., Kraus, S., Stone, P.: Making friends on the fly: cooperating with new teammates. Artif. Intell. 242, 132–171 (2017)
Article MathSciNet MATH Google Scholar
Björkegren, D., Blumenstock, J., Knight, S.: Manipulation-proof machine learning. arXiv preprint arXiv:2004.03865 (2020)
Black, E., Wang, Z., Fredrikson, M.: Consistent counterfactuals for deep models. In: Proceedings of the International Conference on Learning Representations (ICLR22). OpenReview.net (2022)
Google Scholar
Botoeva, E., Kouvaros, P., Kronqvist, J., Lomuscio, A., Misener, R.: Efficient verification of neural networks via dependency analysis. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI20), pp. 3291–3299. AAAI Press (2020)
Google Scholar
Brix, C., Müller, M.N., Bak, S., Johnson, T.T., Liu, C.: First three years of the international verification of neural networks competition (VNN-COMP). arXiv preprint arXiv:2301.05815 (2023)
Byrne, R.: Counterfactuals in explainable artificial intelligence (XAI): evidence from human reasoning. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI19, pp. 6276–6282 (2019)
Google Scholar
Dhurandhar, A., et al.: Explanations based on the missing: towards contrastive explanations with pertinent negatives. In: Advances in Neural Information Processing Systems (NeurIPS18), pp. 590–601 (2018)
Google Scholar
Dutta, S., Long, J., Mishra, S., Tilli, C., Magazzeni, D.: Robust counterfactual explanations for tree-based ensembles. In: Proceedings of the International Conference on Machine Learning (ICML22). Proceedings of Machine Learning Research, vol. 162, pp. 5742–5756. PMLR (2022)
Google Scholar
FICO Community: Explainable Machine Learning Challenge (2019). https://community.fico.com/s/explainable-machine-learning-challenge
Guidotti, D., Leofante, F., Pulina, L., Tacchella, A.: Verification of neural networks: enhancing scalability through pruning. In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI20), pp. 2505–2512. IOS Press (2020)
Google Scholar
Guidotti, D., Pulina, L., Tacchella, A.: pyNeVer: a framework for learning and verification of neural networks. In: Hou, Z., Ganesh, V. (eds.) ATVA 2021. LNCS, vol. 12971, pp. 357–363. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88885-5_23
Chapter Google Scholar
Hancox-Li, L.: Robustness in machine learning explanations: does it matter? In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT*20), pp. 640–647. ACM (2020)
Google Scholar
Henriksen, P., Hammernik, K., Rueckert, D., Lomuscio, A.: Bias field robustness verification of large neural image classifiers. In: Proceedings of the 32nd British Machine Vision Conference (BMVC21). BMVA Press (2021)
Google Scholar
Henriksen, P., Lomuscio, A.: Efficient neural network verification via adaptive refinement and adversarial search. In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI20), pp. 2513–2520. IOS Press (2020)
Google Scholar
Henriksen, P., Lomuscio, A.: DEEPSPLIT: an efficient splitting method for neural network verification via indirect effect analysis. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI21), pp. 2549–2555. ijcai.org (2021)
Google Scholar
Jennings, N.R., et al.: Human-agent collectives. Commun. ACM 57(12), 80–88 (2014)
Article Google Scholar
Jiang, J., Leofante, F., Rago, A., Toni, F.: Formalising the robustness of counterfactual explanations for neural networks. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI23), pp. 14901–14909. AAAI Press (2023)
Google Scholar
Johnson, T., et al.: ARCH-COMP20 category report: artificial intelligence and neural network control systems (AINNCS) for continuous and hybrid systems plants. In: Proceedings of the 7th International Workshop on Applied Verification of Continuous and Hybrid Systems (ARCH20), pp. 107–139. EasyChair (2020)
Google Scholar
Karimi, A., Barthe, G., Balle, B., Valera, I.: Model-agnostic counterfactual explanations for consequential decisions. In: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS20), pp. 895–905. PMLR (2020)
Google Scholar
Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_5
Chapter Google Scholar
Kenny, E., Keane, M.: On generating plausible counterfactual and semi-factual explanations for deep learning. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence, AAAI21, pp. 11575–11585. AAAI Press (2021)
Google Scholar
Kouvaros, P., et al.: Formal analysis of neural network-based systems in the aircraft domain. In: Huisman, M., Păsăreanu, C., Zhan, N. (eds.) FM 2021. LNCS, vol. 13047, pp. 730–740. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-90870-6_41
Chapter Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Leofante, F., Botoeva, E., Rajani, V.: Counterfactual explanations and model multiplicity: a relational verification view. In: Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning (KR23) (2023, to appear)
Google Scholar
Leofante, F., Lomuscio, A.: Towards robust contrastive explanations for human-neural multi-agent systems. In: Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS23), pp. 2343–2345. ACM (2023)
Google Scholar
Leofante, F., Narodytska, N., Pulina, L., Tacchella, A.: Automated verification of neural networks: advances, challenges and perspectives. CoRR abs/1805.09938 (2018)
Google Scholar
Liu, C., Arnon, T., Lazarus, C., Strong, C.A., Barrett, C.W., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optim. 4(3–4), 244–404 (2021)
Article Google Scholar
Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward ReLU neural networks. arXiv preprint arXiv:1706.07351 (2017)
Van Looveren, A., Klaise, J.: Interpretable counterfactual explanations guided by prototypes. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12976, pp. 650–665. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86520-7_40
Chapter Google Scholar
McCloy, R., Byrne, R.: Semifactual “even if’’ thinking. Thinking Reason. 8(1), 41–67 (2002)
Article Google Scholar
Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2019)
Article MathSciNet MATH Google Scholar
Mohammadi, K., Karimi, A., Barthe, G., Valera, I.: Scaling guarantees for nearest counterfactual explanations. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES21), pp. 177–187. ACM (2021)
Google Scholar
Mohapatra, J., Weng, T., Chen, P., Liu, S., Daniel, L.: Towards verifying robustness of neural networks against A family of semantic perturbations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR20), pp. 241–249. IEEE (2020)
Google Scholar
Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the International Conference on Fairness, Accountability, and Transparency (FAT*20), pp. 607–617. ACM (2020)
Google Scholar
Pawelczyk, M., Agarwal, C., Joshi, S., Upadhyay, S., Lakkaraju, H.: Exploring counterfactual explanations through the lens of adversarial examples: a theoretical and empirical analysis. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS22). Proceedings of Machine Learning Research, vol. 151, pp. 4574–4594. PMLR (2022)
Google Scholar
Pawelczyk, M., Broelemann, K., Kasneci, G.: On counterfactual explanations under predictive multiplicity. In: Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI20). Proceedings of Machine Learning Research, vol. 124, pp. 809–818. AUAI Press (2020)
Google Scholar
Pawelczyk, M., Datta, T., van den Heuvel, J., Kasneci, G., Lakkaraju, H.: Probabilistically robust recourse: navigating the trade-offs between costs and robustness in algorithmic recourse. In: Proceedings of the 11th International Conference on Learning Representations (ICLR23). OpenReview.net (2023)
Google Scholar
ProPublica: How We Analyzed the COMPAS Recidivism Algorithm (2016). https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
Pulina, L., Tacchella, A.: An abstraction-refinement approach to verification of artificial neural networks. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 243–257. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14295-6_24
Chapter Google Scholar
Rosenfeld, A., Richardson, A.: Explainability in human-agent systems. Auton. Agents Multi Agent Syst. 33(6), 673–705 (2019)
Article Google Scholar
Russell, C.: Efficient search for diverse coherent explanations. In: Proceedings of the International Conference on Fairness, Accountability, and Transparency (FAT*19), pp. 20–28. ACM (2019)
Google Scholar
Sharma, S., Henderson, J., Ghosh, J.: CERTIFAI: a common framework to provide explanations and analyse the fairness and robustness of black-box models. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES20), pp. 166–172. ACM (2020)
Google Scholar
Slack, D., Hilgard, A., Lakkaraju, H., Singh, S.: Counterfactual explanations can be manipulated. In: Advances in Neural Information Processing Systems 34 (NeurIPS21), pp. 62–75 (2021)
Google Scholar
Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The German traffic sign recognition benchmark: a multi-class classification competition. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN11), pp. 1453–1460. IEEE (2011)
Google Scholar
Upadhyay, S., Joshi, S., Lakkaraju, H.: Towards robust and reliable algorithmic recourse. In: Advances in Neural Information Processing Systems 34 (NeurIPS21), pp. 16926–16937 (2021)
Google Scholar
Ustun, B., Spangher, A., Liu, Y.: Actionable recourse in linear classification. In: Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*19), pp. 10–19. ACM (2019)
Google Scholar
Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harv. JL Tech. 31, 841 (2017)
Google Scholar
Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Efficient formal safety analysis of neural networks. In: Advances in Neural Information Processing Systems (NeurIPS18), pp. 6367–6377. Curran Associates, Inc. (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing, Imperial College London, London, UK
Francesco Leofante & Alessio Lomuscio

Authors

Francesco Leofante
View author publications
You can also search for this author in PubMed Google Scholar
Alessio Lomuscio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Leofante .

Editor information

Editors and Affiliations

Télécom Paris, Paris, France
Vadim Malvone
Universitá degli Studi di Napoli Federico II, Naples, Italy
Aniello Murano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leofante, F., Lomuscio, A. (2023). Robust Explanations for Human-Neural Multi-agent Systems with Formal Verification. In: Malvone, V., Murano, A. (eds) Multi-Agent Systems. EUMAS 2023. Lecture Notes in Computer Science(), vol 14282. Springer, Cham. https://doi.org/10.1007/978-3-031-43264-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-43264-4_16
Published: 07 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43263-7
Online ISBN: 978-3-031-43264-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robust Explanations for Human-Neural Multi-agent Systems with Formal Verification