Abstract
For humans to trust in artificial intelligence (AI) systems, it is essential for machine learning (ML) models to be interpretable to users. For example, the judicial process requires that AI conclusions must be rigorous and absolutely interpretable. In this paper, we propose a novel approach, VAE-SLIME, for providing stable local interpretable model-agnostic explanations (SLIME) based on a variational autoencoder (VAE). LIME is a technique that explains the predictions of any classifier in an interpretable and faithful manner. Despite the great success of LIME, the most popular method in this category, it has several disadvantages due to its random perturbation-based sampling method. The VAE-SLIME proposed in this paper is specifically designed to address the lack of stability and local fidelity exhibited by LIME for tabular data. VAE-SLIME first employs fixed noise to replace the random Gaussian noise used by the reparameterization trick of the VAE. Then, it uses this new VAE model instead of random perturbation method to generate stable samples. By considering the sequential relationship and flipping of features, a novel explanation stability evaluation metric, the feature sequence stability index (FSSI), is introduced to accurately evaluate the stability of explanations. In a comparison with 6 state-of-the-art approaches on 7 commonly used tabular datasets, the experimental results show beyond doubt that the explanations produced by our approach are most stable, and its local fidelity is 65.17% higher than that of other approaches on average.
Similar content being viewed by others
Code Availability
The code will be available at https://github.com/yuhongcqupt/VAE-SLIME.
References
Arrieta AB, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, García S, Gil-López S, Molina D, Benjamins R et al (2020) Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Inf Fusion 58:82–115
Molnar C, Interpretable machine learning (2020)
Ribeiro MT, Singh S, Guestrin C, (2016) “why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135– 1144
Modhukur V, Sharma S, Mondal M, Lawarde A, Kask K, Sharma R, Salumets A (2021) Machine learning approaches to classify primary and metastatic cancers using tissue of origin-based dna methylation profiles. Cancers 13(15):3768
Pan P, Li Y, Xiao Y, Han B, Su L, Su M, Li Y, Zhang S, Jiang D, Chen X et al (2020) Prognostic assessment of covid-19 in the intensive care unit by machine learning methods: model development and validation. J Med Internet Res 22(11):23128
Schultebraucks K, Choi KW, Galatzer-Levy IR, Bonanno GA (2021) Discriminating heterogeneous trajectories of resilience and depression after major life stressors using polygenic scores. JAMA Psychiatry 78(7):744–752
Fan Y, Li D, Liu Y, Feng M, Chen Q, Wang R (2021) Toward better prediction of recurrence for cushing disease: a factorization-machine based neural approach. Int J Mach Learn Cybern 12(3):625–633
Nóbrega C, Marinho LB (2019)Towards explaining recommendations through local surrogate models. In: Proceedings of the 34th ACM/SIGAPP symposium on applied computing, pp 1671– 1678
Zhu F, Jiang M, Qiu Y, Sun C, Wang M (2019) RSLIME: an efficient feature importance analysis approach for industrial recommendation systems. In: International Joint Conference on Neural Networks, IJCNN 2019 Budapest, Hungary, July 14-19, 2019, pp 1–6
Zhou Z, Hooker G, Wang F (2021) S-lime: Stabilized-lime for model explanation. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp 2429– 2438
Zafar MR, Khan N (2021) Deterministic local interpretable model-agnostic explanations for stable explainability. Mach Learn Knowl Extr 3(3):525–541
Zhao X, Huang W, Huang X, Robu V, Flynn D (2021) Baylime: Bayesian local interpretable model-agnostic explanations. In: Proceedings of the thirty-seventh conference on uncertainty in artificial intelligence vol 161, pp 887–896
Shankaranarayana SM, Runje D (2019) ALIME: autoencoder based approach for local interpretability. Intell Data Eng Autom Learn 11871:454–463
Schockaert C, Macher V, Schmitz A (2020) Vae-lime: deep generative model based approach for local data-driven model interpretability applied to the ironmaking industry. arXiv:2007.10256
Slack D, Hilgard S, Jia E, Singh S, Lakkaraju H (2020) Fooling lime and shap: Adversarial attacks on post hoc explanation methods. In: Proceedings of the AAAI/ACM conference on AI, ethics, and society, pp 180– 186
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: 2nd international conference on learning representations
Visani G, Bagli E, Chesani F, Poluzzi A, Capuzzo D (2020) Statistical stability indices for lime: Obtaining reliable explanations for machine learning models. J Oper Res Soc, 1–11
Lee E, Braines D, Stiffler M, Hudler A, Harborne D (2019) Developing the sensitivity of lime for better machine learning explanation. Artificial intelligence and machine learning for multi-domain operations applications 11006:349–356
Ribeiro MT, Singh S, Guestrin C (2018) Anchors: High-precision model-agnostic explanations. In: Proceedings of the AAAI conference on artificial intelligence, 32
Garreau D, von Luxburg U (2020) Explaining the explainer: a first theoretical analysis of lime. In: The 23rd International Conference on artificial intelligence and statistics, vol 108, pp 1287– 1296
Mardaoui D, Garreau D (2021) An analysis of lime for text data. In: International conference on artificial intelligence and statistics, pp 3493– 3501
Garreau D, Mardaoui D (2021) What does lime really see in images? In: International conference on machine learning, pp 3620– 3629
Garreau D (2023) Theoretical analysis of lime. In: Explainable Deep Learning AI, pp 293– 316
Molnar C, Gruber S, Koer P, (2020) Limitations of interpretable machine learning methods
Slack D, Hilgard S, Jia E, Singh S, Lakkaraju H (2020) Fooling lime and shap: adversarial attacks on post hoc explanation methods. In: Proceedings of the AAAI/ACM conference on AI, ethics, and society, pp 180– 186
Goode K, Hofmann H (2021) Visual diagnostics of an explainer model: Tools for the assessment of lime explanations. Stat Anal Data Min: ASA Data Sci J 14(2):185–200
Gramegna A, Giudici P (2021) Shap and lime: an evaluation of discriminative power in credit risk. Front Artif Intell 4:752558
Pandey P, Rai A, Mitra M (2022) Explainable 1-d convolutional neural network for damage detection using lamb wave. Mech Syst Signal Process 164:108220
Arteaga C, Paz A, Park J (2020) Injury severity on traffic crashes: A text mining with an interpretable machine-learning approach. Saf Sci 132:104988
Liu X, Xu Y, Li J, Ong X, Ibrahim SA, Buonassisi T, Wang X (2021) A robust low data solution: Dimension prediction of semiconductor nanorods. Comput Chem Eng 150:107315
Hung S-C, Wu H-C, Tseng M-H (2020) Remote sensing scene classification and explanation using rsscnet and lime. Appl Sci 10(18):6151
Onchis DM, Gillich G-R (2021) Stable and explainable deep learning damage prediction for prismatic cantilever steel beam. Comput Ind 125:103359
Wu J, Plataniotis K, Liu L, Amjadian E, Lawryshyn Y (2023) Interpretation for variational autoencoder used to generate financial synthetic tabular data. Algorithms 16(2):121
Rasouli P, Yu IC (2020) Explan: explaining black-box classifiers using adaptive neighborhood generation. In: 2020 International joint conference on neural networks (IJCNN), pp 1– 9
Lundberg SM, Lee S (2017) A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems 30: Annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp 4765– 4774
Funding
This work was jointly supported by the National Natural Science Foundation of China (62136002, 62233018, 62221005), the Natural Science Foundation of Chongqing (cstc2022ycjhbgzxm0004) and the Key Cooperation Project of Chongqing Municipal Education Commission (HZ2021008).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Xu Xiang and Hong Yu contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xiang, X., Yu, H., Wang, Y. et al. Stable local interpretable model-agnostic explanations based on a variational autoencoder. Appl Intell 53, 28226–28240 (2023). https://doi.org/10.1007/s10489-023-04942-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04942-5