Abstract
We propose a permutation-based explanation method for image classifiers. Current image-model explanations like activation maps are limited to instance-based explanations in the pixel space, making it difficult to understand global model behavior. In contrast, permutation based explanations for tabular data classifiers measure feature importance by comparing model performance on data before and after permuting a feature. We propose an explanation method for image-based models that permutes interpretable concepts across dataset images. Given a dataset of images labeled with specific concepts like captions, we permute a concept across examples in the text space and then generate images via a text-conditioned diffusion model. Feature importance is then reflected by the change in model performance relative to unpermuted data. When applied to a set of concepts, the method generates a ranking of feature importance. We show this approach recovers underlying model feature importance on synthetic and real-world image classification tasks.
D. Fouhey and J. Wiens—co-senior authors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adebayo, J., Muelly, M., Abelson, H., Kim, B.: Post hoc explanations may be ineffective for detecting unknown spurious correlation. In: International Conference on Learning Representations (2021)
Adebayo, J., Muelly, M., Liccardi, I., Kim, B.: Debugging tests for model explanations (2020). arXiv preprint arXiv:2011.05429
Altmann, A., Toloşi, L., Sander, O., Lengauer, T.: Permutation importance: a corrected feature importance measure. Bioinformatics 26(10), 1340–1347 (2010)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Bring, J.: How to standardize regression coefficients. Am. Stat. 48(3), 209–213 (1994)
Cohen, I., et al.: Pearson correlation coefficient. Noise Reduction Speech Process. 1–4 (2009)
DeGrave, A.J., Cai, Z.R., Janizek, J.D., Daneshjou, R., Lee, S.I.: Dissection of medical AI reasoning processes via physician and generative-AI collaboration. medRxiv (2023)
Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019)
Jabbour, S., Fouhey, D., Kazerooni, E., Sjoding, M.W., Wiens, J.: Deep learning applied to chest X-rays: exploiting and preventing shortcuts. In: Machine Learning for Healthcare Conference, pp. 750–782. PMLR (2020)
Jabbour, S., et al.: Measuring the impact of AI in the diagnosis of hospitalized patients: a randomized clinical vignette survey study. JAMA 330(23), 2275–2284 (2023)
Johnson, A., Pollard, T., Mark, R., Berkowitz, S., Horng, S.: MIMIC-CXR database (version 2.0. 0). PhysioNet 10, C2JT1Q (2019)
Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi, L.A., Mark, R.: Mimic-iv. PhysioNet (2020). Available online at: https://physionet.org/content/mimiciv/1.0/(Accessed 23 Aug 2021)
Johnson, A.E., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(1), 317 (2019)
Kim, G., Kwon, T., Ye, J.C.: Diffusionclip: text-guided diffusion models for robust image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2426–2435 (2022)
Koh, P.W., et al.: Concept bottleneck models. In: International Conference on Machine Learning, pp. 5338–5348. PMLR (2020)
Lin, T.Y., et al.: Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. pp. 740–755. Springer (2014)
Losch, M., Fritz, M., Schiele, B.: Interpretability beyond classification output: Semantic bottleneck networks (2019). arXiv preprint arXiv:1907.10882
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017)
Morales Rodríguez, D., Pegalajar Cuellar, M., Morales, D.P.: On the fusion of soft-decision-trees and concept-based models. Available at SSRN 4402768 (2023)
Nicodemus, K.K., Malley, J.D.: Predictor correlation impacts machine learning algorithms: implications for genomic studies. Bioinformatics 25(15), 1884–1890 (2009)
Oikarinen, T., Das, S., Nguyen, L.M., Weng, T.W.: Label-free concept bottleneck models (2023). arXiv preprint arXiv:2304.06129
Prabhu, V., Yenamandra, S., Chattopadhyay, P., Hoffman, J.: Lance: Stress-testing visual models by generating language-guided counterfactual images (2023). arXiv preprint arXiv:2305.19164
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.061251(2), 3 (2022)
Rao, C.R., Miller, J.P., Rao, D.: Essential Statistical Methods for Medical Statistics. North Holland Amsterdam, The Netherlands (2011)
Ribeiro, M.T., Singh, S., Guestrin, C.: "why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. Adv. Neural. Inf. Process. Syst. 35, 36479–36494 (2022)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Strobl, C., Boulesteix, A.L., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. BMC Bioinf. 9, 1–11 (2008)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat Methodol. 58(1), 267–288 (1996)
Tjoa, E., Guan, C.: A survey on explainable artificial intelligence (xai): toward medical xai. IEEE Trans. Neural Netw. Learn. Syst. 32(11), 4793–4813 (2020)
Wang, B., Li, L., Nakashima, Y., Nagahara, H.: Learning bottleneck concepts in image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10962–10971 (2023)
Wei, P., Lu, Z., Song, J.: Variable importance analysis: a comprehensive review. Reliab. Eng. Syst. Saf. 142, 399–432 (2015)
Wong, L.J., McPherson, S.: Explainable neural network-based modulation classification via concept bottleneck models. In: 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0191–0196. IEEE (2021)
Yang, Y., Panagopoulou, A., Zhou, S., Jin, D., Callison-Burch, C., Yatskar, M.: Language in a bottle: language model guided concept bottlenecks for interpretable image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19187–19197 (2023)
Yuksekgonul, M., Wang, M., Zou, J.: Post-hoc concept bottleneck models (2022). arXiv preprint arXiv:2205.15480
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Acknowledgements
We thank Donna Tjandra, Fahad Kamran, Jung Min Lee, Meera Krishnamoorthy, Michael Ito, Mohamed El Banani, Shengpu Tang, Stephanie Shepard, Trenton Chang and Winston Chen for their helpful conversations and feedback. This work was supported by grant R01 HL158626 from the National Heart, Lung, and Blood Institute (NHLBI).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jabbour, S., Kondas, G., Kazerooni, E., Sjoding, M., Fouhey, D., Wiens, J. (2025). DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15122. Springer, Cham. https://doi.org/10.1007/978-3-031-73039-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-73039-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73038-2
Online ISBN: 978-3-031-73039-9
eBook Packages: Computer ScienceComputer Science (R0)