DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks

Jabbour, Sarah; Kondas, Gregory; Kazerooni, Ella; Sjoding, Michael; Fouhey, David; Wiens, Jenna

doi:10.1007/978-3-031-73039-9_3

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15122))

Included in the following conference series:

European Conference on Computer Vision

267 Accesses

Abstract

We propose a permutation-based explanation method for image classifiers. Current image-model explanations like activation maps are limited to instance-based explanations in the pixel space, making it difficult to understand global model behavior. In contrast, permutation based explanations for tabular data classifiers measure feature importance by comparing model performance on data before and after permuting a feature. We propose an explanation method for image-based models that permutes interpretable concepts across dataset images. Given a dataset of images labeled with specific concepts like captions, we permute a concept across examples in the text space and then generate images via a text-conditioned diffusion model. Feature importance is then reflected by the change in model performance relative to unpermuted data. When applied to a set of concepts, the method generates a ranking of feature importance. We show this approach recovers underlying model feature importance on synthetic and real-world image classification tasks.

D. Fouhey and J. Wiens—co-senior authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Explaining Image Classifiers by Removing Input Features Using Generative Models

Black Box Explanation by Learning Image Exemplars in the Latent Feature Space

ciu.image: An R Package for Explaining Image Classification with Contextual Importance and Utility

References

Adebayo, J., Muelly, M., Abelson, H., Kim, B.: Post hoc explanations may be ineffective for detecting unknown spurious correlation. In: International Conference on Learning Representations (2021)
Google Scholar
Adebayo, J., Muelly, M., Liccardi, I., Kim, B.: Debugging tests for model explanations (2020). arXiv preprint arXiv:2011.05429
Altmann, A., Toloşi, L., Sander, O., Lengauer, T.: Permutation importance: a corrected feature importance measure. Bioinformatics 26(10), 1340–1347 (2010)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Google Scholar
Bring, J.: How to standardize regression coefficients. Am. Stat. 48(3), 209–213 (1994)
Article Google Scholar
Cohen, I., et al.: Pearson correlation coefficient. Noise Reduction Speech Process. 1–4 (2009)
Google Scholar
DeGrave, A.J., Cai, Z.R., Janizek, J.D., Daneshjou, R., Lee, S.I.: Dissection of medical AI reasoning processes via physician and generative-AI collaboration. medRxiv (2023)
Google Scholar
Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019)
MathSciNet Google Scholar
Jabbour, S., Fouhey, D., Kazerooni, E., Sjoding, M.W., Wiens, J.: Deep learning applied to chest X-rays: exploiting and preventing shortcuts. In: Machine Learning for Healthcare Conference, pp. 750–782. PMLR (2020)
Google Scholar
Jabbour, S., et al.: Measuring the impact of AI in the diagnosis of hospitalized patients: a randomized clinical vignette survey study. JAMA 330(23), 2275–2284 (2023)
Article Google Scholar
Johnson, A., Pollard, T., Mark, R., Berkowitz, S., Horng, S.: MIMIC-CXR database (version 2.0. 0). PhysioNet 10, C2JT1Q (2019)
Google Scholar
Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi, L.A., Mark, R.: Mimic-iv. PhysioNet (2020). Available online at: https://physionet.org/content/mimiciv/1.0/(Accessed 23 Aug 2021)
Johnson, A.E., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6(1), 317 (2019)
Google Scholar
Kim, G., Kwon, T., Ye, J.C.: Diffusionclip: text-guided diffusion models for robust image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2426–2435 (2022)
Google Scholar
Koh, P.W., et al.: Concept bottleneck models. In: International Conference on Machine Learning, pp. 5338–5348. PMLR (2020)
Google Scholar
Lin, T.Y., et al.: Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. pp. 740–755. Springer (2014)
Google Scholar
Losch, M., Fritz, M., Schiele, B.: Interpretability beyond classification output: Semantic bottleneck networks (2019). arXiv preprint arXiv:1907.10882
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4765–4774 (2017)
Google Scholar
Morales Rodríguez, D., Pegalajar Cuellar, M., Morales, D.P.: On the fusion of soft-decision-trees and concept-based models. Available at SSRN 4402768 (2023)
Google Scholar
Nicodemus, K.K., Malley, J.D.: Predictor correlation impacts machine learning algorithms: implications for genomic studies. Bioinformatics 25(15), 1884–1890 (2009)
Article Google Scholar
Oikarinen, T., Das, S., Nguyen, L.M., Weng, T.W.: Label-free concept bottleneck models (2023). arXiv preprint arXiv:2304.06129
Prabhu, V., Yenamandra, S., Chattopadhyay, P., Hoffman, J.: Lance: Stress-testing visual models by generating language-guided counterfactual images (2023). arXiv preprint arXiv:2305.19164
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.061251(2), 3 (2022)
Rao, C.R., Miller, J.P., Rao, D.: Essential Statistical Methods for Medical Statistics. North Holland Amsterdam, The Netherlands (2011)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: "why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Google Scholar
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)
Google Scholar
Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. Adv. Neural. Inf. Process. Syst. 35, 36479–36494 (2022)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Google Scholar
Strobl, C., Boulesteix, A.L., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. BMC Bioinf. 9, 1–11 (2008)
Article Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat Methodol. 58(1), 267–288 (1996)
Article MathSciNet Google Scholar
Tjoa, E., Guan, C.: A survey on explainable artificial intelligence (xai): toward medical xai. IEEE Trans. Neural Netw. Learn. Syst. 32(11), 4793–4813 (2020)
Article Google Scholar
Wang, B., Li, L., Nakashima, Y., Nagahara, H.: Learning bottleneck concepts in image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10962–10971 (2023)
Google Scholar
Wei, P., Lu, Z., Song, J.: Variable importance analysis: a comprehensive review. Reliab. Eng. Syst. Saf. 142, 399–432 (2015)
Article Google Scholar
Wong, L.J., McPherson, S.: Explainable neural network-based modulation classification via concept bottleneck models. In: 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0191–0196. IEEE (2021)
Google Scholar
Yang, Y., Panagopoulou, A., Zhou, S., Jin, D., Callison-Burch, C., Yatskar, M.: Language in a bottle: language model guided concept bottlenecks for interpretable image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19187–19197 (2023)
Google Scholar
Yuksekgonul, M., Wang, M., Zou, J.: Post-hoc concept bottleneck models (2022). arXiv preprint arXiv:2205.15480
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar

Download references

Acknowledgements

We thank Donna Tjandra, Fahad Kamran, Jung Min Lee, Meera Krishnamoorthy, Michael Ito, Mohamed El Banani, Shengpu Tang, Stephanie Shepard, Trenton Chang and Winston Chen for their helpful conversations and feedback. This work was supported by grant R01 HL158626 from the National Heart, Lung, and Blood Institute (NHLBI).

Author information

Authors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Sarah Jabbour, Gregory Kondas, Ella Kazerooni, Michael Sjoding & Jenna Wiens
New York University, New York, NY, USA
David Fouhey

Authors

Sarah Jabbour
View author publications
You can also search for this author in PubMed Google Scholar
Gregory Kondas
View author publications
You can also search for this author in PubMed Google Scholar
Ella Kazerooni
View author publications
You can also search for this author in PubMed Google Scholar
Michael Sjoding
View author publications
You can also search for this author in PubMed Google Scholar
David Fouhey
View author publications
You can also search for this author in PubMed Google Scholar
Jenna Wiens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sarah Jabbour .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 4800 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jabbour, S., Kondas, G., Kazerooni, E., Sjoding, M., Fouhey, D., Wiens, J. (2025). DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15122. Springer, Cham. https://doi.org/10.1007/978-3-031-73039-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-73039-9_3
Published: 31 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73038-2
Online ISBN: 978-3-031-73039-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks