Abstract
Explaining the decisions of deep learning models is critical for their adoption in medical practice. In this work, we propose to unify existing adversarial explanation methods and path-based feature importance attribution approaches. We consider a path between the input image and a generated adversary and associate a weight depending on the model output variations along this path. We validate our attribution methods on two medical classification tasks. We demonstrate significant improvement compared to state-of-the-art methods in both feature importance attribution and localization performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
As in [3, 23], consider an encoder(E)-generator(G) architecture. E (resp. G) maps from (resp. to) the space of real images (\(\subset \mathbb {R}^n\)) to (resp. from) an encoding space (\(\subset \mathbb {R}^k\)). The real images path \(\gamma \) can for instance be defined as \(\gamma : \lambda \rightarrow G(z_{{\mathbf {x}}}+ \lambda (z_{{\mathbf {x_a}}}- z_{{\mathbf {x}}}))\), where \(z_{{\mathbf {x}}}= E({{\mathbf {x}}})\) and \(z_{{\mathbf {x_a}}}= E({{\mathbf {x_a}}})\). It follows that \(\frac{d {{\boldsymbol{\gamma }}}}{d \lambda } = \frac{\partial G}{\partial z}(z_{{\mathbf {x}}}+ \lambda (z_{{\mathbf {x_a}}}- z_{{\mathbf {x}}}))(z_{{\mathbf {x_a}}}- z_{{\mathbf {x}}})\). But \(\frac{\partial G}{\partial z}\) is a vector of dimension n.k which easily reaches a magnitude of \(10^9\) that is to be computed at several values of \(\lambda \).
References
Bien, N., et al.: Deep-learning-assisted diagnosis for knee magnetic resonance imaging: development and retrospective validation of MRNet. PLoS Med. 15, 1002699 (2018)
Chang, C.H., Creager, E., Goldenberg, A., Duvenaud, D.K.: Explaining image classifiers by counterfactual generation. In: ICLR (2019)
Charachon, M., Cournède, P., Hudelot, C., Ardon, R.: Leveraging conditional generative models in a general explanation framework of classifier decisions. In: ArXiv (2021)
Charachon, M., Hudelot, C., Cournède, P.H., Ruppli, C., Ardon, R.: Combining similarity and adversarial learning to generate visual explanation: Application to medical image classification. In: ICPR (2020)
Dabkowski, P., Gal, Y.: Real time image saliency for black box classifiers. In: NIPS (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Elliott, A., Law, S., Russell, C.: Adversarial perturbations on the perceptual ball(2019). ArXiv arXiv:1912.09405
Esteva, A., et al.: Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017)
Fong, R.C., Vedaldi, A.: Interpretable explanations of black boxes by meaningful perturbation. In: ICCV (2017)
Goyal, Y., Wu, Z., Ernst, J., Batra, D., Parikh, D., Lee, S.: Counterfactual visual explanations. In: ICML (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: ECCV (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Lim, D., Lee, H., Kim, S.: Building reliable explanations of unreliable neural networks: locally smoothing perspective of model interpretation. In: CVPR (2021)
Pratt, H., Coenen, F., Broadbent, D.M., Harding, S.P., Zheng, Y.: Convolutional neural networks for diabetic retinopathy. Procedia Comput. Sci. 90, 200–205 (2016). https://doi.org/10.1016/j.procs.2016.07.014, https://www.sciencedirect.com/science/article/pii/S1877050916311929, 20th Conference on Medical Image Understanding and Analysis (MIUA 2016)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: MICCAI (2015)
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat, Mach. Intell. 1(5), 206–215 (2019)
Samek, W., Binder, A., Montavon, G., Lapuschkin, S., Müller, K.: Evaluating the visualization of what a deep neural network has learned. In: IEEE Transactions on Neural Networks and Learning Systems (2017)
Seah, J.C.Y., Tang, h.J.S.N., Kitchen, A., Gaillard, F., Dixon, A.F.: Chest radiographs in congestive heart failure: visualizing neural network learning. Radiology 290, 514-522 (2019)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: ICCV (2017)
Siddiquee, M.R., et al.: Learning fixed points in generative adversarial networks: from image-to-image translation to disease detection and localization. In: ICCV, pp. 191–200 (2019)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: ICLR (2014)
Simpson, A., et al.: A large annotated medical image dataset for the development and evaluation of segmentation algorithms. ArXiv arXiv:1902.09063 (2019)
Singla, S., Pollack, B., Chen, J., Batmanghelich, K.: Explanation by progressive exaggeration. In: ICLR (2020)
Smilkov, D., Thorat, N., Kim, B., Viégas, F.B., Wattenberg, M.: Smoothgrad: removing noise by adding noise. ArXiv arXiv:1706.03825 (2017)
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.A.: Striving for simplicity: the all convolutional net. In: ICLR (2015). arXiv:1412.6806
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: ICML (2017)
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: CVPR (2017)
Woods, W., Chen, J., Teuscher, C.: Adversarial explanations for understanding image classification decisions and improved neural network robustness. Nat. Mach. Intell. 1 (2019)
Xu, S.Z., Venugopalan, S., Sundararajan, M.: Attribution in scale and space. In: CVPR (2020)
Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Charachon, M., Cournède, PH., Hudelot, C., Ardon, R. (2021). Visual Explanation by Unifying Adversarial Generation and Feature Importance Attributions. In: Reyes, M., et al. Interpretability of Machine Intelligence in Medical Image Computing, and Topological Data Analysis and Its Applications for Medical Data. IMIMIC TDA4MedicalData 2021 2021. Lecture Notes in Computer Science(), vol 12929. Springer, Cham. https://doi.org/10.1007/978-3-030-87444-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-87444-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87443-8
Online ISBN: 978-3-030-87444-5
eBook Packages: Computer ScienceComputer Science (R0)