Model Guidance via Explanations Turns Image Classifiers into Segmentation Models

Yu, Xiaoyan; Franzen, Jannik; Samek, Wojciech; Höhne, Marina M.-C.; Kainmueller, Dagmar

doi:10.1007/978-3-031-63797-1_7

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2154))

Included in the following conference series:

World Conference on Explainable Artificial Intelligence

698 Accesses

Abstract

Heatmaps generated on inputs of image classification networks via explainable AI methods like Grad-CAM and LRP have been observed to resemble segmentations of input images in many cases. Consequently, heatmaps have also been leveraged for achieving weakly supervised segmentation with image-level supervision. On the other hand, losses can be imposed on differentiable heatmaps, which has been shown to serve for (1) improving heatmaps to be more human-interpretable, (2) regularization of networks towards better generalization, (3) training diverse ensembles of networks, and (4) for explicitly ignoring confounding input features. Due to the latter use case, the paradigm of imposing losses on heatmaps is often referred to as “Right for the right reasons”. We unify these two lines of research by investigating semi-supervised segmentation as a novel use case for the Right for the Right Reasons paradigm. First, we show formal parallels between differentiable heatmap architectures and standard encoder-decoder architectures for image segmentation. Second, we show that such differentiable heatmap architectures yield competitive results when trained with standard segmentation losses. Third, we show that such architectures allow for training with weak supervision in the form of image-level labels and small numbers of pixel-level labels, outperforming comparable encoder-decoder models. Code is available at https://github.com/Kainmueller-Lab/TW-autoencoder.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Distilling ensemble of explanations for weakly-supervised pre-training of image segmentation models

Article 09 June 2022

A survey of semi- and weakly supervised semantic segmentation of images

Article 06 December 2019

Weakly-Supervised Semantic Segmentation by Redistributing Region Scores Back to the Pixels

References

Achtibat, R., et al.: From attribution maps to human-understandable explanations through concept relevance propagation. Nat. Mach. Intell. 5, 1006–1019 (2023)
Article Google Scholar
Alvi, M., Zisserman, A., Nellåker, C.: Turning a blind eye: explicit removal of biases and variation from deep neural network embeddings. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11129, pp. 556–572. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11009-3_34
Chapter Google Scholar
Ancona, M., Ceolini, E., Öztireli, C., Gross, M.: Towards better understanding of gradient-based attribution methods for deep neural networks. arXiv preprint arXiv:1711.06104 (2017)
Anders, C.J., Weber, L., Neumann, D., Samek, W., Müller, K.R., Lapuschkin, S.: Finding and removing clever hans: using explanation methods to debug and improve deep models. Inform. Fusion 77, 261–295 (2022)
Article Google Scholar
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)
Article Google Scholar
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: Mixmatch: a holistic approach to semi-supervised learning. Adv. Neural Information Process Syst. 32 (2019)
Google Scholar
Drucker, H., Le Cun, Y.: Improving generalization performance using double backpropagation. IEEE Trans. Neural Netw. 3(6), 991–997 (1992)
Article Google Scholar
Du, Y., Fu, Z., Liu, Q., Wang, Y.: Weakly supervised semantic segmentation by pixel-to-prototype contrast. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4320–4329 (2022)
Google Scholar
Etmann, C.: A closer look at double backpropagation. arXiv preprint arXiv:1906.06637 (2019)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Article Google Scholar
Gur, S., Ali, A., Wolf, L.: Visualization of supervised and self-supervised neural networks via attribution guided factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11545–11554 (2021)
Google Scholar
Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18(2), 203–211 (2021)
Article Google Scholar
Kim, B., Han, S., Kim, J.: Discriminative region suppression for weakly-supervised semantic segmentation. CoRR arXiv: 2103.07246 (2021)
Kim, H.E., Hwang, S.: Deconvolutional feature stacking for weakly-supervised semantic segmentation. arXiv preprint arXiv:1602.04984 (2016)
Kim, S., Nguyen, L.T., Shim, K., Kim, J., Shim, B.: Pseudo-label-free weakly supervised semantic segmentation using image masking. IEEE Access 10, 19401–19411 (2022)
Article Google Scholar
Lai, X., et al.: Semi-supervised semantic segmentation with directional context-aware consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1205–1214 (2021)
Google Scholar
Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., Müller, K.R.: Unmasking clever hans predictors and assessing what machines really learn. Nat. Commun. 10(1), 1096 (2019)
Article Google Scholar
Lee, J., Kim, E., Lee, S., Lee, J., Yoon, S.: Ficklenet: weakly and semi-supervised semantic image segmentation using stochastic inference. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
Google Scholar
Li, K., Wu, Z., Peng, K.C., Ernst, J., Fu, Y.: Tell me where to look: guided attention inference network. In: Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 9215–9223 (2018)
Google Scholar
Liu, F., Avci, B.: Incorporating priors with feature attribution on text classification. arXiv preprint arXiv:1906.08286 (2019)
Liu, S., Zhi, S., Johns, E., Davison, A.: Bootstrapping semantic segmentation with regional contrast. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=6u6N8WWwYSM
Luo, W., Yang, M.: Semi-supervised semantic segmentation via strong-weak dual-branch network. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020, Proceedings, Part V 16, pp. 784–800. Springer (2020). https://doi.org/10.1007/978-981-99-4761-4_22
Montavon, G., Binder, A., Lapuschkin, S., Samek, W., Müller, K.-R.: Layer-wise relevance propagation: an overview. In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 193–209. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_10
Chapter Google Scholar
Nam, W.J., Gur, S., Choi, J., Wolf, L., Lee, S.W.: Relative attributing propagation: Interpreting the comparative contributions of individual units in deep neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 2501–2508 (2020)
Google Scholar
Ouali, Y., Hudelot, C., Tami, M.: Semi-supervised semantic segmentation with cross-consistency training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12674–12684 (2020)
Google Scholar
Pan, J., et al.: Learning self-supervised low-rank network for single-stage weakly and semi-supervised semantic segmentation. Int. J. Comput. Vis. 130(5), 1181–1195 (2022)
Article Google Scholar
Rao, S., Böhle, M., Parchami-Araghi, A., Schiele, B.: Using explanations to guide models. arXiv preprint arXiv:2303.11932 (2023)
Rieger, L., Singh, C., Murdoch, W., Yu, B.: Interpretations are useful: penalizing explanations to align neural networks with prior knowledge. In: International Conference on Machine Learning, pp. 8116–8126. PMLR (2020)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Ross, A.S., Hughes, M.C., Doshi-Velez, F.: Right for the right reasons: training differentiable models by constraining their explanations. arXiv preprint arXiv:1703.03717 (2017)
Schramowski, P., et al.: Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nat. Mach. Intell. 2(8), 476–486 (2020)
Article Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Google Scholar
Shakya, S., Vasquez, M., Wang, Y., Tchoua, R., Furst, J., Raicu, D.: Human-in-the-loop deep learning retinal image classification with customized loss function. In: Medical Imaging 2022: Computer-Aided Diagnosis, vol. 12033, pp. 512–519. SPIE (2022)
Google Scholar
Shao, X., Skryagin, A., Stammer, W., Schramowski, P., Kersting, K.: Right for better reasons: Training differentiable models by constraining their influence functions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 9533–9540 (2021)
Google Scholar
Sohn, K., et al.: Fixmatch: simplifying semi-supervised learning with consistency and confidence. Adv. Neural. Inf. Process. Syst. 33, 596–608 (2020)
Google Scholar
Teney, D., Abbasnejad, E., Lucey, S., van den Hengel, A.: Evading the simplicity bias: Training a diverse set of models discovers solutions with superior ood generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16761–16772 (2022)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A., Bottou, L.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(12) (2010)
Google Scholar
Weber, L., Lapuschkin, S., Binder, A., Samek, W.: Beyond explaining: Opportunities and challenges of xai-based model improvement. Inform. Fus. 92, 154–176 (2023)
Article Google Scholar
Wei, Y., Xiao, H., Shi, H., Jie, Z., Feng, J., Huang, T.S.: Revisiting dilated convolution: A simple approach for weakly-and semi-supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7268–7277 (2018)
Google Scholar

Download references

Acknowledgments

Funding: X.Y.: Helmholtz Einstein International Berlin Research School in Data Science (HEIBRiDS); J.F.: German Research Foundation RTG 2424. W.S. and D.K.: German Research Foundation as grant DFG KI-FOR 5363, project no. 459422098.

Author information

Authors and Affiliations

Max-Delbrueck-Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
Xiaoyan Yu, Jannik Franzen & Dagmar Kainmueller
Helmholtz Imaging, Berlin, Germany
Jannik Franzen & Dagmar Kainmueller
Humboldt-Universität zu Berlin, Berlin, Germany
Xiaoyan Yu
Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany
Wojciech Samek
Department of Electrical Engineering and Computer Science, Technical University of Berlin, Berlin, Germany
Wojciech Samek
BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
Wojciech Samek
UMI Lab, ATB Potsdam, Potsdam, Germany
Marina M.-C. Höhne
University of Potsdam, Potsdam, Germany
Marina M.-C. Höhne & Dagmar Kainmueller

Authors

Xiaoyan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jannik Franzen
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Samek
View author publications
You can also search for this author in PubMed Google Scholar
Marina M.-C. Höhne
View author publications
You can also search for this author in PubMed Google Scholar
Dagmar Kainmueller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiaoyan Yu or Dagmar Kainmueller .

Editor information

Editors and Affiliations

Technological University Dublin, Dublin, Ireland
Luca Longo
Fraunhofer Institute for Telecommunications, Berlin, Germany
Sebastian Lapuschkin
University of Marburg, Marburg, Germany
Christin Seifert

Ethics declarations

Disclosure of Interest

The authors declare that they have no conflicts of interest related to this work.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, X., Franzen, J., Samek, W., Höhne, M.MC., Kainmueller, D. (2024). Model Guidance via Explanations Turns Image Classifiers into Segmentation Models. In: Longo, L., Lapuschkin, S., Seifert, C. (eds) Explainable Artificial Intelligence. xAI 2024. Communications in Computer and Information Science, vol 2154. Springer, Cham. https://doi.org/10.1007/978-3-031-63797-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-63797-1_7
Published: 10 July 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63796-4
Online ISBN: 978-3-031-63797-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Model Guidance via Explanations Turns Image Classifiers into Segmentation Models