Abstract
Heatmaps generated on inputs of image classification networks via explainable AI methods like Grad-CAM and LRP have been observed to resemble segmentations of input images in many cases. Consequently, heatmaps have also been leveraged for achieving weakly supervised segmentation with image-level supervision. On the other hand, losses can be imposed on differentiable heatmaps, which has been shown to serve for (1) improving heatmaps to be more human-interpretable, (2) regularization of networks towards better generalization, (3) training diverse ensembles of networks, and (4) for explicitly ignoring confounding input features. Due to the latter use case, the paradigm of imposing losses on heatmaps is often referred to as “Right for the right reasons”. We unify these two lines of research by investigating semi-supervised segmentation as a novel use case for the Right for the Right Reasons paradigm. First, we show formal parallels between differentiable heatmap architectures and standard encoder-decoder architectures for image segmentation. Second, we show that such differentiable heatmap architectures yield competitive results when trained with standard segmentation losses. Third, we show that such architectures allow for training with weak supervision in the form of image-level labels and small numbers of pixel-level labels, outperforming comparable encoder-decoder models. Code is available at https://github.com/Kainmueller-Lab/TW-autoencoder.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Achtibat, R., et al.: From attribution maps to human-understandable explanations through concept relevance propagation. Nat. Mach. Intell. 5, 1006–1019 (2023)
Alvi, M., Zisserman, A., Nellåker, C.: Turning a blind eye: explicit removal of biases and variation from deep neural network embeddings. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11129, pp. 556–572. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11009-3_34
Ancona, M., Ceolini, E., Öztireli, C., Gross, M.: Towards better understanding of gradient-based attribution methods for deep neural networks. arXiv preprint arXiv:1711.06104 (2017)
Anders, C.J., Weber, L., Neumann, D., Samek, W., Müller, K.R., Lapuschkin, S.: Finding and removing clever hans: using explanation methods to debug and improve deep models. Inform. Fusion 77, 261–295 (2022)
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: Mixmatch: a holistic approach to semi-supervised learning. Adv. Neural Information Process Syst. 32 (2019)
Drucker, H., Le Cun, Y.: Improving generalization performance using double backpropagation. IEEE Trans. Neural Netw. 3(6), 991–997 (1992)
Du, Y., Fu, Z., Liu, Q., Wang, Y.: Weakly supervised semantic segmentation by pixel-to-prototype contrast. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4320–4329 (2022)
Etmann, C.: A closer look at double backpropagation. arXiv preprint arXiv:1906.06637 (2019)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Gur, S., Ali, A., Wolf, L.: Visualization of supervised and self-supervised neural networks via attribution guided factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11545–11554 (2021)
Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18(2), 203–211 (2021)
Kim, B., Han, S., Kim, J.: Discriminative region suppression for weakly-supervised semantic segmentation. CoRR arXiv: 2103.07246 (2021)
Kim, H.E., Hwang, S.: Deconvolutional feature stacking for weakly-supervised semantic segmentation. arXiv preprint arXiv:1602.04984 (2016)
Kim, S., Nguyen, L.T., Shim, K., Kim, J., Shim, B.: Pseudo-label-free weakly supervised semantic segmentation using image masking. IEEE Access 10, 19401–19411 (2022)
Lai, X., et al.: Semi-supervised semantic segmentation with directional context-aware consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1205–1214 (2021)
Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., Müller, K.R.: Unmasking clever hans predictors and assessing what machines really learn. Nat. Commun. 10(1), 1096 (2019)
Lee, J., Kim, E., Lee, S., Lee, J., Yoon, S.: Ficklenet: weakly and semi-supervised semantic image segmentation using stochastic inference. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
Li, K., Wu, Z., Peng, K.C., Ernst, J., Fu, Y.: Tell me where to look: guided attention inference network. In: Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 9215–9223 (2018)
Liu, F., Avci, B.: Incorporating priors with feature attribution on text classification. arXiv preprint arXiv:1906.08286 (2019)
Liu, S., Zhi, S., Johns, E., Davison, A.: Bootstrapping semantic segmentation with regional contrast. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=6u6N8WWwYSM
Luo, W., Yang, M.: Semi-supervised semantic segmentation via strong-weak dual-branch network. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020, Proceedings, Part V 16, pp. 784–800. Springer (2020). https://doi.org/10.1007/978-981-99-4761-4_22
Montavon, G., Binder, A., Lapuschkin, S., Samek, W., Müller, K.-R.: Layer-wise relevance propagation: an overview. In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 193–209. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_10
Nam, W.J., Gur, S., Choi, J., Wolf, L., Lee, S.W.: Relative attributing propagation: Interpreting the comparative contributions of individual units in deep neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 2501–2508 (2020)
Ouali, Y., Hudelot, C., Tami, M.: Semi-supervised semantic segmentation with cross-consistency training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12674–12684 (2020)
Pan, J., et al.: Learning self-supervised low-rank network for single-stage weakly and semi-supervised semantic segmentation. Int. J. Comput. Vis. 130(5), 1181–1195 (2022)
Rao, S., Böhle, M., Parchami-Araghi, A., Schiele, B.: Using explanations to guide models. arXiv preprint arXiv:2303.11932 (2023)
Rieger, L., Singh, C., Murdoch, W., Yu, B.: Interpretations are useful: penalizing explanations to align neural networks with prior knowledge. In: International Conference on Machine Learning, pp. 8116–8126. PMLR (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Ross, A.S., Hughes, M.C., Doshi-Velez, F.: Right for the right reasons: training differentiable models by constraining their explanations. arXiv preprint arXiv:1703.03717 (2017)
Schramowski, P., et al.: Making deep neural networks right for the right scientific reasons by interacting with their explanations. Nat. Mach. Intell. 2(8), 476–486 (2020)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Shakya, S., Vasquez, M., Wang, Y., Tchoua, R., Furst, J., Raicu, D.: Human-in-the-loop deep learning retinal image classification with customized loss function. In: Medical Imaging 2022: Computer-Aided Diagnosis, vol. 12033, pp. 512–519. SPIE (2022)
Shao, X., Skryagin, A., Stammer, W., Schramowski, P., Kersting, K.: Right for better reasons: Training differentiable models by constraining their influence functions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 9533–9540 (2021)
Sohn, K., et al.: Fixmatch: simplifying semi-supervised learning with consistency and confidence. Adv. Neural. Inf. Process. Syst. 33, 596–608 (2020)
Teney, D., Abbasnejad, E., Lucey, S., van den Hengel, A.: Evading the simplicity bias: Training a diverse set of models discovers solutions with superior ood generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16761–16772 (2022)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A., Bottou, L.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(12) (2010)
Weber, L., Lapuschkin, S., Binder, A., Samek, W.: Beyond explaining: Opportunities and challenges of xai-based model improvement. Inform. Fus. 92, 154–176 (2023)
Wei, Y., Xiao, H., Shi, H., Jie, Z., Feng, J., Huang, T.S.: Revisiting dilated convolution: A simple approach for weakly-and semi-supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7268–7277 (2018)
Acknowledgments
Funding: X.Y.: Helmholtz Einstein International Berlin Research School in Data Science (HEIBRiDS); J.F.: German Research Foundation RTG 2424. W.S. and D.K.: German Research Foundation as grant DFG KI-FOR 5363, project no. 459422098.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interest
The authors declare that they have no conflicts of interest related to this work.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, X., Franzen, J., Samek, W., Höhne, M.MC., Kainmueller, D. (2024). Model Guidance via Explanations Turns Image Classifiers into Segmentation Models. In: Longo, L., Lapuschkin, S., Seifert, C. (eds) Explainable Artificial Intelligence. xAI 2024. Communications in Computer and Information Science, vol 2154. Springer, Cham. https://doi.org/10.1007/978-3-031-63797-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-63797-1_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63796-4
Online ISBN: 978-3-031-63797-1
eBook Packages: Computer ScienceComputer Science (R0)