Abstract
Semantic segmentation is one of the most popular tasks in computer vision. Its applications span from medical image analysis to self driving cars and beyond. For a given image, in semantic segmentation, we generate masks of image segments corresponding to each type or class. However, these segmented maps may either segment a region properly but assign it to a different class or they may have a possibly poor segmented region identification. Hence there is a need to visualize the regions of importance in an image for a given class. Class Activation Maps (CAMs) are popularly used in the classification task to understand the correlation of a class and the regions in an image that correspond to it. We propose a new framework to model the semantic segmentation task as an end to end classification task. This can be used with any deep learning based segmentation network. Using this, we visualize the gradient based CAMs (GradCAM) for the task of semantic segmentation. We also validate our results by using sanity checks for saliency maps and correlate them to those found for the classification task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels. Technical report (2010)
Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. In: Advances in Neural Information Processing Systems, pp. 9505–9515 (2018)
Ali, S., et al.: Endoscopy artifact detection (EAD 2019) challenge dataset. arXiv preprint arXiv:1905.03209 (2019)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Kim, D.: Guided prop (2019). https://donghwa-kim.github.io/guidedprop.html
Kotikalapudi, R.: Keras vis (2019). https://github.com/raghakot/keras-vis
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Shi, J., Malik, J.: Normalized cuts and image segmentation. Departmental Papers (CIS), p. 107 (2000)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sun, K., et al.: High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514 (2019)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Zhang, H., et al.: ResNeSt: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bedmutha, M.S., Raman, S. (2021). Using Class Activations to Investigate Semantic Segmentation. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds) Computer Vision and Image Processing. CVIP 2020. Communications in Computer and Information Science, vol 1378. Springer, Singapore. https://doi.org/10.1007/978-981-16-1103-2_14
Download citation
DOI: https://doi.org/10.1007/978-981-16-1103-2_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1102-5
Online ISBN: 978-981-16-1103-2
eBook Packages: Computer ScienceComputer Science (R0)