Abstract
Self-supervised models provide on par or superior results to their fully supervised competitors, yet it is unclear what information about images they contain. As a result, a visual probing framework was recently introduced to probe image representations for interesting visual features. While visual probing provides information about semantic knowledge, complexity, and consistency, it does not directly and exhaustively explain which visual features push self-supervised image representations away and which are neutral. In this paper, we fill this gap by proving a method that removes particular visual features from the image and analyzes how such a distortion influences the representation. Our key findings emphasize that discrepancies in features like lines and forms push self-supervised representations away more than brightness, color, shape, and especially texture changes. Our work is complementary to visual probing and provides more direct explanations of the mechanisms behind the contrastive loss.
Supported by grant no POIR.04.04.00-00-14DE/18-00 carried out within the Team-Net program of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund and Priority Research Area Digiworld under the program Excellence Initiative – Research University at the Jagiellonian University in Kraków.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We use the following implementations of self-supervised methods: https://github.com/{google-research/simclr, yaox12/BYOL-PyTorch, facebookresearch/swav, facebookresearch/moco}. We use ResNet-50 (1x) variant for each self-supervised method.
References
Adebayo, J., Gilmer J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. In: NeurIPS (2018)
Alain, G., Bengio, Y.: Understanding intermediate layers using linear classifier probes. In: ICLR Workshop (2016)
Basaj, D., et al.: Explaining self-supervised image representations with visual probing. In: ICJAI (2021)
Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. CoRR (2017)
Bernard, J., Hutter, M., Ritter, C., Lehmann, M., Sedlmair, M., Zeppelzauer, M: Visual analysis of degree-of-interest functions to support selection strategies for instance labeling. In: EuroVA (2019)
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “Siamese” time delay neural network. In: NeurIPS (1993)
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. arXiv (2020)
Ghorbani, G., Wexler, J., Zou, J., Kim, B.: Towards automatic concept-based explanations. In: NeurIPS (2019)
Grill, J.-B. et al.: Bootstrap your own latent: a new approach to self-supervised learning. arXiv (2020)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: CVPR (2006)
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: CVPR (2020)
Kim, B., et al.: Interpretability beyond feature attribution: quantitative testing with concept activation vectors (TCAV). In: ICML (2018)
Marr, D.: Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Henry Holt and Co., Inc. (1982)
Oleszkiewicz, W., et al.: Visual probing: cognitive framework for explaining self-supervised image representations. arXiv (2020)
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you?: explaining the predictions of any classifier. CoRR (2016)
Chen, T., Kornblith, S., Swersky, K., Norouzi, M., Hinton, G.: Big self-supervised models are strong semi-supervised learners. In: NeurIPS 2020 (2020)
Chen, C., Li, O., Barnett, A., Su, J., Rudin, C.: This looks like that: deep learning for interpretable image recognition. In: NeurIPS 2020 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zieliński, B., Górszczak, M. (2021). What Pushes Self-supervised Image Representations Away?. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_60
Download citation
DOI: https://doi.org/10.1007/978-3-030-92307-5_60
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92306-8
Online ISBN: 978-3-030-92307-5
eBook Packages: Computer ScienceComputer Science (R0)