Making Sense of CNNs: Interpreting Deep Representations and Their Invariances with INNs

Rombach, Robin; Esser, Patrick; Ommer, Björn

doi:10.1007/978-3-030-58520-4_38

Robin Rombach¹²,
Patrick Esser¹² &
Björn Ommer¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12362))

Included in the following conference series:

European Conference on Computer Vision

2758 Accesses
4 Citations

Abstract

To tackle increasingly complex tasks, it has become an essential ability of neural networks to learn abstract representations. These task-specific representations and, particularly, the invariances they capture turn neural networks into black box models that lack interpretability. To open such a black box, it is, therefore, crucial to uncover the different semantic concepts a model has learned as well as those that it has learned to be invariant to. We present an approach based on INNs that (i) recovers the task-specific, learned invariances by disentangling the remaining factor of variation in the data and that (ii) invertibly transforms these recovered invariances combined with the model representation into an equally expressive one with accessible semantic concepts. As a consequence, neural network representations become understandable by providing the means to (i) expose their semantic meaning, (ii) semantically modify a representation, and (iii) visualize individual learned semantic concepts and invariances. Our invertible approach significantly extends the abilities to understand black box models by enabling post-hoc interpretations of state-of-the-art networks without compromising their performance.

R. Rombach and P. Esser—Both authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We used weights available at https://github.com/rgeirhos/texture-vs-shape.

References

Achille, A., Soatto, S.: Emergence of invariance and disentanglement in deep representations. J. Mach. Learn. Res. 19(1), 1947–1980 (2018)
MathSciNet MATH Google Scholar
Ardizzone, L., et al.: Analyzing inverse problems with invertible neural networks (2018)
Google Scholar
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)
Article Google Scholar
Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). https://doi.org/10.1109/CVPR.2017.354
Bau, D., et al.: GAN dissection: visualizing and understanding generative adversarial networks (2018)
Google Scholar
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: VGGFace2: a dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 67–74. IEEE (2018)
Google Scholar
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Commission, E.: On artificial intelligence - a European approach to excellence and trust. Technical report (2020). https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=COM:2020:65:FIN. Accessed Feb 2020
Dai, B., Wipf, D.: Diagnosing and enhancing VAE models (2019)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP (2016)
Google Scholar
Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks (2016)
Google Scholar
Esser, P., Haux, J., Ommer, B.: Unsupervised robust disentangling of latent characteristics for image synthesis. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019). https://doi.org/10.1109/ICCV.2019.00279
Esser, P., Rombach, R., Ommer, B.: A disentangling invertible interpretation network for explaining latent representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9223–9232 (2020)
Google Scholar
Fong, R., Vedaldi, A.: Net2Vec: quantifying and explaining how concepts are encoded by filters in deep neural networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8730–8738 (2018). https://doi.org/10.1109/CVPR.2018.00910
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)
Goetschalckx, L., Andonian, A., Oliva, A., Isola, P.: GANalyze: toward visual definitions of cognitive image properties. arXiv preprint arXiv:1906.10112 (2019)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Goodman, B., Flaxman, S.: European union regulations on algorithmic decision-making and a “right to explanation”. AI Mag. 38(3), 50–57 (2017). https://doi.org/10.1609/aimag.v38i3.2741
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium (2017)
Google Scholar
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)
Jacobsen, J.H., Behrmann, J., Zemel, R., Bethge, M.: Excessive invariance causes adversarial vulnerability (2018)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013)
Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1x1 convolutions. In: Advances in Neural Information Processing Systems, pp. 10215–10224 (2018)
Google Scholar
Kotovenko, D., Sanakoyeu, A., Lang, S., Ommer, B.: Content and style disentanglement for artistic style transfer. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4421–4430 (2019)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Kulkarni, T.D., Whitney, W., Kohli, P., Tenenbaum, J.B.: Deep convolutional inverse graphics network (2015)
Google Scholar
LeCun, Y.: The MNIST database of handwritten digits (1998). http://yann.lecun.com/exdb/mnist/
LeCun, Y.: Learning invariant feature hierarchies. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012. LNCS, vol. 7583, pp. 496–505. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33863-2_51
Chapter Google Scholar
Li, Y., Singh, K.K., Ojha, U., Lee, Y.J.: MixNMatch: multifactor disentanglement and encoding for conditional image generation (2019)
Google Scholar
Lipton, Z.C.: The mythos of model interpretability (2016)
Google Scholar
Liu, M.Y., et al.: Few-shot unsupervised image-to-image translation. arXiv preprint arXiv:1905.01723 (2019)
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations (2018)
Google Scholar
Lorenz, D., Bereska, L., Milbich, T., Ommer, B.: Unsupervised part-based disentangling of object shape and appearance. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10947–10956 (2019)
Google Scholar
Mahendran, A., Vedaldi, A.: Visualizing deep convolutional neural networks using natural pre-images. Int. J. Comput. Vis. 120(3), 233–255 (2016)
Article MathSciNet Google Scholar
Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2019)
Article MathSciNet Google Scholar
Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K.R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017)
Article Google Scholar
Montavon, G., Samek, W., Müller, K.R.: Methods for interpreting and understanding deep neural networks. Digit. Signal Proc. 73, 1–15 (2018)
Article MathSciNet Google Scholar
Mordvintsev, A., Olah, C., Tyka, M.: Inceptionism: going deeper into neural networks (2015)
Google Scholar
Nash, C., Kushman, N., Williams, C.K.: Inverting supervised representations with autoregressive neural density models. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 1620–1629 (2019)
Google Scholar
Nguyen, A., Dosovitskiy, A., Yosinski, J., Brox, T., Clune, J.: Synthesizing the preferred inputs for neurons in neural networks via deep generator networks (2016)
Google Scholar
Plumb, G., Al-Shedivat, M., Xing, E., Talwalkar, A.: Regularizing black-box models for improved interpretability (2019)
Google Scholar
Redlich, A.N.: Supervised factorial learning. Neural Comput. 5(5), 750–766 (1993). https://doi.org/10.1162/neco.1993.5.5.750
Article Google Scholar
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, vol. 32, pp. II-1278. JMLR.org (2014)
Google Scholar
Rombach, R., Esser, P., Ommer, B.: Network fusion for content creation with conditional INNs (2020)
Google Scholar
Samek, W., Wiegand, T., Müller, K.R.: Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296 (2017)
Santurkar, S., Tsipras, D., Tran, B., Ilyas, A., Engstrom, L., Madry, A.: Image synthesis with a single (robust) classifier (2019)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128(2), 336–359 (2019). https://doi.org/10.1007/s11263-019-01228-7
Article Google Scholar
Shocher, A., et al.: Semantic pyramid for image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Simon, M., Rodner, E.: Neural activation constellations: unsupervised part model discovery with convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV) (2015). https://doi.org/10.1109/ICCV.2015.136
Simon, M., Rodner, E., Denzler, J.: Part detector discovery in deep convolutional neural networks. ArXiv abs/1411.3159 (2014)
Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Szegedy, C., et al.: Intriguing properties of neural networks (2013)
Google Scholar
Upchurch, P., et al.: Deep feature interpolation for image content changes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7064–7073 (2017)
Google Scholar
Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning’a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2251–2265 (2018)
Article Google Scholar
Xiao, Z., Yan, Q., Amit, Y.: Generative latent flow (2019)
Google Scholar
Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., Lipson, H.: Understanding neural networks through deep visualization (2015)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhang, Q., Nian Wu, Y., Zhu, S.C.: Interpretable convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8827–8836 (2018)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs (2014)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). https://doi.org/10.1109/CVPR.2016.319

Download references

Acknowledgments

This work has been supported in part by the German Research Foundation (DFG) projects 371923335, 421703927, and EXC 2181/1 - 390900948 and the German federal ministry BMWi within the project “KI Absicherung”.

Author information

Authors and Affiliations

Interdisciplinary Center for Scientific Computing, HCI, Heidelberg University, Heidelberg, Germany
Robin Rombach, Patrick Esser & Björn Ommer

Authors

Robin Rombach
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Esser
View author publications
You can also search for this author in PubMed Google Scholar
Björn Ommer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robin Rombach .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 23614 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rombach, R., Esser, P., Ommer, B. (2020). Making Sense of CNNs: Interpreting Deep Representations and Their Invariances with INNs. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12362. Springer, Cham. https://doi.org/10.1007/978-3-030-58520-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-58520-4_38
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58519-8
Online ISBN: 978-3-030-58520-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics