Keywords

1 Is Interpretability Necessary?

Contemporary computer vision algorithms – in the context of art and beyond – make extensive use of artificial neural networks to solve object recognition and classification tasks. The most common architecture employed for such tasks is the deep convolutional neural network (CNN) [7, 9, 10]. With the spread of CNNs across domains, however, a problem particular to deep neural networks has resurfaced: while we can train deep neural networks to do very well on specific tasks, it is often impossible to know how a model arrives at a decision, i.e. which features of an input image are relevant for its classification. As a response to this impasse, interpretable machine learning has grown into its own distinct area of research, with visual analytics of CNNs as an emerging field of study [5]. While much of the research in this area is concerned with the development of an empirical approach to interpretability [6, 15], one of its open qualitative questions is: which machine learning models need to be interpretable?

While it is obvious that machine learning models deployed in high-stakes scenarios, like credit ratings and recidivism prediction (or predictive policing in general), deserve increased scrutiny and necessitate interpretability [12, 21], it has been put into question [11] if models deployed in less critical contexts require interpretability at all, or if the internal “reasoning” of such models is irrelevant given a good enough error rate on the actual task. The main hypothesis of this paper is that in computer vision for art interpretability is desirable, if not indispensable, despite the lack of a need for normative assessment.

2 Representation and Interpretation

One of the most common technical approaches to increase the interpretability of CNNs is feature visualization. Feature visualization has been an important research area within machine learning in general and deep learning in particular at least since 2014 [24, 27]. All feature visualization methods rely on the principle of activation maximization: learned features of a particular neuron or layer are visualized by optimizing a random noise input image to maximally activate this neuron or layer.

For instance, an image optimized for an output neuron of a neural network trained on the ImageNet dataset will intuitively show some object from the class associated with this neuron – if it is subjected to proper regularization [19, 20, 26]. More elaborate methods employ natural image priors to “bias” visualizations even more towards “legible” images [2, 16,17,18]. In fact, unregularized feature visualization images will often fall into the range of adversarial examples [4] for a given class, i.e. they will not be visually related to natural images from this class but still activate the output neuron for this class with very high confidence. Moreover, as [19] and many others have observed, many feature visualization images are “strange mixtures of ideas” that seem to blend features from many different natural images. This suggests that individual neurons are not necessarily the right semantic units for understanding neural nets. In fact, as [25] show, looking for meaningful features does not necessarily lead to more meaningful visualizations than looking for any combination of features, i.e. producing arbitrary activation maximizations. While some recent results [1] seem to weaken the assumption of a distributed representational structure of CNNs, the assumption has nevertheless given rise to a number of highly visible critical interventions suggesting that it will be necessary to augment deep learning methods with more symbolic approaches [8, 13, 22].

From this indispensability of regularization we can construct a technical argument about the notion of representation as it applies to feature visualization. Johanna Drucker has described the act of interpretation as the collapse of the probability distribution of all possible interpretations [3] for an aesthetic artifact. For feature visualization images, this metaphor applies literally, as feature visualization images are literal samples from the probability distribution that is approximated by the whole model. Somewhat counter-intuitively, feature visualization images, despite being technical images, are thus arbitrary interpretations in the exact sense suggested by Drucker. Interpretations based on feature visualization images thus become (human) interpretations of (technical) interpretations. One possible conclusion to draw from this peculiar representational character of feature visualization images would be that visual interpretability as a concept is critically flawed. We propose to draw the opposite conclusion, suggesting that exactly this “subjective” nature of feature visualization images makes visual interpretability useful for computer vision for art.

3 A Non-traditional Approach to Visual Interpretability

Our suggestion is to use feature visualization images to “augment” the original dataset under investigation. Concretely this would mean that, in assessing a dataset with the help of machine learning, the digital art historian would not only take the model’s results into account but also include a large set of feature visualization images in the analysis. In this “non-traditional” approach, the digital art historian’s hermeneutic work would extend back into the very technical system that enables it, operating on both the original dataset and the feature visualization dataset. The technical system, rather than being an opaque tool, would become an integral part of the interpretative process.

Fig. 1.
figure 1

Feature visualization images for the “portrait” and “landscape” classes of an InceptionV3 neural network. The network was trained on ImageNet and then fine-tuned for ten epochs on an art historical dataset. The dataset, a subset of the web gallery of art dataset, consists of three classes (portrait, landscape, and still life) with 1400 images per class. The resulting classifier reaches 95% validation accuracy. Only minimal regularization was used in the production of the feature visualization images (a \(5\times 5\) median filter was applied every four iterations). High resolution was achieved through multi-scale optimization as proposed in the original implementation of the “deep dream” algorithm [14]. The color channels of the final image were normalized independently.

The toy example in Fig. 1 shows the feasibility of this approach: the model seems to have learned that faces and, surprisingly, drapery are the defining features of a portrait. The highest scoring image from the training dataset, Moretto da Brescia’s Christ with an Angel (1550) confirms this hypothesis, as it contains two faces and three prominent drapery objects. A defining feature of a landscape painting, according to the model, seems to be an aerial perspective blue shift. Both results point to a subtle (likely historical and/or geographical) bias in the dataset that deserves further analysis. Importantly, however, it is the strangeness, the ambiguity, the “Verfremdungseffekt” of the feature visualization image that is open to the same kind of interpretation as the original image that facilitates this conclusion.

[23] have suggested to understand interpretability as a set of strategies to counteract both the inscrutability and the anti-intuitiveness of machine learning models. Inscrutability is defined as the difficulty to investigate a model with a high number of parameters and a high structural complexity. Anti-intuitiveness, on the other hand, is defined as the fact that the internal “reasoning” of a model does not necessarily correspond to intuitive methods of inference, as hidden correlations often play an essential role. Taking up this distinction, we could say that the specific non-traditional interpretability strategy described above would not try to eliminate the anti-intuitiveness of a machine learning model but put it on its feet by embracing its anti-intuitive nature and exploiting it for the benefit of interpretation.

4 Conclusion

We have shown that the representational status of feature visualization images is not as straightforward as often assumed. Based on this clarification, we have proposed that visual interpretability, understood as a method to render the anti-intuitive properties of machine learning models usable, rather than trying to eliminate them, could benefit computer vision for art by extending the reach of the digital art historian’s analysis to include the machines used to facilitate this analysis. Our toy example demonstrates the feasibility of this approach.

Both digital art history and interpretable machine learning are academic fields that only emerged in the past twenty to thirty years, and experienced significant growth only in the past five years. The intimate connection of both fields through their common interest in the analysis and interpretation of images, however, makes a closer collaboration of researchers from both fields reasonable and desirable. The non-traditional interpretability strategy outlined above is only one of many possible non-traditional approaches that could significantly impact both fields, technically, as well as conceptually.