Deep learning methods have become an omnipresent and highly successful part of recent approaches in imaging and vision. However, in most cases they are used on a purely empirical basis without real understanding of their behavior. From a scientific viewpoint, this is unsatisfying.

Many mathematically inclined researchers have a strong desire to understand the theoretical reasons for the success of these approaches and to find relations between deep learning and mathematically well-established techniques in imaging science. The goal of this special issue is to showcase their latest research results and to promote future research in this direction. It features twelve articles. To avoid any conflicts of interest, articles in which one of the guest editors is involved as co-author, have been handled by another guest editor.

We start with papers that provide mathematical insights into prototypical and widely used neural network architectures. The article “The Global Optimization Geometry of Shallow Linear Neural Networks” by Zhu et al. shows that classical linear neural networks have benign geometric properties such that, e.g., the popular gradient descent algorithms for training can be globally convergent.

In their paper on “Processing Simple Geometric Attributes with Autoencoders”, Newson et al. analyze how autoencoders, which constitute the simplest generative networks, encode and decode size and position. To this end, they consider centered dics with variable radii and Dirac delta functions.

Stability of neural nets under adversarial attacks is an important problem. It is addressed in the article “Adversarial Noise Attacks of Deep Learning Architectures—Stability Analysis via Sparse Modeled Signals” by Romano et al.. The authors derive stability theorems for state-of-the-art classification method by assuming that the signal has a (possibly multi-layer) sparse representation.

The popularity of the ResNet architecture is reflected by the fact that it is analyzed in multiple articles in our special issue. In “Forward Stability of ResNet and Its Variants,” Zhang and Schaeffer relate the post-activation ResNet to an optimal control problem with differential inclusions, and they derive continuous-time stability results for the corresponding differential inclusion. These results enable them to propose ResNet variants with improved stability bounds.

Ruthotto and Haber contribute a paper with the title “Deep Neural Networks Motivated by Partial Differential Equations,” where they connect ResNet architectures to parabolic and hyperbolic differential equations. This allows them to transfer the well-established theory from partial differential equations to neural networks, which also leads to several new architectures.

Rousseau et al. present an article on “Residual Networks as Flows of Diffeomorphisms.” They show that ResNets with shared weights can be seen as numerical approximations of exponential diffeomorphic operators.

The usefulness of advanced mathematics for improving deep learning approaches is demonstrated in the next two articles. In their paper “On Orthogonal Projections for Dimension Reduction and Applications in Augmented Target Loss Functions for Learning Problems,” Breger et al. advocate orthogonal projections on high-dimensional input and target data in learning frameworks. They introduce a general framework of variational loss functions for learning tasks that integrate additional information via transformations and projections of the target data, and they show that this concept can increase the accuracy of clinical image segmentation and music information classification.

The paper of Effland et al. focuses on variational networks, a particular type of recurrent neural networks. Using an optimal control approach they analyze the well-known observation that gradient flows can yield better results if they are stopped before convergence. This paradoxical situation also appears in highly expressive regularizers that are learned from data. They derive first- and second-order conditions for optimal stopping times and come up with variational networks that achieve competitive results for image denoising and deblurring.

The remaining articles in this special issue connect learning and inverse problems. In “A Convex Variational Model for Learning Convolutional Image Atoms from Incomplete Data,” Chambolle et al. present convex and semi-convex frameworks for simultaneous atom learning and image reconstructions involving incomplete, noisy and blurry data. In a continuous setting, well-posedness and stability results are established.

The paper of Schwab et al. with the title “Big in Japan: Regularizing Networks for Solving Inverse Problems” explores combinations of classical regularization and a correction term that is trained with deep learning. They prove that the resulting class of methods are convergent approaches for solving inverse problems, derive convergence rates, and show their experimental superiority.

Dittmer et al. present an article that interprets the recently introduced deep image prior (DIP) in terms of optimization of Tikhonov functionals. They obtain analytic results for specific network designs and linear operators.

The last paper in our special issue has the title “Networks for Nonlinear Diffusion Problems in Imaging.” Its authors Arridge and Hauptmann explore a diffusion-inspired network architecture which they term DiffNet. They show that DiffNet can be competitive to the popular U-Net, while requiring substantially less parameters and training data.

We wish the readers an exciting journey through this fascinating and rapidly evolving area of applied mathematics.