Keywords

1 Introduction

The neuroimaging field is moving toward the micron scale, accelerated by modern imaging methods [14, 31], cell labeling techniques [36], and a desire to understand the brain at the level of circuits [18]. This is impacting basic neuroscience research involving animal model organisms, as well as human digital pathology for understanding neurological disease. While some 3D modalities based on tissue clearing are becoming available such as CLARITY or iDISCO , two dimensional (2D) histological sections stained for relevant features and imaged with light microscopy are a gold standard for identification of anatomical regions [10], and making diagnoses in many neurodegenerative diseases [16]. To interpret this data, mapping to the common coordinates of a well characterized 3D atlas is required. This allows automatic labeling of anatomical regions for parsing experimental results, as well as the ability to combine data from different experiments or laboratories, facilitating a statistical understanding of neuroimaging.

The Brain Initiative Cell Census Network https://www.braininitiative.nih.gov/brain-programs/cell-census-network-biccn was created with the goal of mapping molecular, anatomical, and functional data into comprehensive brain cell atlases. This project is beginning with mouse imaging data in the Allen Common Coordinate Framework (CCF) [20], and building toward nonhuman primates and ultimately humans. To understand cell diversity via gene expression throughout the brain, one important contribution is to perform single neuron RNA sequencing (snRNA-seq) [21] at key locations throughout the mouse brain. The data associated to this analysis includes coronal images of 20 \(\upmu \)m thick serially sectioned Nissl stained tissue, allowing visualization of the location and density of neuron cell bodies with a blue/violet color. At indicated locations, tissue is cut to 100 \(\upmu \)m thick, and 0.5–2 mm diameter circular regions are punched out for RNA sequencing after slicing but before staining and imaging. These heavily processed sections present variable contrast profiles, missing tissue, and artifacts as seen in Fig. 1, posing serious challenges for existing registration techniques.

Fig. 1.
figure 1

Several Nissl stained coronal sections of the mouse brain are shown from anterior (top left) to posterior (bottom right).

The goal of this work is to develop an image registration algorithm using diffeomorphic techniques developed by the Computational Anatomy community, to accurately map the Allen Institute’s CCF to our Nissl datasets. Because this community has largely studied mappings between pairs of images that are topologically equivalent, the fundamentally asymmetric nature of our 3D atlas as compared to sparsely sectioned 2D data requires a non standard approach.

Several approaches to registration with missing data or artifacts have been developed by the community. The simplest approach consists of manually defining binary masks that indicate data to be ignored by image similarity functions [9, 35] before registration. A slightly more involved method is to use inpainting, where data in these regions is replaced by a specific image intensity or texture [33], a method included in ANTs [6, 44] for registration in the presence of multiple sclerosis lesions. Anomalous data such as excised tissue tumors or other lesions [24,25,26, 45] have been jointly estimated together with registration parameters using models for contrast changes. Others have used statistical approaches [11, 27, 40] based on Expectation Maximization (EM) algorithms [12] which is the approach we follow. In the presence of contrast differences this problem is more challenging. Image similarity functions designed for cross-modality registration, like normalized cross correlation [4, 5, 42], mutual information [22, 23, 30], or local structural information [7, 15, 41] cannot be used in an EM setting because they do not correspond to a data log likelihood.

The importance of mapping histology into 3D coordinates has long been recognized. A recent review [28] lists 30 different software packages attempting the task. While the review acknowledges that artifacts such as folds, tears, cracks and holes as important challenges, they are not adequately addressed by these methods. Modern approaches to solve the problem (e.g. [1,2,3, 43]) tend to involve multiple preprocessing steps and stages of alignment, and carefully constructed metrics of image similarity. Instead, our approach follows statistical estimation within an intuitive generative model of the formation of 2D slice images from 3D atlases. Large 3D deformations of the atlas and 2D deformations of each slice are modeled via unknown diffeomorphisms using established techniques developed by the Computational Anatomy community, the contrast profile of each observed slice is estimated using an unknown polynomial transformation, and observed images are modeled with additive (conditionally) Gaussian white noise (conditioned on transformation parameters). Artifacts and missing tissue are accommodated through Gaussian mixture modeling (GMM) at each observed pixel. The final two parts, polynomial transformation and GMM, are the critical innovations necessary to handle contrast changes and missing tissue. Our group has described this basic approach in the context of single digital pathology images [39], and developed a simpler version for serial sections [19] (considering deformations only in 3D, without contrast changes or artifacts). Here we extend this problem to the serially sectioned mouse brain, enabling annotation of each pixel, and mapping of slices into standard 3D coordinates.

2 Algorithm

We first describe the generative model at the heart of our mapping algorithm, and then discuss its optimization via EM and gradient descent.

2.1 Generative Model

The generative model which predicts the shape and appearance of each 2D slice from our 3D atlas is shown schematically in Fig. 2, with transformation parameters summarized in Table 1. In this model the role of our 3D atlas and observed data are fundamentally asymmetric: slices can be generated from the atlas, but not vice versa. The motivation for the order of this scheme is to mimic the imaging process, where 3D transformations describe shape differences between an observed brain and a canonical atlas, and 2D transformations describe distortions that occur in the sectioning and imaging process. The steps below describe transformations that are all estimated jointly using a single cost (weighted sum of square error over all 2D slices), rather than simply connecting standard algorithms one after another in a pipeline. The final two steps, polynomial intensity transformation and posterior- weighted sum of square error using GMM, are novel in this work and critical for handling contrast variability and missing tissue. Below we describe each step in detail.

Fig. 2.
figure 2

A schematic of the transformations in our generative model are shown. The top row shows 3D transformations that affect all slices, while the bottom row shows transformations specific to each slice. Posterior weights estimate artifact and missing tissue locations via EM.

Table 1. Listed below are variables used in our transformation, as well as the number of degrees of freedom to be estimated (on each slice if applicable).

I. Atlas Image: We use the Nissl Allen atlas [13], denoted I, at 50 \(\upmu \)m resolution in our studies with the 2017 anatomical labels, which is available publicly in nearly raw raster data (NRRD) format at http://download.alleninstitute.org/informatics-archive/current-release/mouse_ccf/. This is a grayscale image, which we will align to red green blue (RGB) Nissl stained sections denoted \(J^i\) for \(i \in \{1,\cdots ,N\}\).

II. 3D Diffeomorphism: Diffeomorphisms are generated using the large deformation diffeomorphic metric mapping framework [8] by integrating smooth velocity fields numerically using the method of characteristics [34]. We denote our 3D smooth velocity field by \(v_t\), and our 3D diffeomorphism by \(\varphi =\varphi _1\) where \(\dot{\varphi }_t = v_t(\varphi _t)\) and \(\varphi _0 = \) identity. Smoothness is enforced using regularization penalty given by \(E_{\text {reg}} = \frac{1}{2\sigma ^2_R}\int _0^1 \int |Lv_t(x)|^2dx dt\) with \(L = (id - a^2\varDelta )^2\) for id identity, \(\varDelta \) Laplacian, \(a=400\) \(\upmu \)m a characteristic length scale, and \(\sigma _R=5\times 10^4\) a parameter to control tradeoffs between regularization and accuracy. This time-integrated kinetic energy penalty is a standard approach to regularization in the computational anatomy community [8].

III. 3D Affine: We include linear changes in location and scale through a \(4\times 4\) affine matrix A with 12 degrees of freedom (9 linear and 3 translation).

IV. 3D to 2D Slicing: 2D images are generated by slicing the transformed 3D volume at N known locations \(z_i\). Each slice is separated by 20 \(\upmu \)m, with the exception of thick slices which are separated by 100 \(\upmu \)m.

V. 2D Diffeomorphism: For each slice i a 2D diffeomorphism \(\psi ^i\) is generated from a smooth velocity field \(w_t^i\). This is generated using the same methods as in 3D, with a regularization scale \(a_S=\) 400 \(\upmu \)m and weight \(\sigma _{R_S}=1\times 10^3\), with penalty \(E_{\text {reg}}^i\).

VI. 2D Rigid: For each slice a rigid transformation \(R^i\) is applied with 3 degrees of freedom (1 rotation and 2 translation). For identifiability, translations and rotations are constrained to be zero mean (averaged across each slice). We constrain the transformation to be rigid by parameterizing it as the exponential of an antisymmetric matrix.

VII. Intensity Transform: On each slice a cubic polynomial intensity transformation is applied to predict the observed Nissl stained data in a minimum weighted least squares sense. Since data consists of red-green-blue (RGB) images, this is 12 degrees of freedom, possibly nonmonotonic and flexible enough to permute the brightness order of background, gray matter, and white matter. We refer to the intensity transformed atlas corresponding to the ith slice as \(\hat{J}^i\) (the “hat” notation is used to convey that this is an estimate of the shape and appearance of the observed image \(J^i\)).

VIII. Weighted Sum of Square Error: Each transformed atlas slice is compared to our observed Nissl slice using weighted sum of square error \(E^i_{\text {match}} = \int \frac{1}{2\sigma ^2_M}|\hat{J}^i(x,y) - J^i(x,y)|^2W_M^i(x,y) dx\), where the weight \(W_M^i\) corresponds to the posterior probability that each pixel corresponds to some location in the atlas (to be estimated via EM algorithm), as opposed to being artifact or missing tissue. The constant \(\sigma _M\) represents the variance of Gaussian white noise in the image and is set to 0.05 (for an RGB image in the range [0,1]).

2.2 Optimization via Expectation Maximization

M step: Given a weight at each pixel, the M step corresponds to estimating unknown transformation parameters by minimizing the cost with a fixed \(W_M^i\):

$$\begin{aligned} \text {Cost}_{W_M} = E_{\text {reg}} + \textstyle \sum _{i = 1}^N E_{\text {reg}}^i + E^i_{\text {match}} \end{aligned}$$

All unknown deformation parameters are computed iteratively by gradient descent, with gradients backpropagated from each step to the previous. Rather than posing this as a sequential pipeline, optimization over each parameter is performed jointly. This mitigates negative effects in pipeline-based approaches such as poor initial affine alignment, and is consistent with the interpretation as joint maximum likelihood estimation.

Gradients for linear transformations are derived using standard calculus techniques. Gradients with respect to velocity fields were described originally in [8]. As shown in [38], gradients are backpropagated from the endpoint of a 3D diffeomorphic flow to time t via

$$\begin{aligned} ({v_t}\,\mathrm{gradient}) = \textstyle \frac{1}{\sigma ^2_R}v_t - K*\left[ (\text {endpoint gradient})\circ \varphi _{1t}^{-1}|D\varphi _{1t}^{-1}| \nabla (I\circ \varphi _{1t}^{-1})\right] \end{aligned}$$

where \(K*\) is convolution with the Green’s kernel of \(L^*L\), \(\nabla =\left( \frac{\partial }{\partial x},\frac{\partial }{\partial x},\frac{\partial }{\partial x}\right) ^T\), and \(\varphi _{1t} = \varphi _t\circ \varphi _1^{-1}\) (a transformation from the endpoint of the flow \(t=1\), to time t). The backpropagation for 2D diffeomorphisms is similar.

To simplify backpropagation of gradients from 3D to 2D, we interpret each slice as a 3D volume at the appropriate plane, using \(J^i(x,y,z) \simeq \varDelta (z - z_i)J^i(x,y)\), for \(\varDelta \) a triangle approximation to the Dirac \(\delta \) with width given by the slice spacing, modeling the process of physically sectioning the tissue at known thickness. This allows our cost to be written entirely as an integral over 3D space, allowing the backpropagation with standard approaches.

All intensity transformation parameters are computed exactly at each iteration of gradient descent by weighted least squares. This includes unknown polynomial coefficients for the intensity transformations on each slice, and unknown means for (B)ackground pixels (missing tissue) or (A)rtifact (\(\mu _B,\mu _A\)).

E Step: Given an estimate of each of the transformation parameters, the E step corresponds to estimating the posterior probabilities that each pixel corresponds to the atlas, to missing tissue, or to artifact. This is computed at each pixel via Bayes theorem using a Gaussian model, by specifying a variance \(\sigma ^2_M=\sigma ^2_B=\sigma ^2_A/100\) for the image (M)atching, (B)ackground, and (A)rtifact. Additionally, we include a spatial prior with a Gaussian shape of standard deviation 3 mm, making missing tissue and artifacts more likely to occur near the edges of the image, a common feature of this dataset (e.g. streaks near the bottom of images in Fig. 1).

Fig. 3.
figure 3

Registration result for 3 typical slices i. Top: target image \(J^i\) with annotations overlayed. Middle: Transformed atlas image \(\hat{J}^i\) predicting RGB contrast of observed Nissl images from grayscale atlas. Bottom: Posterior probabilities that each pixel corresponds to the deformed atlas (red), an artifact (green), or missing tissue (blue). (Color figure online)

Fig. 4.
figure 4

Registration result for 3 thick cut slices. Layout as in Fig. 3.

3 Results

We demonstrate our algorithm by mapping onto 484 tissue slices from one mouse brain produced using a tape transfer technique [17, 29, 32]. Out of these, 460 were 20 \(\upmu \)m thick and produced using a standard Nissl staining technique, and 24 were 100 \(\upmu \)m thick for snRNA-seq analysis. On these slices, 2–12 punches of roughly 1 mm diameter were removed before tissue staining. The images \(J^i\) were resampled to 45 \(\upmu \)m resolution, with a maximum size \(864\times 1020\) pixels, stored as RGB tiffs with a bit depth of 24 bits per pixel, for a total of approximately 500 MB. While we show results here for one brain, on the order of 50 samples are becoming available as part of the BICCN project. To avoid local minima, the registration procedure is performed at 3 resolutions (downsampled by 4, then 2, then native resolution) and takes approximately 24 h total using 4 cores on a Intel(R) Xeon(R) CPU E5-1650 v4 at 3.60 GHz. All our registered data is made available through http://mousebrainarchitecture.org, and the accuracy of our mapping techniques is verified by anatomists on each slice using a custom designed web interface.

The mapping accuracy for three typical slices is shown in Fig. 3. The top row shows our raw images with atlas labels superimposed. The second row shows our predicted image \(\hat{J}^i\) for each corresponding slice, with a 1 mm grid overlayed showing the distortion of the atlas. The bottom row shows our estimates of missing tissue (blue) or artifact (green). Note that a large streak artifact in the left column is detected (green), as well as missing or torn tissue (blue or green).

Results for 3 nearby thick cut slices are shown in Fig. 4. Here we see that missing tissue is easily detected (blue) and does not interfere with registration accuracy. Other artifacts including smudges on microscope slides are also detected. In Fig. 5 we show registration results for the same slices, using a traditional approach without identifying abnormalities (\(W^i_M\!=\!1,W^i_A\!=\!W^i_B\!=\!0\)). One observes inaccurate registration and dramatic distortions of the atlas.

Fig. 5.
figure 5

Registration result on thick cut slices using a traditional algorithm (no EM). Layout as in Fig. 3 (top two rows).

4 Discussion

In this work we described a new method for mapping a 3D atlas image onto a series of 2D slices, based on a generative model of the image formation process. This technique, which accommodates missing tissue and artifacts through an EM algorithm, was essential for reconstructing the heavily processed tissue necessary for snRNA-seq. We demonstrated the accuracy of our method with several examples of typical and atypical slices, and illustrated its improvement over standard approaches. This work is enabling the BICCN’s goal of quantifying cell diversity throughout the mouse brain in standard 3D coordinates. While the results presented here demonstrate a proof of concept, future work will quantify accuracy on a larger sample in terms of distance between labeled landmarks and overlap of manual segmentations.

This work departs from the standard random orbit model of Computational Anatomy, in that our observed 2D slices do not lie in the orbit of our 3D template under the action of the diffeomorphism group. This fundamental asymmetry between atlas and target images is addressed by using a realistic sequence of deformations that model the sectioning and tape transfer process.

While the results shown here were restricted to Nissl stained mouse sections, this algorithm has potential for larger impact. A common criticism of using brain mapping to understand medical neuroimages is its inability to function in the presence of abnormalities such as strokes, multiple sclerosis, or other lesions. This algorithm can be applied in these situations, automatically classifying abnormal regions and down-weighting the importance of intensity matching in these areas.