1 Introduction

Transmission Electron Microscopy (TEM) can reveal information about very fine structures (\(\sim \)1 nm) in biological tissue sections. Pathologists analyze the morphology of such ultrastructures to diagnose certain clinical conditions. One such example is Primary Ciliary Dyskinesia (PCD). PCD is an autosomal recessive genetic disease in which specialized cell structures, cilia, do not function normally. Cilia are hair-like organelles protruding from cells, and dysfunctional motile cilia results in serious problems, e.g. long term respiratory infection and infertility in males and females. Therefore, pathologists analyze cilia substructures like dynein arms (DA), radial spokes, nexin links, and central pair, see Fig. 1. Though the outer and inner dynein arms (ODA and IDA) are of particular interest for PCD diagnosis [18], defects in other substructures could also be the result of the PCD condition and must not be ignored.

Fig. 1.
figure 1

Example image of perpendicularly cut cilium and related terminology.

In current practice, a large number (\(\ge \)50) of perfectly perpendicularly cut cilia need to be detected in the sample and visually analyzed at high magnification. The procedure is time consuming and therefore costly, it takes around two hours for a proficient pathologist to perform analysis of a single case. The poor signal-to-noise ratio (SNR) in the TEM images also makes image interpretation difficult. The automation of processes at different levels of the diagnostic procedure is hence highly desirable. The work-flow of a proposed automated approach is illustrated in Fig. 2.

In this multiscale approach, we search for cilia-like objects at low-resolution, on detecting a sufficient number of objects we acquire relatively high-resolution images of specific regions and perform analysis to create super-resolution cilium. The collage of high-resolution images along with the enhanced reconstructed cilium is presented to a pathologist for final diagnosis. We have successfully addressed the main challenge, cilia detection at low magnification [16]. We have also presented preliminary results for super-resolution reconstruction of cilium at mid magnification [10], using an automated approach.

Fig. 2.
figure 2

Automated TEM imaging and analysis workflow. (The steps in focus in this paper are highlighted.)

Some work related to enhanced cilium reconstruction for PCD diagnosis has previously been reported. It involves manual or user guided selection, segmentation, and alignment of good cilia instances from digitized high-resolution images [1, 4, 5]. As mentioned, these approaches are manual and don’t take local deformations into account, unlike the one proposed in this work. In addition, alternative automated approaches for PCD diagnosis involves enhancement of ODA by averaging a large number of outer microtubule doublets extracted from many cilia instances [12], and classification of individual DA in cilia, based on their lengths [13]. These approaches focus only on the analysis of DA, whereas our technique performs automated enhancement of all the substructures which could allow PCD diagnosis in a larger number of patients.

The method we proposed in [10] uses fuzzy object representation to create a super-resolution reconstruction from a number of mid-magnification cilia cross-sectional instances aligned using rigid registration. Registering many instances improves the SNR, thus the DA appearance and super-resolution reconstruction further boost the DA representation. It is observed that the quality of the super-resolution image would benefit from an improved registration, which could allow both global and local deformations.

Image-based registration techniques are usually preferred over feature-based techniques for TEM images, as feature extraction from noisy TEM images is quite a challenging task [3, 6]. In this paper, we propose a multi-step registration approach which utilizes customized weight masks at each step, followed by super-resolution reconstruction from a set of registered images in order to enhance the appearance of cilia substructures. The first step is dedicated to aligning the central pair using rigid registration, the second step focuses on coarse alignment of the ring of the outer doublets using affine registration, and the final step is dedicated to the fine alignments of substructures using non-rigid registration. Utilizing optimized weight mask for each registration step is the novelty of this work. Mask derivation is discussed in details in Sect. 3.3.

2 Methodology

2.1 Image Registration

We use standard multi-step registration strategy adapted to our application. In each step, we performed pair-wise registration where one image acts as a static image (\(I_s\)) and the second image as a moving image (\(I_m\)), and the goal is to find the displacement field which defines the mapping between the co-ordinates of the two images. Thus the registration of image-pair can be defined as energy minimization problem to find the displacement field:

$$\begin{aligned} D=\arg \min _{\phi } E_{reg}(\phi ;\,I_s,I_m), \end{aligned}$$
(1)

where \(\phi \) is current displacement field and D is the final displacement field. \(E_{reg}(\phi )\) is the energy function, which is defined as:

$$\begin{aligned} E_{reg}(\phi ,\,I_s,I_m)=-{wNCC}(I_s, T(I_m,\phi ))+\lambda _{reg}R(\phi ), \end{aligned}$$
(2)

where \(T(I_m,\phi )\) is the result of transformation of \(I_m\) using the displacement field \(\phi \), wNCC is weighted normalized cross-correlation used as similarity measure, and R is regularization term with regularization parameter \(\lambda _{reg}\). For non-rigid registration (\(\lambda _{reg}>0\)) and for the rigid and affine registrations (\(\lambda _{reg}=0\)).

The wNCC used in the registration and at later stage for choosing a set of registered images for super-resolution reconstruction, is defined as:

$$\begin{aligned} wNCC(I_s,I_v) = \frac{\sum W_n \cdot [I_s - \overline{I}_{sw}] \cdot [I_v - \overline{I}_{vw}]}{\big (\sum W_n \cdot [I_s - \overline{I}_{sw}]^2 \sum W_n \cdot [I_v - \overline{I}_{vw}]^2\big )^{0.5}}\,, \end{aligned}$$
(3)

where \(W_n\) is a normalized weight mask such that \(\sum W_n=1\). \(\overline{I}_{sw}\) and \(\overline{I}_{vw}\) are the weighted means of the static and variable image calculated using \(W_n\) as:

$$\begin{aligned} \overline{I}_{sw}=\sum W_n\cdot I_s\,, \quad \overline{I}_{vw}=\sum W_n\cdot I_v \quad and \quad W_n(x,y) =\frac{W(x,y)}{\sum W}\,, \end{aligned}$$
(4)

where W is the user defined weight mask, and where (\(A \cdot B\)) means point-wise multiplication, and (\(\sum A\)) means sum over all elements of A.

2.2 Multi-step Registration

Rigid Registration: The focus of the first step is on aligning the central pair. The rotational symmetry of the ring of the outer doublets makes this task challenging, as there is a high possibility of getting stuck in local minima. Therefore, we follow multi-position initialization, using 9 rotations of the moving image as starting positions in the range [0–320\(^\circ \)] with a step size of \(40^\circ \), and selecting the one with the best-achieved wNCC score. The weighting mask (\(W_{rig}\)) only covers the central pair with high weights at the central region, as later described in Sect. 3.3. The resulting transformation matrix is used as the initial transformation in the next registration step.

Affine Registration: The second step focuses on aligning the ring of the outer doublets using affine transformation without affecting the alignment of the central pair attained by rigid registration. This is achieved by using another experimentally derived weight mask (\(W_{aff}\)), as later described in Sect. 3.3. The mask covers all the rings but assigns relatively high weights to the central region in comparison to the periphery. The resulting transformation matrix is used as the initial transformation in the next registration step.

Non-rigid Registration: The final step focuses on the fine adjustments in local regions. This step uses a free form deformation (FFD) model based on B-splines [8, 9] which is often used in medical image registration [15]. The FFD model transforms an image based on the transformation of a grid placed over an image where the grid nodes act as control points. A set of B-splines is used to guide the transformation where each B-spline transformation is influenced by a set of control points. Each individual control point is tuned iteratively to lead the local transformation until convergence or a specified exit condition is reached. The initial displacement field is generated from the affine transformation, and the final displacement field is a combination of multiple transformed B-splines updated iteratively under the influence of their respective control points [13]. Let the image domain is \(\varOmega =\{(m,n)\mid 0 \le m< M, 0 \le n < N\}\) and \(\varPhi \) denotes a [\(P_{m} \times P_{n}\)] mesh of control points with constant uniform spacing of \(\delta \) in x and y-Cartesian direction. Let \(\phi _{i,j}\) is the value of the control point located at (ij), with \(-1 \le i < P_{m}\), \(-1 \le j < P_{n}\) and (\(P_{m}=M+2, P_{n}=N+2\)). Then the approximate transformation using cubic B-splines function that represents the local deformation can be defined as [7, 9]:

$$\begin{aligned} T_{local}(m,n)=\sum _{r=0}^3 \sum _{s=0}^3 B_{r}(u) B_{s}(v) \phi _{(i+r,j+s)}\,. \end{aligned}$$
(5)

Here, \(i=\lfloor m/\delta \rfloor -1\), \(j=\lfloor n/\delta \rfloor -1\), \(u=(m/\delta ) - \lfloor m/\delta \rfloor \) and \(v=(n/\delta ) -\lfloor n/\delta \rfloor \). The functions \(B_{r}\) and \(B_{s}\) are the cubic B-spline polynomials as defined in [7, 9], where \(0 \le z < 1\):

$$\begin{aligned} B_0(z)&=(1-z)^3/6\,,&B_1(z)&=(3z^3-6z^2+4)/6\,,\\ B_2(z)&=(-3z^3+3z^2+3z+1)/6\,,&B_3(z)&=z^3/6\,. \end{aligned}$$

The control points determine the degrees of freedom (DoF) and the amount of non-rigid deformation, depending on the resolution of the control points grid. A low-resolution grid performs coarse non-rigid alignment and on increasing the grid resolution the alignment gets finer. However, many DoF come with a high computational cost. To achieve a good balance, we employed a pyramidal multi-resolution approach [9], where both the image and the control grid resolutions are increased from coarse-to-fine at each level. Let the local transformation at any pyramid level be denoted as \(T_{local}^{pl}\), then the final non-rigid registration displacement field is defined as, \(D=\sum _{pl=1}^{L} T_{local}^{pl}\,\).

Regularization: In order to constrain the deformation to avoid unrealistic transformations, a 2D bending energy of a thin-plate of metal [17] is used as penalty term to regularize the deformation [14] is defined as:

$$\begin{aligned} R(\phi ) = \frac{1}{\left|\varOmega \right|} \iint \limits _{\varOmega } \left[ \left( \frac{ \partial ^2 \phi }{\partial x^2 }\right) ^2 + 2\left( \frac{ \partial ^2 \phi }{\partial x \partial y }\right) ^2 + \left( \frac{ \partial ^2 \phi }{\partial y^2 }\right) ^2 \right] \,dxdy\,, \end{aligned}$$
(6)

The amount of penalty is controlled by regularization parameter \(\lambda _{reg}\) as stated in Eq. (2). The larger the regularization parameter \(\lambda _{reg} > 0\), the smoother the deformation field will be.

2.3 Super-Resolution Reconstruction

The set of registered cilia images is used to reconstruct a super-resolution (SR) image. The approach is inspired by the work presented in [10], but with a slightly adjusted regularization, based on experiences from [2]. To reduce disturbance from possible misalignments, we exclude the 25% lowest scoring registered images based on their wNCC scores. We formulate the SR reconstruction as a regularized energy minimization problem, where the reconstructed image h is estimated as:

$$\begin{aligned} h=\arg \min _u E_{sr}(u)\,. \end{aligned}$$
(7)

We utilize an energy function that includes the robust \(\ell _1\) norm to ensure noise insensitive adherence to the observed images, in combination with a Huberized TV-regularization which provides noise reduction while preserving edges. The energy function is of the form

$$\begin{aligned} E_{sr}(u) = \frac{1}{c} \sum _{i=1}^c \left\| S(u)-T(I_i; D_i) \right\| _1 + \lambda _{sr} \varPhi _H(\left|\nabla (u) \right|)\,, \end{aligned}$$
(8)

where \(S(\cdot )\) is a factor 2 subsampling operation, \(T(I_i; D_i)\) is the i-th registered observed image and \(D_i\) is the displacement field estimated for I. \(\varPhi _H(t)\) is the Huber potential function

$$\begin{aligned} \varPhi _H(t) = \left\{ \begin{array}{ll} \frac{t^2}{2\omega }, &{} t\le \omega \\ t-\frac{\omega }{2}, &{} t>\omega \,, \end{array} \right. \end{aligned}$$

and \(\nabla \) is the discrete image gradient. The two regularization parameters \(\lambda _{sr}\) and \(\omega \) are empirically tuned to \(\lambda _{sr}=0.05\), \(\omega =0.05\). Equation (7) is minimized using spectral projected gradient optimization, see [10] for details.

3 Experiments

The focus of the experiments is to derive the different optimized weight masks suitable for each registration step and the evaluation of the accuracy of the proposed registration technique. This approach is referred as the multiple-mask strategy from here on.

3.1 Image Data and Ground Truth

All the experiments are performed on a dataset of 20 representative cilia instance image patches chosen from a total of 30, which were detected using the automated detection technique presented in [11]. Each image patch is of size \(128\times 128\) pixels, extracted from a \(2048 \times 2048\) pixels image, acquired at mid-magnification using the MiniTEMFootnote 1 system. To evaluate registration algorithm performance, 20 landmarks were placed for each cilium instance by author IMS at the approximate centre of the microtubuli of the central pair (2) and the outer doublets (18). An example of a cilium image patch and corresponding marked landmarks is shown in Fig. 3.

Fig. 3.
figure 3

(a) Cilium instance, (b) landmarks manually placed on (a).

3.2 Average Pixel Alignment Error (PAE)

The registration accuracy is measured as the pixel alignment error defined as the average Euclidean distance between the landmarks from the reference image to its closest landmark in the registered image. Let \(L_{ref}\) and \(L_{reg}\) be the set of 20 landmarks in the reference and the registered images, then the PAE for that image-pair is:

$$\begin{aligned} PAE=\frac{1}{N_{L}} \sum _{p \in L_{ref}} \min _{q \in L_{reg}} d(p,q), \end{aligned}$$
(9)

where \(d(\cdot )\) is the Euclidean distance and \(N_{L}\) is the number of landmarks considered while computing the PAE. For the central pair \(N_{L}=2\), for outer rings \(N_{L}=18\), and for all rings \(N_{L}=20\).

3.3 Weight Masks

Each weight mask is tuned for two parameters, the size, and the weight distribution. We evaluate weight masks with sizes ranging from those covering only the central pair up to those completely covering the outer ring. For weight distribution, uniform distribution and variations of the Hann window are tested. H1 denotes the Hann window defined by a radial profile \(h(r)=0.5(1+cos(\frac{\pi r}{R_{sz}}))\), where r is the radial distance from the center of the mask, and \(R_{sz}\) is the total radius of the mask, and H2, H3, and H4 denote the windows defined by \((h(r))^2\), \((h(r))^3\), and \((h(r))^4\), respectively.

Fig. 4.
figure 4

Illustration of weight mask size extent and distributions (mask size = 1 r)

Figure 4 shows the coverage of circular mask of radius 1 r and the corresponding weights over cilium regions for different distributions. For the non-rigid registration, only the uniform distribution is considered while performing size optimization, as all regions in the cilium are equally important. The approximate radius (r) of cilium, which is the distance from the cilium centre to the plasma membrane is considered as 55 pixels. This is chosen based on the observations from 30 initially detected cilia instances.

Table 1 shows the \(PAE_{cl}\) for the central landmarks, measuring the performance of weight masks tested for the rigid registration (\(W_{rig}\)) step, in the size range [0.4 r–0.7 r]. The performance for (H2, 0.6r) clearly indicates that to achieve a good alignment of the central pair, high weight must be given to the central region. Table 2 shows the \(PAE_{ol}\) for the outer landmarks, measuring the performance of weight masks tested for the affine registration (\(W_{aff}\)) step, in the size range [0.8 r–1 r]. The performance for (H1, 1 r) indicates that to achieve good alignment of the outer rings without disturbing the central ring alignment, a good balance of weight is important for the central and the outer regions with high weight at the central region to compensate for the 9 outer doublets. Table 3 shows the \(PAE_{cl}\), \(PAE_{ol}\) and \(PAE_{al}\) for the central, outer and all landmarks respectively, measuring the performance of weight masks tested for the non-rigid registration (\(W_{nrr}\)) step, in the size range [0.9 r–1.1 r]. The performance for (Uniform, 1 r) indicates that to achieve good local-alignment, especially for the outer rings, we should not consider the plasma membrane as it could mislead the outer region registration. Table 4 summarizes our recommendation for suitable masks at each registration step, and Fig. 5 illustrates the extent and weight distribution of the recommended masks.

Table 1. \(PAE_{cl}\) for \(W_{rig}\)
Table 2. \(PAE_{ol}\) for \(W_{aff}\)
Table 3. PAE for \(W_{nrr}\)
Table 4. Weight masks details
Fig. 5.
figure 5

Recommended weight masks (a) \(W_{rig}\), (b) \(W_{aff}\), (c) \(W_{nrr}\).

3.4 Algorithm

The algorithm takes an image-pair and multiple weight masks (\(W_{rig}\), \(W_{aff}\) and \(W_{nrr}\)) as input. In the rigid registration step, input images are pre-processed with a Gaussian filter and downsampled to half of the original size. In the affine registration step also images are pre-processed using Gaussian smoothing filter, but registration is performed on the original image sizes. In the non-rigid registration step a three level resolution pyramid is used. At the highest level, images are smoothened with a Gaussian filter and resized to half of the original image size, and coarse non-rigid registration is performed using a low resolution grid. The generated displacement field is further refined at the middle level, where images are smoothened with Gaussian filter and processed at their original sizes. The grid resolution used is twice that of the previous level. At the lowest pyramid level registration is performed on the original images and a grid resolution is set close to the original image size. The resulting displacement field defines the mapping between the image-pair and when applied on the moving image results in the registered image \(I_r\).

Table 5. Registration algorithm parameter details

The parameter details are shown in Table 5. G\(_\sigma \) and G\(_{sz}\) represents the sigma value and the Gaussian filter size while, I\(_{sz}\) and Gr\(_{sz}\) are the image size and the grid size. P\(_{sp}\) is the control point spacing and the regularization weight is (\(\lambda _{reg}=8e-4\)). The values for \(G_{\sigma }\) and \(\lambda _{reg}\) were chosen based on experiments on synthetic data for \(G_{\sigma }\) and on real data for \(\lambda _{reg}\). The algorithm takes \(\approx \)60 s per image-pair in MATLAB on a 2.3 GHz Intel Core i7 CPU.

4 Results

In this section results for cilia registration using the proposed multiple-masks strategy and super-resolution reconstruction are presented. We also presented the cilia registration performance of two obvious approaches, first, registering without using any mask, and second, registering using the uniform weight mask with radius 1 r, which only covers the region within the plasma membrane of the cilium. We referred to the former as no-mask and the later as a constant-masks strategy. In Table 6, the mean of average pixel alignment error (\(\overline{PAE}\)) calculated using all the image-pairs is reported for before registration, and after registration using different weight mask(s) strategies. The registration performance is computed separately for the central, outer and all landmarks. Results clearly indicate that the central pair is best aligned using the multiple-masks, whereas the alignment of the ring of the outer doublets is best for the constant-masks, closely followed by the multiple-masks. The overall performance of the multiple-masks is slightly better than the constant-masks. An example of the results for each registration step using multiple-masks is shown in Fig. 6. Figure 7 shows the SR images reconstructed using different weight masks strategies. Here, 15 cilia instances with the highest wNCC scores in the registration were used to create the respective SR image.

Table 6. Mean PAE summary
Fig. 6.
figure 6

Step-wise performance of registration using the multiple-masks approach. (a) Relative position of landmarks in the static (\(+\)) and the moving () image before registration, (b) after rigid registration, (c) after affine registration and (d) after non-rigid registration. Landmarks are used for evaluation only.

Fig. 7.
figure 7

SR cilia with (a) no-mask, (b) constant-masks, (c) multiple-masks.

5 Discussion and Conclusion

In this paper, we present a technique to enhance the cilia substructures by registering multiple instances of cilia cross-sections followed by SR reconstruction. Cilium instance registration is achieved using a multi-step registration strategy where we use different weight masks, with the aim to align different regions at each registration step. Using the proposed multiple-masks results in a \(\overline{PAE}\) of \(2.35\pm 1.82\) for all the landmarks, which is better than using no-masks or constant-masks, see Table 6 (All). The constant-masks, however, performed slightly better for aligning the outer rings but failed for the central pair, see Table 6 (Outer and Central). The subjective results of reconstructed SR cilium in Fig. 7c also support our quantitative results where the central pair, radial spokes and dynein arms have relatively better visibility than the other two results. As the constant-masks performed better in aligning the outer doublets, the corresponding SR cilium in Fig. 7b has good visibility of DA, but poor for radial spokes and the central pair. The same is validated by our expert pathologist, author AD. With these observations we propose to use either the multiple-masks or the constant-masks depending of the requirements of the problem at hand.

Fig. 8.
figure 8

Exceptions in alignment, (a) stuck in local minima, (b) case of poor result from multi-position initialization.

Figure 8a shows a case when the algorithm achieves good alignment of the central pair after the rigid registration step but fails to keep the alignment after the affine registration step. Figure 8b shows the best multi-position initialization which was achieved for a position that was unfavorable for the ring of the outer doublets, making the total registration poor. To avoid disturbance from failure cases, the SR reconstruction is only performed from well-registered instances. Future work involves evaluation of the method on a larger dataset.