1 Introduction

Multi-atlas label fusion (MALF) is a powerful technique for anatomy segmentation. This method relies on image registration to propagate anatomical labels from pre-labeled training images, i.e. atlases, to a target image and applies label fusion to reduce atlas propagation errors.

Template-based atlas propagation [8] has been applied for reducing registration cost in MALF. Using this approach, instead of directly registering each atlas to a target image, the pairwise registration is achieved through registering each image to one common template. Since registrations between atlases and the template can be calculated off-line, only one registration between the template and the target image needs to be calculated online.

One commonly applied criterion for choosing the propagation template is to reduce the overall atlas propagation error from atlases to the template [4, 8]. We show that a more effective criterion should aim to reduce the propagation error from the template to the target image. Hence, instead of employing a common propagation template, a custom selected template should be used for each individual target image. We propose to employ a sizable template library. Given a target image, the template producing the least registration error to the target image is selected for optimal atlas propagation.

In an application of cardiac CT segmentation, we demonstrate that our method significantly outperforms standard common template based atlas propagation. Using a small fraction of computation cost, our method produces comparable results to standard MALF that uses pairwise deformable registrations.

1.1 Related Work

Faster atlas propagation can be achieved by: (1) using cheaper but less accurate registrations to replace deformable registration [5]; (2) reducing the number of online registrations. Employing less accurate registrations often substantially sacrifices accuracy. For instance, a recent work along this line [2] affinely warps images to common templates and applies learning-based refinement, which still underperforms deformable registration based multi-atlas segmentation.

Reducing online registrations can be achieved by atlas selection [1, 10] and/or template-based propagation [8]. Atlas selection aims to select a subset of atlases that are likely to produce accurate label propagation for a target image. However, it has limited effects on reducing registration cost. The performance of multi-atlas segmentation usually increases as poorly registered atlases are excluded from label fusion. However, the performance may start decreasing after removing well registered atlases. Typically, a good number of atlases are still required for label fusion to prevent performance drop.

Template-based atlas propagation is indirect propagation, which is based on composing registrations along a registration path through intermediate image(s). Indirect propagation has been applied for improving atlas propagation accuracy. For example, each atlas is propagated through multiple registration paths to improve the chance that atlas information is accurately propagated at least once [9, 12, 14]. With manifold learning, instead of the brutal force approach, efficiency can be improved by decomposing a difficult-to-estimate large deformation between two images into a series of easier-to-estimate smaller deformations represented by intermediate propagation images/templates [6, 13]. When applied for reducing registration cost, the indirect registration scheme is only applied through common template(s) based atlas propagation. Our contribution is a new strategy for optimal template-based propagation.

2 Method

2.1 Modeling Atlas Propagation Error

Let \(\phi _{I\rightarrow K}\) be a transformation that aligns image I to image K. Let \(f\left( \phi _{I\rightarrow K}, x\right) \) be registration error at location x in K, i.e. the absolute spatial displacement between true and estimated correspondences. Let \(\phi _{I\rightarrow T\rightarrow K}=\phi _{I\rightarrow T} \circ \phi _{T\rightarrow K}\) be the composed transformation for propagating I through a template T. We have:

$$\begin{aligned} f\left( \phi _{I\rightarrow T\rightarrow K}, x\right) \le f\left( \phi _{I\rightarrow T}, x_T\right) + f\left( \phi _{T\rightarrow K}, x\right) \end{aligned}$$
(1)

where \(x_T\) is the correspondence of x in T, as defined by \(\phi _{T\rightarrow K}\). Let \(F(I,K)=\sum _{x\in K}\left[ f\left( \phi _{I\rightarrow K}, x\right) \right] \) and \(F(I,T,K)=\sum _{x\in K}\left[ f\left( \phi _{I\rightarrow T\rightarrow K}, x\right) \right] \) be the overall registration error in \(\phi _{I\rightarrow K}\) and \(\phi _{I\rightarrow T\rightarrow K}\), respectively. We have:

$$\begin{aligned} F(I,T,K) \le F(I,T) +F(T,K) \end{aligned}$$
(2)

For indirect propagation, the overall registration error is upper bounded by the total registration errors in the two registrations on the registration path.

Fig. 1.
figure 1

Atlas propagation through a single selected template.

2.2 Biased Template Selection

Let \(A=\{A_1,...,A_n\}\) be an atlas set with n atlases. The total propagation error from an atlas set to a target image through a single template is bounded by:

$$\begin{aligned} \sum _{i=1}^n F(A_i,T,K) \le \sum _{i=1}^n F(A_i,T) + n F(T,K) \end{aligned}$$
(3)

Our goal is to find a template T such that the total atlas propagation error is minimized, which can be achieved by choosing the template minimizing the upper bound. Based on (3), two competing schemes may minimize the upper bound: (1) unbiased template creation/selection that minimizes the average registration error from all atlases to a common template, i.e. minimizing \(\sum _{i=1}^n F(A_i,T)\); (2) biased template selection that selects the template minimizing registration error to the target image, i.e. minimizing nF(TK).

Minimizing registration errors from all atlases to a common template is the goal for unbiased template building [4, 7] and groupwise registration. However, its effect on reducing overall registration errors is limited by how tightly clustered the atlases are. The key advantage of MALF is to use diverse atlases to capture population variation for robust label propagation. Hence, it is common to have highly dissimilar images included in one atlas set, making it difficult to reduce the overall registration error from all atlases to a common template.

In contrast, the template-target registration error can be more easily minimized by choosing a template similar to the target image (see Fig. 1). Furthermore, since the template-target registration error has the highest weight in (3), the template minimizing the upper bound is also biased to reduce registration error to the target image. In fact, when \(T=K\), the template-based atlas propagation becomes direct registration based propagation. Although it is intractable to consider every image as a potential template, we hypothesize that it is highly possible to find a similar template from a modestly sized and representative template library for any target image to keep the total atlas propagation error stay close to the total error produced by direct registration based atlas propagation.

2.3 Multi-template Atlas Propagation

Atlas propagation through a single template is expected to approach the performance of direct registration based atlas propagation as the template library grows. Using a finite template library, single template atlas propagation is still expected to under perform direct registration based atlas propagation. Since registrations between templates and a target image are independently calculated, the additional propagation error caused by registration composition using different templates are independent from each other, which can be effectively reduced by label fusion. Hence, employing a few templates that are similar to the target image for atlas propagation may completely remove the performance gap between template-based propagation and direct registration based propagation.

2.4 Downsampling-Based Fast Template Selection

Template selection aims to select a template from a template library that has the smallest registration error to a target image. Comparing to atlas selection [1, 10], template selection has the following challenges: (1) to ensure small registration error from at least one template to any target image, a template set may contain more images than a typical atlas set; and (2) template selection needs to have low computational cost to achieve the goal of reducing overall computational cost.

To this end, each template is registered to the target image in a downsampled space. After registration, image similarity measures such as normalized mutual information (NMI) and sum squared distance (SSD) are employed for template ranking. Note that both global and region of interest based image similarity can be employed. In our experiments, both templates and target images are downsampled into a coarse resolution such that deformable registration can be finished within a few seconds, keeping the total computational cost for template selection negligible comparing to a regular registration. Registrations to the target image in the original space are only computed for selected templates.

3 Experiments

We conducted anatomy segmentation experiments using cardiac CT scans. Sixteen anatomical structures were manually traced by a clinician for 42 cases, namely, sternum, aorta (ascending/descending/arch/root), pulmonary artery (left/right/trunk), vertebrae, left/right atrium, left/right ventricle, left ventricular myocardium, superior/inferior vena cava. All images were resampled to have a 2 mm\(^3\) isotropic resolution. See Fig. 2(a) for one image with manual annotations.

Fig. 2.
figure 2

(a) Axial (left) and coronal (right) views of one CT image with manual annotations. (b) One template image at 2, 10, and 15 mm\(^3\) resolutions, respectively.

3.1 Experiment Setup

We conducted leave-one-out cross validation using the 42 labeled scans.

Image Registration. Image registration was computed using ANTS [3] by sequentially optimizing affine and deformable transform (Syn), using Mattes mutual information. Registering an image pair at 2 mm\(^3\) resolution took \(\sim \)50 min on a 2 G HZ CPU.

Performance of Direct/Indirect Registration. To compare atlas propagation accuracy produced by direct registration and indirect registration composition, we calculated pairwise deformable registrations for each pair of the 42 images with manual segmentation. The indirect registrations were produced for each image pair by taking each of the remaining images as the propagation template. The performance of direct/indirect registrations is measured by how well anatomical structures are aligned in Dice similarity coefficient (DSC).

Label Fusion. We applied joint label fusion [11] with default parameters. Note that for our method, all atlases are propagated through the selected template(s) and are used for label fusion.

Template Library. The template library was created by randomly selecting cases without manual segmentation. To investigate how the size of template library affects the performance of atlas propagation, we created four template libraries with varying sizes of 10, 20, 50, and 100, respectively.

Downsampling Space for Template Selection. To study the effect of how downsampling may affect template selection, we tested two downsampling resolutions: 10 mm\(^3\) and 15 mm\(^3\). Figure 2(b) shows one example template. Images in 10 mm\(^3\) and 15 mm\(^3\) resolutions only contain global structures, which are sufficient for a global alignment. Registration at 10 mm\(^3\) and 15 mm\(^3\) can be calculated within 5 s and 1 s, respectively.

Template Selection Metric. Following [8], we applied NMI and SSD for atlas/template selection. To test multi-template atlas propagation, we varied the number of selected propagation templates from 1 to 5.

Baseline Methods. MALF with direct registration atlas propagation and atlas selection was applied to set the baseline performance. We also compared with unbiased template building based atlas propagation. We iteratively applied the unbiased template building method [7] and k-means clustering to create common template(s) for the 42 testing images. For example, when k templates are created, the images are grouped into k clusters based on image similarity. One template is created from each cluster. We varied the number of common templates from 1, 2, 3, and 4. For common template(s) based propagation, we applied two label fusion schemes: (1) each training image was propagated to a target image once through its nearest neighbor template; (2) each training image was propagated through all templates and all warped atlases were applied for label fusion.

3.2 Results

Direct/Indirect Registration. Indirect registration produces more registration errors than direct registration. The average DSC scores over all anatomical structures produced by direct/indirect registrations are 0.525 and 0.456, respectively.

Baseline MALF. Figure 3(a) summarizes the performance when various number of atlases were selected for label fusion. NMI consistently outperformed SSD. Meanwhile, using fewer atlases did not produce more accurate label fusion results than using all atlases when locally weighted voting fusion was applied.

Fig. 3.
figure 3

(a) Segmentation performance using direct atlas propagation with atlas selection; (b) Performance using atlas propagation via common templates and templates selected from a library with 100 images. The performance of Bench-MALF, which applied 41 atlases with direct atlas-target registrations, is shown for direct comparison.

Common Template Atlas Propagation. Figure 4 shows an example when three common templates are built. The created templates capture the fact that the images may cover different body regions.

Fig. 4.
figure 4

Coronal views of three templates created from unbiased template building.

Figure 3(b) shows the label fusion performance when various number of common templates were created for atlas propagation. When each atlas was propagated through multiple templates, the result is slightly better than that produced by propagating each atlas only through its nearest neighbor template. As more templates were created, the registration error from each atlas to its nearest neighbor template is reduced, which results in more accurate label fusion results. Overall, common template(s) based atlas propagation underperformed standard MALF using direct atlas propagation, even when four common templates were used. The difference is significant, with \(p<0.01\) on the paired Students t-test.

Biased Template Selection Atlas Propagation. Table 1 summarizes the performance for biased template selection based atlas propagation. Again, NMI consistently outperformed SSD for template selection. Overall, using larger size template libraries produced more accurate results. The performance gain due to enlarged template library is more prominent when a single template was applied for atlas propagation. However, the performance gain diminishes as the performance approaches the accuracy level produced by standard MALF with direct registration. With NMI template selection, using a template library of size 50 or greater consistently and significantly (\(p<0.01\)) outperformed common template atlas propagation (also see Fig. 3).

Table 1. Label fusion performance produced by template selection based atlas propagation. \(^{*}\)indicates the difference from the standard MALF with direct atlas propagation is statistically significant, with \(p<0.01\) on the paired Students t-test.

With a single propagation template, the best accuracy is 0.799 mean DSC produced by NMI selection in 10 mm\(^3\) space with a template library of 100 images. When two templates were applied for atlas propagation, the performance gap between direct-registration based MALF is completely removed with a template library of size 50. Template selection using a library of size 50 takes less than 5 minutes and 1 min in 10 mm\(^3\) and 15 mm\(^3\) space, respectively. Using 2 deformable registrations + template selection for atlas propagation, our method reached the performance of standard MALF, which requires 41 deformable registrations for 41 atlases. Hence, in our experiments the atlas propagation cost of our method is about 5% of standard MALF.

4 Conclusions

We provide a justification for employing a sizable template library and biased template selection to improve the performance of template-based atlas propagation. In a cardiac CT anatomy segmentation application, our method consistently outperformed common unbiased template based atlas propagation. Using 5% registration cost, our method produced comparable performance to standard direct registration based MALF.