Keywords

1 Introduction

Segmenting tubular structures is an important task in medical image analysis; for example, for assessing vascular diseases or tracking the progress of neurological disorders that manifest in spinal cord atrophy. Especially when used in large-scale clinical trials, largely automated segmentation is desirable to reduce the workload on clinical staff. Such automated segmentation approaches, in turn, should be robust with respect to specific choices of parameterization.

In this paper, we propose a segmentation method that fulfills both criteria: it is completely automated, and it creates segmentations of similar quality over a wide range of parameter choices. Our method adapts Yuan et al.’s continuous max flow approach [10] and combines it with an anisotropic total variation (ATV) regularization term. ATV keeps changes of the segmentation’s boundary small along the course of the tubular structure. We use Frangi et al.’s well-established vesselness feature [3] as our measure of tubularity, which we exploit twofold: both for finding the location and the orientation of the structures of interest.

The directional information of vesselness, which is usually neglected, has previously been used: Manniesing et al. [6] construct an anisotropic tensor from it for accentuating vascular structures in angiography images for image enhancement. Gooya et al. [4] use this tensor in an active contour framework for blood vessel segmentation. ATV-regularized segmentation has been generally described by Olsson et al. [7] and has been used, for example, by Reinbacher et al. [8] who segment thin structures of known volume based on first-order derivatives and a volume constraint. A review of vessel segmentation is given by Lesage et al. [5]. De Leener et al. [2] review the more specific topic of spinal cord segmentation.

Our contributions lie in incorporating Hessian-based vesselness into ATV-regularized segmentation and in integrating ATV into the continuous max flow framework [10]. To the best of our knowledge, both has not been tried, so far.

2 Methods

In Sect. 2.1, we motivate our choice of ATV regularization. In Sect. 2.2, we state the ATV-regularized segmentation problem in the continuous max flow framework and propose an algorithm for solving it. In Sect. 2.3, we describe how we incorporate the vesselness feature. In Sect. 2.4, we present our choice of flow capacities. For a good general introduction to continuous max flow, see [10].

Fig. 1.
figure 1

Left to right: T1 image I of the cervical spinal cord; closeups of the vessel directions \(v_{1}\) before GVF and \(\tilde{v}_{1}\) after GVF (vectors scaled by the segmentation \(u^{*}\) for visualization); distance map D of the vessel ridge \(\mathcal {R}\) (normalized for visualization); source capacities \(C_{s}\); sink capacities \(C_{t}\); nonterminal capacities C; segmentation \(u^{*}\).

2.1 Isotropic and Anisotropic Total Variation

Segmentation on the d-dimensional image domain \(\varOmega \subset \mathbb {R}^{d}\) can be formulated as the problem of finding a binary labeling \(u : \varOmega \rightarrow \left\{ 0, 1 \right\} \) for the given image \(I : \varOmega \rightarrow \mathcal {I}\) (e.g. with \(\mathcal {I} {\,=\,} \left[ 0, 1 \right] \) for a normalized single-channel image). In practice, the problem is often relaxed such that \(u : \varOmega \rightarrow \left[ 0, 1 \right] \), and the final labeling is determined by applying a threshold to the result of the relaxed problem [7, 10].

A common regularization term in segmentation is the total variation \(\hbox {TV}\), which minimizes the surface area of the segmented region, penalizing jumps between segmentation foreground (\(u {\,=\,} 1\)) and background (\(u {\,=\,} 0\)) and thus allowing for smooth segmentations even if I is noisy (here, \(\left| \cdot \right| \) denotes the \(l_2\) norm):

$$\begin{aligned} \hbox {TV}\left[ u \right] = \int _{\varOmega } \left| \nabla u \right| {\text {d}} \negthinspace x . \end{aligned}$$
(1)

While TV is a good regularizer for many applications, it seems not optimal in the context of tubular structure segmentation. This is because TV is isotropic; that is, changes of u are penalized regardless of orientation. If we want to segment a tube, however, we would like to employ the prior knowledge that its shape ideally does not change along its course; thus we would like to penalize changes along the tube’s direction more strongly than changes perpendicular to it. In other words, we would prefer an anisotropic regularization term.

In the proposed method, we thus use anisotropic total variation \(\hbox {ATV}\) [7, 8]:

$$\begin{aligned} \hbox {ATV}\left[ u; A \right] = \int _{\varOmega } \left( \nabla u ^{\mathsf {T}\!}A \nabla u \right) ^{1/2} {\text {d}} \negthinspace x = \int _{\varOmega } \left| S ^{\mathsf {T}\!}\nabla u \right| {\text {d}} \negthinspace x \quad \text{ with }\quad A = S S^{\mathsf {T}\!}, \end{aligned}$$
(2)

where \(A : \varOmega \rightarrow \mathbb {R}^{d \times d}\) is strongly positive definite in the sense of Olsson et al. [7] and S is a decomposition of A. For our particular choice of A and S, see Sect. 2.3.

If we assume, as a simple three-dimensional example, that \(A = \hbox {diag}\left( 1, a, a\right) \) with \(0< a < 1\), we see that changes along the \(x_{1}\) axis will be more strongly penalized than changes along \(x_{2}\) and \(x_{3}\). This would in fact be a meaningful choice if the tubular structure of interest was oriented along \(x_{1}\). From the example we can also see that \(\hbox {ATV}\) is a generalization of \(\hbox {TV}\), as \(\hbox {ATV}\) becomes \(\hbox {TV}\) for \(a {\,=\,} 1\).

2.2 ATV in Continuous Max Flow

The dual formulation of the max flow problem as stated in [10], with TV replaced by ATV regularization as we propose, is given by the min cut problem

$$\begin{aligned} \min _{u \in \left[ 0, 1 \right] } \int _{\varOmega } \left( 1 - u \right) \, C_{s} + u \, C_{t} + \left| S^{\mathsf {T}\!}\nabla u \right| \, C {\text {d}} \negthinspace x , \end{aligned}$$
(3)

with the source capacities \(C_{s}\), sink capacities \(C_{t}\), and nonterminal capacities C (\(C_{.}: \varOmega \rightarrow \mathbb {R}_{\ge 0}\)). Using integration by parts and the geometric definition of the scalar product, we can show for the nonterminal flow \(p: \varOmega \rightarrow \mathbb {R}^{d}\) that

$$\begin{aligned} \int _{\varOmega } \left| S^{\mathsf {T}\!}\nabla u \right| \, C {\text {d}} \negthinspace x = \max _{\left| p \right| \le C} \int _{\varOmega } u \, \hbox {div} \left( S p \right) {\text {d}} \negthinspace x . \end{aligned}$$
(4)

Together with the respective equalities for \(C_{s},C_{t}\) and the source and sink flows \(p_s,p_t: \varOmega \rightarrow \mathbb {R}\) (see Eqs. (18) and (19) in [10]), we derive the primal–dual formulation as \(\max _{p_{s}, p_{t}, p} \min _{u \in \left[ 0, 1 \right] }E \left[ p_{s}, p_{t}, p, u \right] \) with

$$\begin{aligned} E = \int _{\varOmega } \left( 1 - u \right) \, p_{s} + u \, p_{t} + u \, \hbox {div} \left( S p \right) {\text {d}} \negthinspace x = \int _{\varOmega } p_{s} + u \, \left( \hbox {div} \left( S p \right) - p_{s} + p_{t} \right) {\text {d}} \negthinspace x , \end{aligned}$$
(5)

subject to the flow capacity constraints \(p_{s}\le C_{s}\), \(p_{t}\le C_{t}\), and \(\left| p\right| \le C\).

Making use of the anisotropic coarea formula in [7], it can be shown that any \(u^{\ell }\) for a threshold \(\ell \in \left( 0, 1 \right) \), given by

$$\begin{aligned} u^{\ell }(x) = {\left\{ \begin{array}{ll} 1, &{} u^{*}(x)>\ell \\ 0, &{} u^{*}(x)\le \ell \end{array}\right. } \quad \text{ with } \quad u^{*} = \mathop {\hbox {arg min}}\limits _{u \in \left[ 0, 1 \right] } E , \end{aligned}$$
(6)

is a globally optimal solution for the binary problem corresponding to Eq. (3).

Following [10], we add an augmented Lagrangian term to Eq. (5), gaining

$$\begin{aligned} \max _{p_{s}, p_{t}, p} \min _{u \in \left[ 0, 1 \right] } \int _{\varOmega } p_{s} + u \, \left( \hbox {div} \left( S p \right) - p_{s} + p_{t} \right) - \frac{c}{2} \, \left( \hbox {div} \left( S p \right) - p_{s} + p_{t} \right) ^{2} {\text {d}} \negthinspace x \end{aligned}$$
(7)

as the final problem, which we propose to solve with Algorithm 1.

figure a

2.3 ATV Regularization with Vesselness

Frangi et al. [3] examine the Hessian matrices; that is, the second-order derivatives, in the scale space of the volumetric image I to calculate what they call vesselness. The idea is to determine from the ratios of the Hessians’ eigenvalues how closely the local structure in I resembles a tube.

In particular, let \(\mathcal {S}\) be a predefined set of scales that roughly match the expected tube radii. For each scale \(s \in \mathcal {S}\), let \(H_{s}\left( x \right) \) denote its Hessian approximation in x, calculated by convolving I with Gaussian derivatives of standard deviation s. Let \(\lambda _{i,s} \left( x \right) \) (\(i {\,=\,} 1,2,3\)) denote the sorted eigenvalues (\(\left| \lambda _{1,s} \right| \le \left| \lambda _{2,s} \right| \le \left| \lambda _{3,s} \right| \)) and \(v_{i,s} \left( x \right) \) corresponding eigenvectors of \(H_{s}\), such that

$$\begin{aligned} H_{s} = V_{s} \Lambda _{s} V_{s}^{\mathsf {T}\!}\quad \text{ with } \quad V_{s} = \left[ v_{1,s} | v_{2,s} | v_{3,s} \right] , \quad \Lambda _{s} = \hbox {diag}\left( \lambda _{1,s}, \lambda _{2,s}, \lambda _{3,s} \right) . \end{aligned}$$
(8)

Note that \(V_{s}^{\mathsf {T}\!}= V_{s}^{-1}\), as \(H_{s}\) is symmetric. Assuming bright tubular structures on dark background, the vesselness response is \(\nu \left( x \right) = \max _{s \in \mathcal {S}} \nu _{s} \left( x \right) \), where

$$\begin{aligned} \nu _{s} = {\left\{ \begin{array}{ll} 0, \quad \lambda _{2,s} \ge 0 \text{ or } \lambda _{3,s} \ge 0\\ \left( 1 - \exp (\frac{-1}{2w_{1}^{2}} \frac{\lambda _{2,s}^{2}}{\lambda _{3,s}^{2}} ) \right) \exp ( \frac{-1}{2w_{2}^{2}} \frac{\lambda _{1,s}^{2}}{ \lambda _{2,s}\lambda _{3,s} } ) \left( 1 - \exp ( \frac{-\sum _{i} \lambda _{i,s}^{2}}{2 w_{3}^{2}} ) \right) , \quad \text{ else } , \end{array}\right. } \end{aligned}$$
(9)

with the weighting factors \(w_{i} \in \mathbb {R}_{> 0}\). The eigenvectors for \(\nu \) are \(V = \left[ v_{1} | v_{2} | v_{3} \right] \), with \(v_{i} = v_{i,s^{*}}\) and \(s^{*} = \hbox {arg max}_{s \in \mathcal {S}} \nu _{s}\). In the original description of [3], no use of V is made. In our approach, we use the eigenvectors to steer the ATV regularizer. We observe that in points where \(\nu \) is high, \(v_{1}\) points along the local vessel orientation [6]. Recall that we want to regularize strongly along the direction of the vessel. Unfortunately, we cannot use \(v_{1}\) directly for this purpose, as it reliably gives the vessel’s direction in the vessel center only, where \(\nu \) is the highest. Therefore, we use the concept of gradient vector flow (GVF) [9] to first propagate the directions from places where \(\nu \) is high to regions where \(\nu \) is low, creating a smoothly varying vector field. The necessary steps are as follows.

Let \(\mathcal {R}\) be the set of vesselness ridge points; that is, the local maxima of \(\nu \), down to a noise threshold. As both \(-v_{1}\) and \(v_{1}\) are valid eigenvectors, we have to make sure that the vectors of neighboring points approximately point in the same rather than the opposite direction, so that they don’t cancel each other out when diffusing them via GVF. Thus, we fix their signs beforehand, gaining \(\bar{v}_{1}\): We calculate the minimum spanning tree over the ridge points \(\mathcal {R}\), select a root point, keep its sign, and traverse the tree. For each child point, we choose \(\bar{v}_{1}\) as either \(-v_{1}\) or \(v_{1}\), depending on which one maximizes the dot product (i.e. minimizes the angle) with its parent’s \(\bar{v}_{1}\). After traversal, the signs of the \(v_{1}\) for all remaining domain points \(x \in \varOmega \backslash \mathcal {R}\) are fixed w.r.t. their closest point in \(\mathcal {R}\), following the same rule. We scale all \(\bar{v}_{1}\) with \(\nu \), apply GVF, and scale the resulting vectors back to unit length, gaining \(\tilde{v}_{1}\). A comparison of the vector field before and after sign adjustment and GVF is shown in Fig. 1.

Finally, we recomplete \(\tilde{v}_{1}\) to an orthonormal basis \(\tilde{V} = \left[ \tilde{v}_{1} | \tilde{v}_{2} | \tilde{v}_{3} \right] \). The particular choice of \(\tilde{v}_{2}, \tilde{v}_{3}\) does not matter, as we will treat all directions perpendicular to \(\tilde{v}_{1}\) the same when regularizing. From \(\tilde{V}\), we construct AS for Eq. (2) as

$$\begin{aligned} A = \tilde{V} \! \tilde{A} \tilde{V} ^{\mathsf {T}\!}\quad \text{ and } \quad S = \tilde{V} \! \tilde{A}^{1/2} \quad \text{ with } \quad \tilde{A} = \hbox {diag}\left( 1, a, a \right) \quad \text{ and } \quad 0 < a \le 1 . \end{aligned}$$
(10)

Notice the similarity to A in the example at the end of Sect. 2.1: The idea of regularizing one direction stronger than the others remains the same; however, as we now scale with \(\tilde{A} = \hbox {diag}\left( 1, a, a \right) \) in the new basis \(\tilde{V}\), we target the actual local vessel direction \(\tilde{v}_{1}\) rather than a fixed axis.

2.4 Flow Capacities

For the source and sink capacities \(C_{s}, C_{t}\) of Eqs. (3) and (5), we use a combination of the normalized image intensities and the distances to the vessel ridge points \(\mathcal {R}\). Intuitively, using the intensities enables the distinction of foreground (i.e. the vessel) and background (i.e. everything else), while the distances w.r.t. \(\mathcal {R}\) isolate the vessel surrounding. This helps avoiding oversegmentations in case other structures have intensities similar to those of the vessel of interest. More formally, let \(D : \varOmega \rightarrow \mathcal {D} {\,=\,} \mathbb {R}_{\ge 0}\) be a Euclidean distance map of \(\mathcal {R}\), and let \(I : \varOmega \rightarrow \mathcal {I} {\,=\,} \left[ 0, 1 \right] \) be the normalized image. Let \(p_{\mathrm {b}}^{D}, \, p_{\mathrm {f}}^{D}, \, p_{\mathrm {b}}^{I}, \, p_{\mathrm {f}}^{I}\) be predefined estimates of the background and foreground probability densities for \(\mathcal {D}\) and \(\mathcal {I}\), with \(p_{.}^{D}: \mathcal {D} \rightarrow \mathbb {R}_{\ge 0}\) and \(p_{.}^{I}: \mathcal {I} \rightarrow \mathbb {R}_{\ge 0}\). We calculate \(C_{s}, C_{t}\) as

$$\begin{aligned} C_s \left( x \right)&= \frac{1}{q} \, \max \left\{ {}r \left( x \right) , 0 \right\} ,&\text{ with } \quad r \left( x \right)&= \ln \left( \frac{p_{\mathrm {f}}^{D} \left( D \left( x \right) \right) \cdot p_{\mathrm {f}}^{I} \left( I \left( x \right) \right) + \varepsilon }{p_{\mathrm {b}}^{D} \left( D \left( x \right) \right) \cdot p_{\mathrm {b}}^{I} \left( I \left( x \right) \right) + \varepsilon } \right) , \end{aligned}$$
(11)
$$\begin{aligned} C_t \left( x \right)&= \frac{1}{q} \, \max \left\{ -r \left( x \right) , 0 \right\} ,&q&= \ln \left( \frac{\max \{ \hat{p}_{\mathrm {b}}^{D} \!\! \cdot \hat{p}_{\mathrm {b}}^{I}, \; \hat{p}_{\mathrm {f}}^{D} \!\! \cdot \hat{p}_{\mathrm {f}}^{I} \} + \varepsilon }{\varepsilon } \right) , \end{aligned}$$
(12)

where \(\hat{p}_{.}^{.}= \max p_{.}^{.}\). The small positive constant \(\varepsilon \) avoids zero logarithms and zero divisions in r. Normalization with q ensures that \(C_{s},C_{t} \in \left[ 0, 1 \right] \), which eases their balancing with the nonterminal capacities C of Eqs. (3) and (5).

Using C, we try to move the segmentation boundary to image edges by making C small where the intensity gradient magnitude is high and vice versa:

$$\begin{aligned} C \left( x \right) = w \exp \left( {-1}/{\varsigma ^{2}} \, \left| \nabla I \left( x \right) \right| ^{2} \right) \quad \text{ with } \quad w,\varsigma \in \mathbb {R}_{> 0}, \end{aligned}$$
(13)

where \(\varsigma \) controls C’s sensitivity regarding the size of \(\left| \nabla I \right| \) and w balances C and \(C_{s}, C_{t}\). For an example of the capacities, see Fig. 1.

3 Experiments and Results

Implementation: Most of the method’s steps can be described as embarrassingly parallel, which means they can be calculated independently for different voxels. This is true, for example, for the vesselness \(\nu \), GVF, the capacities \(C_{s}, C_{t}, C\), and large parts of Algorithm 1. For this reason, we ported code to the GPU wherever possible. To reduce memory consumption, which is still a limiting factor for GPU programming, we represent the pointwise basis matrices \(\tilde{V}\) in Eq. (10) as equivalent unit quaternions. Derivatives are approximated using forward (\(\nabla \), Algorithm 1), backward (\(\hbox {div} \), Algorithm 1), and central (\(\nabla \), Eq. 13) differences.

Parameterization: For all experiments, the following parameters were chosen. Algorithm 1: \(\hat{\epsilon } {\,=\,} 10^{-6}\), \(\gamma {\,=\,} 0.11 \Delta x ^ 2\), \(c {\,=\,} 0.2 \Delta x ^ 2\) (\(\Delta x\): minimum voxel edge length in mm); GVF: 316 iterations with a regularization parameter of \(\mu {\,=\,} 3.16\) and step size determined as defined and described in [9]; Eq. (6): \(\ell {\,=\,} 0.5\); Eq. (9): \(w_{1} {\,=\,} w_{2} {\,=\,} 0.5\) as suggested in [3], \(w_{3}\) determined following [3]; Eq. (12): \(\varepsilon {\,=\,} 10^{-9}\).

Fig. 2.
figure 2

Phantom experiment. Left to right: Mean intensity projection at \(\sigma {\,=\,} 1.5\); segmentation with ATV; segmentation with TV; Dice coefficients w.r.t. ground truth (GT) for all noise levels \(\sigma \).

Fig. 3.
figure 3

Spinal cord experiment. Left to right: Mean surface distances for ATV/TV/PropSeg [1] vs. \(E_{1}\) and \(E_{2}\); corresponding Hausdorff distances; corresponding Dice coefficients; T1 Dice coefficients for ATV vs. \(E_{1}\) and \(E_{2}\) with varying parameter values (central bar: median, circle: mean, box limits: 25/75th percentile, whiskers: extrema).

Helical Phantom. In this simulation experiment, we rendered images of a synthetic helical phantom with values in [0, 1] to which we added Gaussian noise of standard deviation \(\sigma \) (Fig. 2). The phantom’s tube radius varied between 3 mm and 6 mm, so we set \(\mathcal {S}\)  =  [2.5 mm, 7.2 mm] (16 scales) in Eq. (9). We segmented the images with TV and ATV regularization, setting \(a {\,=\,} 1\) and \(a {\,=\,} 0.03\) in Eq. (10), respectively. We modeled \(p_{.}^{I}\) as normal distributions, using the true background, foreground, and noise level values. For the sake of simplicity, we set \(p_{.}^{D} {\,=\,} 1\) always. For w and \(\varsigma \) in Eq. (13), we made grid searches for each noise level, using the Dice coefficients w.r.t. the ground truth as optimization criterion.

Figure 2 shows the Dice coefficients for the best \(w, \varsigma \) combinations. The advantage of ATV becomes apparent as soon as the noise level increases.

Spinal Cord. In this experiment with real data, two clinical experts (\(E_1\), \(E_2\)) manually segmented 10 MR scans (5 T1, 5 T2) of the healthy human cervical spinal cord over the C1–C3 region. For each image, we then used the remaining four images of the same sequence (T1/T2) to estimate \(p_{.}^{.}\) and to find an optimal parameter combination for \(a,w,\varsigma \). The distributions \(p_{.}^{.}\) were estimated from the manual labelings of the remaining four images, modeling \(p_{\mathrm {b}}^{.}\) as mixtures of four Gaussians and \(p_{\mathrm {f}}^{.}\) as normal distributions. We set \(\mathcal {S}\)  =  [2 mm, 4 mm] (16 scales) in Eq. (9). The parameters \(a,w,\varsigma \) were optimized via grid search, using the mean Dice coefficients of the remaining four images w.r.t. their manual segmentations as optimization criterion. These distributions and the determined optimum parameterization were then used to segment the left-out image for evaluation of the method. For comparison with the state of the art, we also segmented all images with PropSeg [1].

Figure 3 shows the averaged mean surface distances, Hausdorff distances, and Dice coefficients w.r.t. their manual segmentations. Especially the T1 images profit from ATV, as they have both a lower resolution (T1: \(1 {\times } 1 {\times } 1 \, \mathrm {mm^3} \), T2: \(0.75 {\times } 0.38 {\times } 0.38 \, \mathrm {mm^3}\)) and a lower contrast-to-noise ratio (about one eighth) than the T2 images. On the right, Fig. 3 shows the Dice coefficients for applying a wide range of parameter values to the T1 images, demonstrating the robustness of our method w.r.t. parameterization. For each varied parameter, the others were kept constant (\(a {\,=\,} 0.03\), \(w {\,=\,} 0.32\), \(\varsigma {\,=\,} 1.00\)).

4 Discussion and Conclusion

We presented a fully automated method for the segmentation of tubular structures. For a single image of \(256^3\) voxels, the complete process of calculating the vesselness, GVF, capacities, and segmentation takes about 1 to 1.5 min (GPU: Nvidia GeForce GTX 770). Although image segmentation in general and tubular structure segmentation in particular have often been addressed, the results of comparison with a state-of-the-art approach lead us to believe that our method may be of value to the scientific community. Future experiments will have to show in more detail how strong is the dependence of the segmentation quality on the outcome of the Frangi vesselness response and what is the influence of a particular GVF parameterization in this context. Furthermore, the use of alternative vesselness indicators will have to be considered. We provide our reference implementation at https://github.com/spezold/miccai2016.