Keywords

1 Introduction

Organ segmentation from medical images is an essential process for computer-aided diagnosis and interventions. In upper abdominal surgery, such as laparoscopic gastrectomy, pancreatectomy, pancreas region information is strongly required for enabling safer surgical procedure. Multi-organ and single-organ segmentation methods for the abdominal region have been proposed [16]. High segmentation accuracies were reported for the liver, kidneys, and so on. Compared to those organs, the segmentation accuracies of the pancreas remain low.

The difficulty of pancreas segmentation results from the large variations of its position and shape. The pancreas is small, long, and thin. Its position and shape have a strong relationship to the surrounding organs, including the stomach, the liver, and the intestines. The pancreas slides and changes its shape and is affected by the surrounding organs. To segment it, we have to consider two variations: (1) shape and (2) spatial (positional variation in the body). Previously, many probabilistic atlas-based pancreas segmentation methods [15] have been developed. Such atlas-based methods generate a map that represents probability or likelihood of an organ’s existence in medical images. Patient-specific-atlas [1, 4, 5] and hierarchical-atlas [3] segmentation methods were developed to deal with the shape variation of organs. However, the spatial variation of the organ is not very well dealt with. Previous methods have tried to localize probabilistic atlases using positional information of other organs [2, 3] or manually specified information [1, 4].

We propose a fully automated atlas-based pancreas segmentation method. To deal with the spatial variation of the pancreas, we used a regression forest technique to estimate its position and size. We construct a patient-specific probabilistic atlas of the pancreas from the CT volumes in a database. The atlas is constructed for an unknown input CT volume taking the anatomical structures around the pancreas into account. The pancreas position and shape are related to the surrounding blood vessels. We use directed structure specific (DSS) volumes, which store blood vessel position and direction information, in the similar volume retrieval of an atlas construction. In the segmentation process, we segment the pancreas using the expectation maximization (EM) algorithm which uses the atlas as prior and the graph-cut optimization method.

The contributions of this paper are summarized as follows: (1) automated pancreas localization based on the relationships between local appearances and pancreas positions; (2) patient-specific probabilistic atlas generation of the pancreas taking the blood vessel position and direction information into account; and (3) a fully automated framework of pancreas segmentation.

2 Method

2.1 Overview

Our method segments the pancreas region from an input CT volume. The pancreas position and size (represented as a bounding box) are estimated by a regression forest technique. The localization is performed by considering the positional relationships between the local appearances in the CT volume and the pancreas position. Then we construct a patient-specific probabilistic atlas of the pancreas. To construct it, we perform CT volume retrieval from a training database, which resembles the unknown input CT volume using directed line structure similarity. We construct an atlas from the selected volumes and use it for automatic segmentation.

2.2 Pancreas Localization

Regression Forest Training. Our training database includes CT volumes and their corresponding manually segmented pancreas regions. We defined the axis-aligned minimum bounding box of the pancreas as follows: \({\mathbf b}=(b_{L}, b_{R}, b_{A}, b_{P}, b_{H}, b_{F})\). These elements represent the left, right, anterior, posterior, head, and foot side face positions of the bounding box in the CT coordinate system. We obtain patches of the size of \(p \times p \times p\) voxels from the CT volumes on a regular grid. The patch’s center position is defined as \(\mathbf{v}=(v_{x}, v_{y}, v_{z})\). The patch’s offset with respect to \(\mathbf{b}\) is shown as \(\mathbf{d}= \mathbf{b}-\mathbf{\hat{v}}\), where \(\mathbf{\hat{v}}=(v_{x}, v_{x}, v_{y}, v_{y}, v_{z}, v_{z})\).

A regression forest is constructed for each face position of the bounding box, which is obtained by constructing six regression forests. During the construction of a regression tree, training patches are pushed to the root of the tree. At each split node, the thresholding of a feature value of a training patch is performed to decide which child node should be selected for the training patch. Similar to [7], we use

$$\begin{aligned} f(\mathbf{v}) = \frac{1}{|F_{1}|} \sum _{\mathbf{q} \in F_{1}} I(\mathbf{q}) - \frac{1}{|F_{2}|} \sum _{\mathbf{q} \in F_{2}} I(\mathbf{q}) \end{aligned}$$
(1)

as the feature value of \(\mathbf{v}\). \(I(\mathbf{q})\) is a CT value at voxel \(\mathbf{q}\) in a CT volume. \(F_{1}\) and \(F_{2}\) are cuboids arbitrarily arranged in a patch. \(|F_{1}|\) and \(|F_{2}|\) denote the number of voxels in the cuboids. We randomly selected the positions and the sizes of the cuboids in a patch. Two cuboids may overlap.

For each split node, we calculate the variance of the offset value

$$\begin{aligned} \sigma ^{2} = \frac{1}{S} \sum ^{S}_{n=1}(\mathbf{d}_{n}-\mathbf{{\bar{d}}})^{2}, \end{aligned}$$
(2)

where S is the number of patches reaching the node, \(\mathbf{d}_{n}\) is the offset in the node, and \(\mathbf{{\bar{d}}}\) is the average of the offsets in the node. The variance of a parent node is

$$\begin{aligned} s^{2} = \sum _{m \in \{L,R\}} \omega _{m} \sigma ^{2}_{m}, \end{aligned}$$
(3)

where \(\omega _{m}=S_{m}/S\). We iterate the random selection of the feature values \(n_{F}\) times and the random selection of the threshold at each node \(n_{H}\) times. A pair of a feature value and a threshold, which gives the minimum variance, is recorded in a split node. A split node is changed to a leaf node if the number of patches that arrive at the node is fewer than \(n_{min}\) or the tree depth exceeds D. In each leaf node, we generate a histogram of patches about their offset. We estimate the parameters of a normal distribution that fit the histogram. The EM algorithm is used to estimate the parameters of the normal distribution (mean and variance).

Bounding Box Estimation by Regression Forest. We obtain patches from an unknown input CT volume. The patches are input to the root of each tree. In each split node, the feature values are calculated and thresholded using the recorded values. The result is used to select a child node to go through. When the patch reaches a leaf node, the average of the normal distribution, which is recorded in the leaf node, is given as the output of the regression tree. The output is offset from the patch position. The sum of the offset and the patch position is the estimated position of the face of a bounding box. The average of the estimated positions of all the patches is the final estimation result of the face position of a bounding box.

Fig. 1.
figure 1

An example of the DSS volume overlaid on CT volume. Pancreas is shown in yellow. DSS volume intensities are indicated by colors: red is high, and blue is low. (a) 3D view, (b) coronal, and (c) sagittal slice views. DSS volume has high values in blood vessels, especially in SV, which is indicated by red circles.

2.3 Directed Structure Specific (DSS) Volume

A probabilistic atlas of the pancreas is generated from volumes in the training database that have high similarities to the input CT volume. We generate DSS volumes for the atlas selection. Blood vessels are circulating around the pancreas. The splenic vein (SV), whose position has high correlation with the pancreas position. We generate a DSS volume that stores high values in the SV. An example of DSS volume is shown in Fig. 1.

The multi-scale line structure enhancement filter [8] is applied to a CT volume. We generate DSS volume as follows:

$$\begin{aligned} M(\mathbf{x}) = \left\{ \begin{array}{ll} w \cdot \mathop {\max }\nolimits _{1 \le k \le m} \{ \sigma ^{2}_{k} \cdot \lambda _{123}(\mathbf{x};\sigma _{k}) \}, &{}\mathrm{if} \ | \mathbf{e}^\mathrm{T}_{1}{} \mathbf{u}_{z} | \le \tau , \\ \mathop {\max }\nolimits _{1 \le k \le m} \{ \sigma ^{2}_{k} \cdot \lambda _{123}(\mathbf{x};\sigma _{k}) \}, &{}\mathrm{otherwise}, \end{array} \right. \end{aligned}$$
(4)

where \(\sigma _{k} = k\sigma _{1}\), \(\sigma _{1}\) is the size of the enhancement target line and \(\lambda _{123}\) is the response of the line structure enhancement filter. \(\mathbf{e}_{1}\) is an eigenvector that corresponds to the largest eigenvalue of the Hessian matrix calculated at \(\mathbf{x}\), \(\mathbf{u}_{z}\) is the head to foot direction unit vector, \(\tau \) is a threshold of the vector directions, and m is the scale step number. At the voxels in the line structures, \(\mathbf{e}_{1}\) becomes the direction along the line, meaning that \(\mathbf{e}_{1}\) shows the blood vessel directions. The SV direction is almost perpendicular to the head to foot direction. We give higher weight in the calculation of Eq. 4 at the voxels in the SV using the condition of the \(\mathbf{e}_{1}\) and \(\mathbf{u}_{z}\) directions.

2.4 Patient-Specific Probabilistic Atlas Generation

We generate normalized pancreas bounding boxes of the training database and the input CT volume. The bounding box of each data in the database is estimated using a method described in Sect. 2.2. We add a fixed size of the margin, which is selected so that bounding box covers most of the pancreas region, to the bounding box for reducing the effect of estimation error. We crop the CT volume and a corresponding manually segmented pancreas in the bounding box with the margin from the data in the database. The sizes of the cropped CT volume and the manually segmented pancreas are normalized to an image size of \(B_{v}^{3}\) voxels and a voxel size of \(B_{s}^{3}\) mm. We call a set of a cropped CT volume and the corresponding cropped manually segmented pancreas in the database a database VOI. Also, the input CT volume is cropped and we refer to this as input VOI. A schematic illustration of VOI generation is shown in Fig. 2.

Fig. 2.
figure 2

Pancreas bounding box is estimated on CT volume. Fixed margin is added to it and its size is normalized to create VOI.

Next we explain how we select the database VOIs for atlas generation. DSS volumes are generated from the CT volumes in the database VOIs and the input VOI. All of the CT volumes in the database VOIs are registered to the CT volume in the input VOI using the MRF-based non-rigid registration method [9]. The corresponding manually segmented pancreases and DSS volumes are also deformed by the registration process. We calculate the similarities between the DSS volumes of the database VOIs and the input VOI by the zero-mean normalized cross-correlation (ZNCC)[10]. We select \(N_{s}\) database VOIs that have high similarities with input VOI.

A patient-specific probabilistic atlas for the input CT volume is generated from the selected database VOIs. The atlas is obtained by

$$\begin{aligned} M_{\mathbf{p} \in V^{(l)}} = \frac{ \sum ^{N_{s}-1}_{i=0} z_{i} \cdot \delta (L^{i}_\mathbf{p},l) }{ \sum ^{N_{s}-1}_{i=0} z_{i} }, \end{aligned}$$
(5)

where \(M_{\mathbf{p} \in V^{(l)}}\) is a probabilistic atlas of organ label l for voxel \(\mathbf{p}\) in input VOI V. \(z_{i}\) is the weight of the i-th database VOI. Here we use the ZNCC value as the weight. \(\delta (L^{i}_\mathbf{p},l)\), which is a delta function, is described as

$$\begin{aligned} \delta (l,l') = \left\{ \begin{array}{ll} 1, &{} \ \ \ \mathrm{if} \ l=l', \\ 0, &{} \ \ \ \mathrm{otherwise},\ \end{array} \right. \end{aligned}$$
(6)

where \(L^{i}_\mathbf{p}\) is an organ label of voxel \(\mathbf{p}\) in the i-th database VOI.

2.5 Pancreas Segmentation

For the input CT volume, a rough segmentation is first performed using the patient-specific probabilistic atlas generated in Sect. 2.4. We used maximum a posteriori estimation in the rough segmentation [4]. Precise segmentation is performed using the graph-cut method [11] to refine rough segmentation result.

3 Experiments and Discussion

We evaluated the proposed method using 147 cases of abdominal CT volumes. The following are the acquisition parameters of the CT volumes: image size: \(512 \times 512\) voxels, slice number: 263–1061 slices, pixel spacing: 0.546–0.820 mm, and slice spacing: 0.40–0.80 mm. The ground truth pancreas, liver, spleen, and kidneys regions were semi-automatically made by three trained researchers. The ground truth region of each case was made by one of three researchers. All ground truth regions were checked by a medical doctor. The parameters of the method were experimentally selected as \(p=25, n_{F}=40, n_{H}=500, n_{min}=20, D=15, B_{v}=256, B_{s}=1.0, N_{s}=20, w=2.0, m=7, \tau =0.25,\) and \(\sigma _{1}=1.0\). We evaluated the proposed method by leave-one-out cross validation by the following evaluation metrics: Jaccard Index (JI) and Dice Overlap (DICE). Table 1 compares the accuracies of the proposed and previous methods. Figure 3 shows the segmentation results. The average computation times of the pancreas localization and segmentation were 44.7 seconds and about three hours per case.

Table 1. Accuracies of proposed and previous pancreas segmentation methods.
Fig. 3.
figure 3

Examples of pancreas segmentation (left) and corresponding probabilistic atlas (right) of proposed and previous [4] methods. These are slice images of calculated VOIs. Note that their aspect ratios are different from original CT images because size normalization was applied. Segmented pancreases are colored blue and overlaid on axial CT slices. In probabilistic atlases, red means high and blue means low values. Three pairs (a)–(b), (c)–(d), and (e)–(f) show same cases and same axial slices.

The most important contribution of our method is full automation of the pancreas segmentation process while achieving comparable accuracy to the previous state-of-the-art method [4]. The results show that the segmentation accuracy of the proposed method exceeded most of the previous methods. Our fully automated method is quite useful for various medical assistance applications. The proposed method tackles the spatial and shape variations of the pancreas via regression forest-based localization and patient-specific atlases. Although we automated all of the segmentation processes, segmentation results were superior to the other state-of-the-art methods in the DICE.

Previous methods [24] utilized manually segmented regions or the organ positions (other than the pancreas) for pancreas segmentations. They use this information for pancreas localization. Our method does not require any manual input or additional information for atlas localization; this is its great advantage. In Figs. 3(a) and (c), the calculated VOIs included the pancreas regions in their centers. These demonstrate that the pancreas localization successfully works. In these cases, much fat tissues exist among organs. Fat tissues have lower CT values compared to organs. Existence of fat tissue clarifies the organ contours. We used the intensity difference-based feature value (Eq. 1) in the pancreas localization. Clear organ contours make accurate localization of the pancreas. Compared to them, the pancreas touches the surrounding organs in Fig. 3(e). This makes the organ contours obscure and reduces the segmentation accuracy.

Results of the previous method [4] are shown in Figs. 3(b), (d) and (f). The previous method generates pancreas VOIs using the manually specified liver positions. VOIs of the previous method correctly included the pancreas in most cases. However, probabilistic atlases of the previous method were not good. Atlas that have high and low likelihoods inside and outside the pancreas contributes accurate segmentations. The previous method used line structure enhanced volumes for probabilistic atlas generation. We newly introduced DSS volume for probabilistic atlas generation. In the DSS volume, the SV and other blood vessels were enhanced. The enhancement result was quite effective for selecting similar CT volumes from the database in the atlas generation. In Figs. 3(a) and (c), generated atlases have high values in pancreas regions. The good atlases contributed segmentation accuracy improvements.

This paper proposed a fully automated pancreas segmentation method from a CT volume using a regression forest technique and patient-specific atlas generation. In the atlas generation, we introduced a new similarity based on blood vessel position and direction. Experiments using 147 CT volumes outperformed the other state-of-the-art methods. Future work includes the utilization of new feature values that can capture the local appearances of organs in pancreas localization and remove the misalignments of bounding box faces. Assessment of inter- and intra-observer variability of the ground truth data is also required.