Model-Based Segmentation of Vertebral Bodies from MR Images with 3D CNNs

Korez, Robert; Likar, Boštjan; Pernuš, Franjo; Vrtovec, Tomaž

doi:10.1007/978-3-319-46723-8_50

Robert Korez¹⁸,
Boštjan Likar¹⁸,
Franjo Pernuš¹⁸ &
…
Tomaž Vrtovec¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9901))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

14k Accesses
41 Citations
3 Altmetric

Abstract

We propose an automated method for supervised segmentation of vertebral bodies (VBs) from three-dimensional (3D) magnetic resonance (MR) spine images that is based on coupling deformable models with convolutional neural networks (CNNs). We designed a 3D CNN architecture that learns the appearance from a training set of VBs to generate 3D spatial VB probability maps, which guide deformable models towards VB boundaries. The proposed method was applied to segment 161 VBs from 3D MR spine images of 23 subjects, and the results were compared to reference segmentations. By yielding an overall Dice similarity coefficient of $93.4\,{\pm }\,1.7\,\%$, mean symmetric surface distance of $0.54\,{\pm }\,0.14\,\text {mm}$ and Hausdorff distance of $3.83\,{\pm }\,1.04\,\text {mm}$, the proposed method proved superior to existing VB segmentation methods.

You have full access to this open access chapter, Download conference paper PDF

Segmentation of Pathological Spines in CT Images Using a Two-Way CNN and a Collision-Based Model

Shape-Aware Deep Convolutional Neural Network for Vertebrae Segmentation

Vertebrae Localization, Segmentation and Identification Using a Graph Optimization and an Anatomic Consistency Cycle

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Accurate and robust segmentation of anatomical structures from magnetic resonance (MR) images is valuable in many computer-aided clinical tasks. However, it is often challenging due to ambiguous structure boundaries, resemblance among structures, insufficient image contrast and/or spatial resolution, intensity inhomogeneity and image dimensionality, especially in the case of three-dimensional (3D) images. Segmentation of vertebral bodies (VBs) from 3D MR images was already the topic of several studies, as it represents a non-invasive approach to precise quantitative evaluation of many spinal disorders [1], including spondylolisthesis, disc herniation, stenosis and osteporosis. Štern et al. [2] introduced superquadric models that were aligned to VBs in 3D MR images and yielded a distance error of $1.85\,\text {mm}$, Neubert et al. [3] applied active shape models and achieved a distance error of $0.67\,\text {mm}$ and Dice overlap of $91\,\%$, while Zukić et al. [4] used VB boundary models for MR image feature classification and model mesh inflation for VB segmentation, yielding a distance error of $1.76\,\text {mm}$ and Dice overlap of $79.3\,\%$. Schwarzenberg et al. [5] described a graph-cut segmentation that was based on the deformation of a cubic template, achieving a Dice overlap of $81.3\,\%$. Recently, Chu et al. [6] presented a learning-based method for localization and segmentation of VBs from computed tomography and MR images, where likelihoods of voxels to belong to VBs or background were estimated by random forest classifiers, and reported a distance error of $1.5\,\text {mm}$ and Dice overlap of $88.7\,\%$ when segmenting 3D MR images. The authors also publicly released a database of 3D MR spine images with annotated VBs, therefore enabling objective quantitative comparison of any VB segmentation algorithm.

In this paper, we propose an automated method for supervised segmentation of VBs from 3D MR spine images. For this purpose, a 3D convolutional neural network (CNN) architecture was designed that learns the appearance of VBs and background from a training set of MR images, and generates 3D spatial VB probability maps that guide deformable models towards VB boundaries to obtain accurate VB segmentations in a previously unseen MR image.

2 Methodology

Let $\mathcal {M}$ represent the 3D mesh of the mean shape model of VBs in the training set. By coarsely initializing $\mathcal {M}$ in the 3D image and increasing the size of its minimum bounding box, the volume of interest (VOI) that contains the observed VB is determined. A 3D CNN classifier [7, 8] is first applied to obtain the 3D spatial VB probability map, which represents likelihoods of VOI voxels to belong to the observed VB. The deformable modeling process is then guided by the obtained map and constrained by VB shapes in the training set to obtain a smooth and accurate segmentation of the observed VB (Fig. 1(a)).

2.1 Generation of 3D Spatial VB Probability Maps Using CNN

Designed as a special kind of artificial neural network classifiers, CNNs exploit the spatial structure of the input image by providing relative generalization as well as robustness to simple geometrical transformations [7]. Starting with an input in the form of a 3D image patch ${\varvec{P}}$ around the observed VOI voxel, the CNN is formed by alternating convolutional and pooling layers ($l = 1, 2, \ldots , L'$) that are attached to fully connected layers ($l = L'+1, L'+2, \ldots , L$).

Convolutional and Pooling Layers. First, several convolutions are performed in parallel to produce each m-th ordinary (feature) map ${\varvec{X}}_{l}^{m}$ of l-th convolutional layer, with the value at position (i, j, k):

$$\begin{aligned} {\varvec{X}}_{l}^{m}(i,j,k) = b_{l}^{m} + \sum _{n = 1}^{M_{l-1}} \sum _{p = 1}^{P_l} \sum _{q = 1}^{Q_l} \sum _{r = 1}^{R_l} {\varvec{W}}_{l}^{m,n}(p,q,r) \, {\varvec{Z}}_{l-1}^{n}(i+p-1,j+q-1,k+r-1), \end{aligned}$$

(1)

where $b_{l}^{m}$ denotes the additive bias, and ${\varvec{W}}_{l}^{m,n}$ the convolution kernel of height $P_l$, width $Q_l$ and depth $R_l$, connected to n-th (out of $M_{l-1}$) pooled map ${\varvec{Z}}_{l-1}^{n}$ in $(l-1)$-th pooling layer. The activated map ${\varvec{Y}}_{l}^{m} = f_l({\varvec{X}}_{l}^{m})$ is obtained by running ${\varvec{X}}_{l}^{m}$ through an activation function $f_l$, and then transformed into m-th pooled map ${\varvec{Z}}_{l}^{m}$ of l-th pooling layer, with the value at position (i, j, k):

$$\begin{aligned} {\varvec{Z}}_{l}^{m}(i,j,k) = g_l \left( {\varvec{Y}}_{l}^{m}(p,q,r) \Bigg |\Bigg . {\begin{array}{c} p\,{\in }\,\left\{ \widetilde{P}_l(i-1)+1, \ldots , \widetilde{P}_li\right\} \\ ~q\,{\in }\,\left\{ \widetilde{Q}_l(j-1)+1, \ldots , \widetilde{Q}_lj\right\} \\ \,~r\,{\in }\,\left\{ \widetilde{R}_l(k-1)+1, \ldots , \widetilde{R}_lk\right\} \end{array} } \right) , \end{aligned}$$

(2)

where $g_l$ is a pooling function on a kernel of height $\widetilde{P}_l$, width $\widetilde{Q}_l$ and depth $\widetilde{R}_L$.

Fully Connected Layers. Convolutional and pooling layers are attached to fully connected layers by representing $M_{L'}$ pooled maps ${\varvec{Z}}_{L'}^{m}$ in $L'$-th pooling layer with vector ${\varvec{w}}_{L'}$ that is sent to the first fully connected layer. Vector ${\varvec{w}}_{l}$ is the map of l-th fully connected layer of size $S_{l-1}$, with the value at position i:

$$\begin{aligned} {\varvec{w}}_{l}(i) = h_l \left( {\varvec{b}}_{l}(i) + \sum _{j=1}^{S_{l-1}}{\varvec{W}}_{\!l}(i,j) \cdot {\varvec{w}}_{l-1}(j) \right) , \end{aligned}$$

(3)

where $h_l$ denotes an activation function, ${\varvec{b}}_{l}$ the additive bias vector, and ${\varvec{W}}_{\!l}$ the weight matrix. Output ${\varvec{w}}_{L}\,{=}\,(w_{L,1},w_{L,2})^T$ of L-th fully connected layer is passed through the softmax function to obtain vector ${\varvec{p}}$:

$$\begin{aligned} {\varvec{p}} = (p_1, p_2)^T = \frac{(e^{w_{L,1}},e^{w_{L,2}})^T}{e^{w_{L,1}} + e^{w_{L,2}}}; \quad \quad p_1+p_2=1; \quad \quad p_1,p_2 \in [0,1], \end{aligned}$$

(4)

composed of probabilities $p_1$ and $p_2$ that the observed VOI voxel belongs to the object of interest (i.e. VB) or background, respectively, according to the input ${\varvec{P}}$, weights ${\varvec{W\!}}=\{{\varvec{W}}_{\!1},\ldots ,{\varvec{W}}_{\!L}\}$ and biases ${\varvec{b}} = \{{\varvec{b}}_1,\ldots ,{\varvec{b}}_L\}$ (Fig. 1(b)).

2.2 Segmentation of VBs with Deformable Models Using 3D Spatial VB Probability Maps

The applied segmentation of VBs [9–11] is based on deforming mesh $\mathcal {M}=\{\mathcal {V},\mathcal {F}\}$ (composed of vertices $\mathcal {V}$ and faces $\mathcal {F}$) in the 3D MR image by iteratively minimizing the external energy that drives $\mathcal {M}$ towards VB boundaries, and the internal energy that constrains $\mathcal {M}$ according to the training set of VBs.

External Energy. In k-th (out of K) iteration of the applied segmentation, each vertex ${\varvec{x}}_i$ of mesh $\mathcal {M}$ ($i\,{=}\,1,\,2,\,{\ldots },\,\left| \mathcal {V}\right| $) is displaced along the corresponding mesh normal ${\varvec{n}}_i$ to find a more promising vertex on VB boundaries:

$$\begin{aligned} {\varvec{x}}_i^{*} = {\varvec{x}}_i + \delta \, j_i^{*} \, {\varvec{n}}_i; \quad \quad j_i^{*} = \underset{j \in \mathcal {J}}{\arg \max } \Big \{ F_i({\varvec{x}}_i + \delta \, j \, {\varvec{n}}_i) - D\left| \delta j\right| \Big \}, \end{aligned}$$

(5)

where $\delta $ is the sampling distance, $j_i^{*}$ is an element from the iteratively reduced search profile $\mathcal {J}\,{=}\,\{-j',\,-j'+1,\,{\ldots },\,j'-1,\,j'\}$; $j'\,{=}\,{\lfloor }\frac{J-1}{1-K}{\cdot }k\,{+}\,\frac{1\,-\,J\,K}{1-K}{\rfloor }$ (${\lfloor }.{\rfloor }$ is the floor function) along ${\varvec{n}}_i$, and D weights the distance from ${\varvec{x}}_i$ to ${\varvec{x}}_i\,{+}\,\delta \,j\,{\varvec{n}}_i$ (i.e. $\left| \delta j\right| $) and the boundary detection function $F_i({\varvec{x}}) = \left\langle {\varvec{n}}_i,{\varvec{g}}({\varvec{x}})\right\rangle $ ($\left\langle .,.\right\rangle $ is the dot product), with ${\varvec{g}}({\varvec{x}})$ being the gradient of the 3D spatial VB probability map at location ${\varvec{x}}$. The external energy $E_{\text {ext}}$ attracts vertices ${\varvec{x}}_i$ to vertices ${\varvec{x}}_i^{*}$:

$$\begin{aligned} E_{\text {ext}} = \sum _{i=1}^{\left| \mathcal {V}\right| }w_i^*\left||\mathop {\mathrm {proj}}\left( {\varvec{x}}_i^{*} - {\varvec{x}}_i,\frac{{\varvec{g}}({\varvec{x}}_i^{*})}{\left||{\varvec{g}}({\varvec{x}}_i^{*})\right||}\right) \right||^2, \end{aligned}$$

(6)

where weight $w_i^*= \max \{0, F_i({\varvec{x}}_i^{*}) - D\left| \delta j_i^{*}\right| \}$ (Eq. 5) gives a greater influence to ${\varvec{x}}_i^*$. By projecting ${\varvec{x}}_i^{*} - {\varvec{x}}_i$ onto the gradient of the 3D spatial VB probability map ${\varvec{g}}({\varvec{x}}_i^{*})$ at location ${\varvec{x}}_i^{*}$ ($\mathop {\mathrm {proj}}(.,.)$ is the vector projection), $E_{\text {ext}}$ becomes invariant to movements of vertices within the plane perpendicular to ${\varvec{g}}({\varvec{x}}_i^{*})$.

Internal Energy. From the training set of VBs, M aligned and topologically corresponding meshes $\{\mathcal {M}_m\}_{m=1}^{M}\,{=}\,\{\mathcal {V}_m,\,\mathcal {F}_m\}_{m=1}^{M}$ ($\left| \mathcal {V}_m\right| \,{=}\,\left| \mathcal {V}\right| $, $\mathcal {F}_m\,{=}\,\mathcal {F}$) are extracted, and their vertices $\{\mathcal {V}_m\}_{m=1}^{M}$ are squeezed into matrix $S\,{=}\,[{\varvec{s}}_1,\,{\ldots },\,{\varvec{s}}_M]$ of size $3\left| \mathcal {V}\right| {\times }M$, where vector ${\varvec{s}}_m\,{=}\,\varPhi (\mathcal {V}_m)$ contains triplets of $\mathcal {V}_m$ according to function $\varPhi $. In each k-th (out of K) iteration of the applied segmentation, $\mathcal {M}$ is defined as a linear combination of $\{\mathcal {M}_m\}_{m=1}^{M}$ that approximates $\varPhi (\mathcal {V})$ with S:

$$\begin{aligned} {\varvec{w}}^{*} = \underset{{\varvec{w}}{\in }\mathbb {R}^{M{\times }1}}{\arg \min }\left||S{\varvec{w}} - T(\varPhi (\mathcal {V}))\right||, \end{aligned}$$

(7)

where ${\varvec{w}}$ are linear combination coefficients, and T is the rigid transformation that aligns $\varPhi (\mathcal {V})$ to S. Coefficients ${\varvec{w}}^*$ represent contributions of $\{\mathcal {M}_m\}_{m=1}^{M}$ to the reconfiguration of vertices $\mathcal {V}^{*}\,{=}\,\varPhi ^{-1}\left( T^{-1}(S{\varvec{w}}^{*})\right) $ that form a new topologically corresponding mesh $\mathcal {M}^{*}\,{=}\,\{\mathcal {V}^{*},\,\mathcal {F}^{*}\}$ ($\left| \mathcal {V}^{*}\right| \,{=}\,\left| \mathcal {V}\right| $, $\mathcal {F}^{*}\,{=}\,\mathcal {F}$). The internal energy $E_{\text {int}}$ penalizes the deviation of vertices ${\varvec{y}}_i\,{\in }\,\mathcal {V}$ from vertices ${\varvec{y}}_i^{*}\,{\in }\,\mathcal {V}^{*}$:

$$\begin{aligned} E_{\text {int}} = \sum _{i=1}^{\left| \mathcal {V}\right| }\sum _{j\in \mathcal {N}_i}\left||\left( {\varvec{y}}_i-{\varvec{y}}_j\right) -\left( {\varvec{y}}_i^{*}-{\varvec{y}}_j^{*}\right) \right||^2, \end{aligned}$$

(8)

where $\mathcal {N}_i$ is the set of vertices in the neighborhood of ${\varvec{y}}_i$ (or ${\varvec{y}}_i^{*}$, as the topology is preserved). The deformation flexibility of mesh $\mathcal {M}$ is therefore regulated by the shape prior constraint between vertices $\mathcal {V}$ and $\mathcal {V}^{*}$.

Energy Minimization. Once vertices ${\varvec{x}}_i^{*}$ and ${\varvec{y}}_i^{*}$ are determined, mesh $\mathcal {M}$ is reconfigured in each k-th iteration of the applied segmentation by minimizing $E\,{=}\,E_{\text {ext}}\,{+}\,{\alpha }E_{\text {int}}$, where $\alpha $ weights the contribution of both energy terms. According to $E_{\text {ext}}$, $\mathcal {M}$ is driven along iteratively reduced displacements towards VB boundaries, identified as strong gradients in the 3D spatial VB probability map generated by the proposed 3D CNN, while $E_{\text {int}}$ restricts the flexibility of $\mathcal {M}$ by preserving the distribution of its vertices in the training set of VBs. After K iterations, mesh $\mathcal {M}^*$ represents the final segmentation of the observed VB.

3 Results

3.1 Image Database

The proposed method was evaluated on a publicly available database^{Footnote 1} of 23 sagittally reconstructed T2-weighted turbo spin-echo 3D MR images of the lower spine from 23 subjects, acquired with a 1.5 Tesla Siemens scanner and resampled to a voxel size of $2\,{\times }\,1.25\,{\times }\,1.25\,\text {mm}^3$. For each VB from T11 to L5 ($7\,{\cdot }\,23\,{=}\,161$ in total), a reference manual segmentation was available [6]. For method evaluation, the database was split into 11 training and 12 testing images, and vice versa, according to the leave-half-images-out scheme that was repeated five times.

3.2 Implementation Details

Mean VB Shape Models. Marching cubes were first applied to each reference segmentation $B_{ij}$ of i-th VB ($i\,{=}\,1,\,2,\,{\ldots },\,7$ for T11 to L5) in j-th training image ($j\,{=}\,1,\,2,\,{\ldots },\,M$ for $M\,{=}\,11$ or 12 images). The resulting meshes $\{\mathcal {M}_{ij}\}_{j=1}^{M}$ were aligned and made topologically corresponding by applying the isotropic remeshing, coherent point drift and generalized Procrustes analysis, yielding the mean shape model $\mathcal {M}_{i}$ of i-th VB. Vertices of $\mathcal {M}_{i}$ were further used for deformable model-based segmentation guided by 3D spatial VB probability maps.

CNN Architecture. For each i-th VB in the 3D testing image, its mean shape model $\mathcal {M}_i$ was first initialized by coarsely identifying the corresponding VB centroid, and the minimum bounding box of $\mathcal {M}_i$ was resized by factor two to obtain the VOI of the observed VB. For each VOI voxel, a 3D patch ${\varvec{P}}$ of $15\,{\times }\,15\,{\times }\,15$ mm$^3$ was extracted and isotropically resampled to a grid of $15\,{\times }\,15\,{\times }\,15$ voxels, and fed to a 3D CNN with two convolutional ($\mathcal {C}_1$ and $\mathcal {C}_2$), two pooling ($\mathcal {P}_1$ and $\mathcal {P}_2$), and two fully connected layers ($\mathcal {F}_3$ and $\mathcal {F}_4$). For CNN training, the same number of patches was extracted within and outside VBs (100k patches per training image). By using 48 kernels of size $4\,{\times }\,4\,{\times }\,4$ with 3120 weights ${\varvec{W}}_{\!1}$ and biases ${\varvec{b}}_1$, layer $\mathcal {C}_1$ generated 48 ordinary maps that were activated by $f_1(x)\,{=}\,\max \{x,\,\frac{x}{10}\}$, and then transformed in layer $\mathcal {P}_1$ into pooled maps by replacing each $2\,{\times }\,2\,{\times }\,2$ block with its maximum, i.e. $g_1\,{=}\,\max $. By convolving pooled maps with 48 kernels of size $3\,{\times }\,3\,{\times }\,3$ with 62256 weights ${\varvec{W}}_{\!2}$ and biases ${\varvec{b}}_2$, layer $\mathcal {C}_2$ generated 48 ordinary maps that were activated by $f_2\,{=}\,f_1$, and then transformed in layer $\mathcal {P}_2$ into pooled maps equally as in $\mathcal {P}_1$. Layer $\mathcal {F}_3$ consisted of 192 linear combinations from $\mathcal {P}_2$ with 73728 weights ${\varvec{W}}_{\!3}$ and biases ${\varvec{b}}_3$, which were activated by $h_3(x)\,{=}\,\max \{x,0\}$, while layer $\mathcal {F}_4$ consisted of two linear combinations from $\mathcal {F}_3$ with 386 weights ${\varvec{W}}_{\!4}$ and biases ${\varvec{b}}_4$, which were activated by the identity $h_4(x)\,{=}\,x$. The resulting vector ${\varvec{w}}_4\,{=}\,(w_{4,1},\,w_{4,2})^T$ was passed through the softmax function to obtain vector ${\varvec{p}}\,{=}\,(p_1,\,p_2)^T$, where $p_1$ and $p_2$ represented probabilities that the observed VOI voxel belongs to the VB or background, respectively. Vector ${\varvec{p}}$ was obtained for every VOI voxel, resulting in the 3D spatial VB probability map for the observed VOI. The optimal weights ${\varvec{W\!}}$ and biases ${\varvec{b}}$ of the CNN were, using the stochastic gradient descent method, trained by minimizing the $L^2$-regularized negative log-likelihood cost function $l(\mathcal {T},\,\{{\varvec{W\!}},{\varvec{b}}\})\,{=}\,-\sum _{{\varvec{P}} \in \mathcal {T}} \log P(X\,{=}\,c\,{\mid }\,{\varvec{P}},\,\{{\varvec{W\!}},{\varvec{b}}\}) + \lambda \left||\{{\varvec{W\!}},{\varvec{b}}\}\right||_2^2$ to explicitly prevent overfitting, where $P(X=c)$ is the probability that random variable X belongs to class c (VB or background), $\lambda \,{=}\,0.001$ is the regularization parameter, and $\mathcal {T}$ is the set of 3D patches with known classes c extracted from training images. During training, the CNN classification accuracy converged to around $97.8\,\%$ after 5 out of 20 epochs, and then fluctuated for only $0.3\,\%$.

Table 1. Segmentation results of 161 vertebral bodies (levels T11 – L5) from 23 MR spine images, presented as mean (± standard deviation) of the Dice similarity coefficient (DSC), mean symmetric surface distance (MSSD) and Hausdorff distance (HD), and compared to the results of Chu et al. [6] obtained on the same image database.

Full size table

Segmentation with Deformable Models. For each i-th observed VB in the 3D testing image, the mean shape model $\mathcal {M}_i$ was used for the deformable model-based segmentation according to the generated 3D spatial VB probability map, with experimentally defined parameters $\alpha \,{=}\,180$, $J\,{=}\,35$, $D\,{=}\,5.0$ mm$^{-2}$, $\delta \,{=}\,0.25$ mm and $K\,{=}\,15$ iterations. The conjugate gradient method was applied to obtain coefficients ${\varvec{w}}^*$ and to minimize energy E.

3.3 Experimental Results

The obtained segmentations were compared to reference segmentations (Table 1), resulting in ($\varDelta $ indicates the mean difference for five leave-half-images-out experiments) an overall Dice similarity coefficient of $93.4\,{\pm }\,1.7\,\%$ ($\varDelta \,{=}\,0.5\,\%$), mean symmetric surface distance of $0.54\,{\pm }\,0.14\,\text {mm}$ ($\varDelta \,{=}\,0.13\,\text {mm}$) and Hausdorff distance of $3.83\,{\pm }\,1.04\,\text {mm}$ ($\varDelta \,{=}\,0.53\,\text {mm}$). Figures 1(b) and 2(a) show examples of VB segmentations, and Fig. 2(b) shows their dependency from initialization.

4 Discussion and Conclusion

By demonstrating that heuristically crafted features can be efficiently replaced only by appearance features, CNNs proved successful in image classification [8], and have therefore been applied also to medical image segmentation [12–14]. In this paper, we presented an automated method for supervised segmentation of VBs from 3D MR spine images based on coupling 3D CNNs with deformable models. By comparing the results to existing methods and to those of Chu et al. [6] that were obtained on the same image database, the proposed method proved superior in terms of segmentation accuracy and robustness (Table 1). Moreover, as shown in Fig. 2(b), segmentations fully converged when initialized within a $7.5\,\text {mm}$ radius neighborhood of reference VB centroids, meaning that the results were not biased by the coarse manual initialization, which can be eventually replaced by automated VB localization [6]. Besides yielding superior results, the proposed method provides additional advantages related to the use of CNNs for segmentation. Existing CNN-based 3D segmentation approaches [12–14] apply two-dimensional (2D) CNNs, where the input is a 2D image or patch, by analyzing each individual 3D image cross-section, while the results are usually over- or under-segmented as they are obtained by thresholding the CNN probability maps combined with morphological processing and/or removing of connected components. On the other hand, the main novelty of the proposed method is that it relies on 3D spatial VB probability maps, which are obtained by the designed 3D CNN architecture and guide the shape-constrained deformable models towards highly accurate and robust VB segmentations.

Notes

1.
http://dx.doi.org/10.5281/zenodo.22304.

References

Konin, G., Walz, D.: Lumbosacral transitional vertebrae: classification, imaging findings, and clinical relevance. AJNR Am. J. Neuroradiol. 31, 1778–1786 (2010)
Article Google Scholar
Štern, D., Likar, B., Pernuš, F., Vrtovec, T.: Parametric modelling and segmentation of vertebral bodies in 3D CT and MR spine images. Phys. Med. Biol. 56, 7505–7522 (2011)
Article Google Scholar
Neubert, A., Fripp, J., Engstrom, C., Schwarz, R., Lauer, L., Salvado, O., Crozier, S.: Automated detection, 3D segmentation and analysis of high resolution spine MR images using statistical shape models. Phys. Med. Biol. 57, 8357–8376 (2012)
Article Google Scholar
Zukić, D., Vlasák, A., Egger, J., Hořínek, D., Nimsky, C., Kolb, A.: Robust detection and segmentation for diagnosis of vertebral diseases using routine MR images. Comput. Graph. Forum 33, 190–204 (2014)
Google Scholar
Schwarzenberg, R., Freisleben, B., Nimsky, C., Egger, J.: Cube-Cut: vertebral body segmentation in MRI-data through cubic-shaped divergences. PLoS ONE 9, e93389 (2014)
Article Google Scholar
Chu, C., Belavý, D.L., Armbrecht, G., Bansmann, M., Felsenberg, D., Zheng, G.: Fully automatic localization and segmentation of 3D vertebral bodies from CT/MR images via a learning-based method. PLoS ONE 10, e0143327 (2015)
Article Google Scholar
LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Object recognition with gradient-based learning. In: Forsyth, D.A., et al. (eds.) Shape, Contour and Grouping in Computer Vision. LNCS, vol. 1681, pp. 319–345. Springer, Heidelberg (1999). doi:10.1007/3-540-46805-6_19
Chapter Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Proceedings 26th Neural Information Processing Systems – NIPS 2012, Advanced Neural Information vol. 25, pp. 1097–1105 (2012)
Google Scholar
Weese, J., Kaus, M., Lorenz, C., Lobregt, S., Truyen, R., Pekar, V.: Shape constrained deformable models for 3D medical image segmentation. In: Insana, M.F., Leahy, R.M. (eds.) IPMI 2001. LNCS, vol. 2082, pp. 380–387. Springer, Heidelberg (2001). doi:10.1007/3-540-45729-1_38
Chapter Google Scholar
Korez, R., Ibragimov, B., Likar, B., Pernuš, F., Vrtovec, T.: A framework for automated spine and vertebrae interpolation-based detection and model-based segmentation. IEEE T. Med. Imaging 34, 1649–1662 (2015)
Article Google Scholar
De Leener, B., Cohen-Adad, J., Kadoury, S.: Automatic segmentation of the spinal cord and spinal canal coupled with vertebral labeling. IEEE Trans. Med. Imag. 43, 1705–1718 (2015)
Article Google Scholar
Prasoon, A., Petersen, K., Igel, C., Lauze, F., Dam, E., Nielsen, M.: Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 246–253. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40763-5_31
Chapter Google Scholar
Roth, H.R., Lu, L., Farag, A., Shin, H.-C., Liu, J., Turkbey, E.B., Summers, R.M.: DeepOrgan: multi-level deep convolutional networks for automated pancreas segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 556–564. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24553-9_68
Chapter Google Scholar
Thong, W., Kadoury, S., Piché, N., Pal, C.: Convolutional networks for kidney segmentation in contrast-enhanced CT scans. In: Proceedings 1st MICCAI 2015 Workshop on Deep Learning in Medical Image Analysis - MICCAI-DLMIA 2015, pp. 9–16 (2015)
Google Scholar

Download references

Acknowledgements

This work was supported by the Slovenian Research Agency under grants P2-0232, J2-5473, J7-6781 and J2-7118.

Author information

Authors and Affiliations

Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia
Robert Korez, Boštjan Likar, Franjo Pernuš & Tomaž Vrtovec

Authors

Robert Korez
View author publications
You can also search for this author in PubMed Google Scholar
Boštjan Likar
View author publications
You can also search for this author in PubMed Google Scholar
Franjo Pernuš
View author publications
You can also search for this author in PubMed Google Scholar
Tomaž Vrtovec
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomaž Vrtovec .

Editor information

Editors and Affiliations

University College London , London, United Kingdom
Sebastien Ourselin
The Hebrew University of Jerusalem , Jerusalem, Israel
Leo Joskowicz
Harvard Medical School , Boston, Massachusetts, USA
Mert R. Sabuncu
Istanbul Technical University , Istanbul, Turkey
Gozde Unal
Harvard Medical School and Brigham and Women's Hospital, Boston, Massachusetts, USA
William Wells

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Korez, R., Likar, B., Pernuš, F., Vrtovec, T. (2016). Model-Based Segmentation of Vertebral Bodies from MR Images with 3D CNNs. In: Ourselin, S., Joskowicz, L., Sabuncu, M., Unal, G., Wells, W. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016. MICCAI 2016. Lecture Notes in Computer Science(), vol 9901. Springer, Cham. https://doi.org/10.1007/978-3-319-46723-8_50

Download citation

DOI: https://doi.org/10.1007/978-3-319-46723-8_50
Published: 02 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46722-1
Online ISBN: 978-3-319-46723-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)