3D U-Net for Brain Tumour Segmentation

Mehta, Raghav; Arbel, Tal

doi:10.1007/978-3-030-11726-9_23

Raghav Mehta¹⁸ &
Tal Arbel¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11384))

Included in the following conference series:

International MICCAI Brainlesion Workshop

5411 Accesses
22 Citations

Abstract

In this work, we present a 3D Convolutional Neural Network (CNN) for brain tumour segmentation from Multimodal brain MR volumes. The network is a modified version of the popular 3D U-net [13] architecture, which takes as input multi-modal brain MR volumes, processes them at multiple scales, and generates a full resolution multi-class tumour segmentation as output. The network is modified such that there is a better gradient flow in the network, which in turn should allow the network to learn better segmentation. The network is trained end-to-end on BraTS [1,2,3,4,5] 2018 Training dataset using a weighted Categorical Cross Entropy (CCE) loss function. A curriculum on class weights is employed to address the class imbalance issue. We achieve competitive segmentation results on BraTS [1,2,3,4,5] 2018 Testing dataset with Dice scores of 0.706, 0.871, and 0.771 for enhancing tumour, whole tumour, and tumour core, respectively (Docker container of the proposed method is available here: https://hub.docker.com/r/pvgcim/pvg-brats-2018/).

You have full access to this open access chapter, Download conference paper PDF

Brain Tumor Segmentation and Survival Prediction Using Automatic Hard Mining in 3D CNN Architecture

Analysis of depth variation of U-NET architecture for brain tumor segmentation

Article 12 September 2022

DR-Unet104 for Multimodal MRI Brain Tumor Segmentation

Keywords

1 Introduction

Automatic quantitative analysis of brain tumours assists in better and faster diagnosis procedure and surgical planning. Development of accurate and reliable tumour segmentation from multi-modal MRI remains a challenging task due to many sources of variability, including: tumour types, shapes and sizes, intensity and contrast difference in MR images, etc. Classical approaches include Multi Atlas segmentation, probabilistic graphical models like Markov Random Field (MRF) [6] and Conditional Random Field (CRF), Random Forest (RF) [7]. These have been successfully used for the task of tumour segmentation. Methods based on generative models have also been explored [8] for tumour segmentation.

Inspired by the success of deep learning in many tasks related to natural images like semantic segmentation [10], object detection [11], and classification [12], many deep learning based approaches have been proposed for various tasks in medical images like segmentation [13], synthesis [14], and classification [15]. Various CNN architectures have been explored for brain tumour segmentation which either explicitly [9, 16] or implicitly [17, 18] model global and local image context. These architectures either take MR images at multiple resolutions as input [9, 16] or process single resolution MR images at multiple scales [17, 18]. One of the advantages of deep learning based approaches over classical segmentation methods like MRF, RF etc. is that they don’t require any hand-crafted features because the networks are trained in end-to-end manner with appropriate loss functions. In recent BraTS challenges [1], deep learning based approaches have outperformed classical methods.

In this work, we develop a modified version of the popular 3D U-net [13] architecture for brain tumour segmentation task on BraTS 2018 datasets. The U-net architecture has been successfully applied to many medical imaging segmentation tasks, such as liver and lesion segmentation [19], retinal layer segmentation [20], organ segmentation [21] etc. In this paper, the 3D U-net is trained using Categorical Cross Entropy (CCE) loss function on BraTS 2018 training dataset and a curriculum on class weights is employed to address class imbalance [26]. We achieved competitive results on BraTS 2018 [5] validation and testing datasets with Dice scores of 0.788, 0.909, and 0.825 on validation dataset, and 0.706, 0.871, and 0.771 on testing dataset for enhancing tumour, whole tumour, and tumour core, respectively.

2 Method

A flowchart of the 3D U-net architecture can be seen in Fig. 1. The network takes as input full 3D volumes of all available sequences of a patient and generates multi-class segmentation of tumours into sub-types, at the same resolution. The 3D U-net is similar to the one proposed in [13], with some modifications. The U-net consists of 4 resolution steps for both encoder and decoder paths. At the start, we use 2 consecutive 3D convolutions of size $3\times 3\times 3$ with k filters, where k denotes the user-defined initial number of convolution filters (10). Each step in the encoder path consists of 2 3D convolutions of size $3 \times 3 \times 3$ with $k * 2^n$ filters, where n denotes the U-net resolution step. This is followed by average pooling of size $2 \times 2 \times 2$. We chose average pooling instead of max pooling as it allows better gradient flow between consecutive layers. At the end of each encoder step, instance normalization [22] is applied, followed by dropout [23] with 0.05 probability. Instance normalization was preferred over batch normalization due to memory constraints, as we were able to fit only one volume at a time in the available GPU memory. In the decoder path at each step, 3D transposed convolution of size $3 \times 3 \times 3$ is applied, with $2 \times 2 \times 2$ stride and $k * 2^n$ filters for the upsampling task. The output of the transposed convolution is concatenated with the corresponding output of the encoder path. We chose transposed convolution as it allows the network to learn an optimal interpolation function instead of a pre-defined interpolation function in the case of standard upsampling. This is, once again, followed by instance normalization and Dropout with 0.05 probability. Finally, 2 3D convolution of size $3 \times 3 \times 3$ with $k * 2^n$ filters are applied. Rectified linear unit is chosen as a non-linearity function for every convolution layer. The last layer has C filters, where C denotes the total number of classes. This is followed by SoftMax non-linearity.

2.1 Loss Function

We optimize weighted Categorical Cross Entropy (CCE) loss function during training. The equation for the same is given below.

$$\begin{aligned} CCE^i = -\sum _n w_n^i \sum _l t_{n,l}^{i} \log p_{n,l}^{i} \end{aligned}$$

(1)

$$\begin{aligned} w_n^i = w_l*y_n^i \qquad \text {where, } w_l = (\frac{\sum _{k=0}^{k=C} m_k}{m_l}) * r^{ep} + 1, \end{aligned}$$

(2)

where, $w_n^i$ and $w_l$ denote the weight for voxel n of volume i and the weight of class l. $m_l$ is total number of voxels of $l^{th}$ class in the training dataset and C denotes the total number of classes. $w_l$ are decayed over each epoch ep with a rate of $r \in [0,1] $. It should be noted that $w_l$ converges to 1 as ep becomes large ensuring that all sample receive an equal weight at the later training stages. This method of weighting classes is known as curriculum class weighting [26].

3 Experiments and Results

3.1 Data

BraTS 2018 Training Set: The BraTS 2018 training dataset is comprised of 210 high-grade and 75 low-grade glioma patient MRIs. For each patient T1, T1 post contrast (T1c), T2, and Fluid Attenuated Inverse Recovery (FLAIR) MR volumes, along with expert tumour segmentation are provided. Each brain tumour is manually delineated into 3 classes: edema, necrotic/non-enhancing core, and enhancing tumour core [1,2,3,4,5].

BraTS 2018 Validation Set: The BraTS 2018 validation dataset is comprised of 66 patient MRIs. For each patient T1, T1c, T2, and FLAIR MR volumes are provided. No expert tumour segmentation masks are provided and the grade of each glioma is not specified [1,2,3,4,5].

BraTS 2018 Testing Set: The BraTS 2018 testing dataset is comprised of 191 patient MRIs. Similar to validation dataset, here for each patient T1, T1c, T2, and FLAIR MR volumes are provided but expert tumour segmentation masks are not provided. The grade of each glioma is also not specified [1,2,3,4,5].

3.2 Pre-processing

The BraTS challenge provides isotropic, skull-stripped, and co-registered MR volumes. We follow this up with a few pre-processing steps. The intensity of volumes were re-scaled using mean subtraction, divided by the standard deviation, and re-scaled from 0 to 1 and were cropped to $184 \times 200 \times 152$.

3.3 5-Fold Cross Validation

We performed 5-fold cross validation on the training dataset. The BraTS 2018 training dataset is randomly split into five folds with 57 patient dataset each such that each fold contains 42 high-grade patients and 15 low-grade patients. We train our network 5 times such that 4 folds are used to train the network and the remaining fold is used to validate the network.

Please note that we use total five networks, obtained by the corresponding cross-validation, as an ensemble to predict segmentation for BraTS 2018 validation and testing datasets. We view this ensemble as bagging [25], which has been shown to improve performance over a single model.

Parameters. In our network, we used initial number of filters $k = 20$ and number of filters in the last layer $C = 4$. We optimize the loss function in Eq. (1) using Adam [24] with a learning rate of 0.001 and batch size of 1. The network is trained for total 240 epochs. Learning rate is decayed by the factor of 0.75 after every 50 epochs. The decay rate r in Eq. (2) is set to 0.95. We regularize the model using data augmentation, where at each training iteration a random affine transformation is applied to the MR volumes and the corresponding segmentation mask. Random translation, rotation, scaling and shear transformations are applied, where the range of transformations is sampled from a uniform distribution of $[-5,5]$, $[-3^{\circ },3^{\circ }]$, $[-0.1,0.1]$, and $[-0.1,0.1]$, respectively. Volumes are also randomly flipped left to right.

Learning Curves. Figure 2 shows an example of evolution of various Dice scores (Tumour, Enhance, Core, and Average) as a function of number of epochs for one of the 5 cross-validation fold.

4 Discussion

4.1 Quantitative Results

Our method performed well, resulting in Dice scores of 0.788, 0.909, and 0.825 (BraTS 2018 validation dataset), and 0.706, 0.871, and 0.771 (BraTS 2018 testing dataset) for the enhancing tumours, whole tumours, and tumour cores, respectively. Tables 1, 2, and 3 show the results of our method based on different evaluation metric statistics, provided by the challenge organizers. The results are based on a number of experiments on the following BraTS 2018 datasets: 5-fold cross validation on the training dataset, and tests on the validation dataset and the testing dataset. The results indicate that the proposed method performs very well on the whole tumours and tumour cores, with relatively lower performance for enhancing tumours. This was expected as enhancing tumours rely heavily on the T1c images, and present similarly to other enhancements on those images. For other tumour sub-types, other modalities assist in the segmentation.

Table 1. Evaluation metric statistics for 5-fold cross validation on BraTS 2018 training dataset for enhancing tumour (ET), whole tumour (WT), and tumour core (TC).

Full size table

Table 2. Evaluation metric statistics for BraTS 2018 validation dataset for enhancing tumour (ET), whole tumour (WT), and tumour core (TC).

Full size table

Table 3. Evaluation metric statistics for BraTS 2018 testing dataset for enhancing tumour (ET), whole tumour (WT), and tumour core (TC).

Full size table

4.2 Qualitative Results

Figures 3 and 4 show examples of slices with the resulting segmentation labels for high-grade and low-grade glioma patients from one fold of the experiments on the BraTS 2018 training dataset. We can observe that the network performs much better on high-grade glioma cases. This can be attributed to the fact that we have more training examples of high-grade glioma cases as compared to low-grade glioma cases. Example slices with the predicted segmentation labels on the BraTS 2018 validation and testing datasets can be seen in Figs. 5, 6, and 7.

5 Conclusion

In this work, we demonstrated how a simple CNN network like 3D U-net [13] can be successfully applied for the task of tumour segmentation. U-net process the input multi-modal MR images at multiple scales, which allows it to learn local and global context necessary for tumour segmentation. The network was trained using a curriculum on class weights to address class imbalance, showing competitive results for brain tumour segmentation on BraTS 2018 [5] testing dataset. Our method performed well and we got following Dice scores for enhancing tumour, whole tumour, and tumour core on BraTS 2018 [5] validation and testing datasets: 0.788, 0.909, and 0.825 (validation dataset), and 0.706, 0.871, and 0.771 (testing dataset). But our method showed degradation in performance on the testing dataset in the categories of Enhancing Tumours (ET) and Tumour Core (TC).

References

Menze, B.H., et al.: The multimodal brain tumour image segmentation benchmark (BRATS). IEEE TMI 34(10), 1993 (2015)
Google Scholar
Bakas, S., et al.: Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 4, 170117 (2017)
Article Google Scholar
Bakas, S., et al.: Segmentation labels and radiomic features for the pre-operative scans of the TCGA-GBM collection. Cancer Imaging Arch. 286 (2017)
Google Scholar
Bakas, S., et al.: Segmentation labels and radiomic features for the pre-operative scans of the TCGA-LGG collection. Cancer Imaging Arch. (2017)
Google Scholar
Bakas, S., Reyes, M., et al.: Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge. arXiv preprint arXiv:1811.02629 (2018)
Subbanna, N., et al.: Iterative multilevel MRF leveraging context and voxel information for brain tumour segmentation in MRI. In: Proceedings of the IEEE CVPR, pp. 400–405 (2014)
Google Scholar
Zikic, D., et al.: Context-sensitive classification forests for segmentation of brain tumour tissues. In: Proc MICCAI-BraTS, pp. 1–9 (2012)
Google Scholar
Menze, B.H., van Leemput, K., Lashkari, D., Weber, M.-A., Ayache, N., Golland, P.: A generative model for brain tumor segmentation in multi-modal images. In: Jiang, T., Navab, N., Pluim, J.P.W., Viergever, M.A. (eds.) MICCAI 2010. LNCS, vol. 6362, pp. 151–159. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15745-5_19
Chapter Google Scholar
Kamnitsas, K., et al.: Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med. Image Anal. 36, 61–78 (2017)
Article Google Scholar
Long, J., et al.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE CVPR, pp. 3431–3440 (2015)
Google Scholar
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 6, 1137–1149 (2017)
Article Google Scholar
Krizhevsky, A., et al.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Google Scholar
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49
Chapter Google Scholar
Chartsias, A., et al.: Multimodal MR synthesis via modality-invariant latent representation. IEEE TMI 37(3), 803–814 (2018)
Google Scholar
Mazurowski, M.A., et al.: Deep learning in radiology: an overview of the concepts and a survey of the state of the art. arXiv preprint arXiv:1802.08717 (2018)
Havaei, M., et al.: Brain tumour segmentation with deep neural networks. Med. Image Anal. 35, 18–31 (2017)
Article Google Scholar
Havaei, M., Guizard, N., Chapados, N., Bengio, Y.: HeMIS: hetero-modal image segmentation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 469–477. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_54
Chapter Google Scholar
Shen, H., Wang, R., Zhang, J., McKenna, S.J.: Boundary-aware fully convolutional network for brain tumor segmentation. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10434, pp. 433–441. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66185-8_49
Chapter Google Scholar
Christ, P.F., et al.: Automatic liver and lesion segmentation in CT using cascaded fully convolutional neural networks and 3D conditional random fields. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 415–423. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_48
Chapter Google Scholar
Roy, A.G., et al.: ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks. Biomed. Opt. Express 8(8), 3627–3642 (2017)
Article Google Scholar
Roth, H.R., et al.: Hierarchical 3D fully convolutional networks for multi-organ segmentation. arXiv preprint arXiv:1704.06382 (2017)
Ulyanov, D., et al.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022
Srivastava, S., et al.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Jesson, A., Arbel, T.: Brain tumor segmentation using a 3D FCN with multi-scale loss. In: Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M. (eds.) BrainLes 2017. LNCS, vol. 10670, pp. 392–402. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75238-9_34
Chapter Google Scholar

Download references

Acknowledgment

This work was supported by a Canadian Natural Science and Engineering Research Council (NSERC) Collaborative Research and Development Grant (CRDPJ 505357 - 16) and Synaptive Medical. We gratefully acknowledge the support of NVIDIA Corporation for the donation of the Titan X Pascal GPU used for this research.

Author information

Authors and Affiliations

McGill University, Montreal, QC, Canada
Raghav Mehta & Tal Arbel

Authors

Raghav Mehta
View author publications
You can also search for this author in PubMed Google Scholar
Tal Arbel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raghav Mehta .

Editor information

Editors and Affiliations

University Hospital of Zurich, Zürich, Switzerland
Alessandro Crimi
University of Pennsylvania, Philadelphia, PA, USA
Spyridon Bakas
University Medical Center Utrecht, Utrecht, The Netherlands
Hugo Kuijf
National Cancer Institute, Bethesda, MD, USA
Farahani Keyvan
University of Bern, Bern, Switzerland
Mauricio Reyes
Erasmus University Medical Center, Rotterdam, The Netherlands
Theo van Walsum

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mehta, R., Arbel, T. (2019). 3D U-Net for Brain Tumour Segmentation. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2018. Lecture Notes in Computer Science(), vol 11384. Springer, Cham. https://doi.org/10.1007/978-3-030-11726-9_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-11726-9_23
Published: 26 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11725-2
Online ISBN: 978-3-030-11726-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

3D U-Net for Brain Tumour Segmentation

Abstract

Similar content being viewed by others

Brain Tumor Segmentation and Survival Prediction Using Automatic Hard Mining in 3D CNN Architecture

Analysis of depth variation of U-NET architecture for brain tumor segmentation

DR-Unet104 for Multimodal MRI Brain Tumor Segmentation

Keywords

1 Introduction