A mix-pooling CNN architecture with FCRF for brain tumor segmentation

https://doi.org/10.1016/j.jvcir.2018.11.047Get rights and content

Abstract

MR technique is prevalent for doctor to diagnose and assess glioblastomas which are the most lethal form of brain tumors. Although Convolutional Neural Networks (CNN) has been applied in automatic brain tumor segmentation and is proved useful and efficient, traditional one-pathway CNN architecture with convolutional layers and max pooling layers has limited receptive fields representing the local context information. Such mindset in traditional CNN may dismiss useful global context information. In this paper, we design a two-pathway model with average and max pooling layers in different paths. Besides, 1 × 1 kernels are followed input layers to add the non-linearity dimensions of input data. Finally, we combine the CNN architecture with fully connected CRF(FCRF) as a mixture model to introduce the global context information to optimize prediction results. Our experiments proved that the mixture model improved segmentation and labeling accuracy.

Introduction

Gliomas, which come from glial cells, are the most frequent primary brain tumors in adults and account for 70% of adult malignant primary brain tumors [1]. They include low grade type (e.g. oligodendroglioma) and high grade type (e.g. glioblastoma). The former is less lethal than the latter which reduces the life expectance of the patient to less than two years. Surgery, radiation therapy, chemotherapy or their combination are the most common treatments for the patient. Due to high resolution multiplanar structural information, and substantially improved tissue characterization, the MRI scan is commonly often being used to assess gliomas in clinical practice [2]. The segmentation based on MR medical image is crucial for monitoring variation trend of tumor volume within patients during therapy. It also plays an important role in surgical or radiotherapy planning, where not only the tumor has to be outlined, but also surrounding healthy structures are of interest [1]. Currently, manual segmentation is still dominative during clinical practice, which is time-consuming and clinical experience-required for doctors. Besides, the complex features of glioma morphology and subtle changes between MRI examinations are also inevitable challenges to detect reliably by visual inspection of the images, even by an experienced radiologist [2]. Accurate automatic brain tumor segmentation methods are meaningful and noteworthy in the field of brain tumor clinical experience.

According to Gordillo [3], segmentation techniques based 2D MRI slice have been divided into three major class: threshold-based techniques, region-based techniques and pixel classification techniques.

Threshold-base methods provide not less than one threshold to classify the segmentation targets (e.g. voxels) of the MR image by their intensities. Gibbs et al. [4] provided a three-step mixture method. First, a Sobel edge filter is applied to the original image to outline the edge probability as the base of computing value of the voxels. Then, comparing each pixel’s value to the threshold, the system assigned each pixel to different regions. Finally, denoise the previous output.

Region-based methods separate voxels into mutually exclusive regions according to presetting rules. It is similar to clustering methods that divide voxels with homogeneous properties into the same region [5]. Region growing method is the most classic one. This method plants at least one seed in each area, then determines region each voxel belong to by comparing the homogeneity of the voxel to the neighboring seeds [6]. However, the method can not address the partial volume effect caused by using larger, less sharply-defined voxels in medical image field. Lakare et al. [7] proposed a modified method, which eliminates the partial volume effect in the way of exactly detecting the boundary by computing the gradient information [8]. An adaptive region growing method proposed by Deng et al. [9] segments the region precisely according to the mean variances and mean gradient inside of and along the boundary curve respectively.

Pixel classification methods mainly based on the multi-modal properties (Flair, T1, T2, T1c) of each voxel, are divided into unsupervised, semi-supervised and supervised. As the most classic unsupervised methods, clustering justifies the similarity between voxels by the feature distance or angular discrepancy [10]. However, this method does not take the spatial correlation into account [11]. Markov Random Fields (MRF) as a kind of semi-supervised strategy can accommodate this problem to reduce the region overlapping and voxel noises [12]. Gering et al. [13] used a multi-layer MRF framework to detect brain tumor. In this framework each layer focuses on different intensity information such as neighborhood coherence, intra-structure properties and inter-structure relationships. Nie et al. [14] proposed the Spatial accuracy-weighted Hidden Markov random field and Expectation maximization (SHE) approach to address anisotropic problem caused by different resolution of MR images along different axis [15]. With the difference from MRF, Conditional Random Fields (CRF) can straightly obtain discriminant models for saving computation times [16].

The Convolutional Neural Networks (CNN) technique [17] is commonly used as supervised pixel classification, which is also well-suited for the segmentation of heterogeneous information without parametric distribution hypothesis. Based on traditional neural networks, Bengio et al. [18], [19] proposed a deeper neural network architecture that can extract more abstract non-linear features. Krizhevsky and Hinton [20] made a great breakthrough in ImageNet recognition challenge by deep CNN, which also makes progress in images segmentation fields and brain tumor segmentation field [21], [22], [23], [24], [25]. Zikic et al. [21] straightly used CNN to classify each voxel from the preprocessed multi-channel intensity information from a small patch around that voxel. Havaei et al. [22] built a cascade of two networks, one of which outputs the feature maps extracted from the bigger input patches as the part of input layer of the other network, thus bringing in more global contextual features simultaneously. Pereira et al. [23] investigated the potential of using deep architectures with small convolutional kernels for segmentation in MR images after intensity normalization. Dvǒrák et al. [24] proposed a mixture model of clustering and CNNs. Kamnitsas et al. [25] utilized a dual pathway, 11-layers deep, three-dimensional CNN model named DeepMedic to classify various tissues in MR images.

In this paper, due to the anisotropic resolution of MR images, we choose 2D horizontal planes along the perpendicular axis to conduct experiments on. Respected to CNN architecture we propose two assumes. Firstly, there are some nonlinearity relations between the intensities of different modalities (Flair,T1,T1c and T2) of the voxel among MR images. Secondly, the output of average pooling layers and max pooling layers, representing global and local information, are mutual complementary when taking the intensity data such as MR images as input. Inspired by Szegedy et al. [26], we introduce 1 × 1 filters into the input layer. However, the main purpose of 1 × 1 filters is to add the non-linearity dimensions of input data instead of reducing the module dimensions. At the same time, a two-pathway CNNs is applied to make relations between max pooling layers and average pooling layers. The FCRF as post-processing method also reinforces the global context information to optimize the probability result from the CNNs modal. Data pre-processing is necessary due to the data heterogeneity caused by multi-site multi-scanner acquisitions of MR images. The special normalization method for the MR images proposed by Nyúl et al. [27] is used and analyzed during the experiments.

Section snippets

Material and methods

The proposed method concludes three stages: data pre-processing, classification with CNNs, post-processing with FCRF.

Result

As mentioned previously, in addition to the CNN architecture proposed in Table 1, we raise some other relative CNNs for comparison. The Without 1 × 1 Kernel architecture in Table 2 has no 1 × 1 filter while the other part is the same as the CNN proposed. The TwoAverPooling model in Table 3 replaces the 7*7 average pooling layer in proposed one with two 5*5 average pooling layers. We refer to the architecture with only max pooling layers as TwoMaxPath in Table 4 and OneMaxPath in Table 5 which

Conclusion

In this paper, we proposed an automatic mixture brain tumor segmentation method based on CNN and FCRF. For comparison purpose, we also present some contrast architectures and analyse their performance under different train set or stage.

The results from the experiment previously indicate that 1 × 1 kernels can increase the PPV and Dice measurement to some extent in the brain tumor segmentation field. During the experiment, we investigate that 32 1 × 1 kernels followed by input layer are better.

Funding

This work was supported by the National Natural Science Foundation of China [Grant numbers 61672386]; the Anhui Provincial Natural Science Foundation of China [Grant numbers 1708085MF142]; the Major Research Project Breeding Foundation of Wannan Medical College [Grant numbers WK2017Z01]; the Anhui Provincial Humanities and Social Science Foundation of China [Grant numbers SK2018A0201]; ANHUI Province Key Laboratory of Affective Computing & Advanced Intelligent Machine [Grant numbers ACAIM180202

References (40)

  • S. Lakare, A. Kaufman, 3d segmentation techniques for medical volumes, Center for Visual Computing, Department of...
  • J. Han et al.

    Robust object co-segmentation using background prior

    IEEE Trans. Image Process.

    (2018)
  • W. Deng, W. Xiao, H. Deng, J. Liu, MRI brain tumor segmentation with region growing method based on the gradients and...
  • Luming Zhang et al.

    Discovering discriminative graphlets for aerial image categories recognition

    IEEE T-IP

    (2013)
  • J. Han et al.

    Advanced deep-learning techniques for salient and category-specific object detection: a survey

    IEEE Signal Process Mag.

    (2018)
  • B.H. Menze, K. Van Leemput, D. Lashkari, M.-A. Weber, N. Ayache, P. Golland, A generative model for brain tumor...
  • D.T. Gering, W.E.L. Grimson, R. Kikinis, Recognizing deviations from normalcy for brain tumor segmentation, in:...
  • G. Cheng et al.

    Duplex metric learning for image set classification

    IEEE Trans. Image Process.

    (2018)
  • C.-H. Lee, S. Wang, A. Murtha, M. Brown, R. Greiner, Segmenting brain tumors using pseudo–conditional random fields,...
  • Y. LeCun et al.

    Gradient-based learning applied to document recognition

    Proc. IEEE

    (1998)
  • Cited by (0)

    View full text