Keywords

1 Introduction

Brain malignancies are the most dangerous pathologies in neurological diseases. These malignancies present different degrees of aggressiveness, different prognosis and heterogeneous histological sub-regions (i.e., peritumoral edema, necrotic core, enhancing and non-enhancing tumor core). This variability (due to intrinsic heterogeneities of gliomas) poses a challenging task in which the imaging phenotype is described by varying appearance and shape profiles across multimodal MRI scans, reflecting varying tumor tissue properties. This variability poses a challenging task in which the imaging phenotype is described by varying appearance and shape biological descriptors across neurological scans, reflecting varying tumor tissue properties [12].

Localizing tumor areas is a crucial procedure for brain surgery planing. One of the main problems in this case is in fact the time, in practice, radiation oncologists spend a substantial portion of their time performing the segmentation manually using segmentation and visualization tools. Besides, in the literature survey, several techniques are proposed to overcome the challenges of brain tumor segmentation. Specifically, brain tumor segmentation algorithms based on convolutional neural networks (CNNs) have been shown to be at least as effective as other automated tumor segmentation methods [8].

In recent years, the researchers focused on exploring the entire field related to neural networks, a fully automatic pipeline that involves chaining together several unique 3D U-Net, a type of 3D patch-based convolutional neural network [1]. In general, state of the art focuses on models that initiate a process of forced learning given by iterations that cause specific weights to vary until an acceptable result is reached (i.e., to find a plausible tumor contour) [12]. The main problem of these approaches is that although the results are relevant, it is difficult to extract clinical information from these learning processes (i.e., to capture tissue properties from hidden layers as in CNNs) [9].

Patch-based methods make use of energy functions to define a given contour that matches plausible shape structures (i.e., tumor contour) [10]. These energy functions allow us to define the tumor contour from appearance and shape constraints based on the tumor properties [3]. However, the resulting performance of these approaches depends on the correct selection of the model parameters (i.e., graph cuts (GC) for image segmentation) [6]. Global optimization is an essential task in any complex problem where design and choice of model parameters play a key role. In the machine learning field, such problems are found in the tuning of hyperparameters [15] and experimental design [7].

Bayesian optimization (BO) [7, 15], proves to outperform state of the art for global optimization algorithms on many challenging optimization benchmark functions [11]. In this context, Bayesian optimization assumes that the objective function is sampled from a Gaussian process, maintaining the posterior distribution for this function as observations (by running learning algorithm experiments with different hyperparameters are observed). In this paper, we propose an enhanced Graph cut on which the model parameters are selected through a probabilistic approach. Here, we use Bayesian optimization to find the optimal hyperparameters that segment the tumor volume accurately. Our contribution is based on the Bayesian optimization process that finds the model parameters for controlling the energy function of the graph cut in a probabilistic way. The rest of the paper proceeds as follows. Section 2 provides a detailed discussion of materials and methods. Section 3 presents the experimental results and some discussions about the proposed method. The paper concludes in Sect. 4, with a summary and some ideas for future research.

2 Materials and Methods

2.1 Database

In this work we used the Brain Tumor Image Segmentation Challenge (Brats) 2015 [12]. This Database contains high-grade tumors, Low-grade tumors and labels maps made by experts based on landmarks. The tumors of this database are located in different brain regions. The label map showed in Fig. 1, have four different labels 1- for Necrosis (Green), 2- for the Edema (Yellow), 3- for Non-enhancing tumor (Red) and 4- for Enhancing tumor (Blue). We used the MRI T1 images with resolution of \(240\times 240\) pixels and 1 mm \(\times \) 1 mm \(\times \) 1 mm voxel size.

Fig. 1.
figure 1

Sample of a given abnormal tissue for the Brast2015 Database. (Color figure online)

2.2 Graph Cuts

We use a fast approximate energy minimization approach with label costs, that uses the alpha-expansion algorithm [5]. This algorithm is commonly used to minimize energies that involve unary, pairwise, and specialized higher-order terms that describes given appearance contour [3]. The segmented image can be modeled as an energy minimization that finds a labeling f (i.e., estimated contour) as,

$$\begin{aligned} E(f)=E_{\mathrm {smooth}}(f)+E_{\mathrm {data}}(f)+E_{\mathrm {label}}(f), \end{aligned}$$
(1)

where, \( E_{\mathrm {smooth}}(f) \) is a measure of the smoothness by parts of the labeling f, and \( E_{\mathrm {data}}(f) \) measures the discrepancy between f and the observed data. As in [5], the term \( E_{\mathrm {data}}(f) \) is computed as,

$$\begin{aligned} E_{data}(f)=\sum _{p\epsilon P}D_{p}(f_{p}), \end{aligned}$$
(2)

where \( D_{p} \) measures how well the label \( f_{p} \) fits the pixel p. Generally, this is evaluated using a quadratic standard, which can be given by \( (f_{p}-i_{p})^{2}\) where \(i_p\) is the original intensity of the pixel. The smoothness cost \( E_{\mathrm {smooth}}(f) \), is a standard regularizer which can be modeled as,

$$\begin{aligned} E_{\mathrm {smooth}}(f)=\sum _{p,q\epsilon \mathcal {N}}V_{p,q}(f_{p},f_{q}), \end{aligned}$$
(3)

where each \( V_{p,q} \) weights all \( f_{p}\ne f_{q} \). A simple use of this function can be given by \( V_{p,q}(f_{p},f_{q})= K\cdot |f_{p}-f_{q}| \) (with K being an arbitrary constant). Hence, if each \( V_{p,q} \) define a metric, then the minimization of the Eq. (1) it is known as the problem of metric labeling and can be effectively optimized with the alpha-expansion algorithm [5]. The label cost penalize each unique label that appears in f as \(E_{\mathrm {label}}(f)=\sum _{L\subseteq \mathcal {L}}h_L\cdot \delta _{L}(f)\), where \(h_L\) is the non-negative label cost of labels L and the indicator function \(\delta _{L}(.)\) is defined on a label subset L as,

$$\begin{aligned} \delta _{L}(f)\overset{\underset{\mathrm {def}}{}}{=}\left\{ \begin{matrix} 1 &{} \exists p : f_p\in L\\ 0 &{} otherwise \end{matrix}\right. \end{aligned}$$
(4)

2.3 Bayesian Optimization with Gaussian Process Priors

Since we want to compute the graph cuts hyperparameters in a probabilistic way, our goal is to find the minimum of a cost function \(f(\mathbf {x})\) (i.e., the performance index between the ground truth labels and the segmented tumor) on some bounded set \(\mathcal {X}\) that controls the model parameters. To this end, Bayesian optimization builds a probabilistic framework for \(f(\mathbf {x})\) with the aim to exploit this model to make predictions of the model parameters \(\mathcal {X}\) evaluated in the cost function [15]. The main components of the Bayesian optimization framework are the prior of the function to optimize, as well as the acquisition function that will allow us to determine the next point to evaluate the cost function [13]. In this work, we use a Gaussian process prior, due to its flexibility and tractability. A Gaussian Process (GP) is an infinite collection of scalar random variables indexed by an input space such that for any finite set of inputs \(\mathbf {X}=\{\mathbf {x}_1,\mathbf {x}_2,\cdots ,\mathbf {x}_n\}\), the random variables \(\mathbf {f} \overset{\Delta }{=} [f(\mathbf {x}_1),f(\mathbf {x}_2),\cdots ,f(\mathbf {x}_n)]\) are distributed according to a multivariate Gaussian distribution \(\mathbf {f}(\mathbf {X})=\mathcal {GP}(m(\mathbf {x}),k(\mathbf {x},\mathbf {{x}}^\prime ))\). A GP is completely specified by a mean function \( m(\mathbf {x})=\mathbb {E}\left[ f(\mathbf {x}) \right] \) (usually defined as the zero function) and a positive definite covariance function given by \(k(\mathbf {x},\mathbf {{x}}^\prime )=\mathbb {E}\left[ (f(\mathbf {x})-m(\mathbf {x})){(f(\mathbf {{x}}^\prime )-m(\mathbf {{x}}^\prime )}^{T}) \right] \) (see [15] for further details).

Let us assume that \(f(\mathbf {x})\) is drawn from a Gaussian process prior and that our observations are set as \( \left\{ \mathbf {x}_{n},y_{n} \right\} _{n=1}^{N} \), where \( y_{n}\sim \mathcal {N}(f(\mathbf {x}_{n}),\nu )\) and \(\nu \) is the noise variance. The acquisition function is denoted by \( a:\mathcal {X}\rightarrow \mathbb {R}^{+} \) and establishes the point in \( \mathcal {X} \) that is evaluated in the optimization process as \( {x}_{\mathrm {next}}=\mathrm {arg\,max}_{\mathbf {x}}a(\mathbf {x}) \). Since the acquisition function depends on the GP hyperparameters, \(\theta \), and the predictive mean function \( \mu (\mathbf {x};\{\mathbf {x}_n,\mathbf {y}_n\},\theta ) \) (as well as the predictive variance function), the best current value is then \({x}_{\mathrm {best}}=\mathrm {arg\,min}_{\mathbf {x}_n}f(\mathbf {x}_n)\).

2.4 Enhanced Graph Cuts with Bayesian Optimization

Our approach is based on the Bayesian optimization process for estimating the model parameters of the graph cut model that segments a given brain tumor accurately in a probabilistic way. In this work, we choose to optimize the foreground seed, \( \varOmega _{1}=\{x_{f},y_{f},z_{f}\} \), the background seed \( \varOmega _{2}=\{x_{b},y_{b},z_{b}\} \), and the \(\alpha \)-parameter of the swap algorithm \(\varOmega _{3}=\alpha \) [5]. For the graph cuts implementation we use the imcutFootnote 1 toolbox. Besides, as for the Bayesian optimization process, we use as a cost function, the Euclidean distance between the labels of the segmented tumor and the ground truth labels. We use the GPyOptFootnote 2 toolbox for python, developed by the Machine Learning group of the University of Sheffield. In this work, we report results for the expected improvement (EI), and the probability of improvement (PI) and some other relevant acquisition functions [15]. Figure 2 shows the block diagram of the proposed model used in this work.

Fig. 2.
figure 2

Block diagram of the proposed approach for the enhanced graph cuts with Bayesian optimization

3 Results and Discussions

In this section, we show the results of our framework for optimizing the graph cuts hyperparameters. We show a comparison between a given manual tuning and an automatic tuning using Bayesian optimization. Besides, we report a comparison of the different segmentation performances of the acquisition functions, as well as some qualitative and quantitative results compared with relevant works in the state-of-art.

As we can see in Fig. 3, Bayesian optimization can eliminate certain inconveniences that arise for manual tunning of the graph cuts. The figure shows that the optimal parameters allow us to segment the tumor contour more appropriately (i.e., avoiding false negatives derived from the segmentation process).

Fig. 3.
figure 3

Comparison between Bayesian optimization (middle) and manual adjustment, of graph cuts hyperparameters (left). The figure shows that the manual tuning of the graph cut model (left) derived in more false negatives in comparison with ground truth tumor (GT right).

Figure 4 shows the convergence of each acquisition function of the BO process. The red plots show the distance between the hyperparameters on each iteration. As a result, we can differentiate the stages of exploration and exploitation of the hyperparameters. Here, the more variation found between each consecutive hyper-parameter indicates the stage of exploration and small distances between consecutive hyperparameters indicates the stage of exploitation. The figures outlined in blue indicate the error convergence of each method. Furthermore, the results also show that the acquisition functions: integrated lower confidence bound and integrated probability improvement perform the tumor segmentation more accurately (see bottom row of Fig. 4).

Fig. 4.
figure 4

Convergence of the Bayesian optimization process for different acquisition functions. The figure shows the distance between values of x selected consecutively (red plots), and the minimum value of the performance index obtained in each iteration (blue plots). (Color figure online)

Figure 5 shows the curvature computed for three different tumors: ground truth volumes (left) and segmented tumors with BO (right). The results show that regions with high saliency (red areas in the tumor volumes), matches the segmented tumors with the optimal hyperparameters (the segmentation process preserves the relative curvature of the original tumor). Finally, Table 1 shows a comparison of different approaches reported in the state-of-the-art (BraTS 2017 challenge [12]). The results show that our approach outperforms some relevant methods in the state-of-art, which are based on deep learning approaches. Hence, since our approach performs the segmentation in an unsupervised manner, the probabilistic tuning of the model parameters sets an important result for these image segmentation approaches.

Fig. 5.
figure 5

Curvature comparison of three different subjects (left to right). The bottom row shows the curvature obtained from the segmented tumors with GC-BO. The figure shows the curvature obtained from the ground truth volumes. The figure shows that regions with high saliency (i.e., red colors) match the ground truth data. (Color figure online)

Table 1. Comparison with different methods proposed in [12] for brain tumor segmentation. We report the results as in the BRATS challenge: Dice coefficient, Hausdorff distance, sensitivity, and specificity (Some articles do not report some metrics, so they are assigned to N.R).

4 Conclusions and Future Work

In this paper, we propose a Bayesian optimization framework for tuning the model parameters of a graph cuts method. Our method seeks to find the best parameters that segment a given tumor contour in a probabilistic way. The experimental results show that our approach derives in more accurate contours than a given classical procedure in image segmentation with graph cuts. Besides, since the model parameters are optimized, the resulting curvature of the segmented tumor preserves main saliency regions that match the ground truth data. Finally, our approach outperforms some important methods in the state-of-art that use deep learning frameworks.

As for the future works, we plan to extend the classical Energy minimization problem of the region growing approaches to propose a new function that can be optimized with probability black-box functions.