Keywords

1 Introduction

Brain disease is one of the main diseases threatening human health. Using brain imaging to examine the qualitative and quantitative analysis of brain function is helpful for the effective diagnosis of brain disease [1]. Brain MR images have high contrast among different soft tissues and high spatial resolution across the entire field of view, which makes them easier to be analyzed than other medical images. Unfortunately, due to the imaging mechanism and the complexity of organ and tissue structure, MR images will appear gray inconsistency, weak boundary and other phenomena. Most segmentation methods are hindered by various imaging artifacts such as noise and intensity inhomogeneities [2].

Although researchers have proposed many methodologies [3,4,5] for segmentation, there remain some challenges which come from low contrast, intensity inhomogeneity and noise perturbation. In practical applications especially in the field of medical image segmentation, increasingly stringent requirements on the accuracy of the segmentation results make it urgent to find more robust approaches. Finite mixture model (FMM) is one of the most widely used clustering models in the field of image segmentation. Clustering is the process of arranging data into groups that having similar characteristics. When the conditional probability in FMM is chosen as a Gaussian distribution, we call the model as the Gaussian mixture model (GMM) [6, 7]. GMM can deal with the multivariate data problem effectively and flexibly because its small number of parameters can be estimated simply by adopting the expectation maximization (EM) algorithm. However, GMM wont get an accurate segmentation result when the image is influenced by heavy noise and have some outliers. In order to deal with such problems, some scholars have proposed to increase the number of mixture components of GMM to capture the heavy tails of the distribution which is at the expense of increasing the complexity of the model.

Students t-mixture model (SMM) proposed by Peel and McLachlan [8] is used as an alternative to GMM. Having an additional parameter called the degrees of freedom; SMM conforms to the elliptical symmetry which is wider than the circular symmetry of GMM. Hence, SMM is more robust than GMM in the presence of heavy outliers. However, SMM and GMM assume that each pixel in the image is independent and neither of them uses the prior information of the pixels in their neighborhood. Thus, GMM and SMM are still sensitive to noise.

In order to reduce the effect of noise, the mixture models with Markov random field (MRF) have been widely used for pixel labeling [9,10,11] where the prior distribution for each pixel corresponding to each label and depends on its neighboring pixels. Further, some models [12,13,14,15,16,17] were proposed to model the joint of the pixels in order to incorporate the spatial smoothness constraints between neighboring pixels. Among these improvements, EM algorithm was proposed to estimate parameters in the model by Diplaros et al. [14]. On the assumption that the hidden class labels of pixels are generated by prior distributions which share similar parameters for neighboring pixels, a pseudo-likelihood is introduced to couple neighboring prior information based on the entropy of Kullback-Leibler (KL) divergence [16]. Furthermore, Nikou et al. [15] proposed a family of smoothness priors based on the MRF which guarantee different smoothness strength for each cluster. This model also captures information in different spatial directions and obtained closed form update equations of the solutions by EM algorithm. Although these models reduce the influence of noise on the segmentation to some extent, most of the improvements are at the expense of increasing the model parameters and the complexity of the model.

Motivated by the aforementioned observation, in this paper, we propose a new method of spatial constraint construction. Spatially variant finite mixture model in [12] constructed the spatial constraint of the model only by using the priors of the current pixel and its neighborhood. On the basis of SVFMM, we firstly choose Students t-distribution instead of Gaussian distribution to deal with the situation of outliers. In order to improve the robustness of the model to noise, we construct a new penalty term for the model which incorporates with the posterior distributions and priors information. In the tth iteration of the EM algorithm, the prior distribution obtained in the M step only use the result of the (t-1) th iteration. Through the improvement of our model, the priors obtained in the tth iteration uses the posterior of the E step in the same iterative process which can correct the possible errors in the algorithm. However, in practice, there will be a phenomenon of intensity inhomogeneity which affects the segmentation accuracy. The bias field is mainly related to the reflection frequency and static noise field, Jungke et al. [18] clearly stated that the main challenges of medical image segmentation is for the bias field rather than the noise. In this paper, we use the basic function to fit the bias field to improve the robustness of the model.

2 Backgrounds

2.1 Students t-Mixture Model (SMM)

The finite mixture model is one of the most widely used clustering models for image segmentation. Let \(y^i\) denotes the observation at ith pixel of an image \((i=1,...,N)\) and \(x^i\) denotes the class label of the pixel. The density function for a \(K-\)component finite mixture model is

$$\begin{aligned} f(y|x;\theta _1,\theta _2,...,\theta _k;\pi _1,\pi _2,...,\pi _k)=\sum _{j=1}^n \pi _j \psi _j(y|x;\theta _j). \end{aligned}$$
(1)

For every ith pixel belongs to jth class, \(p(x_i=j)=pi_j\) represents the prior distribution which satisfies \(0<pi_j<1\) and \(\sum _{j=1}^n \pi _j=1\) s selected as a Gaussian distribution, (1) represents the Gaussian Mixture Model (GMM).

It has been proved that the Student’s t-distribution has a heavier tail than the Gauss distribution, and the finite mixture model of the Student’s t-distribution provides a more robust way than GMM. The density function of the Students t-distribution [19] is

$$\begin{aligned} t(y_i|\theta _j)= & {} t(y_i|\mu _j,\varSigma _j,\upsilon _j)\nonumber \\= & {} \frac{\varGamma (\frac{\upsilon _j}{2}+\frac{p}{2})}{\varGamma \frac{\upsilon _j}{2}}\frac{|\varSigma _{j}|^{-\frac{1}{2}}}{(\pi \upsilon _{j})^{\frac{p}{2}}}\times [1+\frac{(y_j-\mu _j)^{T}\varSigma _{j}^{-1}(y_j-\mu _j)}{\upsilon _{j}}]^{-\frac{\upsilon _{j}+p}{2}}, \end{aligned}$$
(2)

where \(\mu _j\),\(\varSigma _j\) and \(\upsilon _j\) refer to the mean, covariance and degrees of freedom of multivariate Students t–distribution.

It has been proved that Although FMM is widely used in image segmentation, it is assumed that each pixel is independent; no spatial information about the pixels is used. In order to overcome the drawback of GMM, the Spatially Variant Finite Mixture Model (SVFMM) [12] is proposed.

2.2 Spatially Variant Finite Mixture Model (SVFMM)

The SVFMM [13] method considers a maximum a posteriori (MAP) approach and introduces the spatial information based on the following Gibbs function [12, 20, 21]

$$\begin{aligned} P(\varPi )=\frac{1}{Z}exp(-U(\varPi )), \end{aligned}$$

where

$$\begin{aligned} U(\varPi )=\beta \sum _{i=1}^N V_{N_{i}}(\varPi ). \end{aligned}$$
(3)

The Z is a normalizing constant and \(U(\varPi )\) is an energy function. The function \(V_{N_{i}}\) stands for the clique potential associated with the neighborhood \(N_{i}\) and can be computed as follows

$$\begin{aligned} V_{N_{i}}(\varPi )=\sum _{m\in N_i}g(u_{i,m}), \end{aligned}$$
(4)

where \(u_{i,m}\) specifies the distance between two label vectors \(\pi ^{i}\) and \(\pi ^{m}\) which can be describe as

$$\begin{aligned} u_{i,m}=|\pi ^{i}-\pi ^{m}|=\sum _{j=1}^K (\pi _j^{i}-\pi _j^{m})^{2}. \end{aligned}$$
(5)

Function g(u) in (4) should be monotonically increasing and nonnegative and one choice [21] can be

$$\begin{aligned} g(u)=(1+u^{-1})^{-1}. \end{aligned}$$
(6)

In recent years, SVFMM has gained a lot of research and improvement. One route in these studies is to construct a space constraint based on the Markov random field (MRF). Most of the improvement of [12, 16, 22] has better segmentation, but usually with the increase of the model parameters.

3 Proposed Method

We choose the Students t-distribution instead of Gaussian distribution to improve the robustness of the model to outliers. And in order to make full use of the spatial information between pixels to improve the accuracy of the model, we combine Students t-distribution and SVFMM. A new way of spatial constraint addition based on SVFMM was proposed, which can improve the accuracy of segmentation and control model complexity without increasing the number of parameters. The new distance in clique potential (4) is

$$\begin{aligned} u_{i,m}=\sum _{j=1}^K (\pi _j^{i}-\pi _j^{m})^{2}+(z_j^{i}-z_j^{m})^{2}. \end{aligned}$$
(7)

The \(z_j^{i}\) is the condition expectation values of the hidden variables which is often computed in the E-step of EM algorithm. Distance function 4 describes the difference between the prior probability of each pixel and its neighborhood, the value of the potential function 3 is large when the difference is obvious. We improve the distance to consider the latest posterior probability of each iteration, which can correct the probable error in the previous iteration. After adding the improved spatial information, the energy function is

$$\begin{aligned} f(y|\varTheta )= & {} \sum _{j=1}^K \pi _j t_j(y|\theta _{j})\cdot p(\varPi )\nonumber \\= & {} \sum _{j=1}^K \pi _j\frac{\varGamma (\frac{\upsilon _j}{2}+\frac{p}{2})}{\varGamma \frac{\upsilon _j}{2}}\frac{|\varSigma _{j}|^{-\frac{1}{2}}}{(\pi \upsilon _{j})^{\frac{p}{2}}}\times [1+\frac{(y_j-\mu _j)^{T}\varSigma _{j}^{-1}(y_j-\mu _j)}{\upsilon _{j}}]^{-\frac{\upsilon _{j}+p}{2}}\cdot p(\varPi ), \end{aligned}$$
(8)

where

$$\begin{aligned} p(\varPi )=\frac{1}{z}exp(-\beta \sum _{i=1}^N\sum _{m\in N_i}(1+(\sum _{j=1}^K(\pi _j^{i}-\pi _j^{m})^{2}+(z_j^{i}-z_j^{m})^{2})^{-1})^{-1}). \end{aligned}$$
(9)

In fact, a major problem for segmentation of MR images is the intensity inhomogeneity due to the bias field. Under the influence of bias field, the image formation can be given by

$$\begin{aligned} I=I_{0}\times B+N, \end{aligned}$$
(10)

where I is the observed image, \(I_{0}\) is the true image, B is the bias field and N is the noise.

We improve the spatial constraint method and use the base function to fit the bias field and coupled to the model. Our model is given as follows

$$\begin{aligned} f(y|\varTheta )= & {} \sum _{j=1}^K \pi _j t_j(y|\theta _{j})\cdot p(\varPi )\nonumber \\= & {} \sum _{j=1}^K \pi _j\frac{\varGamma (\frac{\upsilon _j}{2}+\frac{p}{2})}{\varGamma \frac{\upsilon _j}{2}}\frac{|\varSigma _{j}|^{-\frac{1}{2}}}{(\pi \upsilon _{j})^{\frac{p}{2}}}\times [1+\frac{(y_j-B\mu _j)^{T}\varSigma _{j}^{-1}(y_j-B\mu _j)}{\upsilon _{j}}]^{-\frac{\upsilon _{j}+p}{2}}\cdot p(\varPi ), \end{aligned}$$
(11)

where

$$\begin{aligned} p(\varPi )=\frac{1}{z}exp(-\beta \sum _{i=1}^N\sum _{m\in N_i}(1+(\sum _{j=1}^K(\pi _j^{i}-\pi _j^{m})^{2}+(z_j^{i}-z_j^{m})^{2})^{-1})^{-1}). \end{aligned}$$
(12)

4 Parameter Learning

In this section we use the EM algorithm to estimate the parameters of the model. The log-likelihood function of our model is written in the form

$$\begin{aligned} L(\varPi ,\varTheta |Y)= & {} \sum _{i=1}^N log\{\sum _{j=1}^K \pi _j t_j(y|\theta _j)\}\nonumber \\- & {} logZ-\beta \sum _{i=1}^N\sum _{m\in N_i}(1+(\sum _{j=1}^K(\pi _j^{i}-\pi _j^{m})^{2}+(z_j^{i}-z_j^{m})^{2})^{-1})^{-1}). \end{aligned}$$
(13)

Apply of the complete data condition in [12] and the Jensens inequality, we can get

$$\begin{aligned} J(\varPi ,\varTheta |Y)= & {} \sum _{i=1}^N \sum _{j=1}^K z_j^i\{log\pi _j +logt_j(y|\theta _j)\}\nonumber \\- & {} logZ-\beta \sum _{i=1}^N\sum _{m\in N_i}(1+(\sum _{j=1}^K(\pi _j^{i}-\pi _j^{m})^{2}+(z_j^{i}-z_j^{m})^{2})^{-1})^{-1}). \end{aligned}$$
(14)

Similar to the MRF-based methods [12, 13], Z and in (14) are set to one. The new objective function is given by

$$\begin{aligned} J(\varPi ,\varTheta |Y)= & {} \sum _{i=1}^N \sum _{j=1}^K z_j^i\{log\pi _j +logt_j(y|\theta _j)\}\nonumber \\- & {} \sum _{i=1}^N\sum _{m\in N_i}(1+(\sum _{j=1}^K(\pi _j^{i}-\pi _j^{m})^{2}+(z_j^{i}-z_j^{m})^{2})^{-1})^{-1}). \end{aligned}$$
(15)

With help of the scaling factor, Students t-distribution can be represented as a mixture of Gaussians, with the same mean and covariance as follows

$$\begin{aligned} t_j(y^i|\theta _j)=N(y^i|\mu _j,\frac{\varSigma _j}{u_j^i}), \end{aligned}$$
(16)

where

$$\begin{aligned} N(y^i|\mu _j,\frac{\varSigma _j}{u_j^i})=\frac{|\varSigma _j|^{-\frac{1}{2}}(u_j^i)^{\frac{p}{2}}}{(2\pi )^{\frac{p}{2}}}exp\{-\frac{1}{2}u_j^i(y^i-\mu _j)^T\varSigma _j^{-1}(y^i-\mu _j)\} \end{aligned}$$
(17)

and

$$\begin{aligned} u_j^{i}\sim gamma(\frac{\upsilon _j}{2},\frac{\upsilon _j}{2}) \end{aligned}$$
(18)

According to [8], the complete-data log likelihood can be factored into three parts and can be given as

$$\begin{aligned} logL_c (\varTheta )=logL_1c (\pi )+logL_2c (\upsilon )+logL_3c (\theta ), \end{aligned}$$
(19)

where

$$\begin{aligned} logL_1c (\pi )=\sum _{i=1}^N\sum _{j=1}^K z_j^i log\pi _j-\sum _{i=1}^N\sum _{m\in N_i}(1+(\sum _{j=1}^K(\pi _j^{i}-\pi _j^{m})^{2}+(z_j^{i}-z_j^{m})^{2})^{-1})^{-1}), \end{aligned}$$
(20)
$$\begin{aligned} logL_2c (\upsilon )=\sum _{i=1}^N\sum _{j=1}^K z_j^i\{-log\varGamma (\frac{\upsilon _j}{2})+\frac{1}{2}\upsilon _jlog(\frac{\upsilon _j}{2})+\frac{1}{2}\upsilon _j(logu_j^i-u_j^i)-logu_j^i\}, \end{aligned}$$
(21)
$$\begin{aligned} logL_3c (\theta )=\sum _{i=1}^N\sum _{j=1}^K z_j^i\{-\frac{1}{2}plog(2\pi )-\frac{1}{2}log|\varSigma _j|-\frac{1}{2}u_j^i(y^i-B_i\mu _j)^T\varSigma _j^{-1}(y^i-B_i\mu _j)\}. \end{aligned}$$
(22)

We calculate the conditional expectation values of the hidden variables in the (k+1)th iteration E-step of EM algorithm as follows

$$\begin{aligned} E(z_j^i|y^i)=z_j^{i(k+1)}=\frac{\pi _j^{i(k)}t_j(y^i|\theta _j^{(k)})}{\sum _{l=1}^K\pi _l^{i(k)}t_l(y^i|\theta _l^{(k)})}, \end{aligned}$$
(23)
$$\begin{aligned} E(u_j^i|y^i)=u_j^{i(k+1)}=\frac{\upsilon _j^{(k)}+p}{\upsilon _j^{(k)}+(y^i-\mu _j^{(k)})^T\varSigma _j^{(k)-1}(y^i-\mu _j^{(k)})}, \end{aligned}$$
(24)

and

$$\begin{aligned} E(logu_j^i|y^i)=logu_j^{i(k)}-log(\frac{\upsilon _j^{(k)}+p}{2})+\psi (\frac{\upsilon _j^{(k)}+p}{2}). \end{aligned}$$
(25)

The M-step is calculated by maximizing the expectation of the complete data log likelihood form which can be given as

$$\begin{aligned} Q(\varTheta ;\varTheta ^{(k)})=Q_1(\pi ;\varTheta ^{(k)})+Q_2(\upsilon ;\varTheta ^{(k)})+Q_3(\theta ;\varTheta ^{(k)}), \end{aligned}$$
(26)

where

$$\begin{aligned} Q_1(\pi ;\varTheta ^{(k)})=\sum _{i=1}^N\sum _{j=1}^K z_j^{i(k)}log\pi _j-\sum _{i=1}^N\sum _{m\in N_i}(1+(\sum _{j=1}^K(\pi _j^i-\pi _j^m)^2+(z_j^i-z_j^m)^2)^{-1})^{-1}, \end{aligned}$$
(27)
$$\begin{aligned} Q_2(\upsilon ;\varTheta ^{(k)})= & {} \sum _{i=1}^N\sum _{j=1}^K z_j^{i(k)}\{-log\varGamma (\frac{\upsilon }{2})+\frac{1}{2}\upsilon _jlog(\frac{\upsilon _j}{2})\nonumber \\+ & {} \frac{1}{2}\upsilon _j(logu_j^{i(k)}-u_j^{i(k)})+\psi (\frac{\upsilon _j^{(k)}+p}{2})-log(\frac{\upsilon _j^{(k)}+p}{2})\}, \end{aligned}$$
(28)

and

$$\begin{aligned} Q_3(\theta ;\varTheta ^{(k)})= & {} \sum _{i=1}^N\sum _{j=1}^K z_j^{i(k)}\{-\frac{1}{2}plog(2\pi )-\frac{1}{2}log|\varSigma _j|\nonumber \\+ & {} \frac{1}{2}logu_j^{i(k)}-\frac{1}{2}u_j^{i(k)}(y^i-B_i\mu _j)^T\varSigma _j^{-1}(y^i-B_i\mu _j)\}. \end{aligned}$$
(29)

The expression of the \(\mu _j\) and \(\varSigma _j\) can be obtained by maximizing \(Q_3(\theta ;\varTheta ^{(k)})\) (Eq. 29),which yields

$$\begin{aligned} \frac{\partial Q_3}{\partial \mu _j^{(k+1)}}=\sum _{i=1}^N z_j^{i(k)}u_j^{i(k)}\varSigma _j^{-1}(y^i-B_i^{(k)}\mu _j^{(k+1)}) \end{aligned}$$
(30)

and

$$\begin{aligned} \frac{\partial Q_3}{\partial \varSigma _j^{-(k+1)}}=\frac{1}{2}\sum _{i=1}^N z_j^{i(k)}\{\varSigma _j^{-(k+1)}-u_j^{i(k)}(y^i-B_i^{(k)}\mu _j^{(k+1)})^T(y^i-B_i^{(k)}\mu _j^{(k+1)})\}. \end{aligned}$$
(31)

Let \(\frac{\partial Q_3}{\partial \mu _j^{(k+1)}}=0\) and \(\frac{\partial Q_3}{\partial \varSigma _j^{-(k+1)}}=0\) yields

$$\begin{aligned} \mu _j^{(k+1)}=\frac{\sum _{i=1}^N z_j^{i(k)}u_j^{i(k)}y^i}{\sum _{i=1}^N z_j^{i(k)}u_j^{i(k)}} \end{aligned}$$
(32)

and

$$\begin{aligned} \varSigma _j^{(k+1)}=\frac{\sum _{i=1}^N z_j^{i(k)}u_j^{i(k)}(y^i-B_i^{(k)}\mu _j^{(k+1)})^T(y^i-B_i^{(k)}\mu _j^{(k+1)})}{\sum _{i=1}^N z_j^{i(k)}} \end{aligned}$$
(33)

Setting the partial derivative \(Q_2(\upsilon ;\varTheta ^{(k)})\)(Eq. 28) with respect to \(\upsilon _j\), we have

$$\begin{aligned} \frac{\partial Q_2}{\partial \upsilon _j^{(k+1)}}= & {} \frac{1}{2}\sum _{(i=1)}^N z_j^{i(k)}\{log(\frac{\upsilon _j}{2})-\psi (\frac{\upsilon _j}{2})+1+logi_j^{i(k)}-u_j^{i(k)}\nonumber \\+ & {} \psi (\frac{\upsilon _j^{(k)}+p}{2})-log(\frac{\upsilon _j^{(k)}+p}{2})\}. \end{aligned}$$
(34)

Then setting (34) equal to zero, it means that \(\upsilon _j^{(k)}\) needs to be iteratively computed to solve the following equations:

$$\begin{aligned} log(\frac{\upsilon _j}{2})-\psi (\frac{\upsilon _j}{2})+1+\frac{\sum _{i=1}^N z_j^{i(k)}\{logu_j^{i(k)}-u_j^{i(k)}\}}{\sum _{i=1}^N z_j^{i(k)}}+\psi (\frac{\upsilon _j^{(k)}+p}{2} )-log(\frac{\upsilon _j^{(k)}+p}{2})=0. \end{aligned}$$
(35)

In order to maximize \(Q_1(\pi ;\varTheta ^{(k)})\)(Eq. 27) with respect \(\pi _j^i\) we set the derivative equal to zero and obtain the following quadratic [15]

$$\begin{aligned} 4[\sum _{m\in N_i}\dot{g}(u_{i,m})](\pi _j^{i(k+1)})^2-4[\sum _{m\in N_i}\dot{g}(u_{i,m})](\pi _j^{i(k+1)})-z_j^{i(k)}=0, \end{aligned}$$
(36)

where \(\dot{g}(u_{i,m})\) indicates the derivative of g(Eq.(6)). \(N_i\) in the above equation is the neighborhood of the ith pixel whose label vectors \(\pi ^m\) has not been updated in kth step. The two roots of (36) are

$$\begin{aligned} \pi _j^{i(k+1)}=\frac{[\sum _m\in N_i\dot{g}(u_{i,m})\pi _j^m]\pm \sqrt{[\sum _{m\in N_i}\dot{g}(u_{i,m})\pi _j^m]^2+z_j^{i(k+1)}[\sum _{m\in N_i}\dot{g}(u_{i,m})]}}{2[\sum _{m\in N_i}\dot{g}(u_{i,m})]} \end{aligned}$$
(37)

Many scholars assumed that the Bias field B is smooth and slowly varying in small local region [2] and some of them [23, 24] model B to be a linear combination of L smooth basis functions \(S_1,...,S_l\):

$$\begin{aligned} B(y)=\sum _{l=1}^L q_l s_l(y), \end{aligned}$$
(38)

where \(q_l\) is the combination coefficient \(s_l\) satisfies:

$$\begin{aligned} \int _\varOmega s_i(x)s_j(x)dx=\delta _{ij}. \end{aligned}$$
(39)

\(\delta _{ij}=1\) for \(i=j\) and \(\delta _{ij}=0\) for \(i\ne j\). In Eq.(36) the bias field B can be written as:

$$\begin{aligned} B_i=Q^T S_i \end{aligned}$$
(40)

and \(Q_3(\theta ;\varTheta ^{(k)})\)(Eq. 29) can be written as:

$$\begin{aligned} Q_3(\theta ;\varTheta ^{(k)})= & {} \sum _{i=1}^N\sum _{j=1}^K z_j^{i(k)}\{-\frac{1}{2}plog(2\pi )-\frac{1}{2}log|\varSigma _j|\nonumber \\+ & {} \frac{1}{2}logu_j^{i(k)}-\frac{1}{2}u_j^{i(k)}(y^i-Q^T S_i\mu _j)^T\varSigma _j^{-1}(y^i-Q^T S_i\mu _j)\} \end{aligned}$$
(41)

Setting the partial derivative of (41) with respect to Q equals zero, we have

$$\begin{aligned} \frac{\partial Q_3(\theta ;\varTheta ^{(k)})}{\partial Q}= & {} \partial \{\sum _{i=1}^N\sum _{j=1}^K z_j^{i(k)}\{-\frac{1}{2}plog(2\pi )-\frac{1}{2}log|\varSigma _j|+\frac{1}{2}logu_j^{i(k)}\nonumber \\- & {} \frac{1}{2}u_j^{i(k)}(y^i-Q^T S_i\mu _j)^T\varSigma _j^{-1}(y^i-Q^T S_i\mu _j)\}\}/{\partial Q} \end{aligned}$$
$$\begin{aligned}&\Longrightarrow&\sum _{i=1}^N S_i S_i^T\sum _{j=1}^K z_j^i u_j^i\mu _j^T\varSigma _j^{-1}\mu _j Q=\sum _{i=1}^N S_i\sum _{j=1}^K z_j^i u_j^i y^i\varSigma _j^{-1}\mu _j\nonumber \\&\Longrightarrow&Q=(\sum _{i=1}^N S_i S_i^T\sum _{j=1}^K z_j^i u_j^i\mu _j^T\varSigma _j^{-1}\mu _j)^{-1}\sum _{i=1}^N S_i\sum _{j=1}^K z_j^i u_j^i y^i\varSigma _j^{-1}\mu _j. \end{aligned}$$
(42)

We summary the computation process of our algorithm as follows:

Step.1 Initialize the parameters of the model and select \(3\times 3\) square as the neighborhood window.

Step.2 E-step: Calculate \(z_j^i\),\(u_j^i\) and \(logu_j^i\) by using Eqs.(23)–(25), respectively.

Step.3 M-step: Calculate \(\mu _j\),\(\varSigma _j\),\(\upsilon _j\),\(\pi _j\) and Q by using Eqs.(32)–(33), (35), (37) and (42) respectively.

Step.4 Go to Step.2 until the object function converges.

5 Experiment Results and Discussion

In this section, we experimentally evaluate our proposed method in a set of MR images. The clinical images are generated from Internet Brain Segmentation Repository (IBSR, http://www.cma.mgh.harvard.edu/ibsr/). We also evaluate GMM [6], SVFMM [12], SMM [8] and SMM-SVFMM for comparison. Our experiments have been developed in Matlab R2012a and are carried out with clinical brain MR images. All methods are initialized by using K-means method with 100 iterations.

Firstly, we compare our model with other methods in order to show its robustness to noise. Figure 1(a) shows the initial image with Gaussian noise (with 30% noise level). Figure 1(b) is the standard segmentation results. Figure 1(c)-(f) show the segmentation result for GMM, SVFMM, SMM and SMM-SVFMM respectively. Figure 1(g) shows the corresponding segmentation results of our method. It can be seen from the results that our model is more robust to noise by using the new model Gibbs theory. In addition, Fig. 1(h)-(m) are the details of Fig. 1(b)-(g) which shows that our method can better preserve the details of the initial image compared with other methods. By comparing Fig. 1(l) and Fig. 1(m), this paper presents a new method of spatial constraint which is superior to the SVFMM model.

In order to quantitatively analyze the segmentation effect of various segmentation methods, we use Js values [25] as a metric to evaluate their performance.

$$\begin{aligned} Js(S_1,S_2)=\frac{S_1\bigcap S_2}{S_1\bigcup S_2} \end{aligned}$$
(43)

\(S_1\) in (43) is the segmentation result and \(S_2\) is the ground truth. The value of Js should be higher when the result is more accurate. Under different noise conditions, the average results are listed in Table 1. The values show that our method has a more accurate segmentation results than GMM, SVFMM, SMM and SMM-SVFMM under different noise environment.

Fig. 1.
figure 1

(a) Segmentation image with 30% noise level. (b) Original four-class image. (h) Figure (b) details. (c) GMM segmentation result. (i) Figure(c) details. (d) SVFMM segmentation result. (j) Figure(d) details. (e) SMM segmentation result. (k) Figure(e) details. (f) SMM-SVFMM segmentation result. (l) Figure(f) details. (g) Segmentation result of proposed method. (m) Figure(g) details.

Table 1. Comparison of segmentation accuracy of different models under different noise conditions
Fig. 2.
figure 2

(a) Corrupted by 3%Gaussian noise and 80% bias field. (b) Original four-class image. (c) GMM (d) SVFMM (e) SMM (f) SMM-SVFMM (g) proposed method.

Table 2. Comparison of segmentation accuracy of different models under different noise and bias field conditions

Furthermore, in order to show the advantage of our model when the image effected by intensity inhomogeneity, we compare our method with GMM, SVFMM, SMM and SMM-SVFMM. Figure 2(a) shows the image affected by noise and bias field (with 3% Gaussian noise and 40% bias field). Figure 2(b) shows the initial without noise and been influenced by gray inhomogeneity. Figure 2(c)-(f) show the segmentation result for GMM, SVFMM, SMM and SMM-SVFMM which still have noise and intensity inhomogeneity. Figure 2(g) shows the result of our model. The results comparison indicates that our model is more robust to noise and intensity inhomogeneity than other models. Table 2 is the comparison of segmentation accuracy of different models under different noise and bias field conditions. The comparison shows that our method has the highest accuracy under the different degrees of intensity inhomogeneity. Figure 2(a) and (e) are the image to be segmentation with 3% Gaussian noise and were affected by 30% and 80% percent of bias field. Figure 3(b) and (f) are segmentation result by our model which shows that our method is robust to every degree of intensity inhomogeneity.

Fig. 3.
figure 3

(a) Corrupted by 3%Gaussian noise and 30% bias field. (b) Proposed method (c) bias field fitted (d) corrected image (e) Corrupted by 3% Gaussian noise and 80% bias field. (f) Proposed method (g) bias field fitted (h) corrected image.

6 Conclusion

In this paper, we proposed a new model based on SVFMM by use the Students t-distribution instead of the Gaussian distribution and introduce a new robust model Gibbs theory. Furthermore, the base function is used to fit the bias field and coupled to the Markov Random Field in order to enhance the robustness of the intensity inhomogeneity. Moreover, we simultaneously use the EM algorithm for parameters estimation. From the experimental results, we can conclude that the proposed model is more robust to noise and bias fields than other methods. When the neighborhood radius is 5, the accuracy of the model is higher than 3.