Keywords

1 Introduction

Image segmentation refers to partitioning an image into several disjoint subsets such that each corresponds to a meaningful part of the image. It is a classical and fundamental problem in computer vision. Despite many years of research, general purpose image segmentation as an ill-posed problem is still a very challenging task.

Active contour model [4, 6] is one of the successful methods of segmentation. It uses the theory of dynamics model that an initial curve is derived to target contour under the internal force of the curve itself and the external force of image data. Level-set-based active contour models [15, 16] expresses implicitly the contour as the zero level of a level set function, and evolves the curve based on an upgrade equation, finally, get smooth, closed, and high-precision segmentation curves. According to the properties of its energy function, active contour models are classified into region-based model [7, 12, 18] and edge-based model [3]. The region-based models utilize local information to guide contour curve move to the boundary of object approximately, while the edge-based on utilizes a stopping function to attract contours to the desired boundaries.

Geodesic active contour (GAC) [3], which was independently introduced by Caselles et al. [2] and Malladi et al. [14], is a very popular edge-based model. Its basic idea is to represent contours as the zero level set, and to evolve the level set function according to a partial differential equation (PDE). GAC has several advantages over the traditional parametric active contours. Firstly, the contour represented by the level set function can break or merge naturally during the evolution. Secondly, the level set function always remains the function on a fixed grid, which allows efficient numerical schemes. However, it has been proved to be locally optimal, it is possible that the contour would stop in or out of the object when it encounters an obvious boundary during the evolution.

Using interactive segmentation algorithm, user can make some foreground and background labels to give some prior knowledge before segmentation executing. With the prior knowledge, interactive segmentation algorithm can extract objects more accurately. At present, there are many works on the research of interactive segmentation. Regularization term is often used to incorporate the hard constraints of user’s label into the cost function, for example using L2-norm distance to construct regularization term [17]. It is also a popular approach to use probability model such as Gauss Mixed Model, Support Vector Machine and Geodesic which are built with user’s label to assist the segmentation [13, 19]. Graph-based approaches use user’s label to represent the information of must link or can’t link as hard or soft constraints [1, 5].

There are some works on interactive level set segmentation, which use regularization term of hard constraints and probability model. [20] used belief propagation to minimize a global cost function according to local level sets. The propagation starts with one user marked point, and iteratively extends the user information from the labeled pixel to its neighborhood by calculating the beliefs of the pixels in the same level as the marked pixel. This method was designed for medical image, but has not good result for object of heterogeneous images. [10] used user’s label to integrates discriminative classification models and distance transforms with the level set method, and the terms of energy function are based on a probabilistic classifier and an unsigned distance transform of salient edges. This method is effective for heterogeneous image, but it can’t support user to feedback in interactive process.

In this paper, we propose an interactive level set segmentation method for edge-based model. We use the multiplication of the hard constraints which record the user’s label and the level set function as a regularization term and add it into the energy function. Each element of the hard constraints represents the label of the corresponding pixels in the image. If the pixel is not marked, the value of corresponding element in the hard constrains is 0; if marked as foreground, the value is \(-1\); if marked as background, the value is 1. With the new regularization term, the evolution of level set function can only be impacted at the location of user’s label, so that the evolution process can be accurately carried out under the influence of the user’s label. Usually, the initial segmentation may not be perfect. According to the observed results, user can mark new foreground and background labels for the next evolution. By these new labels, the algorithm can efficiently evolve the current level set function without recomputing from initial contour. Satisfactory results is obtained by repeating the process of mark-evolution. The experiment results show the efficiency of our method.

2 Background

Our interactive label regularizer is tested under GAC model. Hence, we briefly introduce it.

2.1 Level Set Segmentation Based on GAC Model

Given an image with size \(m*n\), we denote pixels set by \(\varOmega =\{(x,y);x=1,\dots ,m,y=1,\dots ,n\}\), and the feature \(I=\{I(x,y);(x,y)\in \varOmega \}\). In the method of level set, \(\varOmega \) is taken as a continuous region.Active contours is denoted by C, and represented by the zero level set \(C(t)=\{(x,y)|\phi (t,x,y)=0\}\). Here, we define

$$\begin{aligned} \phi (t,x,y)= \left\{ \begin{array}{cl} <0 &{}\text {if}\ (x,y)\in inside(C(t))\\ >0 &{} \text {if}\ (x,y)\in outside(C(t)). \\ \end{array} \right. \end{aligned}$$
(1)

The evolution equation of the level set function \(\phi \) can be written in the following general form:

$$\begin{aligned} \frac{\partial \phi }{\partial t}+F|\nabla \phi |=0 \end{aligned}$$
(2)

The function F is called speed function. For image segmentation, the F depends on the image data and the level set function.

In [8], g is edge detection function defined by \(g=\frac{1}{1+{|\nabla G_{\delta }*I|^2}}\), where \(G_{\delta }\) is the Gaussian kernel with standard deviation \(\delta \). Based on g, it constructs an energy term, called the external term that drives the motion of the zero level set toward the desired image features, such as object boundaries. The external energy based on \(\phi (x,y)\) is as follow:

$$\begin{aligned} E_{g,\lambda ,v}(\phi )=\lambda L_{g}(\phi )+vA_g(\phi ) \end{aligned}$$
(3)

where \(\lambda >0\), v is a constant. The definitions of \(L_{g}(\phi )\) and \(A_g(\phi )\) are as follow:

$$L_{g}(\phi )=\int _{\varOmega }g\delta (\phi )|\triangledown \phi |dxdy$$
$$A_g(\phi )=\int _{\varOmega }gH(-\phi )dxdy$$

\(L_g\) represents the length of zero level set and \(A_g\) represents the area inside the contour that control the speed of evolution. H represents Heaviside function defined as follows:

$$\begin{aligned} H(z)= \left\{ \begin{array}{cl} 1 &{}\text {if}\ z\ge 0\\ 0 &{} \text {if}\ z<0. \\ \end{array} \right. \end{aligned}$$
(4)

In practical applications, H is generally replaced by one relaxing version. This paper uses a substitute as follows:

$$\begin{aligned} H_\varepsilon (z)= \left\{ \begin{array}{cl} 1 &{}\text {if}\ z>\varepsilon \\ 0 &{} \text {if}\ z<-\varepsilon \\ \frac{1}{2}(1+\frac{z}{\varepsilon }+\frac{1}{\pi } sin(\frac{\pi z}{\varepsilon }) &{} \text {if}\ |z|<\varepsilon . \\ \end{array} \right. \end{aligned}$$
(5)

\(\delta (z)=\frac{dH(z)}{dz}\) is called Dirac measure. Here, \(\delta (z)\) is derived from the relaxed H.

According to the Euler-Lagrange equation

$$\begin{aligned} \frac{\partial \phi }{\partial t}=-\frac{\partial E}{\partial \phi } \end{aligned}$$
(6)

we can get the following variational formulation of \(\frac{\partial \phi }{\partial t}\) in (3):

$$\begin{aligned} L(\phi )=\frac{\partial \phi }{\partial t}=\lambda \delta (\phi )div(g\frac{\nabla \phi }{|\nabla \phi |})+vg\delta (\phi ) \end{aligned}$$
(7)

which is the gradient flow that minimizes the energy function. By using (7) and according to

$$\begin{aligned} \phi ^{k+1}_{i,j}=\phi ^{k}_{i,j}+\tau L(\phi ^{k}_{i,j}) \end{aligned}$$
(8)

we can process the evolution of level set function.

2.2 The Drawback of GAC Model

It is crucial to keep the evolving level set function as an approximate signed distance function during the evolution, especially in a neighborhood around the zero level set. Naturally, Li et al. [8] proposed the following integral:

$$\begin{aligned} P(\phi )=\int _{\varOmega }\frac{1}{2}(|\triangledown \phi |-1)^2 dxdy. \end{aligned}$$
(9)

The energy function of [8] is defined as follow:

$$\begin{aligned} E(\phi )=\mu P(\phi )+E_{g,\lambda ,v}(\phi ). \end{aligned}$$
(10)

The new \(L(\phi )\) is:

$$\begin{aligned} {L(\phi )=} {\mu [\varDelta \phi -div(\frac{\nabla \phi }{|\nabla \phi |})]+\lambda \delta (\phi )div(g\frac{\nabla \phi }{|\nabla \phi |})+vg\delta (\phi )} \end{aligned}$$
(11)

We use this energy function to extract the black ring in Fig. 1(a). The evolution of the contour stopped at the outside boundary of the black ring which is shown in Fig. 1(b), and the segmentation result is the black ring and the gray circle inside, which is not what we want. It shows that GAC model tends to get a local optimal solution. But in Fig. 1(c) and (d), our proposed method can get a satisfactory result with additional user’s label. That means, the active contour would stop evolution when the level set meets boundary. Moreover, some special object can not be extracted.

In addition, [9] proposed a level set energy function that added a penalty term to tradition level set energy function to force the level set function to be close to a signed distance function, which completely eliminated the need of the costly re-initialization procedure. [9] added a distance regularization term to the energy function of [8], so as to restrict the evolution of the zero level set in a give range rather than the whole level set function. This method can reduce the number of iterations of evolution and improve the accuracy of segmentation. However, it still cannot avoid above local solution.

Fig. 1.
figure 1

Extract the black ring in the figure using the general GAC model and our method. (a) is the initial contour used by GAC model, (b) is the result of general GAC model, it can’t extract the black ring due to the local optimum, (c) is the initial contour with user’s label, where red scribbles provide hints for objectives to be avoided. (d) is the result of our method.

3 The Level Set Model with an Interactive Label Regularization Term

Given the user scribble seeds that belong to the set F and set B, which repetitively provide hints of sub-regions to be extracted and to be avoided. The input image will be classified into these two sets F and B. u represents the matrix of user’s label, defined as follows:

$$\begin{aligned} u(x,y)=\left\{ \begin{array}{cl} -1 &{} \text {if}\ (x,y)\in F \text {(blue scribbles)}\\ 0 &{} \text {if}\ (x,y) \ \text {is not marked} \\ 1 &{} \text {if}\ (x,y)\in B \text {(red scribbles)} \end{array} \right. \end{aligned}$$
(12)

In this paper, we propose an interactive level set segmentation method for edge-based model. We add a regulation term that is constructed by user’s label to the energy function (10), the new energy function is as follow:

$$\begin{aligned} E(\phi )=\mu P(\phi )+E_{g,\lambda ,v}(\phi )+kE_{u}(\phi ,u) \end{aligned}$$
(13)

where \(\phi \) is level set function, \(k>0\) is the weight of \(E_u\).

3.1 User’s Label Regularization Term Based on L2-Norm Distance

Firstly, we test the L2-norm distance to construct the user’s label regularization term, which was also used together with graph-based methods in Shen et al. [17]. They defined a likelihood function l(xy) for each pixel (xy), which represents the likelihood of the pixel belonging to the corresponding set:

$$\begin{aligned} (x,y)\in \left\{ \begin{array}{cl} F &{}\text {if}\ l(x,y)<0\\ B &{} \text {if}\ l(x,y)>0 \\ unknown &{} \text {if}\ l(x,y)=0\\ \end{array} \right. \end{aligned}$$
(14)
$$\begin{aligned} E_u(l,u)=\sum _{x,y}^{}\Vert l(x,y)-u(x,y)\Vert ^2. \end{aligned}$$
(15)

This regularization term is effective for normalized cut and graph cut. However, experimental results show that using (15) to calculate \(E_{u}\) in the level-set-based energy function (13) is not effective. From Fig. 2(b) we can see that the user’s mark is of no use and there are a lot of scattered contours in the background. This is mainly because, \(u(x,y)=0\) means the label of (xy) is unknown,while \(\phi (x,y)=0\) means (xy) locates at current contour. The different hints give arise to a penalty when \(u(x,y)=0\) and \(\phi (x,y)\ne 0\), and then disturbs the evolution process of level set function and mislead to another goal.

Fig. 2.
figure 2

The result of calculating \(E_{u}\) with L2-norm form and our method. (a) is the initial contours, where blue and red scribbles respectively provide hints for objectives to be extracted and to be avoided. (b) is the segmentation result by using L2-norm form. (c) is the segmentation result by using our user’s label regularization term. (Color figure online)

3.2 Proposed Regularization Term

In order to avoid these scattered contours in the background and lead to the real goal, we should delete the side effect from the unmarked part. Here, we proposed to use the multiplication of the hard constraints from user’s label and the level set function. We use the multiplication form to calculate \(E_{u}(\phi ,u)\), the value of the locations of the result matrix correspond to the unmarked part is zero, the evolution of level set function can only be impacted at the location of user’s label, so that the evolution process can be accurately carried out under the influence of the user’s labels.

Since the level set function \(\phi \) and u have very different scales, it probably leads to failure if we get their direct multiplication. Here, we make a test as follows. We use the same initial method of level function in [8] that the value of the location inside the subject object is −4 and the outside is 4. So \(\phi \) has different scale to u that led to a bad result of segmentation shown in Fig. 3.

Fig. 3.
figure 3

The segmentation result with different scale. Different scale between level set function and user’s label matrix will led to a bad result.

In order to make \(\phi \) and u remain at the same scale, we first use a sigmoid function (17) (shown in Fig. 4) to normalized the \(\phi \). And then we shift it to keep the same scale with u(xy) in \([-1,1]\), by using the expression \(2\sigma (\phi )-1\). In our proposed method, we construct the \(E_{u}\) as follow:

$$\begin{aligned} E_{u}(\phi ,u)=\int -u(2\sigma (\phi )-1)dxdy \end{aligned}$$
(16)
$$\begin{aligned} \sigma (\phi )=\frac{1}{e^{-\phi }+1} \end{aligned}$$
(17)
Fig. 4.
figure 4

The sigmoid function.

According to Euler Lagrange equation (6), we get the new \(L(\phi )\) as follow:

$$\begin{aligned} \begin{aligned} {L(\phi )=}\,&{\mu [\varDelta \phi -div(\frac{\nabla \phi }{|\nabla \phi |})]+\lambda \delta (\phi )div(g\frac{\nabla \phi }{|\nabla \phi |})+vg\delta (\phi )}\\&{+\,2k(u\sigma (\phi )(1-\sigma (\phi )))} \end{aligned} \end{aligned}$$
(18)

The level set function process the evolution by (8). As shown in Fig. 2(c), using our method can get satisfactory result for users.

The method is user-friendly and can be cooperative by the users, it can make the evolution be processed repeatedly under the feedback of user until we get an ideal result, the segmentation process with user’s feedback is expressed in Fig. 5.

Fig. 5.
figure 5

Real-time feedback process of interactive level set segmentation.

4 Experiment Results

Our experiments are implemented from two aspects. First, we test on effect of the weight for our regularization term. And then, we gives experiments on interactive segmentation.

4.1 The Weight of the Regularization Term

In energy function (13), k indicates the extent to which user’s mark plays a role in the segmentation process. We fix the weight of other part as mentioned in [8], and make experiments use different value of k. Figure 6 shows that we use different value of k to segment images with the same user’s mark. From the result, we can see that at the beginning, with the increase of k, the segmentation results is significantly improved, and when k increases over a number range, the improvement effect is obviously reduced, even decreases. Experiment shows that when the value of k is selected to 100, we can achieve the best performance.

Fig. 6.
figure 6

The result of different value of k. For the first row and the second row, when \(k=100\) makes best performance. For the third and the forth rows, when \(k\ge 100\) and \(k\le 1000\) the results have no significant changes, when \(k=5000\) the effect is obviously decreases.

4.2 Interactive Segmentation Process

Figure 8 shows the interactive segmentation process of objects in the 5 images. User draw a initial contour in the image first, then make some marks for foreground and background. After the first evolution, user can mark the pixels in the image that is error segmented and the algorithm can continue evolving on the basis of the segmentation result last time. By the feedback process several times repeatedly, the algorithm can give user their satisfied result. Obviously, if we make some mark as seeds before the first segmentation, the number of interactive would be decreased.

Fig. 7.
figure 7

Behavioral predictability. In the first row, only the part we marked as background was changed from foreground to background after segmentation process. From the pictures in the second row, we can see that the segmentation result only changed in the place we marked.

Fig. 8.
figure 8figure 8

Interactive segmentation process. Here we show five interactive segmentation processes respectively, the user can get their satisfied results by the feedback process. The object in the second segmentation process is a bird which is very similar with background, there is no clear boundaries, we can extract the bird after some the process of mark-evolution.

4.3 Behavioral Predictability

[11] given some evaluation indices for interactive segmentation algorithm, and shown some corresponding evaluation process. During the process of evaluation, many users were invited to do some interactive segmentation. According to the feedback of user, they are very sensitive to the operate experience. They liked that small localized marks only have a local effect. Conversely, users disliked algorithms in which small additions to the markup could cause large differences to the segmentation. Figure 7 shows the behavioral predictability of our method.

5 Conclusion

In this paper, we construct a novel regularization term with user’s mark. Our method Solves the problem that user can’t extract some specified object in the image due to the local optimum. Using our proposed method, the user can control the process of segmentation accurately. Our method also support the do the mark-evolution process repeatedly until the user get his satisfied result. In future, we will consider how to deal with region-based model by using our new regularization term.