Convex non-convex image segmentation

Chan, Raymond; Lanza, Alessandro; Morigi, Serena; Sgallari, Fiorella

doi:10.1007/s00211-017-0916-4

Convex non-convex image segmentation

Published: 06 September 2017

Volume 138, pages 635–680, (2018)
Cite this article

Numerische Mathematik Aims and scope Submit manuscript

Raymond Chan¹,
Alessandro Lanza²,
Serena Morigi² &
…
Fiorella Sgallari²

1227 Accesses
Explore all metrics

Abstract

A convex non-convex variational model is proposed for multiphase image segmentation. We consider a specially designed non-convex regularization term which adapts spatially to the image structures for a better control of the segmentation boundary and an easy handling of the intensity inhomogeneities. The nonlinear optimization problem is efficiently solved by an alternating directions methods of multipliers procedure. We provide a convergence analysis and perform numerical experiments on several images, showing the effectiveness of this procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective Two-Stage Image Segmentation: A New Non-Lipschitz Decomposition Approach with Convergent Algorithm

Article 01 January 2021

The Vese-Chan model without redundant parameter estimation for multiphase image segmentation

Article Open access 14 January 2020

Advanced Methods in Variational Learning: Segmentation with Intensity Inhomogeneity

Notes

A convex function is proper if it nowhere takes the value $-\infty $ and is not identically equal to $+\infty $.

References

Bioucas-Dias, J., Figueredo, M.: Fast image recovery using variable splitting and constrained optimization. IEEE Trans. Image Process. 19(9), 2345–2356 (2010)
Article MathSciNet MATH Google Scholar
Blake, A., Zisserman, A.: Visual Reconstruction. MIT Press, Cambridge (1987)
Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–22 (2011)
Article MATH Google Scholar
Brown, E.S., Chan, T.F., Bresson, X.: Completely convex formulation of Chan–Vese image segmentation model. Int. J. Comput. Vis. 98(1), 103–121 (2012)
Article MathSciNet MATH Google Scholar
Cai, X.H., Chan, R.H., Zeng, T.Y.: A two-stage image segmentation method using a convex variant of the Mumford–Shah model and thresholding. SIAM J. Imaging Sci. 6(1), 368–390 (2013)
Article MathSciNet MATH Google Scholar
Chan, T., Esedoglu, S., Nikolova, M.: Algorithms for finding global minimizers of image segmentation and denoising models. SIAM J. Appl. Math. 66(5), 1632–1648 (2006)
Article MathSciNet MATH Google Scholar
Chan, T., Vese, L.A.: Active contours without edges. IEEE Trans. Image Process. 10, 266–277 (2001)
Article MATH Google Scholar
Chan, T., Vese, L.A.: Active contours without edges for vector-valued image. J. Vis. Commun. Image Represent. 11, 130–141 (2000)
Article Google Scholar
Chen, P.Y., Selesnick, I.W.: Group-sparse signal denoising: non-convex regularization, convex optimization. IEEE Trans. Signal Proc. 62, 3464–3478 (2014)
Article MathSciNet Google Scholar
Christiansen, M., Hanke, M.: Deblurring methods using antireflective boundary conditions. SIAM J Sci Comput 30, 855–872 (2008)
Clarke, F.H.: Optimizatiom and Nonsmooth Analysis. Wiley, New York (1983)
Google Scholar
Donatelli, M., Reichel, L.: Square smoothing regularization matrices with accurate boundary conditions. J Comput Appl Math 272, 334–349 (2014)
Dong, B., Chien, A., Shen, Z.: Frame based segmentation for medical images. Commun. Math. Sci. 32, 1724–1739 (2010)
MATH Google Scholar
Ekeland, I., Temam, R.: Convex Analysis and Variational Problems (Classics in Applied Mathematics). SIAM, Philadelphia (1999)
Book MATH Google Scholar
Esedoglu, S., Tsai, Y.: Threshold dynamics for the piecewise constant Mumford–Shah functional. J. Comput. Phys. 211, 367–384 (2006)
Article MathSciNet MATH Google Scholar
Huang, G., Lanza, A., Morigi, S., Reichel, L., Sgallari, F.: Majorization–minimization generalized Krylov subspace methods for $\ell _p - \ell _q$ optimization applied to image restoration. BIT Numer Math 57(2), 351–378 (2017). doi:10.1007/s10543-016-0643-8
Article MATH Google Scholar
Lanza, A., Morigi, S., Sgallari, F.: Convex image denoising via non-convex regularization. In: Aujol, JF., Nikolova, M., Papadakis, N. (eds.) Scale Space and Variational Methods in Computer Vision. SSVM 2015. Lecture Notes in Computer Science, vol. 9087, pp. 666–677. Springer, Cham (2015)
Lanza, A., Morigi, S., Reichel, L., Sgallari, F.: A generalized Krylov subspace method for lp–lq minimization. SIAM J. Sci. Comput. 37(5), S30–S50 (2015)
Article MATH Google Scholar
Lanza, A., Morigi, S., Sgallari, F.: Constrained TVp-l2 model for image restoration. J. Sci. Comput. 68(1), 64–91 (2016)
Article MathSciNet MATH Google Scholar
Lanza, A., Morigi, S., Sgallari, F.: Convex image denoising via non-convex regularization with parameter selection. J. Math. Imaging Vis. 56(2), 195–220 (2016)
Article MathSciNet MATH Google Scholar
Lanza, A., Morigi, S., Selesnick, I., Sgallari, F.: Nonconvex nonsmooth optimization via convex–nonconvex majorization–minimization. Numer. Math. 136(2), 343–381 (2017)
Article MathSciNet MATH Google Scholar
Li, F., Ng, M., Zeng, T.Y., Shen, C.: A multiphase image segmentation method based on fuzzy region competition. SIAM J. Imaging Sci. 3, 277–299 (2010)
Article MathSciNet MATH Google Scholar
Li, F., Shen, C., Li, C.: Multiphase soft segmentation with total variation and H1 regularization. J. Math. Imaging Vis. 37, 98–111 (2010)
Article Google Scholar
Lie, J., Lysaker, M., Tai, X.: A binary level set model and some applications to Mumford–Shah image segmentation. IEEE Trans. Image Process. 15, 1171–1181 (2006)
Article MATH Google Scholar
Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated variational problems. Commun. Pure Appl. Math. 42(5), 577–685 (1989)
Article MathSciNet MATH Google Scholar
Ng, M.K., Chan, R.H., Tang, W.C.: A fast algorithm for deblurring models with Neumann boundary conditions. SIAM J Sci Comput 21, 851–866 (1999)
Nikolova, M.: Estimation of binary images by minimizing convex criteria. Proc. IEEE Int. Conf. Image Process. 2, 108–112 (1998)
Google Scholar
Parekh, A., Selesnick, I.W.: Convex Denoising Using Non-Convex Tight Frame Regularization. arXiv:1504.00976 (2015)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Book MATH Google Scholar
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin (1998)
Google Scholar
Sandberg, B., Kang, S., Chan, T.: Unsupervised multiphase segmentation: a phase balancing model. IEEE Trans. Image Process. 19, 119–130 (2010)
Article MathSciNet MATH Google Scholar
Selesnick, I.W., Bayram, I.: Sparse signal estimation by maximally sparse convex optimization. IEEE Trans. Signal Process. 62(5), 1078–1092 (2014)
Article MathSciNet Google Scholar
Selesnick, I.W., Parekh, A., Bayram, I.: Convex 1-D total variation denoising with non-convex regularization. IEEE Signal Process. Lett. 22(2), 141–144 (2015)
Article Google Scholar
Strong, D.M., Chan, T.F.: Edge-preserving and scale-dependent properties of total variation regularization. Inverse Probl. 19(6), 165–187 (2003)
Article MathSciNet MATH Google Scholar
Wu, C., Tai, X.C.: Augmented Lagrangian method, dual methods, and split Bregman iteration for ROF, vectorial TV, and high order models. SIAM J. Imaging Sci. 3(3), 300–339 (2010)
Article MathSciNet MATH Google Scholar
Wu, C., Zhang, J., Tai, X.C.: Augmented lagrangian method for total variation restoration with non-quadratic fidelity. Inverse Probl. Imaging 5(1), 237–261 (2011)
Article MathSciNet MATH Google Scholar
Yuan, J., Bae, E., Tai, X., Boykov, Y.: A study on continuous max-flow and min-cut approaches. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2217–2224 (2010)
Yuan, J., Bae, E., Tai, X., Boykov, Y.: A continuous max-flow approach to Potts model. In: ECCV 2010: Proceedings of the 11th European Conference on Computer Vision, Springer, Berlin, pp. 332–345 (2010)
Varga, R.S.: Matrix Iterative Analysis, Springer Series in Computational Mathematics. Springer, Berlin, Heidelberg (2000). doi:10.1007/978-3-642-05156-2

Download references

Acknowledgements

We would like to thank the referees for comments that lead to improvements of the presentation.This work is partially supported by HKRGC GRF Grant No. CUHK300614, CUHK14306316, CRF Grant No. CUHK2/CRF/11G, AoE Grant AoE/M-05/12, CUHK DAG No. 4053007, and FIS Grant No. 1907303. Research by SM, AL and FS was supported by the “National Group for Scientific Computation (GNCS-INDAM)” and by ex60% project by the University of Bologna “Funds for selected research topics”.

Author information

Authors and Affiliations

Department of Mathematics, The Chinese University of Hong Kong, Sha Tin, Hong Kong
Raymond Chan
Department of Mathematics, University of Bologna, Bologna, Italy
Alessandro Lanza, Serena Morigi & Fiorella Sgallari

Authors

Raymond Chan
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Lanza
View author publications
You can also search for this author in PubMed Google Scholar
Serena Morigi
View author publications
You can also search for this author in PubMed Google Scholar
Fiorella Sgallari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serena Morigi.

Appendix

Proof of Lemma 3.2

Let $x :{=} (x_1,x_2,x_3)^T \in {\mathbb R}^3$. Then, the function $f(\,\cdot \,;\lambda ,T,a)$ in (3.3) can be rewritten in a more compact form as follows:

$$\begin{aligned} f(x;\lambda ,T,a) = \frac{\lambda }{6} \, x^T x \;\,{+}\;\; \phi \,\left( \sqrt{ x^T Q \, x } \,;\, T,a \right) \,, \end{aligned}$$

(7.1)

with the matrix $Q \in {\mathbb R}^{3 \times 3}$ defined as

$$\begin{aligned} Q \;=\, \begin{bmatrix} 2&\quad -1&\quad -1 \\ -1&\quad 1&\quad 0 \\ -1&\quad 0&\quad 1 \end{bmatrix} . \end{aligned}$$

(7.2)

We introduce the eigenvalue decomposition of the matrix Q in (7.2):

$$\begin{aligned} Q = V \Lambda \, V^T, \quad \Lambda = \mathrm{diag}(3,1,0), \quad V V^T = V^T V = I_3 \,, \end{aligned}$$

(7.3)

where orthogonality of the modal matrix V in (7.3) follows from symmetry of matrix Q. Then, we decompose the diagonal eigenvalues matrix $\Lambda $ in (7.3) as follows:

$$\begin{aligned} \Lambda = Z \widetilde{\Lambda } Z, \quad \; Z = \mathrm{diag}(\sqrt{3},1,1), \quad \widetilde{\Lambda } = \mathrm{diag}(1,1,0). \end{aligned}$$

(7.4)

Substituting (7.4) into (7.3), then (7.3) into (7.1), we obtain the following equivalent expression for the function f:

$$\begin{aligned} f(x;\lambda ,T,a) = \frac{\lambda }{6} \, x^T x \;\,{+}\;\; \phi \,\left( \sqrt{ x^T V Z \widetilde{\Lambda } Z \, V^T \, x } \,;\, T,a \right) . \end{aligned}$$

(7.5)

Recalling that the property of convexity for a function is invariant under non-singular linear transformations of its domain, we introduce the following one for the domain ${\mathbb R}^3$ of function f above:

$$\begin{aligned} x = T y, \quad T :{=} V Z^{-1} \;{\in }\; {\mathbb R}^{3 \times 3}\,, \end{aligned}$$

(7.6)

which is non-singular due to V and Z being non-singular matrices. By defining as $f_T :{=} f \circ T$ the function f in the transformed domain, we have:

$$\begin{aligned} f_T(y;\lambda ,T,a) = \frac{\lambda }{6} \, y^T Z^{-2} y \;\,{+}\;\; \phi \,\left( \sqrt{ y^T \widetilde{\Lambda } \, y } \,;\, T,a \right) . \end{aligned}$$

(7.7)

Recalling the definitions of Z and $\widetilde{\Lambda }$ in (7.4), we can write (7.7) in the explicit form:

$$\begin{aligned} f_T(y;\lambda ,T,a)= & {} \frac{\lambda }{6} \, \left( \frac{y_1^2}{3} + y_2^2 + y_3^2 \right) \;\,{+}\;\; \phi \,\left( \sqrt{ y_1^2 + y_2^2 }\,;\, T,a \right) \nonumber \\= & {} \frac{\lambda }{6} \left( \frac{2}{3} y_2^2 + y_3^2\right) + \frac{\lambda }{18} \left( y_1^2 + y_2^2 \right) \,{+}\; \phi \,\left( \sqrt{ y_1^2 + y_2^2 }\,;\, T,a \right) \nonumber \\= & {} \frac{\lambda }{6} \left( \frac{2}{3} y_2^2 + y_3^2\right) + \, g(y_1,y_2;\lambda ,T,a)\,, \end{aligned}$$

(7.8)

where the function g in (7.8) is defined in (3.6). Since the first term in (7.8) is (quadratic) convex, a sufficient condition for the function $f_T$ in (7.8) to be strictly convex is that the function g in (3.6) is strictly convex. This concludes the proof after recalling that the function f is strictly convex if and only if the function $f_T$ is strictly convex. $\square $

Proof of Lemma 3.3

It follows immediately from the definition of strict convexity that a function from ${\mathbb R}^2$ into ${\mathbb R}$ is strictly convex if and only if the restriction of the function to any possible straight line of ${\mathbb R}^2$ is strictly convex. Due to the radial symmetry property of function $\psi $ in (3.7), the restriction of $\psi $ to a generic straight line l is identical to the restriction of $\psi $ to any other straight line obtained by rotating l around the origin. Hence, $\psi $ is strictly convex if and only if all its restrictions to horizontal straight lines (any other direction, e.g. vertical, could be chosen as well) with non-negative intercept are strictly convex.

We denote by $h_0$ and $h_k$ the functions from ${\mathbb R}$ into ${\mathbb R}$ corresponding to the restriction of $\psi $ to the horizontal straight line with null intercept, namely the horizontal coordinate axis, and to any horizontal straight line with positive intercept $k > 0$, respectively. From the definition of the function $\psi $ in (3.7), we have:

$$\begin{aligned} h_0(t) = \psi \left( t,0\right)= & {} z\left( |t|\right) , \qquad \qquad \;\; t \in {\mathbb R}, \end{aligned}$$

(7.9)

$$\begin{aligned} h_k(t) = \psi \left( t,k\right)= & {} z\left( \sqrt{t^2+k^2}\right) , \quad t \in {\mathbb R}, \;\, k > 0. \end{aligned}$$

(7.10)

Since the function $\psi $ in (3.7) is strictly convex if and only if both $h_0$ in (7.9) and $h_k$ in (7.10) are strictly convex, it is clear that a necessary condition for $\psi $ to be strictly convex is that $h_0$ in (7.9) is strictly convex. It thus remains to demonstrate that $h_0$ being strictly convex is also a sufficient condition for $\psi $ to be strictly convex or, equivalently, that strict convexity of $h_0$ in (7.9) implies strict convexity of $h_k$ in (7.10) for any positive k.

The functions $h_0$ and $h_k$ in (7.9)–(7.10) are clearly even and, since we are assuming $z \in \mathcal {C}^1({\mathbb R}_+)$, we have that $h_k \in \mathcal {C}^1({\mathbb R})$ and $h_0 \in \mathcal {C}^0({\mathbb R}) \cap \mathcal {C}^1({\mathbb R}\setminus \{0\})$. In particular, the first-order derivatives of $h_0$ and $h_k$ are as follows:

$$\begin{aligned} h_0'(t)= & {} z'\left( |t|\right) \mathrm {sign}(t), \qquad \qquad \qquad t \in {\mathbb R}\setminus \{0\}, \end{aligned}$$

(7.11)

$$\begin{aligned} h_k'(t)= & {} z'\left( \sqrt{t^2+k^2}\right) \frac{t}{\sqrt{t+k^2}}, \quad \, t \in {\mathbb R}. \end{aligned}$$

(7.12)

We note that $h_0$ is continuously differentiable also at the point $t=0$ if and only if the right-sided derivative of the function z at 0 is equal to 0.

We now assume that the function $h_0$ in (7.9) is strictly convex. This implies that the first-order derivative function $h_0'$ is monotonically increasing on its entire domain ${\mathbb R}\setminus \{0\}$. It thus follows from the definition of $h_0'$ in (7.11) that the first-order derivative function $z'$ is nonnegative and monotonically increasing on ${\mathbb R}_+$. We then notice that, for any given $k > 0$, the first-order derivative function $h_k'$ in (7.12) is continuous (since $z'$ is continuous on ${\mathbb R}_+$ by assumption) and odd (hence $h_k'(0) = 0$). Finally, by recalling that the composition and the product of positive, monotonically increasing functions is monotonically increasing, it follows that $h_k'$ in (7.12) is monotonically increasing on the entire real line, hence $h_k$ in (7.10) is strictly convex. This completes the proof. $\square $

Proof of Proposition 3.7

The functional $\mathcal {J}(\,\cdot \,;\lambda ,\eta ,a)$ in (1.1) is clearly proper. Moreover, since the functions $\phi (\,\cdot \,;T,a)$ and $\Vert \cdot \Vert _2$ are both continuous and bounded from below by zero, $\mathcal {J}$ is also continuous and bounded from below by zero. In particular, we notice that $\mathcal {J}$ achieves the zero value only for $u = b$ with b a constant image. The penalty function $\phi (\,\cdot \,;T,a)$ is not coercive, hence the regularization term in $\mathcal {J}$ is not coercive. However, since the fidelity term is quadratic and strictly convex, hence coercive, and the regularization term is bounded from below by zero, $\mathcal {J}$ is coercive.

As far as strong convexity is concerned, it follows from Definition 3.6 that the functional $\mathcal {J}(\,\cdot \,;\lambda ,T,a)$ in (1.1) is $\mu $-strongly convex if and only if the functional $\widetilde{\mathcal {J}}(u;\lambda ,T,a,\mu )$ defined as

$$\begin{aligned} \widetilde{\mathcal {J}}(u;\lambda ,T,a,\mu ):= & {} \underbrace{ \frac{\lambda }{2} \, \Vert u - b \Vert _2^2 + \sum _{i = 1}^{n} \phi \left( \Vert (\nabla u)_i \Vert _2 ; T,a \right) }_{\mathcal {J}(u;\lambda ,T,a)} \;{-}\; \frac{\mu }{2} \, \Vert u \Vert _2^2 \nonumber \\= & {} \mathcal {A}(u) + \frac{\lambda -\mu }{2} \, \Vert u \Vert _2^2 + \sum _{i = 1}^{n} \phi \left( \Vert (\nabla u)_i \Vert _2 ; T,a \right) \end{aligned}$$

(7.13)

is convex, where $\mathcal {A}(u)$ is an affine function of u. We notice that the functional $\widetilde{\mathcal {J}}$ in (7.13) almost coincides with the original functional $\mathcal {J}$ in (1.1), the only difference being the coefficient is $\lambda -\mu $ instead of $\lambda $. Hence, we can apply the results in Theorem 3.5 and state that $\widetilde{\mathcal {J}}$ in (7.13) is convex if condition (3.10) is satisfied with $\lambda - \mu $ in place of $\lambda $. By substituting $\lambda - \mu $ for $\lambda $ in condition (3.10), deriving the solution interval for $\mu $ and then taking the maximum, one obtains equality (3.22). $\square $

Proof of Proposition 4.1

The demonstration of condition (4.17) for strict convexity of the function $\theta $ in (4.16) is straightforward. In fact, the function $\theta $ can be equivalently rewritten as

$$\begin{aligned} \theta (x) = \underbrace{ \phi \left( \Vert x \Vert _2;T,a \right) +\, \frac{\beta }{2} \, \Vert x \Vert _2^2 }_{\bar{\theta }(x)} \,+ \mathcal {A}(x),\quad x \in {\mathbb R}^2, \end{aligned}$$

(7.14)

with $\mathcal {A}(x)$ an affine function, so that a necessary and sufficient condition for $\theta $ to be strictly convex is that the function $\bar{\theta }$ in (7.14) is strictly convex. We then notice that $\bar{\theta }$ is almost identical to the function g in (3.6), the only difference being the coefficient $\beta /2$ that for g now reads $\lambda /18$. By setting $\lambda /18 = \beta /2 \Longleftrightarrow \lambda = 9 \beta $, the two functions coincide. Condition for strict convexity of g in (3.10) reads as $\lambda > 9\,a$, hence by substituting $\lambda = 9 \beta $ in it we obtain condition (4.17) for strict convexity of $\theta $.

We remark that condition $\beta > a$ reduces to $\beta \ge a$ when only convexity is required.

For the proof of statement (4.19), according to which the unique solution $x^*$ of the strictly convex problem (4.18) is obtained by a shrinkage of vector r, we refer the reader to [20, Proposition 4.5].

We now prove statement (4.20). First, we notice that if $\Vert r\Vert _2 = 0$, i.e. r is the null vector, the minimization problem in (4.18) with the objective function $\theta (x)$ defined in (4.16) reduces to

$$\begin{aligned} \arg \min _{x \in {\mathbb R}^2} \, \left\{ \, \phi \left( \Vert x \Vert _2;T,a \right) + \frac{\beta }{2} \, \Vert x \Vert _2^2 \, \right\} . \end{aligned}$$

(7.15)

Since the former and the latter terms of the cost function in (7.15) are a monotonically non-decreasing and a monotonically increasing functions of $\Vert x\Vert _2$, respectively, the solution of (7.15) is clearly $x^* = 0$. Hence, the case $\Vert r\Vert _2 = 0$ can be easily dealt with by taking any value $\xi ^*$ in formula (4.19). We included the case $\Vert r\Vert _2 = 0$ in formula a) of (4.20). In the following, we consider the case $\Vert r\Vert _2 > 0$.

Based on the previously demonstrated statement (4.19), by setting $x = \xi \, r$, $\xi \ge 0$, we turn the original unconstrained 2-dimensional problem in (4.18) into the following equivalent constrained 1-dimensional problem:

$$\begin{aligned}&\xi ^* {\leftarrow }\; \mathrm {arg} \min _{0 \le \xi \le 1} \left\{ \, \phi \left( \left\| \xi r \right\| _2;T,a \right) + \frac{\beta }{2} \left\| \xi r - r \right\| _2^2 \,\right\} \nonumber \\&\quad {\leftarrow }\; \mathrm {arg} \min _{0 \le \xi \le 1} \left\{ \, f(\xi ) :{=} \phi \left( \left\| r \right\| _2 \xi ;T,a \right) + \frac{\beta }{2} \left\| r \right\| _2^2 \left( \xi ^2 - 2\xi \right) \,\right\} , \end{aligned}$$

(7.16)

where in (7.16) we omitted the constants and introduced the cost function $f: {\mathbb R}_+ \rightarrow {\mathbb R}$ for future reference. Since the function $\phi $ in (7.16), which is defined in (2.1), is continuously differentiable on ${\mathbb R}_+$, the cost function f in (7.16) is also continuously differentiable on ${\mathbb R}_+$. Moreover, f is strictly convex since it represents the restriction of the strictly convex function $\theta $ in (4.16) to the half-line $\xi \, r, \, \xi \ge 0$. Hence, the first-order derivative $f'(\xi )$ is a continuous, monotonically increasing function and a necessary and sufficient condition for an inner point $0< \xi < 1$ to be the global minimizer of f is that $f'(\xi ) = 0$. From the definition of f in (7.16) we have:

$$\begin{aligned} f'(\xi ) = \Vert r \Vert _2 \, \big [ \, \phi ' \left( \Vert r \Vert _2 \xi ;T,a \right) + \beta \Vert r \Vert _2 (\xi - 1) \, \big ], \end{aligned}$$

(7.17)

and, in particular:

$$\begin{aligned} f'(0^+) = - \beta \, \Vert r \Vert _2^2 < 0 , \qquad f'(1) = \Vert r \Vert _2 \, \phi ' \left( \Vert r \Vert _2;T,a \right) \ge 0. \end{aligned}$$

(7.18)

It follows from (7.18) that the solution of (7.16) can not be $\xi ^* = 0$, hence it is either $\xi ^* = 1$ or an inner stationary point.

Recalling the definition of $\phi (\,\cdot \,;T,a)$ in (2.1), after some simple manipulations the function $f'(\xi )$ in (7.17) can be rewritten in the following explicit form:

$$\begin{aligned} f'(\xi )= & {} \left\{ \begin{array}{lclll} f_1'(\xi ) &{} {=} &{} \Vert r \Vert _2 \, \Big [ \, a \frac{T_2-T}{T} \Vert r \Vert _2 \xi + \beta \Vert r \Vert _2 (\xi - 1) \, \Big ], &{} \Vert r \Vert _2 \xi \in [0,T] \\ f_2'(\xi ) &{} {=} &{} \Vert r \Vert _2 \, \Big [ \, -a \Vert r \Vert _2 \xi + a T_2 + \beta \Vert r \Vert _2 (\xi - 1) \, \Big ], &{} \Vert r \Vert _2 \xi \in (T, T_2] \\ f_3'(\xi ) &{} {=} &{} \Vert r \Vert _2 \, \Big [ \, 0 + \beta \Vert r \Vert _2 (\xi - 1) \, \Big ], &{} \Vert r \Vert _2 \xi \in (T_2, +\infty ) \end{array} \right. \nonumber \\ \end{aligned}$$

(7.19)

that is:

$$\begin{aligned} f'(\xi ) = \left\{ \begin{array}{lclll} f_1'(\xi ) &{} {=} &{} \Vert r \Vert _2^2 \, \Big [ \, \big (\beta - a + a \frac{T_2}{T} \big ) \xi &{} - \beta \, \Big ], &{} \xi \in \mathcal {D}_1 :{=}\, \Big [\,0,\frac{T}{\Vert r \Vert _2}\Big ] \\ f_2'(\xi ) &{} {=} &{} \Vert r \Vert _2^2 \, \Big [ \, (\beta - a) \xi + a \frac{T_2}{\Vert r \Vert _2} &{} - \beta \, \Big ], &{} \xi \in \mathcal {D}_2 :{=}\, \Big (\frac{T}{\Vert r \Vert _2}, \frac{T_2}{\Vert r \Vert _2} \Big ) \\ f_3'(\xi ) &{} {=} &{} \Vert r \Vert _2^2 \, \Big [ \, \beta \xi &{} - \beta \, \Big ], &{} \xi \in \mathcal {D}_3 :{=}\, \Big [\frac{T_2}{\Vert r \Vert _2}, +\infty \Big ) \end{array} \right. \end{aligned}$$

(7.20)

Denoting by $\xi _1^*$, $\xi _2^*$, $\xi _3^*$ the points where $f_1'$, $f_2'$, $f_3'$ in (7.20) equal zero, respectively, we have:

$$\begin{aligned} \xi _1^* = \frac{T}{T + \left( T_2 - T \right) \frac{a}{\beta }}, \qquad \xi _2^* = \frac{\beta }{\beta -a} - \frac{a T_2}{\beta - a} \, \frac{1}{\Vert r\Vert _2}, \qquad \xi _3^* = 1. \end{aligned}$$

(7.21)

However, for $\xi _1^*$, $\xi _2^*$ and $\xi _3^*$ in (7.21) to be acceptable candidate solutions of problem (7.16), they must belong to the domains $\mathcal {D}_1$, $\mathcal {D}_2$, $\mathcal {D}_3$ of $f_1'$, $f_2'$, $f_3'$, respectively, and obviously also to the optimization domain $\mathcal {O} :{=} [0,1]$ of problem (7.16). We have:

$$\begin{aligned} \left\{ \begin{array}{ll} \xi _1^* \,{\in }\; \mathcal {D}_1 &{}\quad \mathrm {if}\;\, \Vert r\Vert _2 \,{\in }\, \left( 0,T + (T_2 - T) \frac{a}{\beta }\right] \\ \xi _1^* \,{\in }\; \mathcal {O} &{}\quad \,\forall \,\, \Vert r\Vert _2 \end{array} \right.&\Longrightarrow&\begin{array}{l} \xi _1^* \;{\in }\; \mathcal {D}_1 \cap \mathcal {O} \;\mathrm {if}\\ \Vert r\Vert _2 \,{\in }\, \left( 0,T + (T_2 - T) \frac{a}{\beta }\right] \end{array} \nonumber \\ \end{aligned}$$

(7.22)

$$\begin{aligned} \left\{ \begin{array}{ll} \xi _2^* \,{\in }\; \mathcal {D}_2 &{}\quad \mathrm {if}\;\, \Vert r\Vert _2 \,{\in }\, \left( T + (T_2 - T) \frac{a}{\beta },T_2\right) \\ \xi _2^* \,{\in }\; \mathcal {O} &{}\quad \mathrm {if}\;\, \Vert r\Vert _2 \,{\in }\, \left[ \frac{a}{\beta }T_2,T_2\right] \end{array} \right.&\Longrightarrow&\begin{array}{l} \xi _2^* \;{\in }\; \mathcal {D}_2 \cap \mathcal {O} \;\mathrm {if} \\ \Vert r\Vert _2 \,{\in }\, \left( T + (T_2 - T) \frac{a}{\beta },T_2\right) \end{array} \nonumber \\ \end{aligned}$$

(7.23)

$$\begin{aligned} \left\{ \begin{array}{ll} \xi _3^* \,{\in }\; \mathcal {D}_3 &{}\quad \mathrm {if}\;\, \Vert r\Vert _2 \,{\in }\, \big [\, T_2, +\infty \big ) \\ \xi _3^* \,{\in }\; \mathcal {O} &{}\quad \,\forall \,\, \Vert r\Vert _2 \end{array} \right.&\Longrightarrow&\begin{array}{l} \xi _3^* \;{\in }\; \mathcal {D}_3 \cap \mathcal {O} \;\mathrm {if} \\ \Vert r\Vert _2 \,{\in }\, \big [\, T_2, +\infty \big ) \end{array} \end{aligned}$$

(7.24)

The proof of statement (4.20) is thus completed. $\square $

Proof of Theorem 5.7

Based on the definition of the augmented Lagrangian functional in (5.2), we rewrite in explicit form the first inequality of the saddle-point condition in (4.7):

(7.25)

and, similarly, the second inequality:

$$\begin{aligned}&\displaystyle { \mathcal {L}\,(u^*,t^*;\rho ^*) = F(u^*) + R(t^*) + \frac{\beta }{2} \, \Vert t^* - D u^* \Vert _2^2 \;{-}\; \langle \, \rho ^* , t^* - D u^* \, \rangle } \nonumber \\&\quad \le \displaystyle { \mathcal {L}\,(u,t;\rho ^*) = F(u) \,\;+ R(t) \,+ \frac{\beta }{2} \, \Vert t - D u \Vert _2^2 \;\;\;\;{-}\; \langle \, \rho ^* , t - D u \, \rangle } \nonumber \\&\quad \displaystyle { \forall \, (u,t) \;{\in }\; {\mathbb R}^n {\times }\, {\mathbb R}^{2n} \; . } \end{aligned}$$

(7.26)

In the first part of the proof, we prove that if $\,(u^*,t^*;\rho ^*)\,$ is a solution of the saddle-point problem (4.6)–(4.7), that is it satisfies the two inequalities (7.25) and (7.26), then $u^*$ is a global minimizer of the functional $\mathcal {J}$ in (1.1).

Since (7.25) must be satisfied for any $\rho \;{\in }\; {\mathbb R}^{2n}$, we have:

$$\begin{aligned} t^* = D u^* \; . \end{aligned}$$

(7.27)

The second inequality (7.26) must be satisfied for any $(u,t) \;{\in }\; {\mathbb R}^n {\times }\, {\mathbb R}^{2n}$. Hence, by taking $t = Du$ in (7.26) and, at the same time, substituting in (7.26) the previously derived condition (7.27), we obtain:

$$\begin{aligned}&\displaystyle { \mathcal {J}(u^*;\lambda ,T,a) = F(u^*) + R(Du^*) } \nonumber \\&\quad \le \displaystyle { \mathcal {J}(u;\lambda ,T,a) \;\;=\,\, F(u) \,\;+ R(Du) \quad \, \forall \, u \;{\in }\; {\mathbb R}^n.} \end{aligned}$$

(7.28)

Inequality (7.28) indicates that $u^*$ is a global minimizer of the functional $\mathcal {J}$ in (1.1). Hence, we have proved that all the saddle-point solutions of problem (4.6)–(4.7), if there exists one, are of the form $\,(u^*,Du^*;\rho ^*)\,$, with $u^*$ denoting a global minimizer of $\mathcal {J}$.

In the second part of the proof, we prove that at least one solution of the saddle-point problem exists. In particular, we prove that if $u^*$ is a global minimizer of $\mathcal {J}$ in (1.1), then there exists at least one pair $\,(t^*,\,\rho ^*) \,{\in }\; {\mathbb R}^{2n} {\times }\, {\mathbb R}^{2n}$ such that $(u^*,t^*;\rho ^*)$ is a solution of the saddle-point problem (4.6)–(4.7), that is it satisfies the two inequalities (7.25) and (7.26). The proof relies on a suitable choice of the vectors $t^*$ and $\rho ^*$. We take:

$$\begin{aligned} t^*= & {} D u^*, \end{aligned}$$

(7.29)

$$\begin{aligned} \rho ^*&{\in } \bar{\partial }_{t} \left[ \, R \,\right] (Du^*) \;\;\;\, \mathrm {such}\;\mathrm {that} \quad D^T \rho ^* + \lambda \left( u^* - b\right) = 0, \end{aligned}$$

(7.30)

where the term $\bar{\partial }_{t} \left[ \, R \,\right] (Du^*)$ indicates the Clarke generalized gradient (with respect to t, calculated at $Du^*$) of the nonconvex regularization function R defined in (5.1). We notice that a vector $\rho ^*$ satisfying (7.30) is guaranteed to exist thanks to Proposition 5.2. In fact, since here we are assuming that $u^*$ is a global minimizer of functional $\mathcal {J}$, the first-order optimality condition in (5.5) holds true.

Due to (7.29), the first saddle-point condition in (7.25) is clearly satisfied. Proving the second condition (7.26) is less straightforward: we need to investigate the optimality conditions of the functional $\mathcal {L}\,(u,t;\rho ^*)$ with respect to the pair of primal variables (u, t). We follow the same procedure used, e.g., in [35], which requires $\mathcal {L}\,(u,t;\rho ^*)$ to be jointly convex in (u, t). According to Proposition 5.4, in our case this requirement is fulfilled if the penalty parameter $\beta $ satisfies condition (5.17), which has thus been taken as an hypothesis of this theorem. Hence, we can apply Lemma 5.6 and state that (7.26) is satisfied if and only if both the following two optimality conditions are met:

$$\begin{aligned}&u^* \;{\in }\;\;\, \displaystyle {\arg \min _{u} \, \mathcal {L}\,(u,t^*;\rho ^*)} = \displaystyle {\arg \min _{u} \, \mathcal {L}^{(u)}\,(u)}, \end{aligned}$$

(7.31)

$$\begin{aligned}&t^* \;{\in }\;\;\, \displaystyle {\arg \min _{t} \, \mathcal {L}\,(u^*,t;\rho ^*)} = \displaystyle {\arg \min _{t} \, \mathcal {L}^{(t)}\,(t)}, \end{aligned}$$

(7.32)

where in (7.31)–(7.32) we introduced the two functions $\mathcal {L}^{(u)}$ and $\mathcal {L}^{(t)}$ representing the restrictions of functions $\mathcal {L}\,(u,t^*;\rho ^*)$ and $\mathcal {L}\,(u^*,t;\rho ^*)$ to only the terms depending on the optimization variables u and t, respectively. In particular, after recalling the definition of the augmented Lagrangian functional in (5.2), we have

$$\begin{aligned} \mathcal {L}^{(u)}(u)= & {} \displaystyle { \underbrace{ F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t - D u \Vert _2^2 }_{Q^{(u)}(u)} \,+\, \underbrace{ \frac{\beta +\beta _1}{2} \, \Vert t - D u \Vert _2^2 \,{+}\, \langle \, \rho ^* , D u \, \rangle }_{S^{(u)}(u)} }, \end{aligned}$$

(7.33)

$$\begin{aligned} \mathcal {L}^{(t)}(t)= & {} \displaystyle { \underbrace{ R(t) + \frac{\beta _2}{2} \, \Vert t - D u \Vert _2^2 }_{Q^{(t)}(t)} \,+\, \underbrace{ \frac{\beta -\beta _2}{2} \, \Vert t - D u \Vert _2^2 \,{-}\, \langle \, \rho ^* , t \, \rangle }_{S^{(t)}(t)} }, \end{aligned}$$

(7.34)

where, like in [35], $\mathcal {L}^{(u)}$ and $\mathcal {L}^{(t)}$ have been split into the sum of two functions with the aim of then deriving optimality conditions for $\mathcal {L}^{(u)}$ and $\mathcal {L}^{(t)}$ by means of Lemma 5.5. Unlike in [35], the ADMM quadratic penalty term $\frac{\beta }{2} \, \Vert t - D u \Vert _2^2$ has been split into two parts (differently in $\mathcal {L}^{(u)}$ and $\mathcal {L}^{(t)}$) in order to deal with the nonconvex regularization term. In particular, the coefficients $\beta _1$, $\beta _2$ introduced in (7.33)–(7.34) satisfy

$$\begin{aligned} - \, \beta \;{<}\;\, \beta _1 \;{\le }\;\, \tau _c \, \frac{9}{8}a, \qquad \, a \,\;{\le }\;\, \beta _2 \;{<}\; \beta , \end{aligned}$$

(7.35)

such that the terms $S^{(u)}$, $S^{(t)}$ in (7.33)–(7.34) are clearly convex and the terms $Q^{(u)}$, $Q^{(t)}$ are convex due to results in Lemma 5.3 and Proposition 4.1, respectively. We also notice that all the functions $Q^{(u)}$, $Q^{(t)}, S^{(u)}$, $S^{(t)}$ are proper and continuous and that $S^{(u)}$, $S^{(t)}$ are G$\hat{a}$teaux-differentiable. Hence, we can apply Lemma 5.5 separately to (7.33) and (7.34), to check if the pair $(u^*,t^*)$ satisfies the optimality conditions in (7.31) and (7.32), so that the second saddle-point condition (7.26) holds true. We obtain:

$$\begin{aligned}&F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t^* - D u \Vert _2^2 \;{-}\; F(u^*) + \frac{\beta _1}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {-}\, \big \langle \, (\beta +\beta _1) D^T ( \underbrace{t^* - D u^*}_{0} ) - D^T \rho ^* , u - u^* \big \rangle \,{\ge }\; 0 \quad \; \forall \, u \;{\in }\; {\mathbb R}^n, \end{aligned}$$

(7.36)

$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^* \Vert _2^2 \;{-}\; R(t^*) \;{-}\; \frac{\beta _2}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (\underbrace{t^* - D u^*}_{0}) - \rho ^*, \, t - t^* \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; {\mathbb R}^{2n}, \end{aligned}$$

(7.37)

where the term $t^*-Du^*$ in (7.36)–(7.37) is zero due to the setting (7.29). We rewrite conditions (7.36)–(7.37) as follows:

$$\begin{aligned}&F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t^* - D u \Vert _2^2 \;{-}\; \left( F(u^*) \;{-}\; \frac{\beta _1}{2} \, \Vert t^* - D u^* \Vert _2^2 \right) \nonumber \\&\quad {-}\, \big \langle \, \lambda (u^* - b) \underbrace{-\, \lambda (u^* - b) - D^T \rho ^*}_{0} \,{+}\, \beta _1 D^T (t^* - D u^*) , \, u - u^* \big \rangle \,{\ge }\; 0 \;\; \forall \, u \;{\in }\; {\mathbb R}^n , \nonumber \\ \end{aligned}$$

(7.38)

$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^* \Vert _2^2 \;{-}\; \left( R(t^*) + \frac{\beta _2}{2} \, \Vert t^* - D u^* \Vert _2^2 \right) \nonumber \\&\quad {-}\;\ \bigg \langle \, \rho ^* + \beta _2 \, (t^* - Du^*), \, t - t^* \, \bigg \rangle \,\;{\ge }\;\, 0 \quad \;\;\, \forall \, t \;{\in }\; {\mathbb R}^{2n}, \end{aligned}$$

(7.39)

where in (7.38) we added and subtracted the term $\lambda \, (u^*-b)$ and added the null term $\beta _1 D^T (t^* - D u^*)$, and in (7.39) we added the null term $\beta _2 \, (t^* - Du^*)$. The term $-\lambda \, (u^*-b) - D^T \rho ^*$ in (7.38) is null due to the setting (7.30). By introducing the two functions

$$\begin{aligned} U(u) \, :{=} \; F(u) \;{-}\;\, \frac{\beta _1}{2} \, \Vert t^* - D u \Vert _2^2, \qquad T(t) :{=} R(t) \;+ \frac{\beta _2}{2} \, \Vert t {-} D u^* \Vert _2^2, \end{aligned}$$

(7.40)

which are convex under conditions (7.35) for the same reason for which the functions $Q^{(u)}$, $Q^{(t)}$ in (7.33)–(7.34) are convex, conditions (7.38)–(7.39) can be rewritten as

$$\begin{aligned}&U(u) \,\,{-}\;\, U(u^*) \,{-}\; \left\langle \, \overbrace{\lambda \, (u^* - b) \,{+}\, \beta _1 D^T (t^* - D u^*)}^{\textstyle { \partial _u \big [ U \big ] \, (u^*) }} , \, u - u^* \right\rangle \,\,{\ge }\;\, 0 \;\;\, \forall \, u \;{\in }\; {\mathbb R}^n , \nonumber \\ \end{aligned}$$

(7.41)

$$\begin{aligned}&T(t) \,\,\;{-}\;\, T(t^*) \,\;{-}\;\, \left\langle \underbrace{\rho ^* + \beta _2 \, (t^* - Du^*)}_{\textstyle { {\in }\; \partial _t \big [ T \big ] \, (t^*) }} \quad , \,\, t \,- t^* \, \right\rangle \,\,{\ge }\;\, 0 \;\;\, \forall \, t \;{\in }\; {\mathbb R}^{2n} ,\qquad \qquad \end{aligned}$$

(7.42)

where we highlighted that the left side of the scalar product in (7.41) represents the subdifferential (actually, the standard gradient) of function U calculated at $u^*$ and that the left side of the scalar product in (7.42) is a particular vector belonging to the subdifferential of function T calculated at $t^*$. This second statement comes from the definition of function T in (7.40) and from settings (7.29)–(7.30).

Optimality conditions in (7.41)–(7.42) are easily proved by noticing that the left-hand sides of (7.41)–(7.42) represent the Bregman distances associated with functions U and T, respectively, which are known to be non-negative for convex functions. Hence, the second saddle-point condition in (7.26) is satisfied and, finally, the second and last part of the proof is completed. $\square $

Proof of Theorem 5.8

Let us define the following errors:

$$\begin{aligned} \bar{u}^{(k)} = u^{(k)} - u^*, \quad \bar{t}^{(k)} = t^{(k)} - t^*, \quad \bar{\rho }^{(k)} = \rho ^{(k)} - \rho ^*. \end{aligned}$$

(7.43)

Since $(u^*,t^*;\rho ^*)$ is a saddle-point of the augmented Lagrangian functional in (4.6), it follows from Theorem 5.7 that $t^* = D^*u$. This relationship, together with the ADMM updating formula for the vector of Lagrange multipliers in (4.10), yields:

$$\begin{aligned} \bar{\rho }^{(k+1)} = \bar{\rho }^{(k)} - \beta \, \big ( \, \bar{t}^{(k)} - D \bar{u}^{(k)} \big ). \end{aligned}$$

(7.44)

It then follows easily from (7.44) that

$$\begin{aligned} \big \Vert \bar{\rho }^{(k)} \big \Vert _2^2 - \big \Vert \bar{\rho }^{(k+1)} \big \Vert _2^2 = 2\beta \big \langle \bar{\rho }^{(k)}, \, \bar{t}^{(k)} - D \bar{u}^{(k)} \big \rangle \;{-}\; \beta ^2 \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2. \end{aligned}$$

(7.45)

Computation of a lower bound for the right-hand side of (7.45)

Since $(u^*,t^*;\rho ^*)$ is a saddle-point of the augmented Lagrangian functional in (4.6), it satisfies the following optimality conditions [see (7.36)–(7.37) in the proof of Theorem 5.7]:

$$\begin{aligned}&F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t^* - D u \Vert _2^2 \;{-}\; F(u^*) \,{+}\, \frac{\beta _1}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {-}\; \Big \langle D^T \Big ( (\beta +\beta _1) \, \big ( t^* - D u^* \big ) - \rho ^* \Big ) , \, u - u^* \Big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, u \;{\in }\; {\mathbb R}^n, \end{aligned}$$

(7.46)

$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^* \Vert _2^2 \;{-}\; R(t^*) \;{-}\; \frac{\beta _2}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^* -D u^*) - \rho ^*, \, t - t^* \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; {\mathbb R}^{2n}. \end{aligned}$$

(7.47)

Similarly, by the construction of $\big (u^{(k)},t^{(k)}\big )$ in Algorithm 1, we have:

$$\begin{aligned}&F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t^{(k-1)} - D u \Vert _2^2 \,{-}\, F(u^{(k)}) \,{+}\, \frac{\beta _1}{2} \, \Vert t^{(k-1)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {-}\; \Big \langle D^T \Big ( (\beta +\beta _1) \, \big ( t^{(k-1)} - D u^{(k)} \big ) - \rho ^{(k)} \Big ) , \, u - u^{(k)} \Big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, u \;{\in }\; {\mathbb R}^n \, ,\qquad \qquad \end{aligned}$$

(7.48)

$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^{(k)} \Vert _2^2 \;{-}\; R(t^{(k)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^{(k)} - D u^{(k)}) - \rho ^{(k)}, \, t - t^{(k)} \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; {\mathbb R}^{2n}. \end{aligned}$$

(7.49)

Taking $u = u^{(k)}$ in (7.46), $u = u^*$ in (7.48) and recalling that $\langle D^T w , z \rangle = \langle w , D z \rangle $ , by addition we obtain:

$$\begin{aligned} \underbrace{-\,\big \langle \, \bar{\rho }^{(k)} , D \bar{u}^{(k)} \, \big \rangle }_{A_1} + \underbrace{\beta \, \big \langle \, \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle }_{B_1} \;{-}\; \underbrace{(\beta +\beta _1) \, \big \Vert D \bar{u}^{(k)} \big \Vert _2^2}_{C_1} \,\;{\ge }\;\, 0. \end{aligned}$$

(7.50)

Similarly, taking $t = t^{(k)}$ in (7.47) and $t = t^*$ in (7.49), after addition we have:

$$\begin{aligned} \underbrace{\big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} \, \big \rangle }_{A_2} + \underbrace{\beta \, \big \langle \, \bar{t}^{(k)} , D \bar{u}^{(k)} \, \big \rangle }_{B_2} \;{-}\; \underbrace{(\beta -\beta _2) \, \big \Vert \bar{t}^{(k)} \big \Vert _2^2}_{C_2} \,\;{\ge }\;\, 0, \end{aligned}$$

(7.51)

where, we recall, the parameters $\beta _1$ and $\beta _2$ in (7.50)–(7.51) satisfy the constraints in (7.35).

By summing up (7.50) and (7.51), we obtain:

$$\begin{aligned}&\big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} - D \bar{u}^{(k)} \, \big \rangle - \beta \, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle \nonumber \\&\quad - \left( (\beta -\beta _2) \, \big \Vert \bar{t}^{(k)} \big \Vert _2^2 - 2 \beta \, \big \langle \, \bar{t}^{(k)} , D \bar{u}^{(k)} \, \big \rangle + (\beta +\beta _1) \, \big \Vert D \bar{u}^{(k)} \big \Vert _2^2 \right) \;{\ge }\; 0 \nonumber \end{aligned}$$

that is

$$\begin{aligned}&\big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} - D \bar{u}^{(k)} \, \big \rangle - \beta \, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle - \frac{\beta +\beta _3}{2} \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 \nonumber \\&\quad {-}\; \left( \left( -\,\beta _2-\frac{\beta _3}{2}+\frac{\beta }{2}\right) \, \big \Vert \bar{t}^{(k)} \big \Vert _2^2 - (\beta -\beta _3) \, \big \langle \, \bar{t}^{(k)} , D \bar{u}^{(k)} \, \big \rangle \right. \nonumber \\&\quad \left. + \left( \beta _1-\frac{\beta _3}{2}+\frac{\beta }{2}\right) \, \big \Vert D \bar{u}^{(k)} \big \Vert _2^2 \right) \;{\ge }\; 0, \end{aligned}$$

(7.52)

where we introduced the positive coefficient $\beta _3 > 0$ (the reason will be clear later on). We want that the last term in (7.52) takes the form $-\,\big \Vert \, c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2^2 $ with $c_1,c_2 > 0$. Hence, first we impose that the coefficients of $\big \Vert \bar{t}^{(k)} \big \Vert _2^2$ and $\big \Vert D \bar{u}^{(k)} \big \Vert _2^2$ in (7.52) are strictly positive, which yields:

$$\begin{aligned} \beta _1 > \frac{\beta _3}{2} - \frac{\beta }{2}, \qquad \quad \beta _2 < -\frac{\beta _3}{2} + \frac{\beta }{2}. \end{aligned}$$

(7.53)

Combining (7.53) with conditions (7.35), we obtain:

$$\begin{aligned} \frac{\beta _3}{2} - \frac{\beta }{2}< \beta _1 \le \tau _c \, \frac{9}{8}a, \qquad \, a \le \beta _2< -\frac{\beta _3}{2} + \frac{\beta }{2}, \qquad \, 0< \beta _3 < \beta - 2a. \end{aligned}$$

(7.54)

From condition on $\beta _3$ in (7.54), the following constraint for $\beta $ is derived:

$$\begin{aligned} \beta > 2a. \end{aligned}$$

(7.55)

We notice that condition (7.55) can be more stringent than (5.17), depending on $\tau _c$, hence it has been taken as an hypothesis of this theorem and will be considered, together with (5.17), in the rest of the proof. From condition on $\beta _3$ in (7.54) it also follows that the coefficient $\beta - \beta _3$ of the scalar product in (7.52) is positive.

Then, we have to impose that the coefficient of the term $-\big \langle \, \bar{t}^{(k)} , D \bar{u}^{(k)} \, \big \rangle $ in (7.52) is twice the product of the square roots of the (positive) coefficients of $\big \Vert \bar{t}^{(k)} \big \Vert _2^2$ and $\big \Vert D \bar{u}^{(k)} \big \Vert _2^2$, that is:

$$\begin{aligned} \beta - \beta _3 = 2 \sqrt{ \left( -\,\beta _2-\frac{\beta _3}{2}+\frac{\beta }{2}\right) \left( \beta _1-\frac{\beta _3}{2}+\frac{\beta }{2}\right) } \;{\Longrightarrow }\; \beta = \beta _3 + 2 \frac{\beta _1 \beta _2}{\beta _1-\beta _2} .\qquad \end{aligned}$$

(7.56)

By imposing condition on $\beta _3$ in (7.54), namely $\beta -\beta _3 > 2a$, it is easy to verify that (7.56) admits acceptable solutions only in case that $\beta _1 > \beta _2$. By setting in (7.56) $\beta _1 = \tau _c \frac{9}{8} \, a$ and $\beta _2 = a$, which are acceptable values according to this last result (since $\tau _c > 1$, clearly $\beta _1 > \beta _2$) and also to conditions (7.54), we obtain:

$$\begin{aligned} \beta = \beta _3 + 2 a \, \frac{9 \, \tau _c}{9\,\tau _c - 8}. \end{aligned}$$

(7.57)

We now check if there exist acceptable values for the two remaining free parameters, namely $\beta $ and $\beta _3$, such that (7.57) holds. We impose that $\beta $ in (7.57) satisfies its constraint in (5.17), which guarantees convexity of the augmented Lagrangian functional, and the derived condition in (7.55):

$$\begin{aligned} \left\{ \begin{array}{lll} \displaystyle {\beta _3 + 2 a \, \frac{9 \, \tau _c}{9\,\tau _c - 8}} &{}{\ge }&{} \displaystyle {a \frac{9 \, \tau _c}{9\,\tau _c - 8}} \\ \displaystyle {\beta _3 + 2 a \, \frac{9 \, \tau _c}{9\,\tau _c - 8}} &{}{>}&{} 2 a \end{array} \right. \,\;{\Longrightarrow }\;\, \left\{ \begin{array}{lll} \displaystyle {\beta _3} &{}{\ge }&{} \displaystyle {- a \, \frac{9 \, \tau _c}{9\,\tau _c - 8}} \\ \displaystyle {\beta _3} &{}{>}&{} \displaystyle {- a \, \frac{16}{9\,\tau _c - 8}} \end{array} \right. \end{aligned}$$

(7.58)

Since $\tau _c > 1$ (and $a > 0$), both conditions in (7.58) are satisfied for any $\beta _3 > 0$. Hence, for $\beta _1 = \tau _c \frac{9}{8} \, a$, $\,\beta _2 = a$ and any $0< \beta _3 < \beta - 2a$, with $\beta > 2a$, the last term in (7.52) can be written in the form

$$\begin{aligned} -\,\big \Vert \, c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2^2 \;\;\;\; \mathrm {with}\;\;\;\, \left\{ \begin{array}{lll} c_1 &{}{=}&{} \frac{\beta -\beta _3}{2} - a \\ c_1 &{}{=}&{} \frac{\beta -\beta _3}{2} + \tau _c \frac{9}{8} a \end{array} \right. \end{aligned}$$

(7.59)

where $c_1,c_2 > 0$, $c_1 \ne c_2$. Replacing the expression in (7.59) for the last term in (7.52), we have:

$$\begin{aligned}&\big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} - D \bar{u}^{(k)} \, \big \rangle - \frac{\beta +\beta _3}{2} \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 - \beta \, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle \nonumber \\&\qquad {-}\; \big \Vert \, c_1 \, \bar{t}^{(k)} - c_2 \, D \bar{u}^{(k)} \big \Vert _2^2 \,{\ge }\;\, 0 \nonumber \\&\quad {\Longleftrightarrow }\, 2 \beta \, \big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} - D \bar{u}^{(k)} \, \big \rangle - \beta ^2 \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 \,{\ge }\;\, \beta \beta _3 \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 \nonumber \\&\qquad {+}\; 2 \beta ^2 \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle + 2 \beta \, \big \Vert \, c_1 \, \bar{t}^{(k)} - c_2 \, D \bar{u}^{(k)} \big \Vert _2^2, \end{aligned}$$

(7.60)

where in (7.60) we multiplied both sides by the positive coefficient $2\beta $. We notice that the left-hand side of (7.60) coincides with the right-hand side of (7.45), hence it follows that:

$$\begin{aligned}&\big \Vert \bar{\rho }^{(k)} \big \Vert _2^2 - \big \Vert \bar{\rho }^{(k+1)} \big \Vert _2^2 \;{\ge }\;\, \beta \beta _3 \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 +\, 2 \beta ^2 \underbrace{\big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle }_{T} \nonumber \\&\qquad {+}\;2 \beta \, \big \Vert \, c_1 \, \bar{t}^{(k)} - c_2 \, D \bar{u}^{(k)} \big \Vert _2^2. \end{aligned}$$

(7.61)

Computation of a lower bound for the term $\varvec{T}$ in (7.61).

We can write:

$$\begin{aligned} \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} \, \big \rangle= & {} \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} - D \bar{u}^{(k-1)} \, \big \rangle \nonumber \\&+\, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k-1)} - \bar{t}^{(k-1)} \, \big \rangle \nonumber \\&+\, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, \bar{t}^{(k-1)} \, \big \rangle . \end{aligned}$$

(7.62)

First, we notice that:

$$\begin{aligned} \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, \bar{t}^{(k-1)} \, \big \rangle = \frac{1}{2} \, \Big ( \, \big \Vert \bar{t}^{(k)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k)}-\bar{t}^{(k-1)}\big \Vert _2^2 \, \Big ). \end{aligned}$$

(7.63)

Then, from the construction of $t^{(k-1)}$ (from $u^{(k-1)}$), we have:

$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^{(k-1)} \Vert _2^2 \;{-}\; R(t^{(k-1)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k-1)} - D u^{(k-1)} \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^{(k-1)} - D u^{(k-1)}) - \rho ^{(k-1)}, \, t - t^{(k-1)} \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; {\mathbb R}^{2n}.\qquad \qquad \end{aligned}$$

(7.64)

Taking $t = t^{(k-1)}$ in (7.49) and $t = t^{(k)}$ in (7.64), we obtain:

$$\begin{aligned}&R(t^{(k-1)}) + \frac{\beta _2}{2} \, \Vert t^{(k-1)} - D u^{(k)} \Vert _2^2 \;{-}\; R(t^{(k)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^{(k)} - D u^{(k)}) - \rho ^{(k)}, \, t^{(k-1)} - t^{(k)} \,\, \big \rangle \,\;{\ge }\;\, 0, \end{aligned}$$

(7.65)

$$\begin{aligned}&R(t^{(k)}) + \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k-1)} \Vert _2^2 \;{-}\; R(t^{(k-1)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k-1)} - D u^{(k-1)} \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^{(k-1)} - D u^{(k-1)}) - \rho ^{(k-1)}, \, t^{(k)} - t^{(k-1)} \,\, \big \rangle \,\;{\ge }\;\, 0. \end{aligned}$$

(7.66)

By addition of (7.65) and (7.66), we have that

$$\begin{aligned}&\beta \, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} - D \bar{u}^{(k-1)} \, \big \rangle + \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, \bar{\rho }^{(k)} - \bar{\rho }^{(k-1)} \, \big \rangle \nonumber \\&\quad \ge (\beta -\beta _2) \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2. \end{aligned}$$

(7.67)

Recalling that

$$\begin{aligned} \bar{\rho }^{(k)} - \bar{\rho }^{(k-1)} = \rho ^{(k)} - \rho ^{(k-1)} = - \beta \big ( \bar{t}^{(k-1)} - D \bar{u}^{(k-1)} \big ), \end{aligned}$$

(7.68)

replacing (7.68) into (7.67) and then dividing by $\beta $, we obtain:

$$\begin{aligned}&\big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} - D \bar{u}^{(k-1)} \, \big \rangle + \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k-1)} - \bar{t}^{(k-1)} \, \big \rangle \nonumber \\&\quad \ge \frac{\beta -\beta _2}{\beta } \, \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2. \end{aligned}$$

(7.69)

From (7.62), (7.63) and (7.69), we have:

$$\begin{aligned} \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} \, \big \rangle\ge & {} \frac{1}{2} \, \Big ( \, \big \Vert \bar{t}^{(k)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k)}-\bar{t}^{(k-1)}\big \Vert _2^2 \, \Big ) \nonumber \\&+ \frac{\beta -\beta _2}{\beta } \, \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2 \nonumber \\= & {} \frac{1}{2} \, \Big ( \, \big \Vert \bar{t}^{(k)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 + \left( \frac{\beta -2\beta _2}{\beta }\right) \big \Vert \bar{t}^{(k)}-\bar{t}^{(k-1)}\big \Vert _2^2 \, \Big ).\nonumber \\ \end{aligned}$$

(7.70)

Convergence results for sequences $\varvec{t^{(k)},Du^{(k)},\rho ^{(k)}}$.

From (7.61) and (7.70), we obtain:

$$\begin{aligned} \big \Vert \bar{\rho }^{(k)} \big \Vert _2^2 - \big \Vert \bar{\rho }^{(k+1)} \big \Vert _2^2\ge & {} \beta ^2 \, \big \Vert \bar{t}^{(k)}\big \Vert _2^2 - \beta ^2 \, \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 + \beta (\beta -2\beta _2) \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2 \nonumber \\&{+}\; \beta \beta _3 \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 + 2 \beta \, \big \Vert \, c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2^2 ,\qquad \end{aligned}$$

(7.71)

that is:

$$\begin{aligned}&\underbrace{ \left( \big \Vert \bar{\rho }^{(k)} \big \Vert _2^2 + \beta ^2 \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 \right) }_{s^{(k)}} - \underbrace{ \left( \big \Vert \bar{\rho }^{(k+1)} \big \Vert _2^2 + \beta ^2 \big \Vert \bar{t}^{(k)}\big \Vert _2^2 \right) }_{s^{(k+1)}} \nonumber \\&\quad \ge \beta (\beta -2\beta _2) \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2 +\, \beta \beta _3 \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 \nonumber \\&\qquad +\, 2 \beta \, \big \Vert \, c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2^2 \;{\ge }\;\, 0, \end{aligned}$$

(7.72)

where we have introduced the scalar sequence $\{s^{(k)}\}$, which is clearly bounded from below by zero. We notice that the coefficient $\beta -2\beta _2$ in (7.72) is positive due to the constraint $\beta > 2a$. Since the right-hand side of the first inequality in (7.72) is nonnegative, $\{s^{(k)}\}$ is monotonically non-increasing, hence convergent. This implies that the right-hand side of (7.72) tend to zero as $k \rightarrow \infty $. From these considerations and (7.72) it follows that:

$$\begin{aligned}&\big \{\bar{\rho }^{(k)}\big \} , \, \big \{\bar{t}^{(k)}\big \} , \, \big \{D \bar{u}^{(k)}\big \} \,\;\mathrm {are}\;\, \mathrm {bounded} \;{\Longrightarrow }\; \big \{\rho ^{(k)}\big \} , \, \big \{t^{(k)}\big \}, \, \big \{D u^{(k)}\big \} \;\mathrm {bounded} \,,\nonumber \\ \end{aligned}$$

(7.73)

$$\begin{aligned}&\lim _{k \rightarrow \infty } \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2 = \lim _{k \rightarrow \infty } \big \Vert t^{(k)} - t^{(k-1)} \big \Vert _2 = 0, \end{aligned}$$

(7.74)

$$\begin{aligned}&\lim _{k \rightarrow \infty } \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2 = \lim _{k \rightarrow \infty } \big \Vert t^{(k)} - D u^{(k)} \big \Vert _2 = 0, \end{aligned}$$

(7.75)

$$\begin{aligned}&\lim _{k \rightarrow \infty } \big \Vert c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2 = 0 . \end{aligned}$$

(7.76)

Since the two coefficients $c_1$, $c_2$ in (7.76) satisfy $c_1, c_2 \ne 0$, $c_1 \ne c_2$, then it follows from (7.75)–(7.76) that both the sequences $\{\bar{t}^{(k)}\}$ and $\{D \bar{u}^{(k)}\}$ tend to zero as $k \rightarrow \infty $. Results in (7.73)–(7.76) can thus be rewritten in the following more concise and informative form:

$$\begin{aligned}&\big \{\rho ^{(k)}\big \} \,\;\mathrm {is}\;\, \mathrm {bounded}, \end{aligned}$$

(7.77)

$$\begin{aligned}&\lim _{k \rightarrow \infty } \bar{t}^{(k)} = 0 \,\;\;{\Longleftrightarrow }\;\; \lim _{k \rightarrow \infty } t^{(k)} = t^* = D u^*, \end{aligned}$$

(7.78)

$$\begin{aligned}&\lim _{k \rightarrow \infty } D \bar{u}^{(k)} = 0 \,\;\;{\Longleftrightarrow }\;\; \lim _{k \rightarrow \infty } D u^{(k)} = D u^* , \end{aligned}$$

(7.79)

where the last equality in (7.78) comes from the saddle-point properties stated in Theorem 5.7. Since it will be useful later on, we note that it follows from (7.78) that

$$\begin{aligned} \lim _{k \rightarrow \infty } R(t^{(k)}) \,\,{=}\;\, R(t^*). \end{aligned}$$

(7.80)

Convergence results for sequence $\varvec{u^{(k)}}$.

We now prove that $\lim _{{k \rightarrow \infty } } u^{(k)} =u^*$. Since $(u^*,t^*;\rho ^*)$ is a saddle point of the augmented Lagrangian functional $\mathcal {L}(u,t;\rho )$, we have

$$\begin{aligned} \mathcal {L}(u^*,t^*;\rho ^*) \;{\le }\;\, \mathcal {L}(u,t;\rho ^*) \quad \forall \, (u,t) \;{\in }\;\, {\mathbb R}^n {\times }\; {\mathbb R}^{2n}. \end{aligned}$$

(7.81)

By taking $u = u^{(k)}$, $t = t^{(k)}$ in (7.81) and recalling the definition of $\mathcal {L}(u,t;\rho )$ in (5.2), we have:

$$\begin{aligned}&F(u^*) + R(t^*) \;{-}\; \langle \, \rho ^* , \underbrace{t^* - D u^*}_{0} \, \rangle + \frac{\beta }{2} \, \Vert \underbrace{t^* - D u^*}_{0} \Vert _2^2 \nonumber \\&\quad \le F(u^{(k)}) + R(t^{(k)}) \;{-}\; \langle \, \rho ^* , t^{(k)} - D u^{(k)} \, \rangle + \frac{\beta }{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {\Longleftrightarrow }\;\; F(u^*) \le F(u^{(k)}) + R(t^{(k)}) \;{-}\; R(t^*) \nonumber \\&\qquad {-}\; \langle \, \rho ^* , t^{(k)} - D u^{(k)} \, \rangle + \frac{\beta }{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \, . \end{aligned}$$

(7.82)

Taking $u = u^*$ in (7.48) and $t = t^*$ in (7.49), we obtain:

$$\begin{aligned}&F(u^*) \;{-}\; \frac{\beta _1}{2} \, \Vert t^{(k-1)} - D u^* \Vert _2^2 \,{-}\, F(u^{(k)}) \,{+}\, \frac{\beta _1}{2} \, \Vert t^{(k-1)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad -\; \Big \langle D^T \Big ( (\beta +\beta _1) \, \big ( t^{(k-1)} - D u^{(k)} \big ) - \rho ^{(k)} \Big ) , \, u^* - u^{(k)} \Big \rangle \,\;{\ge }\;\, 0, \end{aligned}$$

(7.83)

$$\begin{aligned}&R(t^*) + \frac{\beta _2}{2} \, \Vert t^* - D u^{(k)} \Vert _2^2 \;{-}\; R(t^{(k)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad +\, \big \langle \, (\beta -\beta _2) (t^{(k)} - D u^{(k)}) - \rho ^{(k)}, \, t^* - t^{(k)} \,\, \big \rangle \,\;{\ge }\;\, 0. \end{aligned}$$

(7.84)

By summing up (7.83) and (7.84), we have:

$$\begin{aligned}&F(u^*) \,\;{\ge }\;\, F(u^{(k)}) + R(t^{(k)}) \;{-}\; R(t^*) + \frac{\beta _1}{2} \, \Vert D u^* \Vert _2^2 \;{-}\; \frac{\beta _1}{2} \, \Vert D u^{(k)} \Vert _2^2 \nonumber \\&\quad {-}\; \beta _1 \, \big \langle t^{(k-1)}, \, Du^* - Du^{(k)} \big \rangle \;{-}\; \frac{\beta _2}{2} \, \Vert t^* - D u^{(k)} \Vert _2^2 + \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {+}\; \big \langle (\beta +\beta _1) \, \big ( t^{(k-1)} - D u^{(k)} \big ) - \rho ^{(k)} , \, Du^* - Du^{(k)} \big \rangle \nonumber \\&\quad {-}\, \big \langle \, (\beta -\beta _2) (t^{(k)} - D u^{(k)}) - \rho ^{(k)}, \, t^* - t^{(k)} \,\, \big \rangle . \end{aligned}$$

(7.85)

Taking $\lim \inf $ of (7.82) and $\lim \sup $ of (7.85), and using the results in (7.77)–(7.80), we have

$$\begin{aligned} \lim \inf \, F(u^{(k)}) \,\;{\ge }\;\,\, F(u^*) \,\;{\ge }\;\,\, \lim \sup \, F(u^{(k)}). \end{aligned}$$

(7.86)

It follows from (7.86) that

$$\begin{aligned} \lim _{k \rightarrow \infty } F(u^{(k)}) = F(u^*). \end{aligned}$$

(7.87)

We now manipulate $F(u^{(k)})$ as follows:

$$\begin{aligned} F(u^{(k)})= & {} \frac{\lambda }{2} \, \Vert u^{(k)} - b \Vert _2^2 \,{=}\; \frac{\lambda }{2} \, \langle \, u^{(k)} - b, \, u^{(k)} - b \, \rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}+u^*}{2} - b, \, u^{(k)} - b \, \right\rangle + \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}-u^*}{2}, \, u^{(k)} - b \, \right\rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}+u^*}{2} - b, \, \frac{u^{(k)}+u^*}{2} - b \, \right\rangle + \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}+u^*}{2} - b, \, \frac{u^{(k)}-u^*}{2} \, \right\rangle \nonumber \\&{+}\; \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}-u^*}{2}, \, u^{(k)} - b \, \right\rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}+u^*}{2} - b + u^{(k)} - b \, \right\rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}-u^*}{2} + u^{(k)}+u^* - 2b \, \right\rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \frac{\lambda }{2} \, \left\| \frac{u^{(k)}-u^*}{2} \right\| _2^2 \nonumber \\&\quad + \lambda \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}+u^*}{2} - b \, \right\rangle \nonumber \\\ge & {} \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \lambda \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}+u^*}{2} - b \, \right\rangle . \end{aligned}$$

(7.88)

On the other hand, we have that

$$\begin{aligned} \big \langle \, \rho ^*, \, D u^{(k)} - D u^* \, \big \rangle= & {} \big \langle \, \rho ^*, \, D (u^{(k)} - u^*) \, \big \rangle = \big \langle \, D^T \rho ^*, \, u^{(k)} - u^* \, \big \rangle \nonumber \\= & {} \lambda \, \big \langle \, u^* - b, \, u^* - u^{(k)} \, \big \rangle , \end{aligned}$$

(7.89)

where in (7.89) we have used the (optimality) condition (7.30). From (7.88) and (7.89) it follows that

$$\begin{aligned}&F(u^{(k)}) \,+\, \big \langle \, \rho ^*, \, D u^{(k)} - D u^* \, \big \rangle \nonumber \\&\quad \ge \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \lambda \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}+u^*}{2} - b \, \right\rangle \nonumber \\&\qquad +\; \lambda \, \big \langle \, u^* - b, \, u^* - u^{(k)} \, \big \rangle \nonumber \\&\quad = \underbrace{\frac{\lambda }{2} \, \Vert u^* - b \Vert _2^2}_{F(u^*)} \,+\, \frac{3}{8} \, \lambda \, \Vert u^{(k)} - u^* \Vert _2^2, \end{aligned}$$

(7.90)

that is

$$\begin{aligned} F(u^{(k)}) \;{-}\; F(u^*) \,+\, \big \langle \, \rho ^*, \, D u^{(k)} - D u^* \, \big \rangle \,\;{\ge }\;\, \frac{3}{8} \, \lambda \, \Vert u^{(k)} - u^* \Vert _2^2. \end{aligned}$$

(7.91)

Taking the limit for $k \rightarrow \infty $ of both sides of (7.91) and recalling (7.79) and (7.87), we obtain:

$$\begin{aligned} 0 \,\;{\ge }\;\, \lim _{k \rightarrow \infty } \frac{3}{8} \, \lambda \, \Vert u^{(k)} - u^* \Vert _2^2 \;\;\;{\Longrightarrow }\;\; \lim _{k \rightarrow \infty } u^{(k)} = u^*, \end{aligned}$$

(7.92)

thus completing the proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chan, R., Lanza, A., Morigi, S. et al. Convex non-convex image segmentation. Numer. Math. 138, 635–680 (2018). https://doi.org/10.1007/s00211-017-0916-4

Download citation

Received: 15 November 2016
Revised: 04 July 2017
Published: 06 September 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s00211-017-0916-4

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convex non-convex image segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Effective Two-Stage Image Segmentation: A New Non-Lipschitz Decomposition Approach with Convergent Algorithm

The Vese-Chan model without redundant parameter estimation for multiphase image segmentation

Advanced Methods in Variational Learning: Segmentation with Intensity Inhomogeneity

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Lemma 3.2

Proof of Lemma 3.3

Proof of Proposition 3.7

Proof of Proposition 4.1

Proof of Theorem 5.7

Proof of Theorem 5.8

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Subscribe and save

Buy Now