Skip to main content
Log in

Convex non-convex image segmentation

  • Published:
Numerische Mathematik Aims and scope Submit manuscript

Abstract

A convex non-convex variational model is proposed for multiphase image segmentation. We consider a specially designed non-convex regularization term which adapts spatially to the image structures for a better control of the segmentation boundary and an easy handling of the intensity inhomogeneities. The nonlinear optimization problem is efficiently solved by an alternating directions methods of multipliers procedure. We provide a convergence analysis and perform numerical experiments on several images, showing the effectiveness of this procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. A convex function is proper if it nowhere takes the value \(-\infty \) and is not identically equal to \(+\infty \).

References

  1. Bioucas-Dias, J., Figueredo, M.: Fast image recovery using variable splitting and constrained optimization. IEEE Trans. Image Process. 19(9), 2345–2356 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  2. Blake, A., Zisserman, A.: Visual Reconstruction. MIT Press, Cambridge (1987)

    Google Scholar 

  3. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–22 (2011)

    Article  MATH  Google Scholar 

  4. Brown, E.S., Chan, T.F., Bresson, X.: Completely convex formulation of Chan–Vese image segmentation model. Int. J. Comput. Vis. 98(1), 103–121 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  5. Cai, X.H., Chan, R.H., Zeng, T.Y.: A two-stage image segmentation method using a convex variant of the Mumford–Shah model and thresholding. SIAM J. Imaging Sci. 6(1), 368–390 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chan, T., Esedoglu, S., Nikolova, M.: Algorithms for finding global minimizers of image segmentation and denoising models. SIAM J. Appl. Math. 66(5), 1632–1648 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  7. Chan, T., Vese, L.A.: Active contours without edges. IEEE Trans. Image Process. 10, 266–277 (2001)

    Article  MATH  Google Scholar 

  8. Chan, T., Vese, L.A.: Active contours without edges for vector-valued image. J. Vis. Commun. Image Represent. 11, 130–141 (2000)

    Article  Google Scholar 

  9. Chen, P.Y., Selesnick, I.W.: Group-sparse signal denoising: non-convex regularization, convex optimization. IEEE Trans. Signal Proc. 62, 3464–3478 (2014)

    Article  MathSciNet  Google Scholar 

  10. Christiansen, M., Hanke, M.: Deblurring methods using antireflective boundary conditions. SIAM J Sci Comput 30, 855–872 (2008)

  11. Clarke, F.H.: Optimizatiom and Nonsmooth Analysis. Wiley, New York (1983)

    Google Scholar 

  12. Donatelli, M., Reichel, L.: Square smoothing regularization matrices with accurate boundary conditions. J Comput Appl Math 272, 334–349 (2014)

  13. Dong, B., Chien, A., Shen, Z.: Frame based segmentation for medical images. Commun. Math. Sci. 32, 1724–1739 (2010)

    MATH  Google Scholar 

  14. Ekeland, I., Temam, R.: Convex Analysis and Variational Problems (Classics in Applied Mathematics). SIAM, Philadelphia (1999)

    Book  MATH  Google Scholar 

  15. Esedoglu, S., Tsai, Y.: Threshold dynamics for the piecewise constant Mumford–Shah functional. J. Comput. Phys. 211, 367–384 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  16. Huang, G., Lanza, A., Morigi, S., Reichel, L., Sgallari, F.: Majorization–minimization generalized Krylov subspace methods for \(\ell _p - \ell _q\) optimization applied to image restoration. BIT Numer Math 57(2), 351–378 (2017). doi:10.1007/s10543-016-0643-8

    Article  MATH  Google Scholar 

  17. Lanza, A., Morigi, S., Sgallari, F.: Convex image denoising via non-convex regularization. In: Aujol, JF., Nikolova, M., Papadakis, N. (eds.) Scale Space and Variational Methods in Computer Vision. SSVM 2015. Lecture Notes in Computer Science, vol. 9087, pp. 666–677. Springer, Cham (2015)

  18. Lanza, A., Morigi, S., Reichel, L., Sgallari, F.: A generalized Krylov subspace method for lp–lq minimization. SIAM J. Sci. Comput. 37(5), S30–S50 (2015)

    Article  MATH  Google Scholar 

  19. Lanza, A., Morigi, S., Sgallari, F.: Constrained TVp-l2 model for image restoration. J. Sci. Comput. 68(1), 64–91 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  20. Lanza, A., Morigi, S., Sgallari, F.: Convex image denoising via non-convex regularization with parameter selection. J. Math. Imaging Vis. 56(2), 195–220 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  21. Lanza, A., Morigi, S., Selesnick, I., Sgallari, F.: Nonconvex nonsmooth optimization via convex–nonconvex majorization–minimization. Numer. Math. 136(2), 343–381 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  22. Li, F., Ng, M., Zeng, T.Y., Shen, C.: A multiphase image segmentation method based on fuzzy region competition. SIAM J. Imaging Sci. 3, 277–299 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  23. Li, F., Shen, C., Li, C.: Multiphase soft segmentation with total variation and H1 regularization. J. Math. Imaging Vis. 37, 98–111 (2010)

    Article  Google Scholar 

  24. Lie, J., Lysaker, M., Tai, X.: A binary level set model and some applications to Mumford–Shah image segmentation. IEEE Trans. Image Process. 15, 1171–1181 (2006)

    Article  MATH  Google Scholar 

  25. Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated variational problems. Commun. Pure Appl. Math. 42(5), 577–685 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  26. Ng, M.K., Chan, R.H., Tang, W.C.: A fast algorithm for deblurring models with Neumann boundary conditions. SIAM J Sci Comput 21, 851–866 (1999)

  27. Nikolova, M.: Estimation of binary images by minimizing convex criteria. Proc. IEEE Int. Conf. Image Process. 2, 108–112 (1998)

    Google Scholar 

  28. Parekh, A., Selesnick, I.W.: Convex Denoising Using Non-Convex Tight Frame Regularization. arXiv:1504.00976 (2015)

  29. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  30. Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin (1998)

    Google Scholar 

  31. Sandberg, B., Kang, S., Chan, T.: Unsupervised multiphase segmentation: a phase balancing model. IEEE Trans. Image Process. 19, 119–130 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  32. Selesnick, I.W., Bayram, I.: Sparse signal estimation by maximally sparse convex optimization. IEEE Trans. Signal Process. 62(5), 1078–1092 (2014)

    Article  MathSciNet  Google Scholar 

  33. Selesnick, I.W., Parekh, A., Bayram, I.: Convex 1-D total variation denoising with non-convex regularization. IEEE Signal Process. Lett. 22(2), 141–144 (2015)

    Article  Google Scholar 

  34. Strong, D.M., Chan, T.F.: Edge-preserving and scale-dependent properties of total variation regularization. Inverse Probl. 19(6), 165–187 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  35. Wu, C., Tai, X.C.: Augmented Lagrangian method, dual methods, and split Bregman iteration for ROF, vectorial TV, and high order models. SIAM J. Imaging Sci. 3(3), 300–339 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  36. Wu, C., Zhang, J., Tai, X.C.: Augmented lagrangian method for total variation restoration with non-quadratic fidelity. Inverse Probl. Imaging 5(1), 237–261 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  37. Yuan, J., Bae, E., Tai, X., Boykov, Y.: A study on continuous max-flow and min-cut approaches. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2217–2224 (2010)

  38. Yuan, J., Bae, E., Tai, X., Boykov, Y.: A continuous max-flow approach to Potts model. In: ECCV 2010: Proceedings of the 11th European Conference on Computer Vision, Springer, Berlin, pp. 332–345 (2010)

  39. Varga, R.S.: Matrix Iterative Analysis, Springer Series in Computational Mathematics. Springer, Berlin, Heidelberg (2000). doi:10.1007/978-3-642-05156-2

Download references

Acknowledgements

We would like to thank the referees for comments that lead to improvements of the presentation.This work is partially supported by HKRGC GRF Grant No. CUHK300614, CUHK14306316, CRF Grant No. CUHK2/CRF/11G, AoE Grant AoE/M-05/12, CUHK DAG No. 4053007, and FIS Grant No. 1907303. Research by SM, AL and FS was supported by the “National Group for Scientific Computation (GNCS-INDAM)” and by ex60% project by the University of Bologna “Funds for selected research topics”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serena Morigi.

Appendix

Appendix

Proof of Lemma 3.2

Let \(x :{=} (x_1,x_2,x_3)^T \in {\mathbb R}^3\). Then, the function \(f(\,\cdot \,;\lambda ,T,a)\) in (3.3) can be rewritten in a more compact form as follows:

$$\begin{aligned} f(x;\lambda ,T,a) = \frac{\lambda }{6} \, x^T x \;\,{+}\;\; \phi \,\left( \sqrt{ x^T Q \, x } \,;\, T,a \right) \,, \end{aligned}$$
(7.1)

with the matrix \(Q \in {\mathbb R}^{3 \times 3}\) defined as

$$\begin{aligned} Q \;=\, \begin{bmatrix} 2&\quad -1&\quad -1 \\ -1&\quad 1&\quad 0 \\ -1&\quad 0&\quad 1 \end{bmatrix} . \end{aligned}$$
(7.2)

We introduce the eigenvalue decomposition of the matrix Q in (7.2):

$$\begin{aligned} Q = V \Lambda \, V^T, \quad \Lambda = \mathrm{diag}(3,1,0), \quad V V^T = V^T V = I_3 \,, \end{aligned}$$
(7.3)

where orthogonality of the modal matrix V in (7.3) follows from symmetry of matrix Q. Then, we decompose the diagonal eigenvalues matrix \(\Lambda \) in (7.3) as follows:

$$\begin{aligned} \Lambda = Z \widetilde{\Lambda } Z, \quad \; Z = \mathrm{diag}(\sqrt{3},1,1), \quad \widetilde{\Lambda } = \mathrm{diag}(1,1,0). \end{aligned}$$
(7.4)

Substituting (7.4) into (7.3), then (7.3) into (7.1), we obtain the following equivalent expression for the function f:

$$\begin{aligned} f(x;\lambda ,T,a) = \frac{\lambda }{6} \, x^T x \;\,{+}\;\; \phi \,\left( \sqrt{ x^T V Z \widetilde{\Lambda } Z \, V^T \, x } \,;\, T,a \right) . \end{aligned}$$
(7.5)

Recalling that the property of convexity for a function is invariant under non-singular linear transformations of its domain, we introduce the following one for the domain \({\mathbb R}^3\) of function f above:

$$\begin{aligned} x = T y, \quad T :{=} V Z^{-1} \;{\in }\; {\mathbb R}^{3 \times 3}\,, \end{aligned}$$
(7.6)

which is non-singular due to V and Z being non-singular matrices. By defining as \(f_T :{=} f \circ T\) the function f in the transformed domain, we have:

$$\begin{aligned} f_T(y;\lambda ,T,a) = \frac{\lambda }{6} \, y^T Z^{-2} y \;\,{+}\;\; \phi \,\left( \sqrt{ y^T \widetilde{\Lambda } \, y } \,;\, T,a \right) . \end{aligned}$$
(7.7)

Recalling the definitions of Z and \(\widetilde{\Lambda }\) in (7.4), we can write (7.7) in the explicit form:

$$\begin{aligned} f_T(y;\lambda ,T,a)= & {} \frac{\lambda }{6} \, \left( \frac{y_1^2}{3} + y_2^2 + y_3^2 \right) \;\,{+}\;\; \phi \,\left( \sqrt{ y_1^2 + y_2^2 }\,;\, T,a \right) \nonumber \\= & {} \frac{\lambda }{6} \left( \frac{2}{3} y_2^2 + y_3^2\right) + \frac{\lambda }{18} \left( y_1^2 + y_2^2 \right) \,{+}\; \phi \,\left( \sqrt{ y_1^2 + y_2^2 }\,;\, T,a \right) \nonumber \\= & {} \frac{\lambda }{6} \left( \frac{2}{3} y_2^2 + y_3^2\right) + \, g(y_1,y_2;\lambda ,T,a)\,, \end{aligned}$$
(7.8)

where the function g in (7.8) is defined in (3.6). Since the first term in (7.8) is (quadratic) convex, a sufficient condition for the function \(f_T\) in (7.8) to be strictly convex is that the function g in (3.6) is strictly convex. This concludes the proof after recalling that the function f is strictly convex if and only if the function \(f_T\) is strictly convex. \(\square \)

Proof of Lemma 3.3

It follows immediately from the definition of strict convexity that a function from \({\mathbb R}^2\) into \({\mathbb R}\) is strictly convex if and only if the restriction of the function to any possible straight line of \({\mathbb R}^2\) is strictly convex. Due to the radial symmetry property of function \(\psi \) in (3.7), the restriction of \(\psi \) to a generic straight line l is identical to the restriction of \(\psi \) to any other straight line obtained by rotating l around the origin. Hence, \(\psi \) is strictly convex if and only if all its restrictions to horizontal straight lines (any other direction, e.g. vertical, could be chosen as well) with non-negative intercept are strictly convex.

We denote by \(h_0\) and \(h_k\) the functions from \({\mathbb R}\) into \({\mathbb R}\) corresponding to the restriction of \(\psi \) to the horizontal straight line with null intercept, namely the horizontal coordinate axis, and to any horizontal straight line with positive intercept \(k > 0\), respectively. From the definition of the function \(\psi \) in (3.7), we have:

$$\begin{aligned} h_0(t) = \psi \left( t,0\right)= & {} z\left( |t|\right) , \qquad \qquad \;\; t \in {\mathbb R}, \end{aligned}$$
(7.9)
$$\begin{aligned} h_k(t) = \psi \left( t,k\right)= & {} z\left( \sqrt{t^2+k^2}\right) , \quad t \in {\mathbb R}, \;\, k > 0. \end{aligned}$$
(7.10)

Since the function \(\psi \) in (3.7) is strictly convex if and only if both \(h_0\) in (7.9) and \(h_k\) in (7.10) are strictly convex, it is clear that a necessary condition for \(\psi \) to be strictly convex is that \(h_0\) in (7.9) is strictly convex. It thus remains to demonstrate that \(h_0\) being strictly convex is also a sufficient condition for \(\psi \) to be strictly convex or, equivalently, that strict convexity of \(h_0\) in (7.9) implies strict convexity of \(h_k\) in (7.10) for any positive k.

The functions \(h_0\) and \(h_k\) in (7.9)–(7.10) are clearly even and, since we are assuming \(z \in \mathcal {C}^1({\mathbb R}_+)\), we have that \(h_k \in \mathcal {C}^1({\mathbb R})\) and \(h_0 \in \mathcal {C}^0({\mathbb R}) \cap \mathcal {C}^1({\mathbb R}\setminus \{0\})\). In particular, the first-order derivatives of \(h_0\) and \(h_k\) are as follows:

$$\begin{aligned} h_0'(t)= & {} z'\left( |t|\right) \mathrm {sign}(t), \qquad \qquad \qquad t \in {\mathbb R}\setminus \{0\}, \end{aligned}$$
(7.11)
$$\begin{aligned} h_k'(t)= & {} z'\left( \sqrt{t^2+k^2}\right) \frac{t}{\sqrt{t+k^2}}, \quad \, t \in {\mathbb R}. \end{aligned}$$
(7.12)

We note that \(h_0\) is continuously differentiable also at the point \(t=0\) if and only if the right-sided derivative of the function z at 0 is equal to 0.

We now assume that the function \(h_0\) in (7.9) is strictly convex. This implies that the first-order derivative function \(h_0'\) is monotonically increasing on its entire domain \({\mathbb R}\setminus \{0\}\). It thus follows from the definition of \(h_0'\) in (7.11) that the first-order derivative function \(z'\) is nonnegative and monotonically increasing on \({\mathbb R}_+\). We then notice that, for any given \(k > 0\), the first-order derivative function \(h_k'\) in (7.12) is continuous (since \(z'\) is continuous on \({\mathbb R}_+\) by assumption) and odd (hence \(h_k'(0) = 0\)). Finally, by recalling that the composition and the product of positive, monotonically increasing functions is monotonically increasing, it follows that \(h_k'\) in (7.12) is monotonically increasing on the entire real line, hence \(h_k\) in (7.10) is strictly convex. This completes the proof. \(\square \)

Proof of Proposition 3.7

The functional \(\mathcal {J}(\,\cdot \,;\lambda ,\eta ,a)\) in (1.1) is clearly proper. Moreover, since the functions \(\phi (\,\cdot \,;T,a)\) and \(\Vert \cdot \Vert _2\) are both continuous and bounded from below by zero, \(\mathcal {J}\) is also continuous and bounded from below by zero. In particular, we notice that \(\mathcal {J}\) achieves the zero value only for \(u = b\) with b a constant image. The penalty function \(\phi (\,\cdot \,;T,a)\) is not coercive, hence the regularization term in \(\mathcal {J}\) is not coercive. However, since the fidelity term is quadratic and strictly convex, hence coercive, and the regularization term is bounded from below by zero, \(\mathcal {J}\) is coercive.

As far as strong convexity is concerned, it follows from Definition 3.6 that the functional \(\mathcal {J}(\,\cdot \,;\lambda ,T,a)\) in (1.1) is \(\mu \)-strongly convex if and only if the functional \(\widetilde{\mathcal {J}}(u;\lambda ,T,a,\mu )\) defined as

$$\begin{aligned} \widetilde{\mathcal {J}}(u;\lambda ,T,a,\mu ):= & {} \underbrace{ \frac{\lambda }{2} \, \Vert u - b \Vert _2^2 + \sum _{i = 1}^{n} \phi \left( \Vert (\nabla u)_i \Vert _2 ; T,a \right) }_{\mathcal {J}(u;\lambda ,T,a)} \;{-}\; \frac{\mu }{2} \, \Vert u \Vert _2^2 \nonumber \\= & {} \mathcal {A}(u) + \frac{\lambda -\mu }{2} \, \Vert u \Vert _2^2 + \sum _{i = 1}^{n} \phi \left( \Vert (\nabla u)_i \Vert _2 ; T,a \right) \end{aligned}$$
(7.13)

is convex, where \(\mathcal {A}(u)\) is an affine function of u. We notice that the functional \(\widetilde{\mathcal {J}}\) in (7.13) almost coincides with the original functional \(\mathcal {J}\) in (1.1), the only difference being the coefficient is \(\lambda -\mu \) instead of \(\lambda \). Hence, we can apply the results in Theorem 3.5 and state that \(\widetilde{\mathcal {J}}\) in (7.13) is convex if condition (3.10) is satisfied with \(\lambda - \mu \) in place of \(\lambda \). By substituting \(\lambda - \mu \) for \(\lambda \) in condition (3.10), deriving the solution interval for \(\mu \) and then taking the maximum, one obtains equality (3.22). \(\square \)

Proof of Proposition 4.1

The demonstration of condition (4.17) for strict convexity of the function \(\theta \) in (4.16) is straightforward. In fact, the function \(\theta \) can be equivalently rewritten as

$$\begin{aligned} \theta (x) = \underbrace{ \phi \left( \Vert x \Vert _2;T,a \right) +\, \frac{\beta }{2} \, \Vert x \Vert _2^2 }_{\bar{\theta }(x)} \,+ \mathcal {A}(x),\quad x \in {\mathbb R}^2, \end{aligned}$$
(7.14)

with \(\mathcal {A}(x)\) an affine function, so that a necessary and sufficient condition for \(\theta \) to be strictly convex is that the function \(\bar{\theta }\) in (7.14) is strictly convex. We then notice that \(\bar{\theta }\) is almost identical to the function g in (3.6), the only difference being the coefficient \(\beta /2\) that for g now reads \(\lambda /18\). By setting \(\lambda /18 = \beta /2 \Longleftrightarrow \lambda = 9 \beta \), the two functions coincide. Condition for strict convexity of g in (3.10) reads as \(\lambda > 9\,a\), hence by substituting \(\lambda = 9 \beta \) in it we obtain condition (4.17) for strict convexity of \(\theta \).

We remark that condition \(\beta > a\) reduces to \(\beta \ge a\) when only convexity is required.

For the proof of statement (4.19), according to which the unique solution \(x^*\) of the strictly convex problem (4.18) is obtained by a shrinkage of vector r, we refer the reader to [20, Proposition 4.5].

We now prove statement (4.20). First, we notice that if \(\Vert r\Vert _2 = 0\), i.e. r is the null vector, the minimization problem in (4.18) with the objective function \(\theta (x)\) defined in (4.16) reduces to

$$\begin{aligned} \arg \min _{x \in {\mathbb R}^2} \, \left\{ \, \phi \left( \Vert x \Vert _2;T,a \right) + \frac{\beta }{2} \, \Vert x \Vert _2^2 \, \right\} . \end{aligned}$$
(7.15)

Since the former and the latter terms of the cost function in (7.15) are a monotonically non-decreasing and a monotonically increasing functions of \(\Vert x\Vert _2\), respectively, the solution of (7.15) is clearly \(x^* = 0\). Hence, the case \(\Vert r\Vert _2 = 0\) can be easily dealt with by taking any value \(\xi ^*\) in formula (4.19). We included the case \(\Vert r\Vert _2 = 0\) in formula a) of (4.20). In the following, we consider the case \(\Vert r\Vert _2 > 0\).

Based on the previously demonstrated statement (4.19), by setting \(x = \xi \, r\), \(\xi \ge 0\), we turn the original unconstrained 2-dimensional problem in (4.18) into the following equivalent constrained 1-dimensional problem:

$$\begin{aligned}&\xi ^* {\leftarrow }\; \mathrm {arg} \min _{0 \le \xi \le 1} \left\{ \, \phi \left( \left\| \xi r \right\| _2;T,a \right) + \frac{\beta }{2} \left\| \xi r - r \right\| _2^2 \,\right\} \nonumber \\&\quad {\leftarrow }\; \mathrm {arg} \min _{0 \le \xi \le 1} \left\{ \, f(\xi ) :{=} \phi \left( \left\| r \right\| _2 \xi ;T,a \right) + \frac{\beta }{2} \left\| r \right\| _2^2 \left( \xi ^2 - 2\xi \right) \,\right\} , \end{aligned}$$
(7.16)

where in (7.16) we omitted the constants and introduced the cost function \(f: {\mathbb R}_+ \rightarrow {\mathbb R}\) for future reference. Since the function \(\phi \) in (7.16), which is defined in (2.1), is continuously differentiable on \({\mathbb R}_+\), the cost function f in (7.16) is also continuously differentiable on \({\mathbb R}_+\). Moreover, f is strictly convex since it represents the restriction of the strictly convex function \(\theta \) in (4.16) to the half-line \(\xi \, r, \, \xi \ge 0\). Hence, the first-order derivative \(f'(\xi )\) is a continuous, monotonically increasing function and a necessary and sufficient condition for an inner point \(0< \xi < 1\) to be the global minimizer of f is that \(f'(\xi ) = 0\). From the definition of f in (7.16) we have:

$$\begin{aligned} f'(\xi ) = \Vert r \Vert _2 \, \big [ \, \phi ' \left( \Vert r \Vert _2 \xi ;T,a \right) + \beta \Vert r \Vert _2 (\xi - 1) \, \big ], \end{aligned}$$
(7.17)

and, in particular:

$$\begin{aligned} f'(0^+) = - \beta \, \Vert r \Vert _2^2 < 0 , \qquad f'(1) = \Vert r \Vert _2 \, \phi ' \left( \Vert r \Vert _2;T,a \right) \ge 0. \end{aligned}$$
(7.18)

It follows from (7.18) that the solution of (7.16) can not be \(\xi ^* = 0\), hence it is either \(\xi ^* = 1\) or an inner stationary point.

Recalling the definition of \(\phi (\,\cdot \,;T,a)\) in (2.1), after some simple manipulations the function \(f'(\xi )\) in (7.17) can be rewritten in the following explicit form:

$$\begin{aligned} f'(\xi )= & {} \left\{ \begin{array}{lclll} f_1'(\xi ) &{} {=} &{} \Vert r \Vert _2 \, \Big [ \, a \frac{T_2-T}{T} \Vert r \Vert _2 \xi + \beta \Vert r \Vert _2 (\xi - 1) \, \Big ], &{} \Vert r \Vert _2 \xi \in [0,T] \\ f_2'(\xi ) &{} {=} &{} \Vert r \Vert _2 \, \Big [ \, -a \Vert r \Vert _2 \xi + a T_2 + \beta \Vert r \Vert _2 (\xi - 1) \, \Big ], &{} \Vert r \Vert _2 \xi \in (T, T_2] \\ f_3'(\xi ) &{} {=} &{} \Vert r \Vert _2 \, \Big [ \, 0 + \beta \Vert r \Vert _2 (\xi - 1) \, \Big ], &{} \Vert r \Vert _2 \xi \in (T_2, +\infty ) \end{array} \right. \nonumber \\ \end{aligned}$$
(7.19)

that is:

$$\begin{aligned} f'(\xi ) = \left\{ \begin{array}{lclll} f_1'(\xi ) &{} {=} &{} \Vert r \Vert _2^2 \, \Big [ \, \big (\beta - a + a \frac{T_2}{T} \big ) \xi &{} - \beta \, \Big ], &{} \xi \in \mathcal {D}_1 :{=}\, \Big [\,0,\frac{T}{\Vert r \Vert _2}\Big ] \\ f_2'(\xi ) &{} {=} &{} \Vert r \Vert _2^2 \, \Big [ \, (\beta - a) \xi + a \frac{T_2}{\Vert r \Vert _2} &{} - \beta \, \Big ], &{} \xi \in \mathcal {D}_2 :{=}\, \Big (\frac{T}{\Vert r \Vert _2}, \frac{T_2}{\Vert r \Vert _2} \Big ) \\ f_3'(\xi ) &{} {=} &{} \Vert r \Vert _2^2 \, \Big [ \, \beta \xi &{} - \beta \, \Big ], &{} \xi \in \mathcal {D}_3 :{=}\, \Big [\frac{T_2}{\Vert r \Vert _2}, +\infty \Big ) \end{array} \right. \end{aligned}$$
(7.20)

Denoting by \(\xi _1^*\), \(\xi _2^*\), \(\xi _3^*\) the points where \(f_1'\), \(f_2'\), \(f_3'\) in (7.20) equal zero, respectively, we have:

$$\begin{aligned} \xi _1^* = \frac{T}{T + \left( T_2 - T \right) \frac{a}{\beta }}, \qquad \xi _2^* = \frac{\beta }{\beta -a} - \frac{a T_2}{\beta - a} \, \frac{1}{\Vert r\Vert _2}, \qquad \xi _3^* = 1. \end{aligned}$$
(7.21)

However, for \(\xi _1^*\), \(\xi _2^*\) and \(\xi _3^*\) in (7.21) to be acceptable candidate solutions of problem (7.16), they must belong to the domains \(\mathcal {D}_1\), \(\mathcal {D}_2\), \(\mathcal {D}_3\) of \(f_1'\), \(f_2'\), \(f_3'\), respectively, and obviously also to the optimization domain \(\mathcal {O} :{=} [0,1]\) of problem (7.16). We have:

$$\begin{aligned} \left\{ \begin{array}{ll} \xi _1^* \,{\in }\; \mathcal {D}_1 &{}\quad \mathrm {if}\;\, \Vert r\Vert _2 \,{\in }\, \left( 0,T + (T_2 - T) \frac{a}{\beta }\right] \\ \xi _1^* \,{\in }\; \mathcal {O} &{}\quad \,\forall \,\, \Vert r\Vert _2 \end{array} \right.&\Longrightarrow&\begin{array}{l} \xi _1^* \;{\in }\; \mathcal {D}_1 \cap \mathcal {O} \;\mathrm {if}\\ \Vert r\Vert _2 \,{\in }\, \left( 0,T + (T_2 - T) \frac{a}{\beta }\right] \end{array} \nonumber \\ \end{aligned}$$
(7.22)
$$\begin{aligned} \left\{ \begin{array}{ll} \xi _2^* \,{\in }\; \mathcal {D}_2 &{}\quad \mathrm {if}\;\, \Vert r\Vert _2 \,{\in }\, \left( T + (T_2 - T) \frac{a}{\beta },T_2\right) \\ \xi _2^* \,{\in }\; \mathcal {O} &{}\quad \mathrm {if}\;\, \Vert r\Vert _2 \,{\in }\, \left[ \frac{a}{\beta }T_2,T_2\right] \end{array} \right.&\Longrightarrow&\begin{array}{l} \xi _2^* \;{\in }\; \mathcal {D}_2 \cap \mathcal {O} \;\mathrm {if} \\ \Vert r\Vert _2 \,{\in }\, \left( T + (T_2 - T) \frac{a}{\beta },T_2\right) \end{array} \nonumber \\ \end{aligned}$$
(7.23)
$$\begin{aligned} \left\{ \begin{array}{ll} \xi _3^* \,{\in }\; \mathcal {D}_3 &{}\quad \mathrm {if}\;\, \Vert r\Vert _2 \,{\in }\, \big [\, T_2, +\infty \big ) \\ \xi _3^* \,{\in }\; \mathcal {O} &{}\quad \,\forall \,\, \Vert r\Vert _2 \end{array} \right.&\Longrightarrow&\begin{array}{l} \xi _3^* \;{\in }\; \mathcal {D}_3 \cap \mathcal {O} \;\mathrm {if} \\ \Vert r\Vert _2 \,{\in }\, \big [\, T_2, +\infty \big ) \end{array} \end{aligned}$$
(7.24)

The proof of statement (4.20) is thus completed. \(\square \)

Proof of Theorem 5.7

Based on the definition of the augmented Lagrangian functional in (5.2), we rewrite in explicit form the first inequality of the saddle-point condition in (4.7):

(7.25)

and, similarly, the second inequality:

$$\begin{aligned}&\displaystyle { \mathcal {L}\,(u^*,t^*;\rho ^*) = F(u^*) + R(t^*) + \frac{\beta }{2} \, \Vert t^* - D u^* \Vert _2^2 \;{-}\; \langle \, \rho ^* , t^* - D u^* \, \rangle } \nonumber \\&\quad \le \displaystyle { \mathcal {L}\,(u,t;\rho ^*) = F(u) \,\;+ R(t) \,+ \frac{\beta }{2} \, \Vert t - D u \Vert _2^2 \;\;\;\;{-}\; \langle \, \rho ^* , t - D u \, \rangle } \nonumber \\&\quad \displaystyle { \forall \, (u,t) \;{\in }\; {\mathbb R}^n {\times }\, {\mathbb R}^{2n} \; . } \end{aligned}$$
(7.26)

In the first part of the proof, we prove that if \(\,(u^*,t^*;\rho ^*)\,\) is a solution of the saddle-point problem (4.6)–(4.7), that is it satisfies the two inequalities (7.25) and (7.26), then \(u^*\) is a global minimizer of the functional \(\mathcal {J}\) in (1.1).

Since (7.25) must be satisfied for any \(\rho \;{\in }\; {\mathbb R}^{2n}\), we have:

$$\begin{aligned} t^* = D u^* \; . \end{aligned}$$
(7.27)

The second inequality (7.26) must be satisfied for any \((u,t) \;{\in }\; {\mathbb R}^n {\times }\, {\mathbb R}^{2n}\). Hence, by taking \(t = Du\) in (7.26) and, at the same time, substituting in (7.26) the previously derived condition (7.27), we obtain:

$$\begin{aligned}&\displaystyle { \mathcal {J}(u^*;\lambda ,T,a) = F(u^*) + R(Du^*) } \nonumber \\&\quad \le \displaystyle { \mathcal {J}(u;\lambda ,T,a) \;\;=\,\, F(u) \,\;+ R(Du) \quad \, \forall \, u \;{\in }\; {\mathbb R}^n.} \end{aligned}$$
(7.28)

Inequality (7.28) indicates that \(u^*\) is a global minimizer of the functional \(\mathcal {J}\) in (1.1). Hence, we have proved that all the saddle-point solutions of problem (4.6)–(4.7), if there exists one, are of the form \(\,(u^*,Du^*;\rho ^*)\,\), with \(u^*\) denoting a global minimizer of \(\mathcal {J}\).

In the second part of the proof, we prove that at least one solution of the saddle-point problem exists. In particular, we prove that if \(u^*\) is a global minimizer of \(\mathcal {J}\) in (1.1), then there exists at least one pair \(\,(t^*,\,\rho ^*) \,{\in }\; {\mathbb R}^{2n} {\times }\, {\mathbb R}^{2n}\) such that \((u^*,t^*;\rho ^*)\) is a solution of the saddle-point problem (4.6)–(4.7), that is it satisfies the two inequalities (7.25) and (7.26). The proof relies on a suitable choice of the vectors \(t^*\) and \(\rho ^*\). We take:

$$\begin{aligned} t^*= & {} D u^*, \end{aligned}$$
(7.29)
$$\begin{aligned} \rho ^*&{\in } \bar{\partial }_{t} \left[ \, R \,\right] (Du^*) \;\;\;\, \mathrm {such}\;\mathrm {that} \quad D^T \rho ^* + \lambda \left( u^* - b\right) = 0, \end{aligned}$$
(7.30)

where the term \(\bar{\partial }_{t} \left[ \, R \,\right] (Du^*)\) indicates the Clarke generalized gradient (with respect to t, calculated at \(Du^*\)) of the nonconvex regularization function R defined in (5.1). We notice that a vector \(\rho ^*\) satisfying (7.30) is guaranteed to exist thanks to Proposition 5.2. In fact, since here we are assuming that \(u^*\) is a global minimizer of functional \(\mathcal {J}\), the first-order optimality condition in (5.5) holds true.

Due to (7.29), the first saddle-point condition in (7.25) is clearly satisfied. Proving the second condition (7.26) is less straightforward: we need to investigate the optimality conditions of the functional \(\mathcal {L}\,(u,t;\rho ^*)\) with respect to the pair of primal variables (ut). We follow the same procedure used, e.g., in [35], which requires \(\mathcal {L}\,(u,t;\rho ^*)\) to be jointly convex in (ut). According to Proposition 5.4, in our case this requirement is fulfilled if the penalty parameter \(\beta \) satisfies condition (5.17), which has thus been taken as an hypothesis of this theorem. Hence, we can apply Lemma 5.6 and state that (7.26) is satisfied if and only if both the following two optimality conditions are met:

$$\begin{aligned}&u^* \;{\in }\;\;\, \displaystyle {\arg \min _{u} \, \mathcal {L}\,(u,t^*;\rho ^*)} = \displaystyle {\arg \min _{u} \, \mathcal {L}^{(u)}\,(u)}, \end{aligned}$$
(7.31)
$$\begin{aligned}&t^* \;{\in }\;\;\, \displaystyle {\arg \min _{t} \, \mathcal {L}\,(u^*,t;\rho ^*)} = \displaystyle {\arg \min _{t} \, \mathcal {L}^{(t)}\,(t)}, \end{aligned}$$
(7.32)

where in (7.31)–(7.32) we introduced the two functions \(\mathcal {L}^{(u)}\) and \(\mathcal {L}^{(t)}\) representing the restrictions of functions \(\mathcal {L}\,(u,t^*;\rho ^*)\) and \(\mathcal {L}\,(u^*,t;\rho ^*)\) to only the terms depending on the optimization variables u and t, respectively. In particular, after recalling the definition of the augmented Lagrangian functional in (5.2), we have

$$\begin{aligned} \mathcal {L}^{(u)}(u)= & {} \displaystyle { \underbrace{ F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t - D u \Vert _2^2 }_{Q^{(u)}(u)} \,+\, \underbrace{ \frac{\beta +\beta _1}{2} \, \Vert t - D u \Vert _2^2 \,{+}\, \langle \, \rho ^* , D u \, \rangle }_{S^{(u)}(u)} }, \end{aligned}$$
(7.33)
$$\begin{aligned} \mathcal {L}^{(t)}(t)= & {} \displaystyle { \underbrace{ R(t) + \frac{\beta _2}{2} \, \Vert t - D u \Vert _2^2 }_{Q^{(t)}(t)} \,+\, \underbrace{ \frac{\beta -\beta _2}{2} \, \Vert t - D u \Vert _2^2 \,{-}\, \langle \, \rho ^* , t \, \rangle }_{S^{(t)}(t)} }, \end{aligned}$$
(7.34)

where, like in [35], \(\mathcal {L}^{(u)}\) and \(\mathcal {L}^{(t)}\) have been split into the sum of two functions with the aim of then deriving optimality conditions for \(\mathcal {L}^{(u)}\) and \(\mathcal {L}^{(t)}\) by means of Lemma 5.5. Unlike in [35], the ADMM quadratic penalty term \(\frac{\beta }{2} \, \Vert t - D u \Vert _2^2\) has been split into two parts (differently in \(\mathcal {L}^{(u)}\) and \(\mathcal {L}^{(t)}\)) in order to deal with the nonconvex regularization term. In particular, the coefficients \(\beta _1\), \(\beta _2\) introduced in (7.33)–(7.34) satisfy

$$\begin{aligned} - \, \beta \;{<}\;\, \beta _1 \;{\le }\;\, \tau _c \, \frac{9}{8}a, \qquad \, a \,\;{\le }\;\, \beta _2 \;{<}\; \beta , \end{aligned}$$
(7.35)

such that the terms \(S^{(u)}\), \(S^{(t)}\) in (7.33)–(7.34) are clearly convex and the terms \(Q^{(u)}\), \(Q^{(t)}\) are convex due to results in Lemma 5.3 and Proposition 4.1, respectively. We also notice that all the functions \(Q^{(u)}\), \(Q^{(t)}, S^{(u)}\), \(S^{(t)}\) are proper and continuous and that \(S^{(u)}\), \(S^{(t)}\) are G\(\hat{a}\)teaux-differentiable. Hence, we can apply Lemma 5.5 separately to (7.33) and (7.34), to check if the pair \((u^*,t^*)\) satisfies the optimality conditions in (7.31) and (7.32), so that the second saddle-point condition (7.26) holds true. We obtain:

$$\begin{aligned}&F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t^* - D u \Vert _2^2 \;{-}\; F(u^*) + \frac{\beta _1}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {-}\, \big \langle \, (\beta +\beta _1) D^T ( \underbrace{t^* - D u^*}_{0} ) - D^T \rho ^* , u - u^* \big \rangle \,{\ge }\; 0 \quad \; \forall \, u \;{\in }\; {\mathbb R}^n, \end{aligned}$$
(7.36)
$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^* \Vert _2^2 \;{-}\; R(t^*) \;{-}\; \frac{\beta _2}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (\underbrace{t^* - D u^*}_{0}) - \rho ^*, \, t - t^* \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; {\mathbb R}^{2n}, \end{aligned}$$
(7.37)

where the term \(t^*-Du^*\) in (7.36)–(7.37) is zero due to the setting (7.29). We rewrite conditions (7.36)–(7.37) as follows:

$$\begin{aligned}&F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t^* - D u \Vert _2^2 \;{-}\; \left( F(u^*) \;{-}\; \frac{\beta _1}{2} \, \Vert t^* - D u^* \Vert _2^2 \right) \nonumber \\&\quad {-}\, \big \langle \, \lambda (u^* - b) \underbrace{-\, \lambda (u^* - b) - D^T \rho ^*}_{0} \,{+}\, \beta _1 D^T (t^* - D u^*) , \, u - u^* \big \rangle \,{\ge }\; 0 \;\; \forall \, u \;{\in }\; {\mathbb R}^n , \nonumber \\ \end{aligned}$$
(7.38)
$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^* \Vert _2^2 \;{-}\; \left( R(t^*) + \frac{\beta _2}{2} \, \Vert t^* - D u^* \Vert _2^2 \right) \nonumber \\&\quad {-}\;\ \bigg \langle \, \rho ^* + \beta _2 \, (t^* - Du^*), \, t - t^* \, \bigg \rangle \,\;{\ge }\;\, 0 \quad \;\;\, \forall \, t \;{\in }\; {\mathbb R}^{2n}, \end{aligned}$$
(7.39)

where in (7.38) we added and subtracted the term \(\lambda \, (u^*-b)\) and added the null term \(\beta _1 D^T (t^* - D u^*)\), and in (7.39) we added the null term \(\beta _2 \, (t^* - Du^*)\). The term \(-\lambda \, (u^*-b) - D^T \rho ^*\) in (7.38) is null due to the setting (7.30). By introducing the two functions

$$\begin{aligned} U(u) \, :{=} \; F(u) \;{-}\;\, \frac{\beta _1}{2} \, \Vert t^* - D u \Vert _2^2, \qquad T(t) :{=} R(t) \;+ \frac{\beta _2}{2} \, \Vert t {-} D u^* \Vert _2^2, \end{aligned}$$
(7.40)

which are convex under conditions (7.35) for the same reason for which the functions \(Q^{(u)}\), \(Q^{(t)}\) in (7.33)–(7.34) are convex, conditions (7.38)–(7.39) can be rewritten as

$$\begin{aligned}&U(u) \,\,{-}\;\, U(u^*) \,{-}\; \left\langle \, \overbrace{\lambda \, (u^* - b) \,{+}\, \beta _1 D^T (t^* - D u^*)}^{\textstyle { \partial _u \big [ U \big ] \, (u^*) }} , \, u - u^* \right\rangle \,\,{\ge }\;\, 0 \;\;\, \forall \, u \;{\in }\; {\mathbb R}^n , \nonumber \\ \end{aligned}$$
(7.41)
$$\begin{aligned}&T(t) \,\,\;{-}\;\, T(t^*) \,\;{-}\;\, \left\langle \underbrace{\rho ^* + \beta _2 \, (t^* - Du^*)}_{\textstyle { {\in }\; \partial _t \big [ T \big ] \, (t^*) }} \quad , \,\, t \,- t^* \, \right\rangle \,\,{\ge }\;\, 0 \;\;\, \forall \, t \;{\in }\; {\mathbb R}^{2n} ,\qquad \qquad \end{aligned}$$
(7.42)

where we highlighted that the left side of the scalar product in (7.41) represents the subdifferential (actually, the standard gradient) of function U calculated at \(u^*\) and that the left side of the scalar product in (7.42) is a particular vector belonging to the subdifferential of function T calculated at \(t^*\). This second statement comes from the definition of function T in (7.40) and from settings (7.29)–(7.30).

Optimality conditions in (7.41)–(7.42) are easily proved by noticing that the left-hand sides of (7.41)–(7.42) represent the Bregman distances associated with functions U and T, respectively, which are known to be non-negative for convex functions. Hence, the second saddle-point condition in (7.26) is satisfied and, finally, the second and last part of the proof is completed. \(\square \)

Proof of Theorem 5.8

Let us define the following errors:

$$\begin{aligned} \bar{u}^{(k)} = u^{(k)} - u^*, \quad \bar{t}^{(k)} = t^{(k)} - t^*, \quad \bar{\rho }^{(k)} = \rho ^{(k)} - \rho ^*. \end{aligned}$$
(7.43)

Since \((u^*,t^*;\rho ^*)\) is a saddle-point of the augmented Lagrangian functional in (4.6), it follows from Theorem 5.7 that \(t^* = D^*u\). This relationship, together with the ADMM updating formula for the vector of Lagrange multipliers in (4.10), yields:

$$\begin{aligned} \bar{\rho }^{(k+1)} = \bar{\rho }^{(k)} - \beta \, \big ( \, \bar{t}^{(k)} - D \bar{u}^{(k)} \big ). \end{aligned}$$
(7.44)

It then follows easily from (7.44) that

$$\begin{aligned} \big \Vert \bar{\rho }^{(k)} \big \Vert _2^2 - \big \Vert \bar{\rho }^{(k+1)} \big \Vert _2^2 = 2\beta \big \langle \bar{\rho }^{(k)}, \, \bar{t}^{(k)} - D \bar{u}^{(k)} \big \rangle \;{-}\; \beta ^2 \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2. \end{aligned}$$
(7.45)

Computation of a lower bound for the right-hand side of (7.45)

Since \((u^*,t^*;\rho ^*)\) is a saddle-point of the augmented Lagrangian functional in (4.6), it satisfies the following optimality conditions [see (7.36)–(7.37) in the proof of Theorem 5.7]:

$$\begin{aligned}&F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t^* - D u \Vert _2^2 \;{-}\; F(u^*) \,{+}\, \frac{\beta _1}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {-}\; \Big \langle D^T \Big ( (\beta +\beta _1) \, \big ( t^* - D u^* \big ) - \rho ^* \Big ) , \, u - u^* \Big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, u \;{\in }\; {\mathbb R}^n, \end{aligned}$$
(7.46)
$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^* \Vert _2^2 \;{-}\; R(t^*) \;{-}\; \frac{\beta _2}{2} \, \Vert t^* - D u^* \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^* -D u^*) - \rho ^*, \, t - t^* \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; {\mathbb R}^{2n}. \end{aligned}$$
(7.47)

Similarly, by the construction of \(\big (u^{(k)},t^{(k)}\big )\) in Algorithm 1, we have:

$$\begin{aligned}&F(u) \;{-}\; \frac{\beta _1}{2} \, \Vert t^{(k-1)} - D u \Vert _2^2 \,{-}\, F(u^{(k)}) \,{+}\, \frac{\beta _1}{2} \, \Vert t^{(k-1)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {-}\; \Big \langle D^T \Big ( (\beta +\beta _1) \, \big ( t^{(k-1)} - D u^{(k)} \big ) - \rho ^{(k)} \Big ) , \, u - u^{(k)} \Big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, u \;{\in }\; {\mathbb R}^n \, ,\qquad \qquad \end{aligned}$$
(7.48)
$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^{(k)} \Vert _2^2 \;{-}\; R(t^{(k)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^{(k)} - D u^{(k)}) - \rho ^{(k)}, \, t - t^{(k)} \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; {\mathbb R}^{2n}. \end{aligned}$$
(7.49)

Taking \(u = u^{(k)}\) in (7.46), \(u = u^*\) in (7.48) and recalling that \(\langle D^T w , z \rangle = \langle w , D z \rangle \) , by addition we obtain:

$$\begin{aligned} \underbrace{-\,\big \langle \, \bar{\rho }^{(k)} , D \bar{u}^{(k)} \, \big \rangle }_{A_1} + \underbrace{\beta \, \big \langle \, \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle }_{B_1} \;{-}\; \underbrace{(\beta +\beta _1) \, \big \Vert D \bar{u}^{(k)} \big \Vert _2^2}_{C_1} \,\;{\ge }\;\, 0. \end{aligned}$$
(7.50)

Similarly, taking \(t = t^{(k)}\) in (7.47) and \(t = t^*\) in (7.49), after addition we have:

$$\begin{aligned} \underbrace{\big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} \, \big \rangle }_{A_2} + \underbrace{\beta \, \big \langle \, \bar{t}^{(k)} , D \bar{u}^{(k)} \, \big \rangle }_{B_2} \;{-}\; \underbrace{(\beta -\beta _2) \, \big \Vert \bar{t}^{(k)} \big \Vert _2^2}_{C_2} \,\;{\ge }\;\, 0, \end{aligned}$$
(7.51)

where, we recall, the parameters \(\beta _1\) and \(\beta _2\) in (7.50)–(7.51) satisfy the constraints in (7.35).

By summing up (7.50) and (7.51), we obtain:

$$\begin{aligned}&\big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} - D \bar{u}^{(k)} \, \big \rangle - \beta \, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle \nonumber \\&\quad - \left( (\beta -\beta _2) \, \big \Vert \bar{t}^{(k)} \big \Vert _2^2 - 2 \beta \, \big \langle \, \bar{t}^{(k)} , D \bar{u}^{(k)} \, \big \rangle + (\beta +\beta _1) \, \big \Vert D \bar{u}^{(k)} \big \Vert _2^2 \right) \;{\ge }\; 0 \nonumber \end{aligned}$$

that is

$$\begin{aligned}&\big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} - D \bar{u}^{(k)} \, \big \rangle - \beta \, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle - \frac{\beta +\beta _3}{2} \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 \nonumber \\&\quad {-}\; \left( \left( -\,\beta _2-\frac{\beta _3}{2}+\frac{\beta }{2}\right) \, \big \Vert \bar{t}^{(k)} \big \Vert _2^2 - (\beta -\beta _3) \, \big \langle \, \bar{t}^{(k)} , D \bar{u}^{(k)} \, \big \rangle \right. \nonumber \\&\quad \left. + \left( \beta _1-\frac{\beta _3}{2}+\frac{\beta }{2}\right) \, \big \Vert D \bar{u}^{(k)} \big \Vert _2^2 \right) \;{\ge }\; 0, \end{aligned}$$
(7.52)

where we introduced the positive coefficient \(\beta _3 > 0\) (the reason will be clear later on). We want that the last term in (7.52) takes the form \(-\,\big \Vert \, c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2^2 \) with \(c_1,c_2 > 0\). Hence, first we impose that the coefficients of \(\big \Vert \bar{t}^{(k)} \big \Vert _2^2\) and \(\big \Vert D \bar{u}^{(k)} \big \Vert _2^2\) in (7.52) are strictly positive, which yields:

$$\begin{aligned} \beta _1 > \frac{\beta _3}{2} - \frac{\beta }{2}, \qquad \quad \beta _2 < -\frac{\beta _3}{2} + \frac{\beta }{2}. \end{aligned}$$
(7.53)

Combining (7.53) with conditions (7.35), we obtain:

$$\begin{aligned} \frac{\beta _3}{2} - \frac{\beta }{2}< \beta _1 \le \tau _c \, \frac{9}{8}a, \qquad \, a \le \beta _2< -\frac{\beta _3}{2} + \frac{\beta }{2}, \qquad \, 0< \beta _3 < \beta - 2a. \end{aligned}$$
(7.54)

From condition on \(\beta _3\) in (7.54), the following constraint for \(\beta \) is derived:

$$\begin{aligned} \beta > 2a. \end{aligned}$$
(7.55)

We notice that condition (7.55) can be more stringent than (5.17), depending on \(\tau _c\), hence it has been taken as an hypothesis of this theorem and will be considered, together with (5.17), in the rest of the proof. From condition on \(\beta _3\) in (7.54) it also follows that the coefficient \(\beta - \beta _3\) of the scalar product in (7.52) is positive.

Then, we have to impose that the coefficient of the term \(-\big \langle \, \bar{t}^{(k)} , D \bar{u}^{(k)} \, \big \rangle \) in (7.52) is twice the product of the square roots of the (positive) coefficients of \(\big \Vert \bar{t}^{(k)} \big \Vert _2^2\) and \(\big \Vert D \bar{u}^{(k)} \big \Vert _2^2\), that is:

$$\begin{aligned} \beta - \beta _3 = 2 \sqrt{ \left( -\,\beta _2-\frac{\beta _3}{2}+\frac{\beta }{2}\right) \left( \beta _1-\frac{\beta _3}{2}+\frac{\beta }{2}\right) } \;{\Longrightarrow }\; \beta = \beta _3 + 2 \frac{\beta _1 \beta _2}{\beta _1-\beta _2} .\qquad \end{aligned}$$
(7.56)

By imposing condition on \(\beta _3\) in (7.54), namely \(\beta -\beta _3 > 2a\), it is easy to verify that (7.56) admits acceptable solutions only in case that \(\beta _1 > \beta _2\). By setting in (7.56) \(\beta _1 = \tau _c \frac{9}{8} \, a\) and \(\beta _2 = a\), which are acceptable values according to this last result (since \(\tau _c > 1\), clearly \(\beta _1 > \beta _2\)) and also to conditions (7.54), we obtain:

$$\begin{aligned} \beta = \beta _3 + 2 a \, \frac{9 \, \tau _c}{9\,\tau _c - 8}. \end{aligned}$$
(7.57)

We now check if there exist acceptable values for the two remaining free parameters, namely \(\beta \) and \(\beta _3\), such that (7.57) holds. We impose that \(\beta \) in (7.57) satisfies its constraint in (5.17), which guarantees convexity of the augmented Lagrangian functional, and the derived condition in (7.55):

$$\begin{aligned} \left\{ \begin{array}{lll} \displaystyle {\beta _3 + 2 a \, \frac{9 \, \tau _c}{9\,\tau _c - 8}} &{}{\ge }&{} \displaystyle {a \frac{9 \, \tau _c}{9\,\tau _c - 8}} \\ \displaystyle {\beta _3 + 2 a \, \frac{9 \, \tau _c}{9\,\tau _c - 8}} &{}{>}&{} 2 a \end{array} \right. \,\;{\Longrightarrow }\;\, \left\{ \begin{array}{lll} \displaystyle {\beta _3} &{}{\ge }&{} \displaystyle {- a \, \frac{9 \, \tau _c}{9\,\tau _c - 8}} \\ \displaystyle {\beta _3} &{}{>}&{} \displaystyle {- a \, \frac{16}{9\,\tau _c - 8}} \end{array} \right. \end{aligned}$$
(7.58)

Since \(\tau _c > 1\) (and \(a > 0\)), both conditions in (7.58) are satisfied for any \(\beta _3 > 0\). Hence, for \(\beta _1 = \tau _c \frac{9}{8} \, a\), \(\,\beta _2 = a\) and any \(0< \beta _3 < \beta - 2a\), with \(\beta > 2a\), the last term in (7.52) can be written in the form

$$\begin{aligned} -\,\big \Vert \, c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2^2 \;\;\;\; \mathrm {with}\;\;\;\, \left\{ \begin{array}{lll} c_1 &{}{=}&{} \frac{\beta -\beta _3}{2} - a \\ c_1 &{}{=}&{} \frac{\beta -\beta _3}{2} + \tau _c \frac{9}{8} a \end{array} \right. \end{aligned}$$
(7.59)

where \(c_1,c_2 > 0\), \(c_1 \ne c_2\). Replacing the expression in (7.59) for the last term in (7.52), we have:

$$\begin{aligned}&\big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} - D \bar{u}^{(k)} \, \big \rangle - \frac{\beta +\beta _3}{2} \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 - \beta \, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle \nonumber \\&\qquad {-}\; \big \Vert \, c_1 \, \bar{t}^{(k)} - c_2 \, D \bar{u}^{(k)} \big \Vert _2^2 \,{\ge }\;\, 0 \nonumber \\&\quad {\Longleftrightarrow }\, 2 \beta \, \big \langle \, \bar{\rho }^{(k)} , \bar{t}^{(k)} - D \bar{u}^{(k)} \, \big \rangle - \beta ^2 \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 \,{\ge }\;\, \beta \beta _3 \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 \nonumber \\&\qquad {+}\; 2 \beta ^2 \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle + 2 \beta \, \big \Vert \, c_1 \, \bar{t}^{(k)} - c_2 \, D \bar{u}^{(k)} \big \Vert _2^2, \end{aligned}$$
(7.60)

where in (7.60) we multiplied both sides by the positive coefficient \(2\beta \). We notice that the left-hand side of (7.60) coincides with the right-hand side of (7.45), hence it follows that:

$$\begin{aligned}&\big \Vert \bar{\rho }^{(k)} \big \Vert _2^2 - \big \Vert \bar{\rho }^{(k+1)} \big \Vert _2^2 \;{\ge }\;\, \beta \beta _3 \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 +\, 2 \beta ^2 \underbrace{\big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , D \bar{u}^{(k)} \, \big \rangle }_{T} \nonumber \\&\qquad {+}\;2 \beta \, \big \Vert \, c_1 \, \bar{t}^{(k)} - c_2 \, D \bar{u}^{(k)} \big \Vert _2^2. \end{aligned}$$
(7.61)

Computation of a lower bound for the term \(\varvec{T}\) in (7.61).

We can write:

$$\begin{aligned} \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} \, \big \rangle= & {} \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} - D \bar{u}^{(k-1)} \, \big \rangle \nonumber \\&+\, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k-1)} - \bar{t}^{(k-1)} \, \big \rangle \nonumber \\&+\, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, \bar{t}^{(k-1)} \, \big \rangle . \end{aligned}$$
(7.62)

First, we notice that:

$$\begin{aligned} \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, \bar{t}^{(k-1)} \, \big \rangle = \frac{1}{2} \, \Big ( \, \big \Vert \bar{t}^{(k)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k)}-\bar{t}^{(k-1)}\big \Vert _2^2 \, \Big ). \end{aligned}$$
(7.63)

Then, from the construction of \(t^{(k-1)}\) (from \(u^{(k-1)}\)), we have:

$$\begin{aligned}&R(t) + \frac{\beta _2}{2} \, \Vert t - D u^{(k-1)} \Vert _2^2 \;{-}\; R(t^{(k-1)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k-1)} - D u^{(k-1)} \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^{(k-1)} - D u^{(k-1)}) - \rho ^{(k-1)}, \, t - t^{(k-1)} \,\, \big \rangle \,\;{\ge }\;\, 0 \quad \; \forall \, t \;{\in }\; {\mathbb R}^{2n}.\qquad \qquad \end{aligned}$$
(7.64)

Taking \(t = t^{(k-1)}\) in (7.49) and \(t = t^{(k)}\) in (7.64), we obtain:

$$\begin{aligned}&R(t^{(k-1)}) + \frac{\beta _2}{2} \, \Vert t^{(k-1)} - D u^{(k)} \Vert _2^2 \;{-}\; R(t^{(k)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^{(k)} - D u^{(k)}) - \rho ^{(k)}, \, t^{(k-1)} - t^{(k)} \,\, \big \rangle \,\;{\ge }\;\, 0, \end{aligned}$$
(7.65)
$$\begin{aligned}&R(t^{(k)}) + \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k-1)} \Vert _2^2 \;{-}\; R(t^{(k-1)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k-1)} - D u^{(k-1)} \Vert _2^2 \nonumber \\&\quad {+}\, \big \langle \, (\beta -\beta _2) (t^{(k-1)} - D u^{(k-1)}) - \rho ^{(k-1)}, \, t^{(k)} - t^{(k-1)} \,\, \big \rangle \,\;{\ge }\;\, 0. \end{aligned}$$
(7.66)

By addition of (7.65) and (7.66), we have that

$$\begin{aligned}&\beta \, \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} - D \bar{u}^{(k-1)} \, \big \rangle + \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, \bar{\rho }^{(k)} - \bar{\rho }^{(k-1)} \, \big \rangle \nonumber \\&\quad \ge (\beta -\beta _2) \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2. \end{aligned}$$
(7.67)

Recalling that

$$\begin{aligned} \bar{\rho }^{(k)} - \bar{\rho }^{(k-1)} = \rho ^{(k)} - \rho ^{(k-1)} = - \beta \big ( \bar{t}^{(k-1)} - D \bar{u}^{(k-1)} \big ), \end{aligned}$$
(7.68)

replacing (7.68) into (7.67) and then dividing by \(\beta \), we obtain:

$$\begin{aligned}&\big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} - D \bar{u}^{(k-1)} \, \big \rangle + \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k-1)} - \bar{t}^{(k-1)} \, \big \rangle \nonumber \\&\quad \ge \frac{\beta -\beta _2}{\beta } \, \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2. \end{aligned}$$
(7.69)

From (7.62), (7.63) and (7.69), we have:

$$\begin{aligned} \big \langle \, \bar{t}^{(k)} - \bar{t}^{(k-1)} , \, D \bar{u}^{(k)} \, \big \rangle\ge & {} \frac{1}{2} \, \Big ( \, \big \Vert \bar{t}^{(k)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k)}-\bar{t}^{(k-1)}\big \Vert _2^2 \, \Big ) \nonumber \\&+ \frac{\beta -\beta _2}{\beta } \, \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2 \nonumber \\= & {} \frac{1}{2} \, \Big ( \, \big \Vert \bar{t}^{(k)}\big \Vert _2^2 - \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 + \left( \frac{\beta -2\beta _2}{\beta }\right) \big \Vert \bar{t}^{(k)}-\bar{t}^{(k-1)}\big \Vert _2^2 \, \Big ).\nonumber \\ \end{aligned}$$
(7.70)

Convergence results for sequences \(\varvec{t^{(k)},Du^{(k)},\rho ^{(k)}}\).

From (7.61) and (7.70), we obtain:

$$\begin{aligned} \big \Vert \bar{\rho }^{(k)} \big \Vert _2^2 - \big \Vert \bar{\rho }^{(k+1)} \big \Vert _2^2\ge & {} \beta ^2 \, \big \Vert \bar{t}^{(k)}\big \Vert _2^2 - \beta ^2 \, \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 + \beta (\beta -2\beta _2) \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2 \nonumber \\&{+}\; \beta \beta _3 \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 + 2 \beta \, \big \Vert \, c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2^2 ,\qquad \end{aligned}$$
(7.71)

that is:

$$\begin{aligned}&\underbrace{ \left( \big \Vert \bar{\rho }^{(k)} \big \Vert _2^2 + \beta ^2 \big \Vert \bar{t}^{(k-1)}\big \Vert _2^2 \right) }_{s^{(k)}} - \underbrace{ \left( \big \Vert \bar{\rho }^{(k+1)} \big \Vert _2^2 + \beta ^2 \big \Vert \bar{t}^{(k)}\big \Vert _2^2 \right) }_{s^{(k+1)}} \nonumber \\&\quad \ge \beta (\beta -2\beta _2) \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2^2 +\, \beta \beta _3 \, \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2^2 \nonumber \\&\qquad +\, 2 \beta \, \big \Vert \, c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2^2 \;{\ge }\;\, 0, \end{aligned}$$
(7.72)

where we have introduced the scalar sequence \(\{s^{(k)}\}\), which is clearly bounded from below by zero. We notice that the coefficient \(\beta -2\beta _2\) in (7.72) is positive due to the constraint \(\beta > 2a\). Since the right-hand side of the first inequality in (7.72) is nonnegative, \(\{s^{(k)}\}\) is monotonically non-increasing, hence convergent. This implies that the right-hand side of (7.72) tend to zero as \(k \rightarrow \infty \). From these considerations and (7.72) it follows that:

$$\begin{aligned}&\big \{\bar{\rho }^{(k)}\big \} , \, \big \{\bar{t}^{(k)}\big \} , \, \big \{D \bar{u}^{(k)}\big \} \,\;\mathrm {are}\;\, \mathrm {bounded} \;{\Longrightarrow }\; \big \{\rho ^{(k)}\big \} , \, \big \{t^{(k)}\big \}, \, \big \{D u^{(k)}\big \} \;\mathrm {bounded} \,,\nonumber \\ \end{aligned}$$
(7.73)
$$\begin{aligned}&\lim _{k \rightarrow \infty } \big \Vert \bar{t}^{(k)} - \bar{t}^{(k-1)} \big \Vert _2 = \lim _{k \rightarrow \infty } \big \Vert t^{(k)} - t^{(k-1)} \big \Vert _2 = 0, \end{aligned}$$
(7.74)
$$\begin{aligned}&\lim _{k \rightarrow \infty } \big \Vert \bar{t}^{(k)} - D \bar{u}^{(k)} \big \Vert _2 = \lim _{k \rightarrow \infty } \big \Vert t^{(k)} - D u^{(k)} \big \Vert _2 = 0, \end{aligned}$$
(7.75)
$$\begin{aligned}&\lim _{k \rightarrow \infty } \big \Vert c_1 \bar{t}^{(k)} - c_2 D \bar{u}^{(k)} \big \Vert _2 = 0 . \end{aligned}$$
(7.76)

Since the two coefficients \(c_1\), \(c_2\) in (7.76) satisfy \(c_1, c_2 \ne 0\), \(c_1 \ne c_2\), then it follows from (7.75)–(7.76) that both the sequences \(\{\bar{t}^{(k)}\}\) and \(\{D \bar{u}^{(k)}\}\) tend to zero as \(k \rightarrow \infty \). Results in (7.73)–(7.76) can thus be rewritten in the following more concise and informative form:

$$\begin{aligned}&\big \{\rho ^{(k)}\big \} \,\;\mathrm {is}\;\, \mathrm {bounded}, \end{aligned}$$
(7.77)
$$\begin{aligned}&\lim _{k \rightarrow \infty } \bar{t}^{(k)} = 0 \,\;\;{\Longleftrightarrow }\;\; \lim _{k \rightarrow \infty } t^{(k)} = t^* = D u^*, \end{aligned}$$
(7.78)
$$\begin{aligned}&\lim _{k \rightarrow \infty } D \bar{u}^{(k)} = 0 \,\;\;{\Longleftrightarrow }\;\; \lim _{k \rightarrow \infty } D u^{(k)} = D u^* , \end{aligned}$$
(7.79)

where the last equality in (7.78) comes from the saddle-point properties stated in Theorem 5.7. Since it will be useful later on, we note that it follows from (7.78) that

$$\begin{aligned} \lim _{k \rightarrow \infty } R(t^{(k)}) \,\,{=}\;\, R(t^*). \end{aligned}$$
(7.80)

Convergence results for sequence \(\varvec{u^{(k)}}\).

We now prove that \(\lim _{{k \rightarrow \infty } } u^{(k)} =u^*\). Since \((u^*,t^*;\rho ^*)\) is a saddle point of the augmented Lagrangian functional \(\mathcal {L}(u,t;\rho )\), we have

$$\begin{aligned} \mathcal {L}(u^*,t^*;\rho ^*) \;{\le }\;\, \mathcal {L}(u,t;\rho ^*) \quad \forall \, (u,t) \;{\in }\;\, {\mathbb R}^n {\times }\; {\mathbb R}^{2n}. \end{aligned}$$
(7.81)

By taking \(u = u^{(k)}\), \(t = t^{(k)}\) in (7.81) and recalling the definition of \(\mathcal {L}(u,t;\rho )\) in (5.2), we have:

$$\begin{aligned}&F(u^*) + R(t^*) \;{-}\; \langle \, \rho ^* , \underbrace{t^* - D u^*}_{0} \, \rangle + \frac{\beta }{2} \, \Vert \underbrace{t^* - D u^*}_{0} \Vert _2^2 \nonumber \\&\quad \le F(u^{(k)}) + R(t^{(k)}) \;{-}\; \langle \, \rho ^* , t^{(k)} - D u^{(k)} \, \rangle + \frac{\beta }{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {\Longleftrightarrow }\;\; F(u^*) \le F(u^{(k)}) + R(t^{(k)}) \;{-}\; R(t^*) \nonumber \\&\qquad {-}\; \langle \, \rho ^* , t^{(k)} - D u^{(k)} \, \rangle + \frac{\beta }{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \, . \end{aligned}$$
(7.82)

Taking \(u = u^*\) in (7.48) and \(t = t^*\) in (7.49), we obtain:

$$\begin{aligned}&F(u^*) \;{-}\; \frac{\beta _1}{2} \, \Vert t^{(k-1)} - D u^* \Vert _2^2 \,{-}\, F(u^{(k)}) \,{+}\, \frac{\beta _1}{2} \, \Vert t^{(k-1)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad -\; \Big \langle D^T \Big ( (\beta +\beta _1) \, \big ( t^{(k-1)} - D u^{(k)} \big ) - \rho ^{(k)} \Big ) , \, u^* - u^{(k)} \Big \rangle \,\;{\ge }\;\, 0, \end{aligned}$$
(7.83)
$$\begin{aligned}&R(t^*) + \frac{\beta _2}{2} \, \Vert t^* - D u^{(k)} \Vert _2^2 \;{-}\; R(t^{(k)}) \;{-}\; \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad +\, \big \langle \, (\beta -\beta _2) (t^{(k)} - D u^{(k)}) - \rho ^{(k)}, \, t^* - t^{(k)} \,\, \big \rangle \,\;{\ge }\;\, 0. \end{aligned}$$
(7.84)

By summing up (7.83) and (7.84), we have:

$$\begin{aligned}&F(u^*) \,\;{\ge }\;\, F(u^{(k)}) + R(t^{(k)}) \;{-}\; R(t^*) + \frac{\beta _1}{2} \, \Vert D u^* \Vert _2^2 \;{-}\; \frac{\beta _1}{2} \, \Vert D u^{(k)} \Vert _2^2 \nonumber \\&\quad {-}\; \beta _1 \, \big \langle t^{(k-1)}, \, Du^* - Du^{(k)} \big \rangle \;{-}\; \frac{\beta _2}{2} \, \Vert t^* - D u^{(k)} \Vert _2^2 + \frac{\beta _2}{2} \, \Vert t^{(k)} - D u^{(k)} \Vert _2^2 \nonumber \\&\quad {+}\; \big \langle (\beta +\beta _1) \, \big ( t^{(k-1)} - D u^{(k)} \big ) - \rho ^{(k)} , \, Du^* - Du^{(k)} \big \rangle \nonumber \\&\quad {-}\, \big \langle \, (\beta -\beta _2) (t^{(k)} - D u^{(k)}) - \rho ^{(k)}, \, t^* - t^{(k)} \,\, \big \rangle . \end{aligned}$$
(7.85)

Taking \(\lim \inf \) of (7.82) and \(\lim \sup \) of (7.85), and using the results in (7.77)–(7.80), we have

$$\begin{aligned} \lim \inf \, F(u^{(k)}) \,\;{\ge }\;\,\, F(u^*) \,\;{\ge }\;\,\, \lim \sup \, F(u^{(k)}). \end{aligned}$$
(7.86)

It follows from (7.86) that

$$\begin{aligned} \lim _{k \rightarrow \infty } F(u^{(k)}) = F(u^*). \end{aligned}$$
(7.87)

We now manipulate \(F(u^{(k)})\) as follows:

$$\begin{aligned} F(u^{(k)})= & {} \frac{\lambda }{2} \, \Vert u^{(k)} - b \Vert _2^2 \,{=}\; \frac{\lambda }{2} \, \langle \, u^{(k)} - b, \, u^{(k)} - b \, \rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}+u^*}{2} - b, \, u^{(k)} - b \, \right\rangle + \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}-u^*}{2}, \, u^{(k)} - b \, \right\rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}+u^*}{2} - b, \, \frac{u^{(k)}+u^*}{2} - b \, \right\rangle + \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}+u^*}{2} - b, \, \frac{u^{(k)}-u^*}{2} \, \right\rangle \nonumber \\&{+}\; \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}-u^*}{2}, \, u^{(k)} - b \, \right\rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}+u^*}{2} - b + u^{(k)} - b \, \right\rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \frac{\lambda }{2} \, \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}-u^*}{2} + u^{(k)}+u^* - 2b \, \right\rangle \nonumber \\= & {} \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \frac{\lambda }{2} \, \left\| \frac{u^{(k)}-u^*}{2} \right\| _2^2 \nonumber \\&\quad + \lambda \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}+u^*}{2} - b \, \right\rangle \nonumber \\\ge & {} \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \lambda \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}+u^*}{2} - b \, \right\rangle . \end{aligned}$$
(7.88)

On the other hand, we have that

$$\begin{aligned} \big \langle \, \rho ^*, \, D u^{(k)} - D u^* \, \big \rangle= & {} \big \langle \, \rho ^*, \, D (u^{(k)} - u^*) \, \big \rangle = \big \langle \, D^T \rho ^*, \, u^{(k)} - u^* \, \big \rangle \nonumber \\= & {} \lambda \, \big \langle \, u^* - b, \, u^* - u^{(k)} \, \big \rangle , \end{aligned}$$
(7.89)

where in (7.89) we have used the (optimality) condition (7.30). From (7.88) and (7.89) it follows that

$$\begin{aligned}&F(u^{(k)}) \,+\, \big \langle \, \rho ^*, \, D u^{(k)} - D u^* \, \big \rangle \nonumber \\&\quad \ge \frac{\lambda }{2} \, \left\| \frac{u^{(k)}+u^*}{2} - b \right\| _2^2 + \lambda \left\langle \, \frac{u^{(k)}-u^*}{2}, \, \frac{u^{(k)}+u^*}{2} - b \, \right\rangle \nonumber \\&\qquad +\; \lambda \, \big \langle \, u^* - b, \, u^* - u^{(k)} \, \big \rangle \nonumber \\&\quad = \underbrace{\frac{\lambda }{2} \, \Vert u^* - b \Vert _2^2}_{F(u^*)} \,+\, \frac{3}{8} \, \lambda \, \Vert u^{(k)} - u^* \Vert _2^2, \end{aligned}$$
(7.90)

that is

$$\begin{aligned} F(u^{(k)}) \;{-}\; F(u^*) \,+\, \big \langle \, \rho ^*, \, D u^{(k)} - D u^* \, \big \rangle \,\;{\ge }\;\, \frac{3}{8} \, \lambda \, \Vert u^{(k)} - u^* \Vert _2^2. \end{aligned}$$
(7.91)

Taking the limit for \(k \rightarrow \infty \) of both sides of (7.91) and recalling (7.79) and (7.87), we obtain:

$$\begin{aligned} 0 \,\;{\ge }\;\, \lim _{k \rightarrow \infty } \frac{3}{8} \, \lambda \, \Vert u^{(k)} - u^* \Vert _2^2 \;\;\;{\Longrightarrow }\;\; \lim _{k \rightarrow \infty } u^{(k)} = u^*, \end{aligned}$$
(7.92)

thus completing the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chan, R., Lanza, A., Morigi, S. et al. Convex non-convex image segmentation. Numer. Math. 138, 635–680 (2018). https://doi.org/10.1007/s00211-017-0916-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00211-017-0916-4

Mathematics Subject Classification

Navigation