Abstract
In a Hilbert space \({\mathcal {H}}\), given \(A{:}\;{\mathcal {H}}\rightarrow 2^{\mathcal {H}}\) a maximally monotone operator, we study the convergence properties of a general class of relaxed inertial proximal algorithms. This study aims to extend to the case of the general monotone inclusion \(Ax \ni 0\) the acceleration techniques initially introduced by Nesterov in the case of convex minimization. The relaxed form of the proximal algorithms plays a central role. It comes naturally with the regularization of the operator A by its Yosida approximation with a variable parameter, a technique recently introduced by Attouch–Peypouquet (Math Program Ser B, 2018. https://doi.org/10.1007/s10107-018-1252-x) for a particular class of inertial proximal algorithms. Our study provides an algorithmic version of the convergence results obtained by Attouch–Cabot (J Differ Equ 264:7138–7182, 2018) in the case of continuous dynamical systems.

Similar content being viewed by others
Notes
Note that in [4, Proposition 14], a closely related but different condition has been considered: the difference of the quotients is assumed to be less than or equal to c (and this guarantees (\(K_0\))).
References
Álvarez, F.: Weak convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in Hilbert space. SIAM J. Optim. 14, 773–782 (2004)
Álvarez, F., Attouch, H.: The heavy ball with friction dynamical system for convex constrained minimization problems, Optimization (Namur, 1998), pp. 25–35, Lecture Notes in Economics and Mathematical Systems, vol. 481. Springer, Berlin, (2000)
Álvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set Valued Anal. 9(1–2), 3–11 (2001)
Attouch, H., Cabot, A.: Convergence rates of inertial forward–backward algorithms. SIAM J. Optim. 28, 849–874 (2018)
Attouch, H., Cabot, A.: Convergence of damped inertial dynamics governed by regularized maximally monotone operators. J. Differ. Equ. 264, 7138–7182 (2018)
Attouch, H., Maingé, P.E.: Asymptotic behavior of second-order dissipative evolution equations combining potential with non-potential effects. ESAIM Control Optim. Calc. Var. 17, 836–857 (2010)
Attouch, H., Peypouquet, J.: Convergence of inertial dynamics and proximal algorithms governed by maximal monotone operators. Math. Program. Ser. B. https://doi.org/10.1007/s10107-018-1252-x (2018)
Bauschke, H., Combettes, P.: Convex Analysis and Monotone Operator Theory in Hilbert spaces. CMS Books in Mathematics. Springer, Berlin (2011)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Bertsekas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, New York (1982)
Bot, R.I., Csetnek, E.R.: Second order forward–backward dynamical systems for monotone inclusion problems. SIAM J. Control Optim. 54, 1423–1443 (2016)
Boţ, R.I., Csetnek, E.R., Hendrich, C.: Inertial Douglas–Rachford splitting for monotone inclusion problems. Appl. Math. Comput. 256, 472–487 (2015)
Brézis, H.: Opérateurs maximaux monotones dans les espaces de Hilbert et équations d’évolution. Lecture Notes 5. North Holland (1972)
Brézis, H., Browder, F.E.: Nonlinear ergodic theorems. Bull. Am. Math. Soc. 82(6), 959–961 (1976)
Brézis, H., Lions, P.L.: Produits infinis de résolvantes. Isr. J. Math. 29, 329–345 (1978)
Cabot, A., Frankel, P.: Asymptotics for some proximal-like method involving inertia and memory aspects. Set Valued Var. Anal. 19, 59–74 (2011)
Combettes, P.L., Glaudin, L.E.: Quasinonexpansive iterations on the affine hull of orbits: from Mann’s mean value algorithm to inertial methods. SIAM J. Optim. 27, 2356–2380 (2017)
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55, 293–318 (1992)
Eckstein, J., Ferris, M.C.: Operator-splitting methods for monotone affine variational inequalities, with a parallel application to optimal control. Informs J. Comput. 10, 218–235 (1998)
Iutzeler, F., Hendrickx, J.M.: Generic online acceleration scheme for optimization algorithms via relaxation and inertia. Optim. Methods Softw. (2018) (to appear)
Lorenz, D.A., Pock, T.: An inertial forward–backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51, 311–325 (2015)
Maingé, P.-E.: Convergence theorems for inertial KM-type algorithms. J. Comput. Appl. Math. 219, 223–236 (2008)
Moudafi, A., Oliny, M.: Convergence of a splitting inertial proximal method for monotone operators. J. Comput. Appl. Math. 155, 447–454 (2003)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(O(1/k^2)\). Sov. Math. Dokl. 27, 372–376 (1983)
Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)
Passty, G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. Appl. 72, 383–390 (1979)
Pesquet, J.-C., Pustelnik, N.: A Parallel Inertial Proximal Optimization Method. Pacific Journal of Optimization 8, 273–305 (2012)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A. Yosida regularization
A set-valued mapping A from \({\mathcal {H}}\) to \({\mathcal {H}}\) assigns to each \(x\in {\mathcal {H}}\) a set \(A(x)\subset {\mathcal {H}}\), hence it is a mapping from \({\mathcal {H}}\) to \(2^{\mathcal {H}}\). Every set-valued mapping \(A:{\mathcal {H}}\rightarrow 2^{\mathcal {H}}\) can be identified with its graph defined by
The set \(\{x\in {\mathcal {H}}:\ 0\in A(x)\}\) of the zeros of A is denoted by \({\mathrm{zer}}A\). An operator \(A:{\mathcal {H}}\rightarrow 2^{\mathcal {H}}\) is said to be monotone if for any (x, u), \((y,v)\in {\mathrm{gph}}A\), one has \(\langle y-x, v-u\rangle \ge 0\). It is maximally monotone if there exists no monotone operator whose graph strictly contains \({\mathrm{gph}}A\). If a single-valued operator \(A:{\mathcal {H}}\rightarrow {\mathcal {H}}\) is continuous and monotone, then it is maximally monotone, cf. [13, Proposition 2.4].
Given a maximally monotone operator A and \(\lambda >0\), the resolvent of A with index \(\lambda \) and the Yosida regularization of A with parameter \(\lambda \) are defined by
respectively. The operator \(J_{\lambda A}: {\mathcal {H}}\rightarrow {\mathcal {H}}\) is nonexpansive and everywhere defined (indeed it is firmly non-expansive). Moreover, \(A_{\lambda }\) is \(\lambda \)-cocoercive: for all \(x, y \in {\mathcal {H}}\) we have
This property immediately implies that \(A_{\lambda }: {\mathcal {H}}\rightarrow {\mathcal {H}}\) is \(\frac{1}{\lambda }\)-Lipschitz continuous. Another property that proves useful is the resolvent equation (see, for example, [13, Proposition 2.6] or [8, Proposition 23.6])
which is valid for any \(\lambda , \mu >0\). This property allows to compute simply the resolvent of \(A_\lambda \) by
for any \(\lambda , \mu >0\). Also note that for any \(x \in {\mathcal {H}}\), and any \(\lambda >0\) \( A_\lambda (x) \in A (J_{\lambda A}x)= A( x - \lambda A_\lambda (x)). \) Finally, for any \(\lambda >0\), \(A_{\lambda }\) and A have the same solution set, \({\mathrm{zer}}A_{\lambda }= {\mathrm{zer}}A\). For a detailed presentation of the maximally monotone operators and the Yosida approximation, the reader can consult [8] or [13].
Appendix B. Some auxiliary results
In this section, we present some auxiliary lemmas that are used throughout the paper.
Lemma B.1
Let \((a_k)\), \((\alpha _k)\) and \((w_k)\) be sequences of real numbers satisfying
Assume that \(\alpha _i\ge 0\) for every \(i\ge 1\).
-
(i)
For every \(k\ge 1\), we have
$$\begin{aligned} \sum _{i=1}^k a_i\le t_{1,k}a_1+\sum _{i=1}^{k-1} t_{i+1,k} w_i, \end{aligned}$$(68)where the double sequence \((t_{i,k})\) is defined by (13).
-
(ii)
Under \((K_0)\), assume that the sequence \((t_i)\) defined by (15) satisfies \(\sum _{i=1}^{+\infty }t_{i+1}(w_i)_+<~+\infty \). Then the series \(\sum _{i\ge 1}(a_i)_+\) is convergent, and
$$\begin{aligned}\sum _{i= 1}^{+\infty }(a_i)_+\le t_1(a_1)_+ +\sum _{i=1}^{+\infty } t_{i+1} (w_i)_+.\end{aligned}$$
Proof
\(\mathrm{{(i)}}\) Recall from Lemma 2.4\(\mathrm{{(i)}}\) that \(\alpha _i t_{i+1,k}=t_{i,k}-1\) for every \(i\ge 1\) and \(k\ge i+1\). Multiplying inequality (67) by \(t_{i+1,k}\) gives
or equivalently
By summing from \(i=1\) to \(k-1\), we deduce that
Since \(t_{k,k}=1\), inequality (68) follows immediately.
\(\mathrm{{(ii)}}\) Taking the positive part of each member of (67), we find
By applying \(\mathrm{{(i)}}\) with \((a_i)_+\) (resp. \((w_i)_+\)) in place of \(a_i\) (resp. \(w_i\)), we obtain for every \(k\ge 1\)
because \(t_{i+1,k} \le t_{i+1} \), and \(\sum _{i=1}^{+\infty } t_{i+1} (w_i)_+<+\infty \) by assumption. Then just let k tend to \( + \infty \). \(\square \)
Given a bounded sequence \((x_k)\) of a Banach space \(({\mathcal {X}},\Vert .\Vert )\), the next lemma gives basic properties of the averaged sequence \((\widehat{x}_k)\) defined by (57).
Lemma B.2
Let \(({\mathcal {X}},\Vert .\Vert )\) be a Banach space and let \((x_k)\) be a bounded sequence of \({\mathcal {X}}\). Given a sequence \((\tau _{i,k})_{i,k\ge 1}\) of nonnegative numbers satisfying (55)–(56), let \((\widehat{x}_k)\) be the averaged sequence defined by \(\widehat{x}_k=\sum _{i=1}^{+\infty }\tau _{i,k}x_i\). Then we have
-
(i)
The sequence \((\widehat{x}_k)\) is well-defined, bounded and \(\sup _{k\ge 1}\Vert \widehat{x}_k\Vert \le \sup _{k\ge 1}\Vert x_k\Vert \).
-
(ii)
If \((x_k)\) converges toward \(\overline{x^{}}\in {\mathcal {X}}\), then the sequence \((\widehat{x}_k)\) is also convergent and \(\lim _{k\rightarrow +\infty }\widehat{x}_k=\overline{x^{}}\).
Proof
\(\mathrm{{(i)}}\) Set \(M=\sup _{k\ge 1}\Vert x_k\Vert <+\infty \). In view of (55), observe that for every \(k\ge 1\),
Since the space \({\mathcal {X}}\) is complete, we classically deduce that the series \(\sum _{i\ge 1}\tau _{i,k}x_i\) is convergent. From the definition of \(\widehat{x}_k\), we then have \(\Vert \widehat{x}_k\Vert \le \sum _{i=1}^{+\infty }\tau _{i,k}\Vert x_i\Vert ,\) and hence \(\Vert \widehat{x}_k\Vert \le M\) in view of (69).
\(\mathrm{{(ii)}}\) Assume that \((x_k)\) converges toward \(\overline{x^{}}\in {\mathcal {X}}\). By using (55), we have for every \(k\ge 1\),
Fix \(\varepsilon >0\), and let \(K\ge 1\) such that \(\Vert x_i-\overline{x^{}}\Vert \le \varepsilon \) for every \(i\ge K\). From the above inequality, we obtain
with \(M=\sup _{i\ge 1}\Vert x_i-\overline{x^{}}\Vert <+\infty \). Taking the upper limit as \(k\rightarrow +\infty \), we deduce from (56) that
Since this is true for every \(\varepsilon >0\), we conclude that \(\lim _{k\rightarrow +\infty }\Vert \widehat{x}_k-\overline{x^{}}\Vert =0\). \(\square \)
Rights and permissions
About this article
Cite this article
Attouch, H., Cabot, A. Convergence of a relaxed inertial proximal algorithm for maximally monotone operators. Math. Program. 184, 243–287 (2020). https://doi.org/10.1007/s10107-019-01412-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-019-01412-0
Keywords
- Maximally monotone operators
- Yosida regularization
- Inertial proximal method
- Large step proximal method
- Lyapunov analysis
- (Over)Relaxation