Convergence of a relaxed inertial proximal algorithm for maximally monotone operators

Attouch, Hedy; Cabot, Alexandre

doi:10.1007/s10107-019-01412-0

Convergence of a relaxed inertial proximal algorithm for maximally monotone operators

Full Length Paper
Series A
Published: 29 June 2019

Volume 184, pages 243–287, (2020)
Cite this article

Mathematical Programming Submit manuscript

Hedy Attouch¹ &
Alexandre Cabot²

1479 Accesses
46 Citations
Explore all metrics

Abstract

In a Hilbert space ${\mathcal {H}}$, given $A{:}\;{\mathcal {H}}\rightarrow 2^{\mathcal {H}}$ a maximally monotone operator, we study the convergence properties of a general class of relaxed inertial proximal algorithms. This study aims to extend to the case of the general monotone inclusion $Ax \ni 0$ the acceleration techniques initially introduced by Nesterov in the case of convex minimization. The relaxed form of the proximal algorithms plays a central role. It comes naturally with the regularization of the operator A by its Yosida approximation with a variable parameter, a technique recently introduced by Attouch–Peypouquet (Math Program Ser B, 2018. https://doi.org/10.1007/s10107-018-1252-x) for a particular class of inertial proximal algorithms. Our study provides an algorithmic version of the convergence results obtained by Attouch–Cabot (J Differ Equ 264:7138–7182, 2018) in the case of continuous dynamical systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convergence Rate of Proximal Inertial Algorithms Associated with Moreau Envelopes of Convex Functions

Convergence Rate of Inertial Proximal Algorithms with General Extrapolation and Proximal Coefficients

Article 29 April 2020

Strong Convergence of Two Proximal Point Algorithms with Possible Unbounded Error Sequences

Article 19 October 2016

Notes

Note that in [4, Proposition 14], a closely related but different condition has been considered: the difference of the quotients is assumed to be less than or equal to c (and this guarantees ($K_0$)).

References

Álvarez, F.: Weak convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in Hilbert space. SIAM J. Optim. 14, 773–782 (2004)
Article MathSciNet Google Scholar
Álvarez, F., Attouch, H.: The heavy ball with friction dynamical system for convex constrained minimization problems, Optimization (Namur, 1998), pp. 25–35, Lecture Notes in Economics and Mathematical Systems, vol. 481. Springer, Berlin, (2000)
Álvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set Valued Anal. 9(1–2), 3–11 (2001)
Article MathSciNet Google Scholar
Attouch, H., Cabot, A.: Convergence rates of inertial forward–backward algorithms. SIAM J. Optim. 28, 849–874 (2018)
Article MathSciNet Google Scholar
Attouch, H., Cabot, A.: Convergence of damped inertial dynamics governed by regularized maximally monotone operators. J. Differ. Equ. 264, 7138–7182 (2018)
Article MathSciNet Google Scholar
Attouch, H., Maingé, P.E.: Asymptotic behavior of second-order dissipative evolution equations combining potential with non-potential effects. ESAIM Control Optim. Calc. Var. 17, 836–857 (2010)
Article MathSciNet Google Scholar
Attouch, H., Peypouquet, J.: Convergence of inertial dynamics and proximal algorithms governed by maximal monotone operators. Math. Program. Ser. B. https://doi.org/10.1007/s10107-018-1252-x (2018)
Bauschke, H., Combettes, P.: Convex Analysis and Monotone Operator Theory in Hilbert spaces. CMS Books in Mathematics. Springer, Berlin (2011)
Book Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet Google Scholar
Bertsekas, D.P.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, New York (1982)
MATH Google Scholar
Bot, R.I., Csetnek, E.R.: Second order forward–backward dynamical systems for monotone inclusion problems. SIAM J. Control Optim. 54, 1423–1443 (2016)
MathSciNet MATH Google Scholar
Boţ, R.I., Csetnek, E.R., Hendrich, C.: Inertial Douglas–Rachford splitting for monotone inclusion problems. Appl. Math. Comput. 256, 472–487 (2015)
MathSciNet MATH Google Scholar
Brézis, H.: Opérateurs maximaux monotones dans les espaces de Hilbert et équations d’évolution. Lecture Notes 5. North Holland (1972)
Brézis, H., Browder, F.E.: Nonlinear ergodic theorems. Bull. Am. Math. Soc. 82(6), 959–961 (1976)
MathSciNet MATH Google Scholar
Brézis, H., Lions, P.L.: Produits infinis de résolvantes. Isr. J. Math. 29, 329–345 (1978)
MATH Google Scholar
Cabot, A., Frankel, P.: Asymptotics for some proximal-like method involving inertia and memory aspects. Set Valued Var. Anal. 19, 59–74 (2011)
Article MathSciNet Google Scholar
Combettes, P.L., Glaudin, L.E.: Quasinonexpansive iterations on the affine hull of orbits: from Mann’s mean value algorithm to inertial methods. SIAM J. Optim. 27, 2356–2380 (2017)
Article MathSciNet Google Scholar
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55, 293–318 (1992)
Article MathSciNet Google Scholar
Eckstein, J., Ferris, M.C.: Operator-splitting methods for monotone affine variational inequalities, with a parallel application to optimal control. Informs J. Comput. 10, 218–235 (1998)
Article MathSciNet Google Scholar
Iutzeler, F., Hendrickx, J.M.: Generic online acceleration scheme for optimization algorithms via relaxation and inertia. Optim. Methods Softw. (2018) (to appear)
Lorenz, D.A., Pock, T.: An inertial forward–backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51, 311–325 (2015)
MathSciNet MATH Google Scholar
Maingé, P.-E.: Convergence theorems for inertial KM-type algorithms. J. Comput. Appl. Math. 219, 223–236 (2008)
MathSciNet MATH Google Scholar
Moudafi, A., Oliny, M.: Convergence of a splitting inertial proximal method for monotone operators. J. Comput. Appl. Math. 155, 447–454 (2003)
MathSciNet MATH Google Scholar
Nesterov, Y.: A method of solving a convex programming problem with convergence rate $O(1/k^2)$. Sov. Math. Dokl. 27, 372–376 (1983)
MATH Google Scholar
Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)
MathSciNet MATH Google Scholar
Passty, G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. Appl. 72, 383–390 (1979)
MathSciNet MATH Google Scholar
Pesquet, J.-C., Pustelnik, N.: A Parallel Inertial Proximal Optimization Method. Pacific Journal of Optimization 8, 273–305 (2012)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut Montpelliérain Alexander Grothendieck, UMR 5149 CNRS, Université Montpellier, Place Eugène Bataillon, 34095, Montpellier Cedex 5, France
Hedy Attouch
Institut de Mathématiques de Bourgogne, UMR 5584, CNRS, Univ. Bourgogne Franche-Comté, 21000, Dijon, France
Alexandre Cabot

Authors

Hedy Attouch
View author publications
You can also search for this author inPubMed Google Scholar
Alexandre Cabot
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Alexandre Cabot.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. Yosida regularization

A set-valued mapping A from ${\mathcal {H}}$ to ${\mathcal {H}}$ assigns to each $x\in {\mathcal {H}}$ a set $A(x)\subset {\mathcal {H}}$, hence it is a mapping from ${\mathcal {H}}$ to $2^{\mathcal {H}}$. Every set-valued mapping $A:{\mathcal {H}}\rightarrow 2^{\mathcal {H}}$ can be identified with its graph defined by

$$\begin{aligned}{\mathrm{gph}}A=\{(x,u)\in {\mathcal {H}}\times {\mathcal {H}}: \, u\in A(x)\}.\end{aligned}$$

The set $\{x\in {\mathcal {H}}:\ 0\in A(x)\}$ of the zeros of A is denoted by ${\mathrm{zer}}A$. An operator $A:{\mathcal {H}}\rightarrow 2^{\mathcal {H}}$ is said to be monotone if for any (x, u), $(y,v)\in {\mathrm{gph}}A$, one has $\langle y-x, v-u\rangle \ge 0$. It is maximally monotone if there exists no monotone operator whose graph strictly contains ${\mathrm{gph}}A$. If a single-valued operator $A:{\mathcal {H}}\rightarrow {\mathcal {H}}$ is continuous and monotone, then it is maximally monotone, cf. [13, Proposition 2.4].

Given a maximally monotone operator A and $\lambda >0$, the resolvent of A with index $\lambda $ and the Yosida regularization of A with parameter $\lambda $ are defined by

$$\begin{aligned} J_{\lambda A} = \left( I + \lambda A \right) ^{-1}\qquad \hbox {and}\qquad A_{\lambda } = \frac{1}{\lambda } \left( I- J_{\lambda A} \right) , \end{aligned}$$

respectively. The operator $J_{\lambda A}: {\mathcal {H}}\rightarrow {\mathcal {H}}$ is nonexpansive and everywhere defined (indeed it is firmly non-expansive). Moreover, $A_{\lambda }$ is $\lambda $-cocoercive: for all $x, y \in {\mathcal {H}}$ we have

$$\begin{aligned} \langle A_{\lambda }y - A_{\lambda }x, y-x\rangle \ge \lambda \Vert A_{\lambda }y - A_{\lambda }x \Vert ^2 . \end{aligned}$$

This property immediately implies that $A_{\lambda }: {\mathcal {H}}\rightarrow {\mathcal {H}}$ is $\frac{1}{\lambda }$-Lipschitz continuous. Another property that proves useful is the resolvent equation (see, for example, [13, Proposition 2.6] or [8, Proposition 23.6])

$$\begin{aligned} (A_\lambda )_{\mu }= A_{(\lambda +\mu )}, \end{aligned}$$

which is valid for any $\lambda , \mu >0$. This property allows to compute simply the resolvent of $A_\lambda $ by

$$\begin{aligned} J_{\mu A_\lambda } = \frac{\lambda }{\lambda + \mu }I + \frac{\mu }{\lambda + \mu }J_{(\lambda + \mu )A}, \end{aligned}$$

for any $\lambda , \mu >0$. Also note that for any $x \in {\mathcal {H}}$, and any $\lambda >0$ $ A_\lambda (x) \in A (J_{\lambda A}x)= A( x - \lambda A_\lambda (x)). $ Finally, for any $\lambda >0$, $A_{\lambda }$ and A have the same solution set, ${\mathrm{zer}}A_{\lambda }= {\mathrm{zer}}A$. For a detailed presentation of the maximally monotone operators and the Yosida approximation, the reader can consult [8] or [13].

Appendix B. Some auxiliary results

In this section, we present some auxiliary lemmas that are used throughout the paper.

Lemma B.1

Let $(a_k)$, $(\alpha _k)$ and $(w_k)$ be sequences of real numbers satisfying

$$\begin{aligned} a_{i+1}\le \alpha _i a_i+w_i \quad \text{ for } \text{ every } i\ge 1. \end{aligned}$$

(67)

Assume that $\alpha _i\ge 0$ for every $i\ge 1$.

(i)
For every $k\ge 1$, we have
$$\begin{aligned} \sum _{i=1}^k a_i\le t_{1,k}a_1+\sum _{i=1}^{k-1} t_{i+1,k} w_i, \end{aligned}$$
(68)
where the double sequence $(t_{i,k})$ is defined by (13).
(ii)
Under $(K_0)$, assume that the sequence $(t_i)$ defined by (15) satisfies $\sum _{i=1}^{+\infty }t_{i+1}(w_i)_+<~+\infty $. Then the series $\sum _{i\ge 1}(a_i)_+$ is convergent, and
$$\begin{aligned}\sum _{i= 1}^{+\infty }(a_i)_+\le t_1(a_1)_+ +\sum _{i=1}^{+\infty } t_{i+1} (w_i)_+.\end{aligned}$$

Proof

$\mathrm{{(i)}}$ Recall from Lemma 2.4$\mathrm{{(i)}}$ that $\alpha _i t_{i+1,k}=t_{i,k}-1$ for every $i\ge 1$ and $k\ge i+1$. Multiplying inequality (67) by $t_{i+1,k}$ gives

$$\begin{aligned}t_{i+1,k} a_{i+1}\le (t_{i,k}-1) a_i+t_{i+1,k}w_i,\end{aligned}$$

or equivalently

$$\begin{aligned}a_i\le (t_{i,k}a_i-t_{i+1,k} a_{i+1}) +t_{i+1,k}w_i.\end{aligned}$$

By summing from $i=1$ to $k-1$, we deduce that

$$\begin{aligned}\sum _{i=1}^{k-1} a_i\le t_{1,k}a_1-t_{k,k}a_k+\sum _{i=1}^{k-1} t_{i+1,k} w_i.\end{aligned}$$

Since $t_{k,k}=1$, inequality (68) follows immediately.

$\mathrm{{(ii)}}$ Taking the positive part of each member of (67), we find

$$\begin{aligned}(a_{i+1})_+\le \alpha _i (a_i)_+ +(w_i)_+ .\end{aligned}$$

By applying $\mathrm{{(i)}}$ with $(a_i)_+$ (resp. $(w_i)_+$) in place of $a_i$ (resp. $w_i$), we obtain for every $k\ge 1$

$$\begin{aligned} \sum _{i=1}^k (a_i)_+\le & {} t_{1,k}(a_1)_+ +\sum _{i=1}^{k-1} t_{i+1,k} (w_i)_+ \le t_{1}(a_1)_+ +\sum _{i=1}^{+\infty } t_{i+1} (w_i)_+<+\infty , \end{aligned}$$

because $t_{i+1,k} \le t_{i+1} $, and $\sum _{i=1}^{+\infty } t_{i+1} (w_i)_+<+\infty $ by assumption. Then just let k tend to $ + \infty $. $\square $

Given a bounded sequence $(x_k)$ of a Banach space $({\mathcal {X}},\Vert .\Vert )$, the next lemma gives basic properties of the averaged sequence $(\widehat{x}_k)$ defined by (57).

Lemma B.2

Let $({\mathcal {X}},\Vert .\Vert )$ be a Banach space and let $(x_k)$ be a bounded sequence of ${\mathcal {X}}$. Given a sequence $(\tau _{i,k})_{i,k\ge 1}$ of nonnegative numbers satisfying (55)–(56), let $(\widehat{x}_k)$ be the averaged sequence defined by $\widehat{x}_k=\sum _{i=1}^{+\infty }\tau _{i,k}x_i$. Then we have

(i)
The sequence $(\widehat{x}_k)$ is well-defined, bounded and $\sup _{k\ge 1}\Vert \widehat{x}_k\Vert \le \sup _{k\ge 1}\Vert x_k\Vert $.
(ii)
If $(x_k)$ converges toward $\overline{x^{}}\in {\mathcal {X}}$, then the sequence $(\widehat{x}_k)$ is also convergent and $\lim _{k\rightarrow +\infty }\widehat{x}_k=\overline{x^{}}$.

Proof

$\mathrm{{(i)}}$ Set $M=\sup _{k\ge 1}\Vert x_k\Vert <+\infty $. In view of (55), observe that for every $k\ge 1$,

$$\begin{aligned} \sum _{i=1}^{+\infty }\tau _{i,k}\Vert x_i\Vert \le M\, \sum _{i=1}^{+\infty }\tau _{i,k}=M. \end{aligned}$$

(69)

Since the space ${\mathcal {X}}$ is complete, we classically deduce that the series $\sum _{i\ge 1}\tau _{i,k}x_i$ is convergent. From the definition of $\widehat{x}_k$, we then have $\Vert \widehat{x}_k\Vert \le \sum _{i=1}^{+\infty }\tau _{i,k}\Vert x_i\Vert ,$ and hence $\Vert \widehat{x}_k\Vert \le M$ in view of (69).

$\mathrm{{(ii)}}$ Assume that $(x_k)$ converges toward $\overline{x^{}}\in {\mathcal {X}}$. By using (55), we have for every $k\ge 1$,

$$\begin{aligned} \Vert \widehat{x}_k-\overline{x^{}}\Vert= & {} \left\| \sum _{i=1}^{+\infty }\tau _{i,k}(x_i-\overline{x^{}})\right\| \le \sum _{i=1}^{+\infty }\tau _{i,k}\Vert x_i-\overline{x^{}}\Vert . \end{aligned}$$

Fix $\varepsilon >0$, and let $K\ge 1$ such that $\Vert x_i-\overline{x^{}}\Vert \le \varepsilon $ for every $i\ge K$. From the above inequality, we obtain

$$\begin{aligned} \Vert \widehat{x}_k-\overline{x^{}}\Vert\le & {} \left( \sup _{i\in \{1,\ldots ,K\}}\Vert x_i-\overline{x^{}}\Vert \right) \left( \sum _{i=1}^K\tau _{i,k}\right) +\varepsilon \, \sum _{i=K+1}^{+\infty }\tau _{i,k} \le M\,\sum _{i=1}^K\tau _{i,k} +\varepsilon , \end{aligned}$$

with $M=\sup _{i\ge 1}\Vert x_i-\overline{x^{}}\Vert <+\infty $. Taking the upper limit as $k\rightarrow +\infty $, we deduce from (56) that

$$\begin{aligned}\limsup _{k\rightarrow +\infty }\Vert \widehat{x}_k-\overline{x^{}}\Vert \le \varepsilon .\end{aligned}$$

Since this is true for every $\varepsilon >0$, we conclude that $\lim _{k\rightarrow +\infty }\Vert \widehat{x}_k-\overline{x^{}}\Vert =0$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Attouch, H., Cabot, A. Convergence of a relaxed inertial proximal algorithm for maximally monotone operators. Math. Program. 184, 243–287 (2020). https://doi.org/10.1007/s10107-019-01412-0

Download citation

Received: 13 February 2018
Accepted: 24 June 2019
Published: 29 June 2019
Issue Date: November 2020
DOI: https://doi.org/10.1007/s10107-019-01412-0

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convergence of a relaxed inertial proximal algorithm for maximally monotone operators

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Convergence Rate of Proximal Inertial Algorithms Associated with Moreau Envelopes of Convex Functions

Convergence Rate of Inertial Proximal Algorithms with General Extrapolation and Proximal Coefficients

Strong Convergence of Two Proximal Point Algorithms with Possible Unbounded Error Sequences

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A. Yosida regularization

Appendix B. Some auxiliary results

Lemma B.1

Proof

Lemma B.2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now