Riemannian proximal gradient methods

Huang, Wen; Wei, Ke

doi:10.1007/s10107-021-01632-3

Riemannian proximal gradient methods

Full Length Paper
Series A
Published: 09 March 2021

Volume 194, pages 371–413, (2022)
Cite this article

Mathematical Programming Submit manuscript

3341 Accesses
1 Altmetric
Explore all metrics

Abstract

In the Euclidean setting the proximal gradient method and its accelerated variants are a class of efficient algorithms for optimization problems with decomposable objective. In this paper, we develop a Riemannian proximal gradient method (RPG) and its accelerated variant (ARPG) for similar problems but constrained on a manifold. The global convergence of RPG is established under mild assumptions, and the O(1/k) is also derived for RPG based on the notion of retraction convexity. If assuming the objective function obeys the Rimannian Kurdyka–Łojasiewicz (KL) property, it is further shown that the sequence generated by RPG converges to a single stationary point. As in the Euclidean setting, local convergence rate can be established if the objective function satisfies the Riemannian KL property with an exponent. Moreover, we show that the restriction of a semialgebraic function onto the Stiefel manifold satisfies the Riemannian KL property, which covers for example the well-known sparse PCA problem. Numerical experiments on random and synthetic data are conducted to test the performance of the proposed RPG and ARPG.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k²)

Article 22 August 2022

An inexact Riemannian proximal gradient method

Article 16 January 2023

On the linear convergence rate of Riemannian proximal gradient method

Article 19 June 2024

Notes

The commonly-used update expression is $x_{k+1}=\arg \min _x\langle \nabla f(x_k),x-x_k\rangle _2+\frac{L}{2}\Vert x-x_k\Vert _2^2+g(x)$. We reformulate it equivalently for the convenience of the Riemannian formulation given later.
Such result can be obtained by noting (i) ${{\,\mathrm{D}\,}}f(x)[\eta ] = {\left\langle P_{{{\,\mathrm{T}\,}}_x {\mathcal {M}}} \nabla f(x),\eta \right\rangle _{{{\,\mathrm{F}\,}}}} = {\left\langle {{\,\mathrm{grad}\,}}f(x),\eta \right\rangle _{x}}$ for [15, (B.2)], and (ii) there exists a constant $\alpha > 0$ such that $\Vert \eta \Vert _{{{\,\mathrm{F}\,}}} \le \alpha \Vert \eta \Vert _x$ for all $x \in {\mathcal {M}}$ by smoothness of the Riemannian metric and compactness of ${\mathcal {M}}$.
The right hand side of (3.12) can be ${}{\kappa _{\varOmega }} \min (\Vert \eta _x\Vert _x^2, \Vert \xi _x\Vert _x^2) \Vert \zeta _y\Vert _y^2$. We use the the form in (3.12) for simplicity.
When the desingularising function has the form $\varsigma (t) = \frac{C}{\theta } t^{\theta }$ for some $C > 0$, $\theta \in (0, 1]$, we say that F satisfies the Riemannian KL property with an exponent $\theta $, as in the Euclidean case.

References

Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008)
Book Google Scholar
Absil, P.A., Mahony, R., Trumpf, J.: An Extrinsic Look at the Riemannian Hessian (2013)
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)
Article MathSciNet Google Scholar
Attouch, H., Bolte, J., Svaiter, B. F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137, 91–129 (2013)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). https://doi.org/10.1137/080716542
Article MathSciNet MATH Google Scholar
Beck, A.: First-Order Methods in Optimization. Society for Industrial and Applied Mathematics, Philadelphia (2017)
Book Google Scholar
Bento, G.C., da Cruz Neto, J. X., Oliveira, P.R.: Convergence of inexact descent methods for nonconvex optimization on Riemannian manifold (2011). arXiv preprint arXiv:1103.4828
Bento, G.C., Ferreira, O.P., Melo, J.G.: Iteration-complexity of gradient, subgradient and proximal point methods on Riemannian manifolds. J. Optim. Theory Appl. 173(2), 548–562 (2017)
Article MathSciNet Google Scholar
Bochnak, J., Coste, M., Roy, M.-F.: Real Algebraic Geometry. Springer, Berlin (1998)
Bolte, J., Daniilidis, A., Lewis, A.: The łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4), 1205–1223 (2007)
Article Google Scholar
Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2), 556–572 (2007)
Article MathSciNet Google Scholar
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)
Article MathSciNet Google Scholar
Boothby, W.M.: An Introduction to Differentiable Manifolds and Riemannian Geometry, 2nd edn. Academic Press, London (1986)
MATH Google Scholar
Boumal, N.: An introduction to optimization on smooth manifolds (2020)
Boumal, N., Absil, P.-A., Cartis, C.: Global rates of convergence for nonconvex optimization on manifolds. IMA J. Numer. Anal. 39(1), 1–33 (2018)
Article MathSciNet Google Scholar
Chen, S., Ma, S., So, A.M.-C., Zhang, T.: Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30(1), 210–239 (2020)
Chen, W., Hui, J., You, Y.: An augmented Lagrangian method for $\ell _{1}$-regularized optimization problems with orthogonality constraints. SIAM J. Sci. Comput. 38(4), B570–B592 (2016)
Article MathSciNet Google Scholar
Daniilidis, A., Deville, R., Durand-Cartagena, E., Rifford, L.: Self-contracted curves in Riemannian manifolds. J. Math. Anal. Appl. 457, 1333–1352 (2018)
Article MathSciNet Google Scholar
Darzentas, J.: Problem Complexity and Method Efficiency in Optimization (1983)
de Carvalho Bento, G., Bitar, S.D.B., da Cruz Neto, J.X., Oliveira, P.R., de Oliveira Souza, J.C.: Computing Riemannian center of mass on Hadamard manifolds. J. Optim. Theory Appl. 183, 977–992 (2019)
do Carmo, M.P.: Riemannian geometry. Mathematics: Theory & Applications (1992)
Ferreira, O.P., Oliveira, P.R.: Proximal point algorithm on Riemannian manifolds. Optimization 51(2), 257–270 (2002)
Article MathSciNet Google Scholar
Genicot, M., Huang, W., Trendafilov, N.T.: Weakly correlated sparse components with nearly orthonormal loadings. In: Geometric Science of Information, pp. 484–490 (2015)
Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156, 59–99 (2016)
Article MathSciNet Google Scholar
Grohs, P., Hosseini, S.: $\epsilon $-subgradient algorithms for locally lipschitz functions on Riemannian manifolds. Adv. Comput. Math. (2015). https://doi.org/10.1007/s10444-015-9426-z
Article MATH Google Scholar
Grohs, P., Hosseini, S.: Nonsmooth trust region algorithms for locally Lipschitz functions on Riemannian manifolds. IMA J. Numer. Anal. (2015). https://doi.org/10.1093/imanum/drv043
Article MATH Google Scholar
Hosseini, S.: Convergence of nonsmooth descent methods via Kurdyka–Łojasiewicz inequality on Riemannian manifolds (2017). INS Preprint No. 1523
Hosseini, S., Huang, W., Yousefpour, R.: Line search algorithms for locally Lipschitz functions on Riemannian manifolds. SIAM J. Optim. 28(1), 596–619 (2018)
Article MathSciNet Google Scholar
Hosseini, S., Pouryayevali, M.R.: Generalized gradients and characterization of epi-Lipschitz sets in Riemannian manifolds. Nonlinear Anal. Theory Methods Appl. 74(12), 3884–3895 (2011)
Article MathSciNet Google Scholar
Hosseini, S., Uschmajew, A.: A Riemannian gradient sampling algorithm for nonsmooth optimization on manifolds. SIAM J. Optim. 27(1), 173–189 (2017)
Article MathSciNet Google Scholar
Huang, W.: Optimization algorithms on Riemannian manifolds with applications. PhD thesis, Florida State University, Department of Mathematics (2013)
Huang, W., Gallivan, K.A., Absil, P.-A.: A Broyden class of quasi-Newton methods for Riemannian optimization. SIAM J. Optim. 25(3), 1660–1685 (2015)
Article MathSciNet Google Scholar
Huang, W., Wei, K.: Extending FISTA to Riemannian optimization for sparse PCA (2019). arXiv:1909.05485
Huang, W., Wei, K.: Riemannian proximal gradient methods (extended version) (2019). arXiv:1909.06065
Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the Lasso. J. Comput. Graph. Stat. 12(3), 531–547 (2003)
Article MathSciNet Google Scholar
Kurdyka, K., Mostowski, T., Adam, P.: Proof of the gradient conjecture of R. Thom. Ann. Math. 152, 763–792 (2000)
Article MathSciNet Google Scholar
Lageman, C.: Convergence of gradient-like dynamical systems and optimization algorithms. PhD thesis, Universitat Wurzburg (2007)
Lai, R., Osher, S.: A splitting method for orthogonality constrained problems. J. Sci. Comput. 58(2), 431–449 (2014)
Article MathSciNet Google Scholar
Lee, J.M.: Introduction to Riemannian Manifolds. Volume 176 of Graduate Texts in Mathematics, 2nd edn. Springer, Berlin (2018)
Book Google Scholar
Li, H., Lin, Z.: Accelerated proximal gradient methods for nonconvex programming. In: International Conference on Neural Information Processing Systems (2015)
Liu, Y., Shang, F., Cheng, J., Cheng, H., Jiao, L.: Accelerated first-order methods for geodesically convex optimization on Riemannian manifolds. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30, pp. 4868–4877. Curran Associates Inc, Red Hook (2017)
Google Scholar
Nesterov, Y.E.: A method for solving the convex programming problem with convergence rate ${O}(1/k^{2})$. Dokl. Akas. Nauk SSSR 269, 543–547 (1983). (in Russian)
Google Scholar
Sjöstrand, K., Clemmensen, L., Larsen, R., Einarsson, G., Ersboll, B.: SpaSM: a matlab toolbox for sparse statistical modeling. J. Stat. Softw. 84(10), 1–37 (2018)
Article Google Scholar
Srivastava, A., Klassen, E.P.: Functional and Shape Data Analysis. Springer, New York (2016)
Book Google Scholar
Tang, J., Liu, H.: Unsupervised feature selection for linked social media data. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 904–912 (2012)
Zhang, H., Sra, S.: First-order methods for geodesically convex optimization. In: Conference on Learning Theory (2016)
Zhang, Y., Lau, Y., Kuo, H.-W., Cheung, S., Pasupathy, A., Wright, J.: On the global geometry of sphere-constrained sparse blind deconvolution. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

Download references

Acknowledgements

The authors would like to thank Zirui Zhou for fruitful discussions on the KL property, and thank Shiqian Ma for kindly sharing their codes with us.

Author information

Authors and Affiliations

School of Mathematical Sciences, Xiamen University, Xiamen, China
Wen Huang
School of Data Science, Fudan University, Shanghai, China
Ke Wei

Authors

Wen Huang
View author publications
You can also search for this author inPubMed Google Scholar
Ke Wei
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Wen Huang or Ke Wei.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Authors are listed alphabetically. WH was partially supported by the Fundamental Research Funds for the Central Universities (No. 20720190060) and the National Natural Science Foundation of China (No. 12001455). KW was partially supported by the NSFC Grant 11801088 and the Shanghai Sailing Program 18YF1401600.

Proofs of Lemmas 6 and 7

1.1 Proof of Lemma 6

Proof

Since R is smooth and therefore $C^2$, the mapping $m: {{\,\mathrm{T}\,}}{\mathcal {M}} \times {\mathbb {R}} \rightarrow {{\,\mathrm{T}\,}}{\mathcal {M}}: (\eta , t) \mapsto \frac{D}{d t} \frac{d}{d t} R\left( t \eta \right) $ is continuous where $\frac{D}{d t}$ denotes the covariant derivative along the curve $t \mapsto R(t \eta )$, see definition of covariant derivative in e.g., [21, Proposition 2.2]. In addition, since the set ${\mathcal {D}} = \{ (\eta _x, t) \mid x \in {\bar{\varOmega }}, \Vert \eta _x\Vert _x = 1, 0 \le t \le \delta _T \}$ is compact, there exists a positive constant $b_2$ such that

$$\begin{aligned} \Vert m(\eta , t)\Vert \le b_2 \end{aligned}$$

(A.1)

for all $(\eta , t) \in {\mathcal {D}}$.

If $\eta _x = 0_x$, then the conclusion holds. Otherwise, let ${\tilde{\eta }}_x = \eta _x / \Vert \eta _x\Vert _x$. Since ${{\,\mathrm{dist}\,}}(x, y)$ is the shortest distance of a curve connecting x and y, we have

$$\begin{aligned} {{\,\mathrm{dist}\,}}(x, y) \le \int _0^{\Vert \eta _x\Vert _x} \left\| \frac{d}{d t} R_{x} \left( t {\tilde{\eta }}_x\right) \right\| _{R_x(t {\tilde{\eta }}_x)} d t, \end{aligned}$$

(A.2)

where the right side is the length of the curve $R_x(t \eta _x)$. Using the Cauchy-Schwarz inequality and the invariance of the metric by the Riemannian affine connection, we have

$$\begin{aligned} \left| \frac{d}{d t} \left\| \frac{d}{d t} R_{x} (t {\tilde{\eta }}_x)\right\| \right|&= \left| \frac{d}{d t} \sqrt{ {\left\langle \frac{d}{d t} R_{x} (t {\tilde{\eta }}_x),\frac{d}{d t} R_{x} (t {\tilde{\eta }}_x) \right\rangle _{}} } \right| = \left| \frac{ {\left\langle \frac{D}{d t} \frac{d}{d t} R_{x} (t {\tilde{\eta }}_x) , \frac{d}{d t} R_{x} (t {\tilde{\eta }}_x) \right\rangle _{}} }{ \left\| \frac{d}{d t} R_{x} (t {\tilde{\eta }}_x) \right\| } \right| \\&\le \left\| \frac{D}{d t} \frac{d}{d t} R_{x} (t {\tilde{\eta }}_x) \right\| \le b_2. \qquad ({\mathrm{by}}~(A.1)) \end{aligned}$$

It follows that

$$\begin{aligned} \int _0^{\Vert \eta _x\Vert _x} \left\| \frac{d}{d t} R_{x} (t {\tilde{\eta }}_x)\right\| _{R_x(t \eta _x)} d t\le & {} \int _0^{\Vert \eta _x\Vert _x} (1 + b_2 t) d t \nonumber \\= & {} \Vert \eta _x\Vert _x + \frac{b_2}{2} \Vert \eta _x\Vert _x^2 \le b_3 \Vert \eta _x\Vert _x, \end{aligned}$$

(A.3)

where $b_3 = 1 + b_2 \delta _T / 2$. Combining (A.2) and (A.3) yields the result. $\square $

1.2 Proof of Lemma 7

Proof

For any $x \in {\bar{\varOmega }}$, there exists a positive constant $\varrho _x$ and a neighborhood ${\mathcal {U}}_x$ of x such that ${\mathcal {U}}_x$ is a totally restrictive set with respect to $\varrho _x$. Since ${\bar{\varOmega }}$ is compact, there exists finite number of $x_i$ such that their totally restrictive sets covering ${\bar{\varOmega }}$, i.e., $\cup _{i = 1}^t {\mathcal {U}}_{x_i} \supset {\bar{\varOmega }}$. Let $\delta = \frac{1}{2} \min (\varrho _{x_i}, i = 1, \ldots , t)$. We have that for any $x \in {\bar{\varOmega }}$, the retraction R is a diffeomorphism on ${\mathbb {B}}(x, 2 \delta )$. Therefore, ${\mathcal {T}}_{R_{\eta _x}}^{\sharp }$ is invertible for any $\eta _x$ satisfying $\Vert \eta _x\Vert _x < 2 \delta $.

Since ${\mathcal {T}}_{R_{\eta _x}}^{-\sharp }$ is smooth with respect to $\eta _x$ and the set $\{\eta _x \mid x \in {{\bar{\varOmega }}}, \Vert \eta _x\Vert \le \delta \}$ is compact, there exists a constant $L_t > 0$ such that

$$\begin{aligned} \Vert {\mathcal {T}}_{R_{\eta _x}}^{-\sharp }\Vert \le L_t, \forall \eta _x \in \{\eta _x \mid x \in {{\bar{\varOmega }}}, \Vert \eta _x\Vert \le \delta \}. \end{aligned}$$

(A.4)

By Lemma 6, there exists a positive constant $\kappa $ such that

$$\begin{aligned} {{\,\mathrm{dist}\,}}(x, R_x(\eta _x)) \le \kappa \Vert \eta _x\Vert _x \end{aligned}$$

(A.5)

for all $x \in {\bar{\varOmega }}$ and for all $\eta _x \in {\mathcal {B}}(0_x, \delta )$. Let ${\tilde{\delta }} = \min (\delta , i({\bar{\varOmega }}) / \kappa )$. For all $\eta _x \in {\mathcal {B}}(0_x, {\tilde{\delta }})$ it holds that

$$\begin{aligned} {{\,\mathrm{dist}\,}}(x, R_x(\eta _x)) \le \kappa \Vert \eta _x\Vert _x \le i({\bar{\varOmega }}). \end{aligned}$$

(A.6)

By the definition of locally Lipschitz continuity of a vector field, we have $\Vert {\mathcal {P}}_{\gamma }^{0 \leftarrow 1} \xi _y - \xi _x\Vert _x \le L_v {{\,\mathrm{dist}\,}}(x, y)$ for any $x, y \in {{\bar{\varOmega }}}$ and ${{\,\mathrm{dist}\,}}(x, y) < i({\bar{\varOmega }})$. Since the parallel translation is isometric, it holds that $\Vert \xi _y - {\mathcal {P}}_{\gamma }^{1 \leftarrow 0} \xi _x\Vert _y \le L_v {{\,\mathrm{dist}\,}}(x, y)$. Using (A.5) and (A.6) yields

$$\begin{aligned} \Vert \xi _y - {\mathcal {P}}_{\gamma }^{1 \leftarrow 0} \xi _x\Vert _x \le L_v {{\,\mathrm{dist}\,}}(x, y) \le L_v \kappa \Vert \eta _x\Vert _x, \end{aligned}$$

(A.7)

for all $\eta _x \in {\mathcal {B}}(0_x, {\tilde{\delta }})$, where $y = R_x(\eta _x)$.

By [32, Lemma 3.5], for any ${\bar{x}} \in {\mathcal {M}}$, there exists a neighborhood ${\mathcal {U}}_{{\bar{x}}}$ of ${\bar{x}}$ and a positive number $L_{{\bar{x}}}$ such that for all $x, y \in {\mathcal {U}}_{{\bar{x}}}$ it holds that

$$\begin{aligned} \Vert {\mathcal {P}}_{\gamma }^{1 \rightarrow 0} \xi _x - {\mathcal {T}}_{\eta _x}^{-\sharp } \xi _x\Vert _y \le L_x \Vert \xi _x\Vert _x \Vert \eta _x\Vert _x. \end{aligned}$$

Since ${{\bar{\varOmega }}}$ is compact, there exist finite number of ${\bar{x}}$, denoted by ${\bar{x}}_1, \ldots , {\bar{x}}_t$, such that $\cup _{i=1}^t {\mathcal {U}}_{{\bar{x}}_i} \supset \varOmega $. Let $L_{cc}$ denote $\max (L_{{\bar{x}}_i}, i = 1, \ldots , t)$, and $\sigma = \sup _{r}\left\{ r \in {\mathbb {R}} \mid \exists i, \hbox { such that } {\mathbb {B}}(z, r) \subseteq {\mathcal {U}}_{{\bar{x}}_i} \forall z \in {{\bar{\varOmega }}} \right\} $. Since the number of ${\bar{x}}_i$ is finite, we have $L_{cc} < \infty $ and $\sigma > 0$. Therefore, for any $x, y \in {{\bar{\varOmega }}}$ satisfying ${{\,\mathrm{dist}\,}}(x, y) < \sigma $, it holds that

$$\begin{aligned} \Vert {\mathcal {P}}_{\gamma }^{1 \rightarrow 0} \xi _x - {\mathcal {T}}_{\eta _x}^{-\sharp } \xi _x\Vert _y \le L_{cc} \Vert \xi _x\Vert _x \Vert \eta _x\Vert _x. \end{aligned}$$

(A.8)

Note that $\Vert \eta _x\Vert _x < \sigma / \kappa $ implies ${{\,\mathrm{dist}\,}}(x, y) < \sigma $ by (A.5). It follows from (A.4), (A.7) and (A.8) that for any $x, y \in {{\bar{\varOmega }}}$ satisfying $\Vert \eta _x\Vert _x < \min (\sigma / \kappa , {\tilde{\delta }})$,

$$\begin{aligned} \Vert \xi _y - {\mathcal {T}}_{\eta _x}^{-\sharp } (\xi _x + a \eta _x)\Vert _y\le & {} \Vert \xi _y - {\mathcal {P}}_{\gamma }^{1 \leftarrow 0} \xi _x\Vert _y + \Vert {\mathcal {P}}_{\gamma }^{1 \leftarrow 0} \xi _x - {\mathcal {T}}_{\eta _x}^{-\sharp } \xi _x\Vert _y + \Vert {\mathcal {T}}_{\eta _x}^{-\sharp } a \eta _x\Vert _y \\\le & {} L_c \Vert \eta _x\Vert _x, \end{aligned}$$

where $L_{c} = L_v \kappa + L_{cc} \sup _{x \in {{\bar{\varOmega }}}} \Vert \xi _x\Vert _x + a L_t$.$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, W., Wei, K. Riemannian proximal gradient methods. Math. Program. 194, 371–413 (2022). https://doi.org/10.1007/s10107-021-01632-3

Download citation

Received: 16 September 2019
Accepted: 16 February 2021
Published: 09 March 2021
Issue Date: July 2022
DOI: https://doi.org/10.1007/s10107-021-01632-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Riemannian proximal gradient methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k2)

An inexact Riemannian proximal gradient method

On the linear convergence rate of Riemannian proximal gradient method

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Proofs of Lemmas 6 and 7

Proofs of Lemmas 6 and 7

1.1 Proof of Lemma 6

Proof

1.2 Proof of Lemma 7

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k²)