Skip to main content
Log in

Sequential optimality conditions for nonlinear optimization on Riemannian manifolds and a globally convergent augmented Lagrangian method

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

Recently, the approximate Karush–Kuhn–Tucker (AKKT) conditions, also called the sequential optimality conditions, have been proposed for nonlinear optimization in Euclidean spaces, and several methods to find points satisfying such conditions have been developed by researchers. These conditions are known as genuine necessary optimality conditions because all local optima satisfy them with no constraint qualification (CQ). In this paper, we extend the AKKT conditions to nonlinear optimization on Riemannian manifolds and propose an augmented Lagrangian (AL) method that globally converges to points satisfying such conditions. In addition, we prove that the AKKT and KKT conditions are indeed equivalent under a certain CQ. Finally, we examine the effectiveness of the proposed AL method via several numerical experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data availability

Not applicable.

References

  1. Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008)

    Book  Google Scholar 

  2. Aihara, K., Sato, H.: A matrix-free implementation of Riemannian Newton’s method on the Stiefel manifold. Optim Lett. 11, 1729–1741 (2017)

    Article  MathSciNet  Google Scholar 

  3. Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: On augmented Lagrangian methods with general lower-level constraint. SIAM J. Optim. 18, 1286–1309 (2007)

    Article  MathSciNet  Google Scholar 

  4. Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: Augmented Lagrangian methods under the constant positive linear dependence constraint qualification. Math. Program. 111, 5–32 (2008)

    Article  MathSciNet  Google Scholar 

  5. Andreani, R., Fukuda, E.H., Haeser, G., Santos, D.O., Secchin, L.D.: Optimality conditions for nonlinear second-order cone programming and symmetric cone programming. Optimization online (2019)

  6. Andreani, R., Martínez, J.M., Svaiter, B.F.: A new sequential optimality condition for constrained optimization and algorithmic consequences. SIAM J. Optim. 20, 3533–3554 (2010)

    Article  MathSciNet  Google Scholar 

  7. Andreani, R., Haeser, G., Martínez, J.M.: On sequential optimality conditions for smooth constrained optimization. Optimization 60, 627–641 (2011)

    Article  MathSciNet  Google Scholar 

  8. Andreani, R., Haeser, G., Viana, D.S.: Optimality conditions and global convergence for nonlinear semidefinite programming. Math. Program. 180, 203–235 (2020)

    Article  MathSciNet  Google Scholar 

  9. Bergmann, R.: manopt.jl - Optimization on Manifolds in Julia. http://www.manoptjl.org (2019)

  10. Bergmann, R., Herzog, R.: Intrinsic formulation of KKT conditions and constraint qualifications on smooth manifolds. SIAM J. Optim. 29, 2423–2444 (2019)

    Article  MathSciNet  Google Scholar 

  11. Birgin, E., Martínez, J.M.: Practical Augmented Lagrangian Methods for Constrained Optimization. SIAM, Philadelphia (2014)

    Book  Google Scholar 

  12. Boumal, N.: An introduction to optimization on smooth manifolds. http://www.nicolasboumal.net/book (2020)

  13. Boumal, N., Mishra, B., Absil, P.-A., Sepulchre, R.: Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res. 15, 1455–1459 (2014)

    MATH  Google Scholar 

  14. Huang, W., Absil, P.-A., Gallivan, K. A., Hand, P.: ROPTLIB: an object-oriented C++ library for optimization on Riemannian manifolds. ACM Trans. Math. Softw. 44, 43:1–43:21 (2018)

  15. Gallot, S., Hulin, D., Lafontaine, J.: Riemannian Geometry, 3rd edn. Springer, Berlin (2004)

    Book  Google Scholar 

  16. Kanzow, C., Steck, D., Wachsmuth, D.: An augmented Lagrangian method for optimization problems in Banach spaces. SIAM J. Control Optim. 56, 272–291 (2018)

    Article  MathSciNet  Google Scholar 

  17. Lichman, M.: UCI machine learning repository (2013)

  18. Liu, C., Boumal, N.: Simple algorithms for optimization on Riemannian manifolds with constraints. Appl. Math. Optim. 82, 949–981 (2020)

    Article  MathSciNet  Google Scholar 

  19. Martin, S., Raim, A. M., Huang, W., Adragni, K. P.: ManifoldOptim: An R interface to the ROPTLIB library for Riemannian manifold optimization. arXiv preprint arXiv:1612.03930 (2016)

  20. Sato, H.: Riemannian Optimization and Its Applications. Springer Nature, Switzerland (2021)

    Book  Google Scholar 

  21. Sato, H., Aihara, K.: Cholesky QR-based retraction on the generalized Stiefel manifold. Comput. Optim. Appl. 72, 293–308 (2019)

    Article  MathSciNet  Google Scholar 

  22. Sato, H., Iwai, T.: A Riemannian optimization approach to the matrix singular value decomposition. SIAM J. Optim. 23, 188–212 (2013)

    Article  MathSciNet  Google Scholar 

  23. Sato, H., Iwai, T.: Optimization algorithms on the Grassmann manifold with application to matrix eigenvalue problems. Japan J. Indust. Appl. Math. 31, 355–400 (2014)

    Article  MathSciNet  Google Scholar 

  24. Sato, H., Iwai, T.: A new, globally convergent Riemannian conjugate gradient method. Optimization 64, 1011–1031 (2015)

    Article  MathSciNet  Google Scholar 

  25. Sato, H., Kasai, H., Mishra, B.: Riemannian stochastic variance reduced gradient algorithm with retraction and vector transport. SIAM J. Optim. 29, 1444–1472 (2019)

    Article  MathSciNet  Google Scholar 

  26. Townsend, J., Koep, N., Weichwald, S.: Pymanopt: A Python toolbox for optimization on manifolds using automatic differentiation. J. Mach. Learn. Res. 17, 1–5 (2016)

    MathSciNet  MATH  Google Scholar 

  27. Wu, H., Luo, H., Ding, X.: Chen, G: Global convergence of modified augmented Lagrangian methods for nonlinear semidefinite programmings. Comput. Optim. Appl. 56, 531–558 (2013)

    Article  MathSciNet  Google Scholar 

  28. Yamakawa, Y., Okuno, T.: Global convergence of a stabilized sequential quadratic semidefinite programming method for nonlinear semidefinite programs without constraint qualifications, arXiv (2018)

  29. Yang, W.H., Zhang, L.-H., Song, R.: Optimality conditions for the nonlinear programming problems on Riemannian manifolds. Pac. J. Optim. 10, 415–434 (2014)

    MathSciNet  MATH  Google Scholar 

  30. Zhu, X., Sato, H.: Riemannian conjugate gradient methods with inverse retraction. Comput. Optim. Appl. 77, 779–810 (2020)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank two anonymous referees for their valuable comments which gave us a lot of constructive suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuya Yamakawa.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

We present the proof of Lemma 2.

Proof of Lemma 2

We prove this lemma by contradiction. Suppose that \(\mathop{{\rm grad}}\theta (w) \not = 0\). Since the optimal solution \(w \in {{\mathcal {M}}}\) satisfies \(w \in \mathrm{int}B(x^{*}, \delta )\), there exists \(\varepsilon > 0\) such that \(B(w, \varepsilon ) \subset B(x^{*}, \delta )\). Let \(t > 0\) and \(u := - \mathop{{\rm grad}}\theta (w)\). Note that \(u \not = 0\). We consider the Taylor expansion \(\theta (\mathrm{Exp}_{w}(tu)) = \theta (w) + t \langle \mathop{{\rm grad}}\theta (w), u \rangle _{w} + t r(t)\), where \(r : \mathbb {R}_{+} \rightarrow \mathbb {R}\) is a certain function satisfying \(r(t) \rightarrow 0\) as \(t \rightarrow 0\). This result implies that

$$\begin{aligned} \textstyle \theta (\mathrm{Exp}_{w}(tu)) = \theta (w) + t \{ r(t) - \Vert u \Vert _{w}^{2} \}. \end{aligned}$$
(A.1)

Since \(\mathrm{Exp}_{w}(tu) \rightarrow w\) and \(r(t) \rightarrow 0\) as \(t \rightarrow 0\), there exists \(t_{0} > 0\) such that \(\mathrm{Exp}_{w}(t_{0}u) \in B(w, \varepsilon )\) and \(r(t_{0}) \le \frac{1}{2} \Vert u \Vert _{w}^{2}\). Note that \(v := \mathrm{Exp}_{w}(t_{0} u)\) is feasible because \(v \in B(w, \varepsilon ) \subset B(x^{*}, \delta )\). Substituting \(t = t_{0}\) into (A.1) yields \(\theta (v) \le \theta (w) - \frac{t_{0}}{2} \Vert u \Vert _{w}^{2} < \theta (w)\). However, this contradicts the assumption that w is an optimal solution. \(\square \)

Lemma 3 is shown in the following.

Proof of Lemma 3

Let \(S_{+} := \{ j \in \mathbb {N}; \, [\xi ]_{j} \ge 0 \}\), \(S_{-} := \{ j \in \mathbb {N}; \, [\xi ]_{j} < 0 \}\), \(T_{+} := \{ j \in \mathbb {N}; \, [\xi ]_{j} \ge [\zeta ]_{j} \}\), and \(T_{-} := \{ j \in \mathbb {N}; \, [\xi ]_{j} < [\zeta ]_{j} \}\). The left-hand side of the inequality is rewritten as

$$\begin{aligned} \Vert [-\xi ]_{+} \Vert = \left[ \sum _{j=1}^{n} \max \{ -[\xi ]_{j}, 0 \}^{2} \right] ^{\frac{1}{2}} = \left[ \sum _{j \in S_{-}} [\xi ]_{j}^{2} \right] ^{\frac{1}{2}}. \end{aligned}$$
(A.2)

Meanwhile, we obtain

$$\begin{aligned} \Vert \min \{ \xi , \zeta \} \Vert = \left[ \sum _{j=1}^{n} \min \{ [\xi ]_{j}, [\zeta ]_{j} \}^{2} \right] ^{\frac{1}{2}} = \left[ \sum _{j \in T_{+}} [\zeta ]_{j}^{2} + \sum _{j \in T_{-}} [\xi ]_{j}^{2} \right] ^{\frac{1}{2}}. \end{aligned}$$
(A.3)

Note that \(S_{-} \subset T_{-}\) is a sufficient condition under which the desired inequality holds. Indeed, if \(S_{-} \subset T_{-}\), then from (A.2) and (A.3), we have

$$\begin{aligned} \Vert [-\xi ]_{+} \Vert = \left[ \sum _{j \in S_{-}} [\xi ]_{j}^{2} \right] ^{\frac{1}{2}} \le \left[ \sum _{j \in T_{-}} [\xi ]_{j}^{2} \right] ^{\frac{1}{2}} \le \left[ \sum _{j \in T_{+}} [\zeta ]_{j}^{2} + \sum _{j \in T_{-}} [\xi ]_{j}^{2} \right] ^{\frac{1}{2}} = \Vert \min \{ \xi , \zeta \} \Vert . \end{aligned}$$

We take an arbitrary element \(j \in S_{-}\). The definition of \(S_{-}\) and \(\zeta \in \mathbb {R}_{+}^{n}\) imply that \([\xi ]_{j} < 0 \le [\zeta ]_{j}\). Note that the definition of \(T_{-}\) yields \(j \in T_{-}\), i.e., \(S_{-} \subset T_{-}\). Therefore, the proof is completed. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yamakawa, Y., Sato, H. Sequential optimality conditions for nonlinear optimization on Riemannian manifolds and a globally convergent augmented Lagrangian method. Comput Optim Appl 81, 397–421 (2022). https://doi.org/10.1007/s10589-021-00336-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-021-00336-w

Keywords

Navigation