Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity

Ahookhosh, Masoud

doi:10.1007/s00186-019-00674-w

Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity

Original Article
Published: 26 June 2019

Volume 89, pages 319–353, (2019)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Masoud Ahookhosh ORCID: orcid.org/0000-0003-4206-9789^1,2

739 Accesses
6 Citations
Explore all metrics

Abstract

We introduce four accelerated (sub)gradient algorithms (ASGA) for solving several classes of convex optimization problems. More specifically, we propose two estimation sequences majorizing the objective function and develop two iterative schemes for each of them. In both cases, the first scheme requires the smoothness parameter and a Hölder constant, while the second scheme is parameter-free (except for the strong convexity parameter which we set zero if it is not available) at the price of applying a finitely terminated backtracking line search. The proposed algorithms attain the optimal complexity for smooth problems with Lipschitz continuous gradients, nonsmooth problems with bounded variation of subgradients, and weakly smooth problems with Hölder continuous gradients. Further, for strongly convex problems, they are optimal for smooth problems while nearly optimal for nonsmooth and weakly smooth problems. Finally, numerical results for some applications in sparse optimization and machine learning are reported, which confirm the theoretical foundations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A family of spectral gradient methods for optimization

Article 06 May 2019

Accelerated Primal-Dual Gradient Descent with Linesearch for Convex, Nonconvex, and Nonsmooth Optimization Problems

Article 01 March 2019

Inexact Reduced Gradient Methods in Nonconvex Optimization

Article 19 October 2023

References

Ahookhosh M (2015) High-dimensional nonsmooth convex optimization via optimal subgradient methods. Ph.D. thesis, University of Vienna
Ahookhosh M (2018) Optimal subgradient methods: computational properties for large-scale linear inverse problems. Optim Eng 19(4):815–844
Article MathSciNet Google Scholar
Ahookhosh M, Ghederi S (2017) On efficiency of nonmonotone Armijo-type line searches. Appl Math Model 43:170–190
Article MathSciNet Google Scholar
Ahookhosh M, Neumaier A (2017) An optimal subgradient algorithm for large-scale bound-constrained convex optimization. Math Methods Oper Res 86(1):123–147
Article MathSciNet MATH Google Scholar
Ahookhosh M, Neumaier A (2017) Optimal subgradient algorithms for large-scale convex optimization in simple domains. Numer Algorithms 76(4):1071–1097
Article MathSciNet MATH Google Scholar
Ahookhosh M, Neumaier A (2018) Solving nonsmooth convex optimization with complexity $\cal{O}^{-1/2}$. TOP 26(1):110–145
MathSciNet MATH Google Scholar
Ahookhosh M, Themelis A, Patrinos P (2019) Bregman forward-backward splitting for nonconvex composite optimization: superlinear convergence to nonisolated critical points. arXiv:1905.11904
Amini K, Ahookhosh M, Nosratipour H (2014) An inexact line search approach using modified nonmonotone strategy for unconstrained optimization. Numer Algorithms 66:49–78
Article MathSciNet MATH Google Scholar
Auslender A, Teboulle M (2006) Interior gradient and proximal methods for convex and conic optimization. SIAM J Optim 16:697–725
Article MathSciNet MATH Google Scholar
Baes M (2009) Estimate sequence methods: extensions and approximations. IFOR Internal report, ETH, Zurich, Switzerland
Baes M, Bürgisser M (2014) An acceleration procedure for optimal first-order methods. Optim Methods Softw 9(3):610–628
Article MathSciNet MATH Google Scholar
Bauschke HH, Bolte J, Teboulle M (2016) A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math Oper Res 42(2):330–348
Article MathSciNet MATH Google Scholar
Beck A, Teboulle M (2003) Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper Res Lett 31:167–175
Article MathSciNet MATH Google Scholar
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2:183–202
Article MathSciNet MATH Google Scholar
Becker SR, Candès EJ, Grant MC (2011) Templates for convex cone problems with applications to sparse signal recovery. Math Program Comput 3:165–218
Article MathSciNet MATH Google Scholar
Boyd S, Xiao L, Mutapcic A (2003) Subgradient methods. http://www.stanford.edu/class/ee392o/subgrad_method.pdf
Chen Y, Lan G, Ouyang Y (2014) Optimal primal-dual methods for a class of saddle point problems. SIAM J Optim 24(4):1779–1814
Article MathSciNet MATH Google Scholar
Chen Y, Lan G, Ouyang Y (2015) An accelerated linearized alternating direction method of multipliers. SIAM J Imaging Sci 8(1):644–681
Article MathSciNet MATH Google Scholar
Devolder O, Glineur F, Nesterov Y (2013) First-order methods with inexact oracle: the strongly convex case. CORE Discussion Paper 2013/16
Devolder O, Glineur F, Nesterov Y (2014) First-order methods of smooth convex optimization with inexact oracle. Math Program 146:37–75
Article MathSciNet MATH Google Scholar
Ghadimi S (2019) Conditional gradient type methods for composite nonlinear and stochastic optimization. Math Program 173:431–464
Article MathSciNet MATH Google Scholar
Ghadimi S, Lan G, Zhang H (2019) Generalized uniformly optimal methods for nonlinear programming. J Sci Comput. https://doi.org/10.1007/s10915-019-00915-4
MathSciNet Google Scholar
Golub T, Slonim D, Tamayo P, Huard C, Gaasenbeek M, Mesirov J, Coller H, Loh M, Downing J, Caligiuri M (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–536
Article Google Scholar
Gonzaga CC, Karas EW (2013) Fine tuning Nesterov’s steepest descent algorithm for differentiable convex programming. Math Program 138:141–166
Article MathSciNet MATH Google Scholar
Gonzaga CC, Karas EW, Rossetto DR (2013) An optimal algorithm for constrained differentiable convex optimization. SIAM J Optim 23(4):1939–1955
Article MathSciNet MATH Google Scholar
Hanzely F, Richtarik P, Xiao L (2018) Accelerated Bregman proximal gradient methods for relatively smooth convex optimization. arXiv:1808.03045
http://www.broad.mit.edu/cgi-bin/cancer/publications/pub_paper.cgi?mode=view&paper_id=43
Juditsky A, Nesterov Y (2014) Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization. Stoch Syst 4(1):44–80
Article MathSciNet MATH Google Scholar
Lan G (2010) An optimal method for stochastic composite optimization. Math Program 133:365–397
Article MathSciNet MATH Google Scholar
Lan G (2015) Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math Program 149:1–45
Article MathSciNet MATH Google Scholar
Lan G, Lu Z, Monteiro RDC (2011) Primal-dual first-order methods with $O(1/\varepsilon )$ iteration-complexity for cone programming. Math Program 126:1–29
Article MathSciNet MATH Google Scholar
Lu H, Freund R, Nesterov Y (2018) Relatively smooth convex optimization by first-order methods, and applications. SIAM J Optim 28(1):333–354
Article MathSciNet MATH Google Scholar
Nemirovskii AS, Nesterov YE (1985) Optimal methods of smooth convex minimization. USSR Comput Math Math Phys 25(2):21–30
Article MathSciNet Google Scholar
Nemirovskii AS, Nesterov Y (1985) Optimal methods of smooth convex minimization. USSR Comput Math Math Phys 25(2):21–30
Article MathSciNet Google Scholar
Nemirovsky AS, Yudin DB (1983) Problem complexity and method efficiency in optimization. Wiley, New York
Google Scholar
Nesterov Y (1983) A method of solving a convex programming problem with convergence rate $O(1/k^2)$, Doklady AN SSSR (In Russian), 269 543–547. English translation: Soviet Math. Dokl., 27, 372–376 (1983)
Nesterov Y (2004) Introductory lectures on convex optimization: a basic course. Kluwer, Dordrecht
Book MATH Google Scholar
Nesterov Y (2005) Smooth minimization of non-smooth functions. Math Program 103:127–152
Article MathSciNet MATH Google Scholar
Nesterov Y (2005) Excessive gap technique in nonsmooth convex minimization. SIAM J Optim 16:235–249
Article MathSciNet MATH Google Scholar
Nesterov Y (2013) Gradient methods for minimizing composite objective function. Math Program 140:125–161
Article MathSciNet MATH Google Scholar
Nesterov Y (2015) Universal gradient methods for convex optimization problems. Math Program 152:381–404
Article MathSciNet MATH Google Scholar
Nesterov Y (2018) Complexity bounds for primal-dual methods minimizing the model of objective function. Math Program 171(1–2):311–330
Article MathSciNet MATH Google Scholar
Neumaier A (1998) Solving ill-conditioned and singular linear systems: a tutorial on regularization. SIAM Rev 40(3):636–666
Article MathSciNet MATH Google Scholar
Neumaier A (2001) Introduction to numerical analysis. Cambridge University Press, Cambridge
Book MATH Google Scholar
Neumaier A (2016) OSGA: a fast subgradient algorithm with optimal complexity. Math Program 158(1–2):1–21
Article MathSciNet MATH Google Scholar
Renegar J, Grimmer B (2018) A simple nearly-optimal restart scheme for speeding-up first order methods. arxiv:1803.00151
Roulet V, d’Aspremont A (2017) Sharpness, restart and acceleration. arxiv:1702.03828
Shawe-Taylor J, Sun S (2011) A review of optimization methodologies in support vector machines. Neurocomputing 74:3609–3618
Article Google Scholar
Themelis A, Ahookhosh M, Panagiotis P (2019) On the acceleration of forward-backward splitting via an inexact Newton method. In Luke R, Bauschke H, Burachik R (eds) Splitting algorithms, modern operator theory, and applications. Springer (to appear)
Tseng P (2008) On accelerated proximal gradient methods for convex-concave optimization, Manuscript. http://pages.cs.wisc.edu/~brecht/cs726docs/Tseng.APG.pdf
Wright SJ, Nowak RD, Figueiredo MAT (2009) Sparse reconstruction by separable approximation. IEEE Trans Signal Process 57(7):2479–2493
Article MathSciNet MATH Google Scholar
Zhu J, Rosset S, Hastie T, Tibshirani R (2004) 1-norm support vector machines. Adv Neural Inf Process Syst 16:49–56
Google Scholar

Download references

Acknowledgements

I would like to thank Arnold Neumaier for his useful comments on this paper. I am really grateful of anonymous referees and the associated editor for their constructive comments and suggestions that improved the quality of this paper.

Author information

Authors and Affiliations

Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090, Vienna, Austria
Masoud Ahookhosh
Department of Electrical Engineering (ESAT-STADIUS) – KU Leuven, Kasteelpark Arenberg 10, 3001, Leuven, Belgium
Masoud Ahookhosh

Authors

Masoud Ahookhosh
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Masoud Ahookhosh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The numerical results for the elastic net minimization and the support vector machine are given in Tables 3 and 4, respectively.

Table 3 Numerical results of NSDSG, NESCO, NESUN, ASGA-1, ASGA-2, ASGA-3, and ASGA-4 for the elastic net minimization problems (54) and (55)

Full size table

Table 4 Numerical results of NSDSG, NESUN, ASGA-1, ASGA-2, ASGA-3, and ASGA-4 for the binary classification with linear support vector machines (57)

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahookhosh, M. Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity. Math Meth Oper Res 89, 319–353 (2019). https://doi.org/10.1007/s00186-019-00674-w

Download citation

Received: 27 May 2018
Revised: 16 June 2019
Published: 26 June 2019
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s00186-019-00674-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A family of spectral gradient methods for optimization

Accelerated Primal-Dual Gradient Descent with Linesearch for Convex, Nonconvex, and Nonsmooth Optimization Problems

Inexact Reduced Gradient Methods in Nonconvex Optimization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now