Abstract
We introduce four accelerated (sub)gradient algorithms (ASGA) for solving several classes of convex optimization problems. More specifically, we propose two estimation sequences majorizing the objective function and develop two iterative schemes for each of them. In both cases, the first scheme requires the smoothness parameter and a Hölder constant, while the second scheme is parameter-free (except for the strong convexity parameter which we set zero if it is not available) at the price of applying a finitely terminated backtracking line search. The proposed algorithms attain the optimal complexity for smooth problems with Lipschitz continuous gradients, nonsmooth problems with bounded variation of subgradients, and weakly smooth problems with Hölder continuous gradients. Further, for strongly convex problems, they are optimal for smooth problems while nearly optimal for nonsmooth and weakly smooth problems. Finally, numerical results for some applications in sparse optimization and machine learning are reported, which confirm the theoretical foundations.



Similar content being viewed by others
References
Ahookhosh M (2015) High-dimensional nonsmooth convex optimization via optimal subgradient methods. Ph.D. thesis, University of Vienna
Ahookhosh M (2018) Optimal subgradient methods: computational properties for large-scale linear inverse problems. Optim Eng 19(4):815–844
Ahookhosh M, Ghederi S (2017) On efficiency of nonmonotone Armijo-type line searches. Appl Math Model 43:170–190
Ahookhosh M, Neumaier A (2017) An optimal subgradient algorithm for large-scale bound-constrained convex optimization. Math Methods Oper Res 86(1):123–147
Ahookhosh M, Neumaier A (2017) Optimal subgradient algorithms for large-scale convex optimization in simple domains. Numer Algorithms 76(4):1071–1097
Ahookhosh M, Neumaier A (2018) Solving nonsmooth convex optimization with complexity \(\cal{O}^{-1/2}\). TOP 26(1):110–145
Ahookhosh M, Themelis A, Patrinos P (2019) Bregman forward-backward splitting for nonconvex composite optimization: superlinear convergence to nonisolated critical points. arXiv:1905.11904
Amini K, Ahookhosh M, Nosratipour H (2014) An inexact line search approach using modified nonmonotone strategy for unconstrained optimization. Numer Algorithms 66:49–78
Auslender A, Teboulle M (2006) Interior gradient and proximal methods for convex and conic optimization. SIAM J Optim 16:697–725
Baes M (2009) Estimate sequence methods: extensions and approximations. IFOR Internal report, ETH, Zurich, Switzerland
Baes M, Bürgisser M (2014) An acceleration procedure for optimal first-order methods. Optim Methods Softw 9(3):610–628
Bauschke HH, Bolte J, Teboulle M (2016) A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math Oper Res 42(2):330–348
Beck A, Teboulle M (2003) Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper Res Lett 31:167–175
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2:183–202
Becker SR, Candès EJ, Grant MC (2011) Templates for convex cone problems with applications to sparse signal recovery. Math Program Comput 3:165–218
Boyd S, Xiao L, Mutapcic A (2003) Subgradient methods. http://www.stanford.edu/class/ee392o/subgrad_method.pdf
Chen Y, Lan G, Ouyang Y (2014) Optimal primal-dual methods for a class of saddle point problems. SIAM J Optim 24(4):1779–1814
Chen Y, Lan G, Ouyang Y (2015) An accelerated linearized alternating direction method of multipliers. SIAM J Imaging Sci 8(1):644–681
Devolder O, Glineur F, Nesterov Y (2013) First-order methods with inexact oracle: the strongly convex case. CORE Discussion Paper 2013/16
Devolder O, Glineur F, Nesterov Y (2014) First-order methods of smooth convex optimization with inexact oracle. Math Program 146:37–75
Ghadimi S (2019) Conditional gradient type methods for composite nonlinear and stochastic optimization. Math Program 173:431–464
Ghadimi S, Lan G, Zhang H (2019) Generalized uniformly optimal methods for nonlinear programming. J Sci Comput. https://doi.org/10.1007/s10915-019-00915-4
Golub T, Slonim D, Tamayo P, Huard C, Gaasenbeek M, Mesirov J, Coller H, Loh M, Downing J, Caligiuri M (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–536
Gonzaga CC, Karas EW (2013) Fine tuning Nesterov’s steepest descent algorithm for differentiable convex programming. Math Program 138:141–166
Gonzaga CC, Karas EW, Rossetto DR (2013) An optimal algorithm for constrained differentiable convex optimization. SIAM J Optim 23(4):1939–1955
Hanzely F, Richtarik P, Xiao L (2018) Accelerated Bregman proximal gradient methods for relatively smooth convex optimization. arXiv:1808.03045
http://www.broad.mit.edu/cgi-bin/cancer/publications/pub_paper.cgi?mode=view&paper_id=43
Juditsky A, Nesterov Y (2014) Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization. Stoch Syst 4(1):44–80
Lan G (2010) An optimal method for stochastic composite optimization. Math Program 133:365–397
Lan G (2015) Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math Program 149:1–45
Lan G, Lu Z, Monteiro RDC (2011) Primal-dual first-order methods with \(O(1/\varepsilon )\) iteration-complexity for cone programming. Math Program 126:1–29
Lu H, Freund R, Nesterov Y (2018) Relatively smooth convex optimization by first-order methods, and applications. SIAM J Optim 28(1):333–354
Nemirovskii AS, Nesterov YE (1985) Optimal methods of smooth convex minimization. USSR Comput Math Math Phys 25(2):21–30
Nemirovskii AS, Nesterov Y (1985) Optimal methods of smooth convex minimization. USSR Comput Math Math Phys 25(2):21–30
Nemirovsky AS, Yudin DB (1983) Problem complexity and method efficiency in optimization. Wiley, New York
Nesterov Y (1983) A method of solving a convex programming problem with convergence rate \(O(1/k^2)\), Doklady AN SSSR (In Russian), 269 543–547. English translation: Soviet Math. Dokl., 27, 372–376 (1983)
Nesterov Y (2004) Introductory lectures on convex optimization: a basic course. Kluwer, Dordrecht
Nesterov Y (2005) Smooth minimization of non-smooth functions. Math Program 103:127–152
Nesterov Y (2005) Excessive gap technique in nonsmooth convex minimization. SIAM J Optim 16:235–249
Nesterov Y (2013) Gradient methods for minimizing composite objective function. Math Program 140:125–161
Nesterov Y (2015) Universal gradient methods for convex optimization problems. Math Program 152:381–404
Nesterov Y (2018) Complexity bounds for primal-dual methods minimizing the model of objective function. Math Program 171(1–2):311–330
Neumaier A (1998) Solving ill-conditioned and singular linear systems: a tutorial on regularization. SIAM Rev 40(3):636–666
Neumaier A (2001) Introduction to numerical analysis. Cambridge University Press, Cambridge
Neumaier A (2016) OSGA: a fast subgradient algorithm with optimal complexity. Math Program 158(1–2):1–21
Renegar J, Grimmer B (2018) A simple nearly-optimal restart scheme for speeding-up first order methods. arxiv:1803.00151
Roulet V, d’Aspremont A (2017) Sharpness, restart and acceleration. arxiv:1702.03828
Shawe-Taylor J, Sun S (2011) A review of optimization methodologies in support vector machines. Neurocomputing 74:3609–3618
Themelis A, Ahookhosh M, Panagiotis P (2019) On the acceleration of forward-backward splitting via an inexact Newton method. In Luke R, Bauschke H, Burachik R (eds) Splitting algorithms, modern operator theory, and applications. Springer (to appear)
Tseng P (2008) On accelerated proximal gradient methods for convex-concave optimization, Manuscript. http://pages.cs.wisc.edu/~brecht/cs726docs/Tseng.APG.pdf
Wright SJ, Nowak RD, Figueiredo MAT (2009) Sparse reconstruction by separable approximation. IEEE Trans Signal Process 57(7):2479–2493
Zhu J, Rosset S, Hastie T, Tibshirani R (2004) 1-norm support vector machines. Adv Neural Inf Process Syst 16:49–56
Acknowledgements
I would like to thank Arnold Neumaier for his useful comments on this paper. I am really grateful of anonymous referees and the associated editor for their constructive comments and suggestions that improved the quality of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The numerical results for the elastic net minimization and the support vector machine are given in Tables 3 and 4, respectively.
Rights and permissions
About this article
Cite this article
Ahookhosh, M. Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity. Math Meth Oper Res 89, 319–353 (2019). https://doi.org/10.1007/s00186-019-00674-w
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-019-00674-w
Keywords
- Structured nonsmooth convex optimization
- First-order black-box oracle
- Estimation sequence
- Strong convexity
- Optimal complexity