A Unified Convergence Analysis of Stochastic Bregman Proximal Gradient and Extragradient Methods

Xiao, Xiantao

doi:10.1007/s10957-020-01799-3

A Unified Convergence Analysis of Stochastic Bregman Proximal Gradient and Extragradient Methods

Published: 08 January 2021

Volume 188, pages 605–627, (2021)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Xiantao Xiao ORCID: orcid.org/0000-0002-6778-7085¹

1087 Accesses
9 Citations
Explore all metrics

Abstract

We consider a mini-batch stochastic Bregman proximal gradient method and a mini-batch stochastic Bregman proximal extragradient method for stochastic convex composite optimization problems. A simplified and unified convergence analysis framework is proposed to obtain almost sure convergence properties and expected convergence rates of the mini-batch stochastic Bregman proximal gradient method and its variants. This framework can also be used to analyze the convergence of the mini-batch stochastic Bregman proximal extragradient method, which has seldom been discussed in the literature. We point out that the standard uniformly bounded variance assumption and the usual Lipschitz gradient continuity assumption are not required in the analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Framework of Convergence Analysis of Mini-batch Stochastic Projected Gradient Methods

Article 20 November 2019

A simple convergence analysis of Bregman proximal gradient algorithm

Article 04 April 2019

A hybrid stochastic optimization framework for composite nonconvex optimization

Article 04 January 2021

References

Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: from Theory to Algorithms. Cambridge University Press, New York, NY, USA (2014)
Book Google Scholar
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)
Article MathSciNet Google Scholar
Birge, J.R., Louveaux, F.: Introduction to Stochastic Programming, 2nd edn. Springer Series in Operations Research and Financial Engineering. Springer, New York (2011)
Book Google Scholar
Fu, M.C.: Optimization for simulation: theory vs. practice. INFORMS J. Comput. 14(3), 192–215 (2002)
Article MathSciNet Google Scholar
Fu, M.C.: Handbook of Simulation Optimization, International Series in Operations Research and Management Science. Springer, New York (2015)
Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Article MathSciNet Google Scholar
Newton, D., Youseian, F., Pasupathy, R.: Stochastic gradient descent: Recent trends. In: E. Gel, L. Ntaimo (eds.) Recent Advances in Optimization and Modeling of Contemporary Problems, pp. 193–220. INFORMS (2018)
Atchadé, Y.F., Fort, G., Moulines, E.: On perturbed proximal gradient algorithms. J. Mach. Learn. Res. 18, 1–33 (2017)
MathSciNet MATH Google Scholar
Lei, J., Shanbhag, U.V.: Asynchronous variance-reduced block schemes for composite non-convex stochastic optimization: block-specific steplengths and adapted batch-sizes. Optim Methods Softw 0, 1–31 (2020)
Article Google Scholar
Lei, J., Shanbhag, U.V.: Variance-reduced accelerated first-order methods: central limit theorems and confidence statements (2020). https://arxiv.org/abs/2006.07769
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Article MathSciNet Google Scholar
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In: Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 1646–1654. Curran Associates, Inc. (2014)
Shalev-Shwartz, S., Zhang, T.: Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization. Math. Program. 155(1–2, Ser. A), 105–145 (2016)
Article MathSciNet Google Scholar
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1–2, Ser. A), 267–305 (2016)
Article MathSciNet Google Scholar
Ghadimi, S.: Conditional gradient type methods for composite nonlinear and stochastic optimization. Math. Program. 173(1–2, Ser. A), 431–464 (2019)
Article MathSciNet Google Scholar
Jofré, A., Thompson, P.: On variance reduction for stochastic smooth convex optimization with multiplicative noise. Math. Program. 174(1–2, Ser. B), 253–292 (2019)
Article MathSciNet Google Scholar
Xu, Y., Yin, W.: Block stochastic gradient iteration for convex and nonconvex optimization. SIAM J. Optim. 25(3), 1686–1716 (2015)
Article MathSciNet Google Scholar
Dang, C.D., Lan, G.: Stochastic block mirror descent methods for nonsmooth and stochastic optimization. SIAM J. Optim. 25(2), 856–881 (2015)
Article MathSciNet Google Scholar
Yousefian, F., Nedić, A., Shanbhag, U.V.: On stochastic mirror-prox algorithms for stochastic Cartesian variational inequalities: randomized block coordinate and optimal averaging schemes. Set-Valued Var. Anal. 26(4), 789–819 (2018)
Article MathSciNet Google Scholar
Korpelevich, G.M.: The extragradient method for finding saddle points and other problems. Ekon. Mat. Metody 12, 747–756 (1976)
MathSciNet MATH Google Scholar
Xiu, N., Zhang, J.: Some recent advances in projection-type methods for variational inequalities. J. Comput. Appl. Math. 152(1–2), 559–585 (2003)
Article MathSciNet Google Scholar
Facchinei, F., Pang, J.S.: Finite-dimensional variational inequalities and complementarity problems. Springer-Verlag, New York (2003)
MATH Google Scholar
Iusem, A.N., Jofré, A., Oliveira, R.I., Thompson, P.: Extragradient method with variance reduction for stochastic variational inequalities. SIAM J. Optim. 27(2), 686–724 (2017)
Article MathSciNet Google Scholar
Iusem, A.N., Jofré, A., Oliveira, R.I., Thompson, P.: Variance-based extragradient methods with line search for stochastic variational inequalities. SIAM J. Optim. 29(1), 175–206 (2019)
Article MathSciNet Google Scholar
Kannan, A., Shanbhag, U.V.: Optimal stochastic extragradient schemes for pseudomonotone stochastic variational inequality problems and their variants. Comput. Optim. Appl. 74(3), 779–820 (2019)
Article MathSciNet Google Scholar
Jalilzadeh, A., Shanbhag, U.V.: eg-VSSA: An extragradient variable sample-size stochastic approximation scheme: Error analysis and complexity trade-offs. In: 2016 Winter Simulation Conference (WSC), pp. 690–701 (2016)
Lin, T., Ma, S., Zhang, S.: An extragradient-based alternating direction method for convex minimization. Found. Comput. Math. 17(1), 35–59 (2017)
Article MathSciNet Google Scholar
Nguyen, T.P., Pauwels, E., Richard, E., Suter, B.W.: Extragradient method in optimization: convergence and complexity. J. Optim. Theory Appl. 176(1), 137–162 (2018)
Article MathSciNet Google Scholar
Yang, M., Milzarek, A., Wen, Z., Zhang, T.: A stochastic extra-step quasi-newton method for nonsmooth nonconvex optimization (2019). https://arxiv.org/abs/1910.09373
Chavdarova, T., Gidel, G., Fleuret, F., Lacoste-Julien, S.: Reducing noise in gan training with variance reduced extragradient. In: H. Wallach, H. Laro chelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, R. Garnett (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 393–403. Curran Associates, Inc. (2019)
Hsieh, Y.G., Iutzeler, F., Malick, J., Mertikopoulos, P.: On the convergence of single-call stochastic extra-gradient methods. In: Advances in Neural Information Processing Systems 32, pp. 6938–6948. Curran Associates, Inc. (2019)
Mokhtari, A., Ozdaglar, A., Pattathil, S.: A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. pp. 1497–1507. PMLR, Online (2020)
Robbins, H., Siegmund, D.: A convergence theorem for non negative almost supermartingales and some applications. In: Rustagi, J.S. (ed.) Optimizing methods in statistics, pp. 233–257. Academic Press, New York (1971)
Google Scholar
Rockafellar, R.T.: Convex analysis. Princeton Mathematical Series, No. 28. Princeton University Press, Princeton, N.J. (1970)
Bauschke, H.H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42(2), 330–348 (2017)
Article MathSciNet Google Scholar
Teboulle, M.: A simplified view of first order methods for optimization. Math. Program. 170(1, Ser. B), 67–96 (2018)
Article MathSciNet Google Scholar
Bolte, J., Sabach, S., Teboulle, M., Vaisbourd, Y.: First order methods beyond convexity and Lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3), 2131–2151 (2018)
Article MathSciNet Google Scholar
Ghadimi, S., Lan, G., Zhang, H.: Generalized uniformly optimal methods for nonlinear programming. J. Sci. Comput. 79(3), 1854–1881 (2019)
Article MathSciNet Google Scholar
Grimmer, B.: Convergence rates for deterministic and stochastic subgradient methods without Lipschitz continuity. SIAM J. Optim. 29(2), 1350–1365 (2019)
Article MathSciNet Google Scholar
Nguyen, T.H., Simsekli, U., Gurbuzbalaban, M., Richard, G.: First exit time analysis of stochastic gradient descent under heavy-tailed gradient noise. In: H. Wallach, H. Laro chelle, A. Beygelzimer, F. d’Alché-Buc, E. Fox, R. Garnett (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 273–283. Curran Associates, Inc. (2019)
Lei, Y., Hu, T., Li, G., Tang, K.: Stochastic gradient descent for nonconvex learning without bounded gradient assumptions. IEEE Trans. Neural Netw. Learn. Syst. 31(10), 4394–4400 (2020)
Article MathSciNet Google Scholar
Lei, Y., Ying, Y.: Fine-grained analysis of stability and generalization for stochastic gradient descent (2020). https://arxiv.org/abs/2006.08157
Cui, S., Shanbhag, U.V.: On the analysis of variance-reduced and randomized projection variants of single projection schemes for monotone stochastic variational inequality problems (2019). https://arxiv.org/abs/1904.11076

Download references

Acknowledgements

The author would like to thank the referees and the associate editor for their helpful comments and suggestions. This work was partially supported by the National Natural Science Foundation of China (No. 11871135) and the Fundamental Research Funds for the Central Universities (No. DUT19K46).

Author information

Authors and Affiliations

School of Mathematical Sciences, Dalian University of Technology, Dalian, 116023, China
Xiantao Xiao

Authors

Xiantao Xiao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Xiantao Xiao.

Additional information

Communicated by Alfredo N. Iusem.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, X. A Unified Convergence Analysis of Stochastic Bregman Proximal Gradient and Extragradient Methods. J Optim Theory Appl 188, 605–627 (2021). https://doi.org/10.1007/s10957-020-01799-3

Download citation

Received: 05 January 2020
Accepted: 19 December 2020
Published: 08 January 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10957-020-01799-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Unified Convergence Analysis of Stochastic Bregman Proximal Gradient and Extragradient Methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Framework of Convergence Analysis of Mini-batch Stochastic Projected Gradient Methods

A simple convergence analysis of Bregman proximal gradient algorithm

A hybrid stochastic optimization framework for composite nonconvex optimization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now