Inexact proximal stochastic gradient method for convex composite optimization

Wang, Xiao; Wang, Shuxiong; Zhang, Hongchao

doi:10.1007/s10589-017-9932-7

Inexact proximal stochastic gradient method for convex composite optimization

Published: 11 August 2017

Volume 68, pages 579–618, (2017)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

1030 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

We study an inexact proximal stochastic gradient (IPSG) method for convex composite optimization, whose objective function is a summation of an average of a large number of smooth convex functions and a convex, but possibly nonsmooth, function. Variance reduction techniques are incorporated in the method to reduce the stochastic gradient variance. The main feature of this IPSG algorithm is to allow solving the proximal subproblems inexactly while still keeping the global convergence with desirable complexity bounds. Different subproblem stopping criteria are proposed. Global convergence and the component gradient complexity bounds are derived for the both cases when the objective function is strongly convex or just generally convex. Preliminary numerical experiment shows the overall efficiency of the IPSG algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Globalized inexact proximal Newton-type methods for nonconvex composite functions

Article Open access 16 November 2020

Conditional gradient type methods for composite nonlinear and stochastic optimization

Article 24 January 2018

An efficient adaptive accelerated inexact proximal point method for solving linearly constrained nonconvex composite problems

Article 25 April 2020

Notes

The datasets are available at http://www.gems-system.org.

References

Bauschke, H.H., Combettes, P.L.: A dykstra-like algorithm for two monotone operators. Pac. J. Optim. 4(3), 383–391 (2008)
MATH MathSciNet Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MATH MathSciNet Google Scholar
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont, Massachusetts (1999)
MATH Google Scholar
Bertsekas, D.P.: Incremental proximal methods for large scale convex optimization. Math. Program. Ser. B 129, 163–195 (2011)
Article MATH MathSciNet Google Scholar
Bertsekas, D.P.: Incremental gradient, subgradient, and proximal methods for convex optimization: a survey. arXiv: 1507.01030v1, 3 July (2015)
Cai, J.-F., Candès, E.J., She, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)
Article MATH MathSciNet Google Scholar
Cai, J.-F., Candès, E.J., She, Z.: Fast newton-type methods for total variation regularization. In: ICML (2011)
Defazio, A., Bach, F., Lacoste-Julien, S.: Saga: a fast incremental gradient method with support for non-strongly convex composite objectives. In: NIPS, pp. 1646–1654 (2014)
Duchi, J., Hazan, E., Singe, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2015)
MATH MathSciNet Google Scholar
Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 54(12), 3736–3745 (2006)
Article MathSciNet Google Scholar
Fadili, J., Peyrè, G.: Total variation projection with first order schemes. IEEE Trans. Image Process. 20(3), 657–669 (2011)
Article MATH MathSciNet Google Scholar
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, I: a generic algorithmic framework. SIAM J. Optim. 22(4), 1469–1492 (2012)
Article MATH MathSciNet Google Scholar
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: shrinking procedures and optimal algorithms. SIAM J. Optim. 23(4), 2061–2089 (2013)
Article MATH MathSciNet Google Scholar
Hovhannisyan, V., Parpas, P., Zafeiriou, S.: Magma: multi-level accelerated gradient mirror descent algorithm for large-scale convex composite minimization. arXiv: 1509.05715v3, July (2016)
Jang, K., Sun, D., Toh, K.C.: An inexact accelerated proximal gradient method for large scale linearly constrained convex sdp. SIAM J. Optim. 22(3), 1042–1064 (2012)
Article MATH MathSciNet Google Scholar
Jenatton, R., Mairal, J., Bach, F.R., Obozinski, G.R.: Proximal methods for sparse hierarchical dictionary learning. In: ICML, pp. 487–494 (2010)
Johnson R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: NIPS, pp. 315–323 (2013)
Kavukcuoglu, K., Ranzato, M., Fergus, R., LeCun, Y.: Learning invariant features through topographic filter maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Konečný, J., Liu, J., Richtárik, P., Takáč, M.: Mini-batch semi-stochastic gradient descent in the proximal setting. IEEE J. Sel. Top. Signal Process. 10(2), 242–255 (2016)
Article Google Scholar
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133(1), 365–397 (2012)
Article MATH MathSciNet Google Scholar
Ma, S., Goldfarb, D., Chen, L.: Fixed point and bregman iterative methods for matrix rank minimization. Math. Program. 128(1), 321–353 (2011)
Article MATH MathSciNet Google Scholar
Mahoney, M.W., Drineas, P.: CUR matrix decompositions for improved data analysis. Proc. Natl. Acad. Sci. 106(3), 697 (2009)
Article MATH MathSciNet Google Scholar
Mairal, J., Jenatton, R., Obozinski, G., Bach, F.: Convex and network flow optimization for structured sparsity. J. Mach. Learn. Res. 12, 2681–2720 (2011)
MATH MathSciNet Google Scholar
Nesterov, Y.E.: Gradient methods for minimizing composite objective function. Math. Program. Ser. B 140, 341–362 (2013)
Article Google Scholar
Nitanda, A.: Stochastic proximal gradient descent with acceleration techniques. In: NIPS, pp. 1574–1582 (2014)
Obozinski, G., Taskar, B., Jordan, M.I.: Joint covariate selection and joint subspace selection for multiple classification problems. Stat. Comput. 20(2), 231–252 (2009)
Article MathSciNet Google Scholar
Salzo, S., Villa, S.: Inexact and accelerated proximal point algorithms. J. Convex Anal. 4(19), 1167–1192 (2012)
MATH MathSciNet Google Scholar
Scheinberg, K., Tang, X.: Practical inexact proximal quasi-Newton method with global complexity analysis. Math. Program. (2016). doi:10.1007/s10107-016-0997-3
Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. (2016). doi:10.1007/s10107-016-1030-6
MATH Google Scholar
Schmidt, M., Le Roux, N., Bach, F.: Supplementary material for the paper convergence rates of inexact proximal-gradient methods for convex optimization. In: NIPS (2011)
Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss minimization. J. Mach. Learn. Res. 14, 567–599 (2013)
MATH MathSciNet Google Scholar
Tibshirani, R., Hastie, T., Friedman, J.: The elements of statistical learning: data mining, 2nd edn. In: Inference. Springer, New York (2009)
Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward–backward algorithms. SIAM J. Optim. 23(3), 1607–1633 (2011)
Article MATH MathSciNet Google Scholar
Wang, M., Bertsekas, D.P.: Incremental constraint projection methods for variational inequalities. Math. Program. 150(2), 321–363 (2015)
Article MATH MathSciNet Google Scholar
Wang, M., Bertsekas, D.P.: Stochastic first-order methods with random constraint projection. SIAM J. Optim. 26(1), 681–717 (2016)
Article MATH MathSciNet Google Scholar
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24, 2057–2075 (2014)
Article MATH MathSciNet Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B 68, 49–67 (2006)
Article MATH MathSciNet Google Scholar
Zălinescu, C.: Convex Analysis in General Vector Spaces. World Scientific Publishing Co. Inc., Singapore (2002)
Book MATH Google Scholar
Zhang, Y., Lin, X.: Stochastic primal-dual coordinate method for regularized empirical risk minimization. In: ICML (2015)
Zhu, Z.A., Yuan, Y.: Univr: A universal variance reduction framework for proximal stochastic gradient method. arXiv: 1506.01972v1, 5 June (2015)

Download references

Author information

Authors and Affiliations

School of Mathematical Sciences, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China
Xiao Wang
Institute of Computational Mathematics and Scientific/Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijng, 100190, China
Shuxiong Wang
Department of Mathematics, Louisiana State University, 220 Lockett Hall, Baton Rouge, LA, 70803-4918, USA
Hongchao Zhang

Authors

Xiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuxiong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongchao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao Wang.

Additional information

This research is partially supported by the National Natural Science Foundation of China 11301505 and the National Science Foundation of USA 1522654.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Wang, S. & Zhang, H. Inexact proximal stochastic gradient method for convex composite optimization. Comput Optim Appl 68, 579–618 (2017). https://doi.org/10.1007/s10589-017-9932-7

Download citation

Received: 09 September 2016
Published: 11 August 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s10589-017-9932-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inexact proximal stochastic gradient method for convex composite optimization

Abstract

Access this article

Similar content being viewed by others

Globalized inexact proximal Newton-type methods for nonconvex composite functions

Conditional gradient type methods for composite nonlinear and stochastic optimization

An efficient adaptive accelerated inexact proximal point method for solving linearly constrained nonconvex composite problems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Inexact proximal stochastic gradient method for convex composite optimization

Abstract

Access this article

Similar content being viewed by others

Globalized inexact proximal Newton-type methods for nonconvex composite functions

Conditional gradient type methods for composite nonlinear and stochastic optimization

An efficient adaptive accelerated inexact proximal point method for solving linearly constrained nonconvex composite problems

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation