Convergence rates of accelerated proximal gradient algorithms under independent noise

Sun, Tao; Barrio, Roberto; Jiang, Hao; Cheng, Lizhi

doi:10.1007/s11075-018-0565-4

Convergence rates of accelerated proximal gradient algorithms under independent noise

Original Paper
Published: 27 June 2018

Volume 81, pages 631–654, (2019)
Cite this article

Numerical Algorithms Aims and scope Submit manuscript

Tao Sun ORCID: orcid.org/0000-0001-5024-1900¹,
Roberto Barrio²,
Hao Jiang³ &
…
Lizhi Cheng⁴

314 Accesses
3 Citations
Explore all metrics

Abstract

We consider an accelerated proximal gradient algorithm for the composite optimization with “independent errors” (errors little related with historical information) for solving linear inverse problems. We present a new inexact version of FISTA algorithm considering deterministic and stochastic noises. We prove some convergence rates of the algorithm and we connect it with the current existing catalyst framework for many algorithms in machine learning. We show that a catalyst can be regarded as a special case of the FISTA algorithm where the smooth part of the function vanishes. Our framework gives a more generic formulation that provides convergence results for the deterministic and stochastic noise cases and also to the catalyst framework. Some of our results provide simpler alternative analysis of some existing results in literature, but they also extend the results to more generic situations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convergence of Stochastic Proximal Gradient Algorithm

Article 15 October 2019

Accelerated Gradient-Free Optimization Methods with a Non-Euclidean Proximal Operator

Article 16 August 2019

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k2)

Article 22 August 2022

References

Agarwal, A., Bartlett, P.L., Ravikumar, P., Wainwright, M.J.: Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization. IEEE Trans. Inf. Theory 58(5), 3235–3249 (2012)
Article MathSciNet MATH Google Scholar
Ash, R.B., Doleans-Dade, C.: Probability and Measure Theory. Academic Press, San Diego (2000)
MATH Google Scholar
Auslender, A.: Numerical methods for nondifferentiable convex optimization. In: Nonlinear Analysis and Optimization, pp. 102–126. Springer (1987)
Bai, M.R., Chung, C., Wu, P.-C., Chiang, Y.-H., Yang, C.-M.: Solution strategies for linear inverse problems in spatial audio signal processing. Appl. Sci. 7, 582 (2017)
Article Google Scholar
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18 (11), 2419–2434 (2009)
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Björck, Å.: Numerical methods for least squares problems. SIAM (1996)
Candes, E., Recht, B.: Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012)
Article MATH Google Scholar
Chambolle, A., De Vore, R.A., Lee, N.-Y., Lucier, B.J.: Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEE Trans. Image Process. 7(3), 319–335 (1998)
Article MathSciNet MATH Google Scholar
Cominetti, R.: Coupling the proximal point algorithm with approximation methods. J. Optim. Theory Appl. 95(3), 581–600 (1997)
Article MathSciNet MATH Google Scholar
Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2004)
Article MathSciNet MATH Google Scholar
Defazio, A., Bach, F., Lacoste-Julien, S.: Saga: A fast incremental gradient method with support for non-strongly convex composite objectives. In: Advances in Neural Information Processing Systems, pp. 1646–1654 (2014)
Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146(1-2), 37–75 (2014)
Article MathSciNet MATH Google Scholar
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Article MathSciNet MATH Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
MathSciNet MATH Google Scholar
Escande, P., Weiss, P.: Sparse wavelet representations of spatially varying blurring operators. SIAM J. Imag. Sci. 8(4), 2976–3014 (2015)
Article MathSciNet MATH Google Scholar
Figueiredo, M.A.T., Nowak, R.D.: An EM algorithm for wavelet-based image restoration. IEEE Trans. Image Process. 12(8), 906–916 (2003)
Article MathSciNet MATH Google Scholar
Figueiredo, M.A.T., Nowak, R.D., Wright, S.J.: Gradient projection for sparse reconstruction application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Sign. Proces. 1(4), 586–597 (2007)
Article Google Scholar
Hale, E.T., Yin, W., Zhang, Y.: A fixed-point continuation method for ℓ ₁-regularized minimization with applications to compressed sensing. CAAM Technical Report TR07-07, Rice University, http://www.caam.rice.edu/zhang/reports/tr0707.pdf (2007)
Hale, E.T, Yin, W., Zhang, Y.: Fixed-point continuation for ℓ ₁-minimization methodology and convergence. SIAM J. Optim. 19(3), 1107–1130 (2008)
Article MathSciNet MATH Google Scholar
Honorio, J.: Convergence rates of biased stochastic optimization for learning sparse ising models. In: Proceedings of the 29th International Coference on International Conference on Machine Learning, pp. 1099–1106, Omnipress (2012)
Xiaowei, H u, Prashanth, L.A.: András György, and Csaba Szepesvári. (Bandit) convex optimization with biased noisy gradient oracles. In: Artificial Intelligence and Statistics, pp. 819–828 (2016)
Jiang, K., Sun, D., Toh, K.-C.: An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP. SIAM J. Optim. 22(3), 1042–1064 (2012)
Article MathSciNet MATH Google Scholar
Kaipio, J., Somersalo, E.: Statistical inverse problems: discretization, model reduction and inverse crimes. J. Comput. Appl. Math. 198(2), 493–504 (2007)
Article MathSciNet MATH Google Scholar
Lin, H., Mairal, J., Harchaoui, Z.: A universal catalyst for first-order optimization. In: Advances in Neural Information Processing Systems, pp. 3384–3392 (2015)
Mairal, J.: Incremental majorization-minimization optimization with application to large-scale machine learning. SIAM J. Optim. 25(2), 829–855 (2015)
Article MathSciNet MATH Google Scholar
Mohammad-Djafari, A.: Inverse problems in imaging science: from classical regularization methods to state of the art bayesian methods. In: International Image Processing, Applications and Systems Conference, pp. 1–2 (2014)
Monteiro, R.D.C., Svaiter, B.F.: An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM J. Optim. 23(2), 1092–1125 (2013)
Article MathSciNet MATH Google Scholar
Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation I: Basic Theory, vol. 330. Springer Science & Business Media, Berlin (2006)
Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization: a Basic Course, vol. 87. Springer Science & Business Media, Berlin (2013)
Google Scholar
Reem, D., De Pierro, A.: A new convergence analysis and perturbation resilience of some accelerated proximal forward–backward algorithms with errors. Inverse Prob. 33(4), 044001 (2017)
Article MathSciNet MATH Google Scholar
Robbins, H., Siegmund, D.: A convergence theorem for non negative almost supermartingales and some applications. In: Optimizing Methods in Statistics, pp. 233–257. Elsevier (1971)
Tyrrell Rockafellar, R., Wets, R.J.-B.: Variational Analysis, vol. 317. Springer Science & Business Media, Berlin (2009)
Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (2015)
Google Scholar
Salzo, S., Villa, S.: Inexact and accelerated proximal point algorithms. J. Convex Anal. 19(4), 1167–1192 (2012)
MathSciNet MATH Google Scholar
Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1), 83–112 (2017)
Article MathSciNet MATH Google Scholar
Schmidt, M., Roux, N.L., Bach, F.R.: Convergence rates of inexact proximal-gradient methods for convex optimization. In: Advances in Neural Information Processing Systems, pp. 1458–1466 (2011)
Shalev-Shwartz, S., Zhang, T.: Proximal stochastic dual coordinate ascent. arXiv:1211.2717 (2012)
Solodov, M.V., Svaiter, B.F.: A hybrid approximate extragradient–proximal point algorithm using the enlargement of a maximal monotone operator. Set-Valued Anal. 7(4), 323–345 (1999)
Article MathSciNet MATH Google Scholar
Sun, T., Cheng, L.: Reweighted fast iterative shrinkage thresholding algorithm with restarts for ℓ ₁–ℓ ₁ minimisation. IET Signal Proc. 10(1), 28–36 (2016)
Article Google Scholar
Sun, T., Du, P., Cheng, L., Jiang, H.: Alternating projection for sparse recovery. IET Signal Proc. 11(2), 135–144 (2016)
Article Google Scholar
Sun, T., Zhang, H., Cheng, L.: Subgradient projection for sparse signal recovery with sparse noise. Electron. Lett. 50(17), 1200–1202 (2014)
Article Google Scholar
Sun, T., Zhang, H., Cheng, L.: Precondition techniques for accelerated linearized Bregman algorithms. Pac. J. Optim. 11(3), 527–548 (2015)
MathSciNet MATH Google Scholar
Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward-backward algorithms. SIAM J. Optim. 23(3), 1607–1633 (2013)
Article MathSciNet MATH Google Scholar
Wright, S.J.: Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015)
Article MathSciNet MATH Google Scholar
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Article MathSciNet MATH Google Scholar
Zaslavski, A.J.: Convergence of a proximal point method in the presence of computational errors in hilbert spaces. SIAM J. Optim. 20(5), 2413–2421 (2010)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

We thank the unknown referees for suggestions on the improvement of the paper. H.J. and L.C. have been supported by the National Science Foundation of China (No. 61402495) and National Natural Science Foundation of Hunan Province in China (2018JJ3616). R.B. has been supported by the Spanish Research Project MTM2015-64095-P. T.S. has been supported by National Science Foundation of China (No.61571008).

Author information

Authors and Affiliations

Department of Mathematics, National University of Defense Technology, Changsha, 410073, Hunan, People’s Republic of China
Tao Sun
Departamento de Matemática Aplicada, IUMA and CoDy Group, Universidad de Zaragoza, Pedro Cerbuna, 12, 50009, Zaragoza, Spain
Roberto Barrio
College of Computer, National University of Defense Technology, Changsha, 410073, Hunan, People’s Republic of China
Hao Jiang
College of Science and The State Key Laboratory for High Performance Computation, National University of Defense Technology, Changsha, 410073, Hunan, People’s Republic of China
Lizhi Cheng

Authors

Tao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Barrio
View author publications
You can also search for this author in PubMed Google Scholar
Hao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Lizhi Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Sun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, T., Barrio, R., Jiang, H. et al. Convergence rates of accelerated proximal gradient algorithms under independent noise. Numer Algor 81, 631–654 (2019). https://doi.org/10.1007/s11075-018-0565-4

Download citation

Received: 28 October 2017
Accepted: 20 June 2018
Published: 27 June 2018
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s11075-018-0565-4

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convergence rates of accelerated proximal gradient algorithms under independent noise

Abstract

Access this article

Similar content being viewed by others

Convergence of Stochastic Proximal Gradient Algorithm

Accelerated Gradient-Free Optimization Methods with a Non-Euclidean Proximal Operator

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k2)

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Convergence rates of accelerated proximal gradient algorithms under independent noise

Abstract

Access this article

Similar content being viewed by others

Convergence of Stochastic Proximal Gradient Algorithm

Accelerated Gradient-Free Optimization Methods with a Non-Euclidean Proximal Operator

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k2)

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation