Abstract
We consider an accelerated proximal gradient algorithm for the composite optimization with “independent errors” (errors little related with historical information) for solving linear inverse problems. We present a new inexact version of FISTA algorithm considering deterministic and stochastic noises. We prove some convergence rates of the algorithm and we connect it with the current existing catalyst framework for many algorithms in machine learning. We show that a catalyst can be regarded as a special case of the FISTA algorithm where the smooth part of the function vanishes. Our framework gives a more generic formulation that provides convergence results for the deterministic and stochastic noise cases and also to the catalyst framework. Some of our results provide simpler alternative analysis of some existing results in literature, but they also extend the results to more generic situations.
Similar content being viewed by others
References
Agarwal, A., Bartlett, P.L., Ravikumar, P., Wainwright, M.J.: Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization. IEEE Trans. Inf. Theory 58(5), 3235–3249 (2012)
Ash, R.B., Doleans-Dade, C.: Probability and Measure Theory. Academic Press, San Diego (2000)
Auslender, A.: Numerical methods for nondifferentiable convex optimization. In: Nonlinear Analysis and Optimization, pp. 102–126. Springer (1987)
Bai, M.R., Chung, C., Wu, P.-C., Chiang, Y.-H., Yang, C.-M.: Solution strategies for linear inverse problems in spatial audio signal processing. Appl. Sci. 7, 582 (2017)
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18 (11), 2419–2434 (2009)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)
Björck, Å.: Numerical methods for least squares problems. SIAM (1996)
Candes, E., Recht, B.: Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012)
Chambolle, A., De Vore, R.A., Lee, N.-Y., Lucier, B.J.: Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEE Trans. Image Process. 7(3), 319–335 (1998)
Cominetti, R.: Coupling the proximal point algorithm with approximation methods. J. Optim. Theory Appl. 95(3), 581–600 (1997)
Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2004)
Defazio, A., Bach, F., Lacoste-Julien, S.: Saga: A fast incremental gradient method with support for non-strongly convex composite objectives. In: Advances in Neural Information Processing Systems, pp. 1646–1654 (2014)
Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146(1-2), 37–75 (2014)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Escande, P., Weiss, P.: Sparse wavelet representations of spatially varying blurring operators. SIAM J. Imag. Sci. 8(4), 2976–3014 (2015)
Figueiredo, M.A.T., Nowak, R.D.: An EM algorithm for wavelet-based image restoration. IEEE Trans. Image Process. 12(8), 906–916 (2003)
Figueiredo, M.A.T., Nowak, R.D., Wright, S.J.: Gradient projection for sparse reconstruction application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Sign. Proces. 1(4), 586–597 (2007)
Hale, E.T., Yin, W., Zhang, Y.: A fixed-point continuation method for ℓ 1-regularized minimization with applications to compressed sensing. CAAM Technical Report TR07-07, Rice University, http://www.caam.rice.edu/zhang/reports/tr0707.pdf (2007)
Hale, E.T, Yin, W., Zhang, Y.: Fixed-point continuation for ℓ 1-minimization methodology and convergence. SIAM J. Optim. 19(3), 1107–1130 (2008)
Honorio, J.: Convergence rates of biased stochastic optimization for learning sparse ising models. In: Proceedings of the 29th International Coference on International Conference on Machine Learning, pp. 1099–1106, Omnipress (2012)
Xiaowei, H u, Prashanth, L.A.: András György, and Csaba Szepesvári. (Bandit) convex optimization with biased noisy gradient oracles. In: Artificial Intelligence and Statistics, pp. 819–828 (2016)
Jiang, K., Sun, D., Toh, K.-C.: An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP. SIAM J. Optim. 22(3), 1042–1064 (2012)
Kaipio, J., Somersalo, E.: Statistical inverse problems: discretization, model reduction and inverse crimes. J. Comput. Appl. Math. 198(2), 493–504 (2007)
Lin, H., Mairal, J., Harchaoui, Z.: A universal catalyst for first-order optimization. In: Advances in Neural Information Processing Systems, pp. 3384–3392 (2015)
Mairal, J.: Incremental majorization-minimization optimization with application to large-scale machine learning. SIAM J. Optim. 25(2), 829–855 (2015)
Mohammad-Djafari, A.: Inverse problems in imaging science: from classical regularization methods to state of the art bayesian methods. In: International Image Processing, Applications and Systems Conference, pp. 1–2 (2014)
Monteiro, R.D.C., Svaiter, B.F.: An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. SIAM J. Optim. 23(2), 1092–1125 (2013)
Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation I: Basic Theory, vol. 330. Springer Science & Business Media, Berlin (2006)
Nesterov, Y.: Introductory Lectures on Convex Optimization: a Basic Course, vol. 87. Springer Science & Business Media, Berlin (2013)
Reem, D., De Pierro, A.: A new convergence analysis and perturbation resilience of some accelerated proximal forward–backward algorithms with errors. Inverse Prob. 33(4), 044001 (2017)
Robbins, H., Siegmund, D.: A convergence theorem for non negative almost supermartingales and some applications. In: Optimizing Methods in Statistics, pp. 233–257. Elsevier (1971)
Tyrrell Rockafellar, R., Wets, R.J.-B.: Variational Analysis, vol. 317. Springer Science & Business Media, Berlin (2009)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (2015)
Salzo, S., Villa, S.: Inexact and accelerated proximal point algorithms. J. Convex Anal. 19(4), 1167–1192 (2012)
Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1), 83–112 (2017)
Schmidt, M., Roux, N.L., Bach, F.R.: Convergence rates of inexact proximal-gradient methods for convex optimization. In: Advances in Neural Information Processing Systems, pp. 1458–1466 (2011)
Shalev-Shwartz, S., Zhang, T.: Proximal stochastic dual coordinate ascent. arXiv:1211.2717 (2012)
Solodov, M.V., Svaiter, B.F.: A hybrid approximate extragradient–proximal point algorithm using the enlargement of a maximal monotone operator. Set-Valued Anal. 7(4), 323–345 (1999)
Sun, T., Cheng, L.: Reweighted fast iterative shrinkage thresholding algorithm with restarts for ℓ 1–ℓ 1 minimisation. IET Signal Proc. 10(1), 28–36 (2016)
Sun, T., Du, P., Cheng, L., Jiang, H.: Alternating projection for sparse recovery. IET Signal Proc. 11(2), 135–144 (2016)
Sun, T., Zhang, H., Cheng, L.: Subgradient projection for sparse signal recovery with sparse noise. Electron. Lett. 50(17), 1200–1202 (2014)
Sun, T., Zhang, H., Cheng, L.: Precondition techniques for accelerated linearized Bregman algorithms. Pac. J. Optim. 11(3), 527–548 (2015)
Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward-backward algorithms. SIAM J. Optim. 23(3), 1607–1633 (2013)
Wright, S.J.: Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015)
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Zaslavski, A.J.: Convergence of a proximal point method in the presence of computational errors in hilbert spaces. SIAM J. Optim. 20(5), 2413–2421 (2010)
Acknowledgments
We thank the unknown referees for suggestions on the improvement of the paper. H.J. and L.C. have been supported by the National Science Foundation of China (No. 61402495) and National Natural Science Foundation of Hunan Province in China (2018JJ3616). R.B. has been supported by the Spanish Research Project MTM2015-64095-P. T.S. has been supported by National Science Foundation of China (No.61571008).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sun, T., Barrio, R., Jiang, H. et al. Convergence rates of accelerated proximal gradient algorithms under independent noise. Numer Algor 81, 631–654 (2019). https://doi.org/10.1007/s11075-018-0565-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-018-0565-4