Skip to main content
Log in

Incrementally Updated Gradient Methods for Constrained and Regularized Optimization

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

We consider incrementally updated gradient methods for minimizing the sum of smooth functions and a convex function. This method can use a (sufficiently small) constant stepsize or, more practically, an adaptive stepsize that is decreased whenever sufficient progress is not made. We show that if the gradients of the smooth functions are Lipschitz continuous on the space of n-dimensional real column vectors or the gradients of the smooth functions are bounded and Lipschitz continuous over a certain level set and the convex function is Lipschitz continuous on its domain, then every cluster point of the iterates generated by the method is a stationary point. If in addition a local Lipschitz error bound assumption holds, then the method is linearly convergent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Bertsekas, D.P.: A new class of incremental gradient methods for least squares problems. SIAM J. Optim. 7, 913–926 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  2. Gaivoronski, A.A.: Convergence properties of back-propagation for neural nets via theory of stochastic gradient methods. Part I. Optim. Methods Softw. 4, 117–134 (1994)

    Article  Google Scholar 

  3. Luo, Z.-Q., Tseng, P.: Analysis of an approximate gradient projection method with applications to the backpropagation algorithm. Optim. Methods Softw. 4, 85–101 (1994)

    Article  Google Scholar 

  4. Mangasarian, O.L., Solodov, M.V.: Serial and parallel backpropagation convergence via nonmonotone perturbed minimization. Optim. Methods Softw. 4, 103–116 (1994)

    Article  Google Scholar 

  5. White, H.: Learning in artificial neural networks: a statistical perspective. Neural Comput. 1, 425–464 (1989)

    Article  Google Scholar 

  6. White, H.: Some asymptotic results for learning in single hidden-layer feedforward network models. J. Am. Stat. Assoc. 84, 1003–1013 (1989)

    Article  MATH  Google Scholar 

  7. Luo, Z.-Q.: On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks. Neural Comput. 3, 226–245 (1991)

    Article  Google Scholar 

  8. Solodov, M.V.: Incremental gradient algorithms with stepsizes bounded away from zero. Comput. Optim. Appl. 11, 23–35 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  9. Grippo, L.: A class of unconstrained minimization methods for neural network training. Optim. Methods Softw. 4, 135–150 (1994)

    Article  Google Scholar 

  10. Tseng, P.: An incremental gradient(-projection) method with momentum term and adaptive stepsize rule. SIAM J. Optim. 8, 506–531 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  11. Blatt, D., Hero, A.O., Gauchman, H.: A convergent incremental gradient method with a constant step size. SIAM J. Optim. 18, 29–51 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  12. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing—Explorations in the Microstructure of Cognition, pp. 318–362. MIT press, Cambridge (1986)

    Google Scholar 

  13. Tesauro, G., He, Y., Ahmad, S.: Asymptotic convergence of back propagation. Neural Comput. 1, 382–391 (1989)

    Article  Google Scholar 

  14. Werbos, P.J.: Beyond regression: new tools for prediction and analysis in the behavioral sciences. Ph.D. Thesis, Committee on Applied Mathematics, Harvard University, Cambridge (1974)

  15. Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 1550–1560 (1990)

    Article  Google Scholar 

  16. Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  17. Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57, 1413–1457 (2004)

    Article  MATH  Google Scholar 

  18. Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010)

    Google Scholar 

  19. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  20. Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  21. Xiao, L.: Dual averaging methods for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 11, 2543–2596 (2010)

    MATH  MathSciNet  Google Scholar 

  22. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    MATH  Google Scholar 

  23. Fukushima, M., Mine, H.: A generalized proximal point algorithm for certain non-convex minimization problems. Int. J. Syst. Sci. 12, 989–1000 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  24. Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice-Hall, Englewood Cliffs (1989)

    MATH  Google Scholar 

  25. Tseng, P.: On the rate of convergence of a partially asynchronous gradient projection algorithm. SIAM J. Optim. 1, 603–619 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  26. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1999)

    Book  MATH  Google Scholar 

  27. Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)

    MATH  Google Scholar 

  28. Hertz, J., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation. Addison-Wesley, Redwood City (1991)

    Google Scholar 

  29. Denoeux, T., Lengellé, R.: Initializing back propagation networks with prototypes. Neural Netw. 6, 351–363 (1993)

    Article  Google Scholar 

  30. Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)

    Article  MathSciNet  Google Scholar 

  31. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, New York (1998)

    Book  MATH  Google Scholar 

  32. Koh, K., Kim, S.-J., Boyd, S.: An interior-point method for large-scale 1-regularized logistic regression. J. Mach. Learn. Res. 8, 1519–1555 (2007)

    MATH  MathSciNet  Google Scholar 

  33. Beck, A., Teboulle, M.: Gradient-Based Algorithms with Applications in Signal Recovery Problems. In: Palomar, D., Eldar, Y. (eds.) Convex Optimization in Signal Processing and Communications, pp. 33–88. Cambribge University Press, Cambribge (2010)

    Google Scholar 

  34. Nesterov, Y.: Primal-dual subgradient methods for convex problems. Math. Program. 120, 221–259 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  35. Juditsky, A., Lan, G., Nemirovski, A., Shapiro, A.: Stochastic approximation approach to stochastic programming. SIAM J. Optim. 19, 1574–1609 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  36. Tseng, P., Yun, S.: Incrementally updated gradient methods for constrained and regularized optimization. Report, Department of Mathematics Education, Sungkyunkwan University, Seoul (2012)

Download references

Acknowledgements

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education, Science and Technology(2012R1A1A1006406).

We thank anonymous referees for their detailed comments to improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sangwoon Yun.

Additional information

Communicated by Masao Fukusima.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tseng, P., Yun, S. Incrementally Updated Gradient Methods for Constrained and Regularized Optimization. J Optim Theory Appl 160, 832–853 (2014). https://doi.org/10.1007/s10957-013-0409-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-013-0409-2

Keywords

Navigation