Skip to main content
Log in

A Stochastic Nesterov’s Smoothing Accelerated Method for General Nonsmooth Constrained Stochastic Composite Convex Optimization

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

We propose a novel stochastic Nesterov’s smoothing accelerated method for general nonsmooth, constrained, stochastic composite convex optimization, the nonsmooth component of which may be not easy to compute its proximal operator. The proposed method combines Nesterov’s smoothing accelerated method (Nesterov in Math Program 103(1):127–152, 2005) for deterministic problems and stochastic approximation for stochastic problems, which allows three variants: single sample and two different mini-batch sizes per iteration, respectively. We prove that all the three variants achieve the best-known complexity bounds in terms of stochastic oracle. Numerical results on a robust linear regression problem, as well as a support vector machine problem show that the proposed method compares favorably with other state-of-the-art first-order methods, and the variants with mini-batch sizes outperform the variant with single sample.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.cs.umd.edu/~sen/lbc-proj/data/cora.tgz.

  2. http://qwone.com/~jason/20Newsgroups.

  3. http://www.cad.zju.edu.cn/home/dengcai/Data/TextData.html.

References

  1. Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  2. Chapelle, O., Sindhwani, V., Keerthi, S.S.: Optimization techniques for semi-supervised support vector machines. J. Mach. Learn. Res. 9, 203–233 (2008)

    MATH  Google Scholar 

  3. Shivaswamy, P.K., Jebara, T.: Relative margin machines. NIPS 19, 1481–1488 (2008)

    Google Scholar 

  4. Li, J., Chen, C., So, A.M.C.: Fast epigraphical projection-based incremental algorithms for Wasserstein distributionally robust support vector machine. NIPS 33, 4029–4039 (2020)

    Google Scholar 

  5. Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2001)

    MATH  Google Scholar 

  6. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  7. Polyak, B.: New stochastic approximation type procedures. Automat. i Telemekh. 7(2), 98–107 (1990). ((English translation: Automation and Remote Control))

    Google Scholar 

  8. Polyak, B., Juditsky, A.: Acceleration of stochastic approximation by averaging. SIAM J. Control. Optim. 30(4), 838–855 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Nemirovsky, A., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. Wiley, New York (1983)

    Google Scholar 

  11. Lan, G., Nemirovski, A., Shapiro, A.: Validation analysis of mirror descent stochastic approximation method. Math. Program. 134(2), 425–458 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  12. Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1–2), 267–305 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  13. Wang, X., Wang, X., Yuan, Y.-X.: Stochastic proximal quasi-Newton methods for non-convex composite optimization. Optim. Methods Softw. 34(5), 922–948 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  14. Chen, S., Ma, S., So, A.M.-C., Zhang, T.: Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30(1), 210–239 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  15. Xiao, X.: A unified convergence analysis of stochastic Bregman proximal gradient and extragradient method. J. Optim. Theory Appl. 188(3), 605–627 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  16. Bai, J., Hager, W.W., Zhang, H.: An inexact accelerated stochastic ADMM for separable convex optimization. Comput. Optim. Appl. 81, 479–518 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  17. Bai, J., Han, D., Sun, H., Zhang, H.: Convergence on a symmetric accelerated stochastic ADMM with larger stepsizes. CSIAM-AM. (2022). https://doi.org/10.4208/csiam-am.SO-2021-0021

    Article  Google Scholar 

  18. Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  19. Nitanda, A.: Stochastic proximal gradient descent with acceleration techniques. NIPS 27, 1574–1582 (2014)

    Google Scholar 

  20. Wang, X., Wang, S., Zhang, H.: Inexact proximal stochastic gradient method for convex composite optimization. Comput. Optim. Appl. 68(3), 579–618 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  21. Allen-Zhu, Z.: Katyusha: the first direct acceleration of stochastic gradient methods. J. Mach. Learn. Res. 18(1), 8194–8244 (2017)

    MathSciNet  MATH  Google Scholar 

  22. Reddi, S., Sra, S., Poczos, B., Smola, A.J.: Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization. NIPS 29, 1145–1153 (2016)

    Google Scholar 

  23. Pham, N.H., Nguyen, L.M., Phan, D.T., Tran-Dinh, Q.: ProxSARAH: an efficient algorithmic framework for stochastic composite nonconvex optimization. J. Mach. Learn. Res. 21(110), 1–48 (2020)

    MathSciNet  MATH  Google Scholar 

  24. Shivaswamy, P.K., Jebara, T.: Maximum relative margin and data-dependent regularization. J. Mach. Learn. Res. 11(2), 747–788 (2010)

    MathSciNet  MATH  Google Scholar 

  25. Zhang, T., Zhou, Z.H.: Optimal margin distribution machine. IEEE Trans. Knowl. Data Eng. 32(6), 1143–1156 (2019)

    Article  MathSciNet  Google Scholar 

  26. Crammer, K., Dredze, M., Pereira, F.: Confidence-weighted linear classification for text categorization. J. Mach. Learn. Res. 13(1), 1891–1926 (2012)

    MathSciNet  MATH  Google Scholar 

  27. Bertsimas, D., Gupta, V., Kallus, N.: Robust sample average approximation. Math. Program. 171(1), 217–282 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  28. Chen, X.: Smoothing methods for nonsmooth, nonconvex minimization. Math. Program. 134(1), 71–99 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  29. Chen, X.: Smoothing methods for complementarity problems and their applications: a survey. J. Oper. Res. Soc. Jpn. 43(1), 32–47 (2000)

    MathSciNet  MATH  Google Scholar 

  30. Zhang, C., Chen, X.: A smoothing active set method for linearly constrained non-Lipschitz nonconvex optimization. SIAM J. Optim. 30, 1–30 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  31. Zhang, C., Chen, X.: Smoothing projected gradient method and its application to stochastic linear complementarity problems. SIAM J. Optim. 20, 627–649 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  32. Polyak, B.: Introduction to Optimization. Optimization Software Inc., New York (1987)

    MATH  Google Scholar 

  33. Ouyang, H., Gray, A.G.: Stochastic smoothing for nonsmooth minimizations: accelerating SGD by exploiting structure. ICML 2, 1523–1530 (2012)

    Google Scholar 

  34. Devolder, O., Glineur, F., Nesterov, Y.: Double smoothing technique for large-scale linearly constrained convex optimization. SIAM J. Optim. 22(2), 702–727 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  35. Quoc, T.: Adaptive smoothing algorithms for nonsmooth composite convex minimization. Comput. Optim. Appl. 66(3), 425–451 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  36. Duchi, J.C., Bartlett, P.L., Wainwright, M.J.: Randomized smoothing for stochastic optimization. SIAM J. Optim. 22(2), 674–701 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  37. Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization, http://www.math.washington.edu/~tseng/papers/apgm.pdf (2008)

  38. Li, Z., Li, J.: A simple proximal stochastic gradient method for nonsmooth nonconvex optimization. NIPS 31, 5569–5579 (2018)

    Google Scholar 

  39. Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133(1), 365–397 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  40. Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. NIPS 27, 422 (2014)

    Google Scholar 

  41. Reddi, S.J., Hefny, A., Sra, S., Póczos, B., Smola, A.: Stochastic variance reduction for nonconvex optimization. ICML. 2, 314–323 (2016)

    Google Scholar 

  42. Nguyen, L.M., Liu, J., Scheinberg, K., Takác, M.: SARAH: a novel method for machine learning problems using stochastic recursive gradient. ICML. 5, 2613–2621 (2017)

    Google Scholar 

  43. Zhou, K., Jin, Y., Ding, Q., Cheng, J.: Amortized Nesterov’s momentum: a robust momentum and its application to deep learning. UAI. 7, 211–220 (2020)

    Google Scholar 

  44. Beck, A.: First-Order Methods in Optimization. SIAM, Philadelphia (2017)

    Book  MATH  Google Scholar 

  45. Noble, W.S.: What is a support vector machine? Nat. Biotechnol. 24(12), 1565–1567 (2006)

    Article  Google Scholar 

  46. Huang, S., Cai, N., Pacheco, P.P., Narrandes, S., Wang, Y., Xu, W.: Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom. Proteom. 15(1), 41–51 (2018)

    Google Scholar 

  47. Rodriguez, R., Vogt, M., Bajorath, J.: Support vector machine classification and regression prioritize different structural features for binary compound activity and potency value prediction. ACS Omega 2(10), 6371–6379 (2017)

    Article  Google Scholar 

  48. Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intel. Syst. Tecnol. 2(3), 1–27 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

The work is supported in part by “the Natural Science Foundation of Beijing, China” (Grant No. 1202021) and “the Natural Science Foundation of China” (Grant No. 12171027).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, R., Zhang, C., Wang, L. et al. A Stochastic Nesterov’s Smoothing Accelerated Method for General Nonsmooth Constrained Stochastic Composite Convex Optimization. J Sci Comput 93, 52 (2022). https://doi.org/10.1007/s10915-022-02016-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-022-02016-1

Keywords

Navigation