A Stochastic Nesterov’s Smoothing Accelerated Method for General Nonsmooth Constrained Stochastic Composite Convex Optimization

Wang, Ruyu; Zhang, Chao; Wang, Lichun; Shao, Yuanhai

doi:10.1007/s10915-022-02016-1

A Stochastic Nesterov’s Smoothing Accelerated Method for General Nonsmooth Constrained Stochastic Composite Convex Optimization

Published: 07 October 2022

Volume 93, article number 52, (2022)
Cite this article

Journal of Scientific Computing Aims and scope Submit manuscript

Ruyu Wang¹,
Chao Zhang ORCID: orcid.org/0000-0001-9013-8289¹,
Lichun Wang¹ &
…
Yuanhai Shao²

508 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

We propose a novel stochastic Nesterov’s smoothing accelerated method for general nonsmooth, constrained, stochastic composite convex optimization, the nonsmooth component of which may be not easy to compute its proximal operator. The proposed method combines Nesterov’s smoothing accelerated method (Nesterov in Math Program 103(1):127–152, 2005) for deterministic problems and stochastic approximation for stochastic problems, which allows three variants: single sample and two different mini-batch sizes per iteration, respectively. We prove that all the three variants achieve the best-known complexity bounds in terms of stochastic oracle. Numerical results on a robust linear regression problem, as well as a support vector machine problem show that the proposed method compares favorably with other state-of-the-art first-order methods, and the variants with mini-batch sizes outperform the variant with single sample.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid stochastic optimization framework for composite nonconvex optimization

Article 04 January 2021

An adaptive stochastic sequential quadratic programming with differentiable exact augmented lagrangians

Article 30 June 2022

Spectral projected subgradient method for nonsmooth convex optimization problems

Article 30 September 2022

Notes

References

Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)
Article MathSciNet MATH Google Scholar
Chapelle, O., Sindhwani, V., Keerthi, S.S.: Optimization techniques for semi-supervised support vector machines. J. Mach. Learn. Res. 9, 203–233 (2008)
MATH Google Scholar
Shivaswamy, P.K., Jebara, T.: Relative margin machines. NIPS 19, 1481–1488 (2008)
Google Scholar
Li, J., Chen, C., So, A.M.C.: Fast epigraphical projection-based incremental algorithms for Wasserstein distributionally robust support vector machine. NIPS 33, 4029–4039 (2020)
Google Scholar
Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2001)
MATH Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)
Article MathSciNet MATH Google Scholar
Polyak, B.: New stochastic approximation type procedures. Automat. i Telemekh. 7(2), 98–107 (1990). ((English translation: Automation and Remote Control))
Google Scholar
Polyak, B., Juditsky, A.: Acceleration of stochastic approximation by averaging. SIAM J. Control. Optim. 30(4), 838–855 (2006)
Article MathSciNet MATH Google Scholar
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Article MathSciNet MATH Google Scholar
Nemirovsky, A., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. Wiley, New York (1983)
Google Scholar
Lan, G., Nemirovski, A., Shapiro, A.: Validation analysis of mirror descent stochastic approximation method. Math. Program. 134(2), 425–458 (2012)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1–2), 267–305 (2016)
Article MathSciNet MATH Google Scholar
Wang, X., Wang, X., Yuan, Y.-X.: Stochastic proximal quasi-Newton methods for non-convex composite optimization. Optim. Methods Softw. 34(5), 922–948 (2019)
Article MathSciNet MATH Google Scholar
Chen, S., Ma, S., So, A.M.-C., Zhang, T.: Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30(1), 210–239 (2020)
Article MathSciNet MATH Google Scholar
Xiao, X.: A unified convergence analysis of stochastic Bregman proximal gradient and extragradient method. J. Optim. Theory Appl. 188(3), 605–627 (2021)
Article MathSciNet MATH Google Scholar
Bai, J., Hager, W.W., Zhang, H.: An inexact accelerated stochastic ADMM for separable convex optimization. Comput. Optim. Appl. 81, 479–518 (2022)
Article MathSciNet MATH Google Scholar
Bai, J., Han, D., Sun, H., Zhang, H.: Convergence on a symmetric accelerated stochastic ADMM with larger stepsizes. CSIAM-AM. (2022). https://doi.org/10.4208/csiam-am.SO-2021-0021
Article Google Scholar
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Article MathSciNet MATH Google Scholar
Nitanda, A.: Stochastic proximal gradient descent with acceleration techniques. NIPS 27, 1574–1582 (2014)
Google Scholar
Wang, X., Wang, S., Zhang, H.: Inexact proximal stochastic gradient method for convex composite optimization. Comput. Optim. Appl. 68(3), 579–618 (2017)
Article MathSciNet MATH Google Scholar
Allen-Zhu, Z.: Katyusha: the first direct acceleration of stochastic gradient methods. J. Mach. Learn. Res. 18(1), 8194–8244 (2017)
MathSciNet MATH Google Scholar
Reddi, S., Sra, S., Poczos, B., Smola, A.J.: Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization. NIPS 29, 1145–1153 (2016)
Google Scholar
Pham, N.H., Nguyen, L.M., Phan, D.T., Tran-Dinh, Q.: ProxSARAH: an efficient algorithmic framework for stochastic composite nonconvex optimization. J. Mach. Learn. Res. 21(110), 1–48 (2020)
MathSciNet MATH Google Scholar
Shivaswamy, P.K., Jebara, T.: Maximum relative margin and data-dependent regularization. J. Mach. Learn. Res. 11(2), 747–788 (2010)
MathSciNet MATH Google Scholar
Zhang, T., Zhou, Z.H.: Optimal margin distribution machine. IEEE Trans. Knowl. Data Eng. 32(6), 1143–1156 (2019)
Article MathSciNet Google Scholar
Crammer, K., Dredze, M., Pereira, F.: Confidence-weighted linear classification for text categorization. J. Mach. Learn. Res. 13(1), 1891–1926 (2012)
MathSciNet MATH Google Scholar
Bertsimas, D., Gupta, V., Kallus, N.: Robust sample average approximation. Math. Program. 171(1), 217–282 (2018)
Article MathSciNet MATH Google Scholar
Chen, X.: Smoothing methods for nonsmooth, nonconvex minimization. Math. Program. 134(1), 71–99 (2012)
Article MathSciNet MATH Google Scholar
Chen, X.: Smoothing methods for complementarity problems and their applications: a survey. J. Oper. Res. Soc. Jpn. 43(1), 32–47 (2000)
MathSciNet MATH Google Scholar
Zhang, C., Chen, X.: A smoothing active set method for linearly constrained non-Lipschitz nonconvex optimization. SIAM J. Optim. 30, 1–30 (2020)
Article MathSciNet MATH Google Scholar
Zhang, C., Chen, X.: Smoothing projected gradient method and its application to stochastic linear complementarity problems. SIAM J. Optim. 20, 627–649 (2009)
Article MathSciNet MATH Google Scholar
Polyak, B.: Introduction to Optimization. Optimization Software Inc., New York (1987)
MATH Google Scholar
Ouyang, H., Gray, A.G.: Stochastic smoothing for nonsmooth minimizations: accelerating SGD by exploiting structure. ICML 2, 1523–1530 (2012)
Google Scholar
Devolder, O., Glineur, F., Nesterov, Y.: Double smoothing technique for large-scale linearly constrained convex optimization. SIAM J. Optim. 22(2), 702–727 (2012)
Article MathSciNet MATH Google Scholar
Quoc, T.: Adaptive smoothing algorithms for nonsmooth composite convex minimization. Comput. Optim. Appl. 66(3), 425–451 (2017)
Article MathSciNet MATH Google Scholar
Duchi, J.C., Bartlett, P.L., Wainwright, M.J.: Randomized smoothing for stochastic optimization. SIAM J. Optim. 22(2), 674–701 (2012)
Article MathSciNet MATH Google Scholar
Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization, http://www.math.washington.edu/~tseng/papers/apgm.pdf (2008)
Li, Z., Li, J.: A simple proximal stochastic gradient method for nonsmooth nonconvex optimization. NIPS 31, 5569–5579 (2018)
Google Scholar
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133(1), 365–397 (2012)
Article MathSciNet MATH Google Scholar
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. NIPS 27, 422 (2014)
Google Scholar
Reddi, S.J., Hefny, A., Sra, S., Póczos, B., Smola, A.: Stochastic variance reduction for nonconvex optimization. ICML. 2, 314–323 (2016)
Google Scholar
Nguyen, L.M., Liu, J., Scheinberg, K., Takác, M.: SARAH: a novel method for machine learning problems using stochastic recursive gradient. ICML. 5, 2613–2621 (2017)
Google Scholar
Zhou, K., Jin, Y., Ding, Q., Cheng, J.: Amortized Nesterov’s momentum: a robust momentum and its application to deep learning. UAI. 7, 211–220 (2020)
Google Scholar
Beck, A.: First-Order Methods in Optimization. SIAM, Philadelphia (2017)
Book MATH Google Scholar
Noble, W.S.: What is a support vector machine? Nat. Biotechnol. 24(12), 1565–1567 (2006)
Article Google Scholar
Huang, S., Cai, N., Pacheco, P.P., Narrandes, S., Wang, Y., Xu, W.: Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom. Proteom. 15(1), 41–51 (2018)
Google Scholar
Rodriguez, R., Vogt, M., Bajorath, J.: Support vector machine classification and regression prioritize different structural features for binary compound activity and potency value prediction. ACS Omega 2(10), 6371–6379 (2017)
Article Google Scholar
Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intel. Syst. Tecnol. 2(3), 1–27 (2011)
Article Google Scholar

Download references

Acknowledgements

The work is supported in part by “the Natural Science Foundation of Beijing, China” (Grant No. 1202021) and “the Natural Science Foundation of China” (Grant No. 12171027).

Author information

Authors and Affiliations

School of Mathematics and Statistics, Beijing Jiaotong University, Beijing, 100044, People’s Republic of China
Ruyu Wang, Chao Zhang & Lichun Wang
Management School, Hainan University, Hainan, 570100, People’s Republic of China
Yuanhai Shao

Authors

Ruyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lichun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanhai Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, R., Zhang, C., Wang, L. et al. A Stochastic Nesterov’s Smoothing Accelerated Method for General Nonsmooth Constrained Stochastic Composite Convex Optimization. J Sci Comput 93, 52 (2022). https://doi.org/10.1007/s10915-022-02016-1

Download citation

Received: 03 March 2022
Revised: 31 July 2022
Accepted: 18 September 2022
Published: 07 October 2022
DOI: https://doi.org/10.1007/s10915-022-02016-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Stochastic Nesterov’s Smoothing Accelerated Method for General Nonsmooth Constrained Stochastic Composite Convex Optimization

Abstract

Access this article

Similar content being viewed by others

A hybrid stochastic optimization framework for composite nonconvex optimization

An adaptive stochastic sequential quadratic programming with differentiable exact augmented lagrangians

Spectral projected subgradient method for nonsmooth convex optimization problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Stochastic Nesterov’s Smoothing Accelerated Method for General Nonsmooth Constrained Stochastic Composite Convex Optimization

Abstract

Access this article

Similar content being viewed by others

A hybrid stochastic optimization framework for composite nonconvex optimization

An adaptive stochastic sequential quadratic programming with differentiable exact augmented lagrangians

Spectral projected subgradient method for nonsmooth convex optimization problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation