Skip to main content
Log in

A stochastic primal-dual method for a class of nonconvex constrained optimization

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

In this paper we study a class of nonconvex optimization which involves uncertainty in the objective and a large number of nonconvex functional constraints. Challenges often arise when solving this type of problems due to the nonconvexity of the feasible set and the high cost of calculating function value and gradient of all constraints simultaneously. To handle these issues, we propose a stochastic primal-dual method in this paper. At each iteration, a proximal subproblem based on a stochastic approximation to an augmented Lagrangian function is solved to update the primal variable, which is then used to update dual variables. We explore theoretical properties of the proposed algorithm and establish its iteration and sample complexities to find an \(\epsilon\)-stationary point of the original problem. Numerical tests on a weighted maximin dispersion problem and a nonconvex quadratically constrained optimization problem demonstrate the promising performance of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Notes

  1. For any feasible point \(x\in X\) with \(f(x)\le {\mathbf {0}}\), we say that BCQ holds at x if we can deduce \(y:=\left( y_{1},\ldots ,y_{M} \right) ^{{\prime }}={\mathbf {0}}\) from \(-J_f(x)^{{\prime }} y \in \mathcal {N}_{X}\left( x\right)\), \(y\ge {\mathbf {0}},\) and \(y_{i}= 0 \text { if } f_{i}\left( x\right) <0 ,i=1,\dots ,M.\)

References

  1. Boob, D., Deng, Q., Lan, G.: Stochastic first-order methods for convex and nonconvex functional constrained optimization. Math. Program. (2022)

  2. Campi, M.C., Garatti, S.: A sampling-and-discarding approach to chance-constrained optimization: Feasibility and optimality. J. Optim. Theory App. 148(2), 257–280 (2011)

    Article  MathSciNet  Google Scholar 

  3. Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In: 28th NIPS, vol. 27 (2014)

  4. Ghadimi, S.: Conditional gradient type methods for composite nonlinear and stochastic optimization. Math. Program. 173(1–2), 431–464 (2019)

    Article  MathSciNet  Google Scholar 

  5. Haines, S., Loeppky, J., Tseng, P., Wang, X.: Convex relaxations of the weighted maxmin dispersion problem. SIAM J. Optim. 23(4), 2264–2294 (2013)

    Article  MathSciNet  Google Scholar 

  6. Hamedani, E.Y., Aybat, N.S.: A primal-dual algorithm with line search for general convex-concave saddle point problems. SIAM J. Optim. 31(2), 1299–1329 (2021)

    Article  MathSciNet  Google Scholar 

  7. Hiriart-Urruty, J.-B., Lemaréchal, C.: Fundamentals of Convex Analysis. Springer (2001)

  8. Huo, Z., Gu, B., Liu, J., Huang, H.: Accelerated method for stochastic composition optimization with nonsmooth regularization. In: 32nd AAAI Conference on Artificial Intelligence, pp. 3287–3294. AAAI (2018)

  9. Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. NIPS 1(3), 315–323 (2013)

    Google Scholar 

  10. Lan, G.: First-Order and Stochastic Optimization Methods for Machine Learning. Springer, Cham (2020)

    Book  Google Scholar 

  11. Lan, G., Romeijn, E., Zhou, Z.: Conditional gradient methods for convex optimization with general affine and nonlinear constraints. SIAM J. Optim. 31(3), 2307–2339 (2021)

    Article  MathSciNet  Google Scholar 

  12. Lan, G., Zhou, Z.: Algorithms for stochastic optimization with function or expectation constraints. Comput. Optim. App. 76(2), 461–498 (2020)

    Article  MathSciNet  Google Scholar 

  13. Li, Z., Chen, P.-Y., Liu, S., Lu, S., Xu, Y.: Rate-improved inexact augmented lagrangian method for constrained nonconvex optimization. In: 24th AISTATS, vol. 130, pp. 2170–2178 (2021)

  14. Li, Z., Xu, Y.: Augmented lagrangian based first-order methods for convex-constrained programs with weakly-convex objective. INFROMS J. Optim. 3(4), 373–397 (2021)

    MathSciNet  Google Scholar 

  15. Lin, Q., Ma, R., Xu, Y.: Complexity of an inexact proximal-point penalty methods for constrained non-convex optimization. Comput. Optim. Appl. 82(1), 175–224 (2022)

    Article  MathSciNet  Google Scholar 

  16. Lin, Q., Nadarajah, S., Soheili, N.: A level-set method for convex optimization with a feasible solution path. SIAM J. Optim. 28(4), 3290–3311 (2018)

    Article  MathSciNet  Google Scholar 

  17. Lin, T., Jin, C., Jordan, M.I.: On gradient descent ascent for nonconvex-concave minimax problems. In: 37th ICML, vol. 119, pp. 6083–6093 (2020)

  18. Lu, S., Tsaknakis, I., Hong, M., Chen, Y.: Hybrid block successive approximation for one-sided non-convex min-max problems: algorithms and applications. IEEE T. Signal Proces. 68, 3676–3691 (2020)

    Article  MathSciNet  Google Scholar 

  19. Milzarek, A., Xiao, X., Cen, S., Wen, Z., Ulbrich, M.: A stochastic semismooth newton method for nonsmooth nonconvex optimization. SIAM J. Optim. 29(4), 2916–2948 (2019)

    Article  MathSciNet  Google Scholar 

  20. Nemirovski, A.: Prox-method with rate of convergence o(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15(1), 229–251 (2004)

    Article  MathSciNet  Google Scholar 

  21. Nguyen, L.M., Liu, J., Scheinberg, K., Takác̆, M.: SARAH: a novel method for machine learning problems using stochastic recursive gradient. In: 34th ICML, vol. 70, pp. 2613–2621 (2017)

  22. Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (2006)

    MATH  Google Scholar 

  23. Nouiehed, M., Sanjabi, M., Huang, T., Lee, J.D., Razaviyayn, M.: Solving a class of non-convex min-max games using iterative first order methods. In: 33th NIPS, vol. 32 (2019)

  24. Pan, W., Shen, J., Xu, Z.: An efficient algorithm for nonconvex-linear minimax optimization problem and its application in solving weighted maximin dispersion problem. Comput. Optim. Appl. 78(1), 287–306 (2021)

    Article  MathSciNet  Google Scholar 

  25. Pham, N.H., Nguyen, L.M., Phan, D.T., Quoc, T.-D.: ProxSARAH: An efficient algorithmic framework for stochastic composite nonconvex optimization. J. Mach. Lean. Res. 21(110), 1–48 (2020)

    MathSciNet  MATH  Google Scholar 

  26. Poljak, B.T.: A general method for solving extremal problems. Dokl. Akad. Nauk SSSR 174(1), 33–36 (1967)

    MathSciNet  Google Scholar 

  27. Rafique, H., Liu, M., Lin, Q., Yang, T.: Weakly-convex concave min-max optimization: provable algorithms and applications in machine learning. Optim. Method Softw. pp. 1–35 (2021)

  28. Reddi, S.J., Sra, S., Poczos, B., Smola, A.: Stochastic Frank-Wolfe Methods for Nonconvex Optimization. In: 54TH ALLERTON, pp. 1244–1251 (2016)

  29. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Statist. 22(3), 400–407 (1951)

    Article  MathSciNet  Google Scholar 

  30. Rockafellar, R.T.: Convex Analysis. Princeton University Press (1972)

  31. Rockafellar, R.T.: Lagrange multipliers and optimality. SIAM Rev. 35(2), 183–238 (1993)

    Article  MathSciNet  Google Scholar 

  32. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer Science & Business Media (2009)

  33. Sahin, M. F., Eftekhari, A., Alacaoglu, A., Latorre, F., Cevher, V.: An inexact augmented lagrangian framework for nonconvex optimization with nonlinear constraints. In: 33th NIPS, vol. 32 (2019)

  34. Seri, R., Choirat, C.: Scenario approximation of robust and chance-constrained programs. J. Optim. Theory App. 158(2), 590–614 (2013)

    Article  MathSciNet  Google Scholar 

  35. Wang, S., Xia, Y.: On the ball-constrained weighted maximin dispersion problem. SIAM J. Optim. 26(3), 1565–1588 (2016)

    Article  MathSciNet  Google Scholar 

  36. Wang, X., Wang, X., Yuan, Y.X.: Stochastic proximal quasi-newton methods for non-convex composite optimization. Optim. Method Softw. 34(5), 922–948 (2019)

    Article  MathSciNet  Google Scholar 

  37. Wang, X., Yuan, Y.: An augmented lagrangian trust region method for equality constrained optimization. Optim. Method Softw. 30(3), 559–582 (2015)

    Article  MathSciNet  Google Scholar 

  38. Wang, X., Zhang, H.: An augmented lagrangian affine scaling method for nonlinear programming. Optim. Methods Softw. 30(5), 934–964 (2015)

    Article  MathSciNet  Google Scholar 

  39. Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)

    Article  MathSciNet  Google Scholar 

  40. Xu, Y.: Primal-dual stochastic gradient method for convex programs with many functional constraints. SIAM J. Optim. 30(2), 1664–1692 (2020)

    Article  MathSciNet  Google Scholar 

  41. Xu, Y.: First-order methods for constrained convex programming based on linearized augmented lagrangian function. INFORMS J. Optim. 3(1), 89–117 (2021)

    Article  MathSciNet  Google Scholar 

  42. Xu, Z., Zhang, H., Xu, Y., Lan, G.: A unified single-loop alternating gradient projection algorithm for nonconvex-concave and convex-nonconcave minimax problems (2020). arXiv preprint arXiv:2006.02032

  43. Zhang, J., Xiao, P., Sun, R., Luo, Z.: A single-loop smoothed gradient descent-ascent algorithm for nonconvex-concave min-max problems. In: 34th NIPS, vol. 33 (2020)

Download references

Funding

This research was partially supported by the National Natural Science Foundation of China (11871453, 11731013) and the Major Key Project of PCL (PCL2022A05).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 A.1: Lipschitz-continuity of \(\psi _{\beta }\left( u,v\right)\) in \(v \in \mathbb {R}_{+}\)

For any \(u \in \mathbb {R}\),

$$\begin{aligned} \left| \psi _{\beta }\left( u,v_{1}\right) -\psi _{\beta }\left( u,v_{2}\right) \right| \le \left| u \right| \left| v_{1}-v_{2} \right| ,\quad \forall v_{1},v_{2}\ge 0 . \end{aligned}$$

Proof

For \(\psi _{\beta }\left( u, v\right) =\left\{ \begin{array}{ll} u v+\frac{\beta }{2} u^{2}, &{} \text{ if } \beta u+v \ge 0 \\ -\frac{v^{2}}{2 \beta }, &{} \text{ if } \beta u+v<0\end{array}\right. ,\beta >0\), the result obviously holds for any \(u \ge 0\). So we will just talk about case when \(u < 0\).

If \(\beta u+v_{1} \ge 0,\beta u+v_{2} \ge 0\), the result holds obviously. If \(\beta u+v_{1} \ge 0,\beta u+v_{2} < 0\), since \(\psi _{\beta }\left( u, v\right)\) is monotonically decreasing in \(v\ge 0\) when \(u < 0\), we have

$$\begin{aligned} \left| \psi _{\beta }\left( u,v_{1}\right) -\psi _{\beta }\left( u,v_{2}\right) \right|&=\psi _{\beta }\left( u,v_{2}\right) -\psi _{\beta }\left( u,v_{1}\right) = -\frac{v_{2}^{2}}{2 \beta }-u v_{1}-\frac{\beta }{2} u^{2} \\&\le u v_{2}+\frac{\beta }{2} u^{2} -u v_{1}-\frac{\beta }{2} u^{2} =\left| u \right| \left| v_{1}-v_{2} \right| . \end{aligned}$$

If \(\beta u+v_{1}< 0,\beta u+v_{2} < 0\), then we have

$$\begin{aligned} \left| \psi _{\beta }\left( u,v_{1}\right) -\psi _{\beta }\left( u,v_{2}\right) \right| =\frac{ \left| v_{1}^{2}-v_{2}^{2} \right| }{2\beta }= \frac{ v_{1}+v_{2} }{2\beta } \left| v_{1}-v_{2} \right| \le -u \left| v_{1}-v_{2} \right| =\left| u \right| \left| v_{1}-v_{2} \right| . \end{aligned}$$

\(\square\)

1.2 A.2: Nonexpansiveness of prox operator

For \(x^{k+1}\) defined in (15) and \({\hat{x}}^{k+1}\) defined in (30), it holds that

$$\begin{aligned} \left\| x^{k+1}-{\hat{x}}^{k+1} \right\| \le \alpha _{k} \left\| \nabla f_{0}\left( x^{k}\right) +\nabla _{x}\Psi _{\beta }\left( x^{k},z^{k}\right) -g_{0}^{k}-h^{k} \right\| . \end{aligned}$$

Proof

From the optimality condition for (15), there exists a subgradient \(u \in \partial \chi _{0}\left( x^{k+1}\right)\) such that

$$\begin{aligned} \left\langle u+g_{0}^{k}+h^{k}+\frac{1}{\alpha _{k}}\nabla _{y} V\left( x^{k}, y\right) {|}_{y=x^{k+1}} ,x-x^{k+1} \right\rangle \ge 0,\quad \forall x\in X. \end{aligned}$$

Applying the convexity of \(\chi _{0}\), we obtain that

$$\begin{aligned} \chi _{0}\left( x \right) -\chi _{0}\left( x^{k+1} \right) + \left\langle g_{0}^{k}+h^{k}+\frac{1}{\alpha _{k}}\nabla _{y} V\left( x^{k}, y\right) {|}_{y=x^{k+1}} ,x-x^{k+1}\right\rangle \ge 0,\quad \forall x\in X. \end{aligned}$$
(54)

Similarly, it yields from the optimality condition for \({\hat{x}}^{k+1}\) that for any \(x\in X\),

$$\begin{aligned}&\chi _{0}\left( x \right) -\chi _{0}\left( {\hat{x}}^{k+1} \right) + \left\langle \nabla f_{0}\left( x^{k} \right) +\nabla _{x} \Psi _{\beta }\left( x^{k},z^{k}\right) \right. \nonumber \\&\quad \left. +\frac{1}{\alpha _{k}}\nabla _{y} V\left( x^{k}, y\right) {|}_{y={\hat{x}}^{k+1}} ,x-{\hat{x}}^{k+1} \right\rangle \ge 0. \end{aligned}$$
(55)

Letting \(x={\hat{x}}^{k+1}\) in (54) and \(x=x^{k+1}\) in (55) and adding them up, we obtain

$$\begin{aligned}&\, \left\langle \nabla f_{0}\left( x^{k}\right) +\nabla _{x}\Psi _{\beta }\left( x^{k},z^{k}\right) -g_{0}^{k}-h^{k}, x^{k+1}-{\hat{x}}^{k+1} \right\rangle \\ \ge&\, \frac{1}{\alpha _{k}} \left\langle \nabla _{y} V\left( x^{k},{y} \right) {|}_{y=x^{k+1}} -\nabla _{y} V\left( x^{k}, y\right) {|}_{y={\hat{x}}^{k+1}} ,x^{k+1}-{\hat{x}}^{k+1} \right\rangle \\ =&\, \frac{1}{\alpha _{k}} \left\langle \nabla v\left( x^{k+1}\right) -\nabla v\left( {\hat{x}}^{k+1}\right) ,x^{k+1}-{\hat{x}}^{k+1} \right\rangle \\ \ge&\frac{1}{\alpha _{k}} \left\| x^{k+1}-{\hat{x}}^{k+1} \right\| ^{2} , \end{aligned}$$

where the last inequality comes from the 1-strong convexity of \(v\left( x\right)\). Then applying Cauchy-Schwarz inequality indicates the result. \(\square\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, L., Wang, X. A stochastic primal-dual method for a class of nonconvex constrained optimization. Comput Optim Appl 83, 143–180 (2022). https://doi.org/10.1007/s10589-022-00384-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-022-00384-w

Keywords

Mathematics Subject Classification

Navigation