A stochastic primal-dual method for a class of nonconvex constrained optimization

Jin, Lingzi; Wang, Xiao

doi:10.1007/s10589-022-00384-w

A stochastic primal-dual method for a class of nonconvex constrained optimization

Published: 24 June 2022

Volume 83, pages 143–180, (2022)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

1142 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper we study a class of nonconvex optimization which involves uncertainty in the objective and a large number of nonconvex functional constraints. Challenges often arise when solving this type of problems due to the nonconvexity of the feasible set and the high cost of calculating function value and gradient of all constraints simultaneously. To handle these issues, we propose a stochastic primal-dual method in this paper. At each iteration, a proximal subproblem based on a stochastic approximation to an augmented Lagrangian function is solved to update the primal variable, which is then used to update dual variables. We explore theoretical properties of the proposed algorithm and establish its iteration and sample complexities to find an $\epsilon$-stationary point of the original problem. Numerical tests on a weighted maximin dispersion problem and a nonconvex quadratically constrained optimization problem demonstrate the promising performance of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Study of a primal-dual algorithm for equality constrained minimization

Article 31 July 2014

Complexity of an inexact proximal-point penalty method for constrained smooth non-convex optimization

Article 03 March 2022

Level constrained first order methods for function constrained optimization

Article Open access 06 March 2024

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Notes

For any feasible point $x\in X$ with $f(x)\le {\mathbf {0}}$, we say that BCQ holds at x if we can deduce $y:=\left( y_{1},\ldots ,y_{M} \right) ^{{\prime }}={\mathbf {0}}$ from $-J_f(x)^{{\prime }} y \in \mathcal {N}_{X}\left( x\right)$, $y\ge {\mathbf {0}},$ and $y_{i}= 0 \text { if } f_{i}\left( x\right) <0 ,i=1,\dots ,M.$

References

Boob, D., Deng, Q., Lan, G.: Stochastic first-order methods for convex and nonconvex functional constrained optimization. Math. Program. (2022)
Campi, M.C., Garatti, S.: A sampling-and-discarding approach to chance-constrained optimization: Feasibility and optimality. J. Optim. Theory App. 148(2), 257–280 (2011)
Article MathSciNet Google Scholar
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In: 28th NIPS, vol. 27 (2014)
Ghadimi, S.: Conditional gradient type methods for composite nonlinear and stochastic optimization. Math. Program. 173(1–2), 431–464 (2019)
Article MathSciNet Google Scholar
Haines, S., Loeppky, J., Tseng, P., Wang, X.: Convex relaxations of the weighted maxmin dispersion problem. SIAM J. Optim. 23(4), 2264–2294 (2013)
Article MathSciNet Google Scholar
Hamedani, E.Y., Aybat, N.S.: A primal-dual algorithm with line search for general convex-concave saddle point problems. SIAM J. Optim. 31(2), 1299–1329 (2021)
Article MathSciNet Google Scholar
Hiriart-Urruty, J.-B., Lemaréchal, C.: Fundamentals of Convex Analysis. Springer (2001)
Huo, Z., Gu, B., Liu, J., Huang, H.: Accelerated method for stochastic composition optimization with nonsmooth regularization. In: 32nd AAAI Conference on Artificial Intelligence, pp. 3287–3294. AAAI (2018)
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. NIPS 1(3), 315–323 (2013)
Google Scholar
Lan, G.: First-Order and Stochastic Optimization Methods for Machine Learning. Springer, Cham (2020)
Book Google Scholar
Lan, G., Romeijn, E., Zhou, Z.: Conditional gradient methods for convex optimization with general affine and nonlinear constraints. SIAM J. Optim. 31(3), 2307–2339 (2021)
Article MathSciNet Google Scholar
Lan, G., Zhou, Z.: Algorithms for stochastic optimization with function or expectation constraints. Comput. Optim. App. 76(2), 461–498 (2020)
Article MathSciNet Google Scholar
Li, Z., Chen, P.-Y., Liu, S., Lu, S., Xu, Y.: Rate-improved inexact augmented lagrangian method for constrained nonconvex optimization. In: 24th AISTATS, vol. 130, pp. 2170–2178 (2021)
Li, Z., Xu, Y.: Augmented lagrangian based first-order methods for convex-constrained programs with weakly-convex objective. INFROMS J. Optim. 3(4), 373–397 (2021)
MathSciNet Google Scholar
Lin, Q., Ma, R., Xu, Y.: Complexity of an inexact proximal-point penalty methods for constrained non-convex optimization. Comput. Optim. Appl. 82(1), 175–224 (2022)
Article MathSciNet Google Scholar
Lin, Q., Nadarajah, S., Soheili, N.: A level-set method for convex optimization with a feasible solution path. SIAM J. Optim. 28(4), 3290–3311 (2018)
Article MathSciNet Google Scholar
Lin, T., Jin, C., Jordan, M.I.: On gradient descent ascent for nonconvex-concave minimax problems. In: 37th ICML, vol. 119, pp. 6083–6093 (2020)
Lu, S., Tsaknakis, I., Hong, M., Chen, Y.: Hybrid block successive approximation for one-sided non-convex min-max problems: algorithms and applications. IEEE T. Signal Proces. 68, 3676–3691 (2020)
Article MathSciNet Google Scholar
Milzarek, A., Xiao, X., Cen, S., Wen, Z., Ulbrich, M.: A stochastic semismooth newton method for nonsmooth nonconvex optimization. SIAM J. Optim. 29(4), 2916–2948 (2019)
Article MathSciNet Google Scholar
Nemirovski, A.: Prox-method with rate of convergence o(1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15(1), 229–251 (2004)
Article MathSciNet Google Scholar
Nguyen, L.M., Liu, J., Scheinberg, K., Takác̆, M.: SARAH: a novel method for machine learning problems using stochastic recursive gradient. In: 34th ICML, vol. 70, pp. 2613–2621 (2017)
Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (2006)
MATH Google Scholar
Nouiehed, M., Sanjabi, M., Huang, T., Lee, J.D., Razaviyayn, M.: Solving a class of non-convex min-max games using iterative first order methods. In: 33th NIPS, vol. 32 (2019)
Pan, W., Shen, J., Xu, Z.: An efficient algorithm for nonconvex-linear minimax optimization problem and its application in solving weighted maximin dispersion problem. Comput. Optim. Appl. 78(1), 287–306 (2021)
Article MathSciNet Google Scholar
Pham, N.H., Nguyen, L.M., Phan, D.T., Quoc, T.-D.: ProxSARAH: An efficient algorithmic framework for stochastic composite nonconvex optimization. J. Mach. Lean. Res. 21(110), 1–48 (2020)
MathSciNet MATH Google Scholar
Poljak, B.T.: A general method for solving extremal problems. Dokl. Akad. Nauk SSSR 174(1), 33–36 (1967)
MathSciNet Google Scholar
Rafique, H., Liu, M., Lin, Q., Yang, T.: Weakly-convex concave min-max optimization: provable algorithms and applications in machine learning. Optim. Method Softw. pp. 1–35 (2021)
Reddi, S.J., Sra, S., Poczos, B., Smola, A.: Stochastic Frank-Wolfe Methods for Nonconvex Optimization. In: 54TH ALLERTON, pp. 1244–1251 (2016)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Statist. 22(3), 400–407 (1951)
Article MathSciNet Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press (1972)
Rockafellar, R.T.: Lagrange multipliers and optimality. SIAM Rev. 35(2), 183–238 (1993)
Article MathSciNet Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer Science & Business Media (2009)
Sahin, M. F., Eftekhari, A., Alacaoglu, A., Latorre, F., Cevher, V.: An inexact augmented lagrangian framework for nonconvex optimization with nonlinear constraints. In: 33th NIPS, vol. 32 (2019)
Seri, R., Choirat, C.: Scenario approximation of robust and chance-constrained programs. J. Optim. Theory App. 158(2), 590–614 (2013)
Article MathSciNet Google Scholar
Wang, S., Xia, Y.: On the ball-constrained weighted maximin dispersion problem. SIAM J. Optim. 26(3), 1565–1588 (2016)
Article MathSciNet Google Scholar
Wang, X., Wang, X., Yuan, Y.X.: Stochastic proximal quasi-newton methods for non-convex composite optimization. Optim. Method Softw. 34(5), 922–948 (2019)
Article MathSciNet Google Scholar
Wang, X., Yuan, Y.: An augmented lagrangian trust region method for equality constrained optimization. Optim. Method Softw. 30(3), 559–582 (2015)
Article MathSciNet Google Scholar
Wang, X., Zhang, H.: An augmented lagrangian affine scaling method for nonlinear programming. Optim. Methods Softw. 30(5), 934–964 (2015)
Article MathSciNet Google Scholar
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
Article MathSciNet Google Scholar
Xu, Y.: Primal-dual stochastic gradient method for convex programs with many functional constraints. SIAM J. Optim. 30(2), 1664–1692 (2020)
Article MathSciNet Google Scholar
Xu, Y.: First-order methods for constrained convex programming based on linearized augmented lagrangian function. INFORMS J. Optim. 3(1), 89–117 (2021)
Article MathSciNet Google Scholar
Xu, Z., Zhang, H., Xu, Y., Lan, G.: A unified single-loop alternating gradient projection algorithm for nonconvex-concave and convex-nonconcave minimax problems (2020). arXiv preprint arXiv:2006.02032
Zhang, J., Xiao, P., Sun, R., Luo, Z.: A single-loop smoothed gradient descent-ascent algorithm for nonconvex-concave min-max problems. In: 34th NIPS, vol. 33 (2020)

Download references

Funding

This research was partially supported by the National Natural Science Foundation of China (11871453, 11731013) and the Major Key Project of PCL (PCL2022A05).

Author information

Authors and Affiliations

School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
Lingzi Jin & Xiao Wang
Peng Cheng Laboratory, Shenzhen, 518066, China
Lingzi Jin & Xiao Wang

Authors

Lingzi Jin
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 A.1: Lipschitz-continuity of $\psi _{\beta }\left( u,v\right)$ in $v \in \mathbb {R}_{+}$

For any $u \in \mathbb {R}$,

$$\begin{aligned} \left| \psi _{\beta }\left( u,v_{1}\right) -\psi _{\beta }\left( u,v_{2}\right) \right| \le \left| u \right| \left| v_{1}-v_{2} \right| ,\quad \forall v_{1},v_{2}\ge 0 . \end{aligned}$$

Proof

For $\psi _{\beta }\left( u, v\right) =\left\{ \begin{array}{ll} u v+\frac{\beta }{2} u^{2}, &{} \text{ if } \beta u+v \ge 0 \\ -\frac{v^{2}}{2 \beta }, &{} \text{ if } \beta u+v<0\end{array}\right. ,\beta >0$, the result obviously holds for any $u \ge 0$. So we will just talk about case when $u < 0$.

If $\beta u+v_{1} \ge 0,\beta u+v_{2} \ge 0$, the result holds obviously. If $\beta u+v_{1} \ge 0,\beta u+v_{2} < 0$, since $\psi _{\beta }\left( u, v\right)$ is monotonically decreasing in $v\ge 0$ when $u < 0$, we have

$$\begin{aligned} \left| \psi _{\beta }\left( u,v_{1}\right) -\psi _{\beta }\left( u,v_{2}\right) \right|&=\psi _{\beta }\left( u,v_{2}\right) -\psi _{\beta }\left( u,v_{1}\right) = -\frac{v_{2}^{2}}{2 \beta }-u v_{1}-\frac{\beta }{2} u^{2} \\&\le u v_{2}+\frac{\beta }{2} u^{2} -u v_{1}-\frac{\beta }{2} u^{2} =\left| u \right| \left| v_{1}-v_{2} \right| . \end{aligned}$$

If $\beta u+v_{1}< 0,\beta u+v_{2} < 0$, then we have

$$\begin{aligned} \left| \psi _{\beta }\left( u,v_{1}\right) -\psi _{\beta }\left( u,v_{2}\right) \right| =\frac{ \left| v_{1}^{2}-v_{2}^{2} \right| }{2\beta }= \frac{ v_{1}+v_{2} }{2\beta } \left| v_{1}-v_{2} \right| \le -u \left| v_{1}-v_{2} \right| =\left| u \right| \left| v_{1}-v_{2} \right| . \end{aligned}$$

$\square$

1.2 A.2: Nonexpansiveness of prox operator

For $x^{k+1}$ defined in (15) and ${\hat{x}}^{k+1}$ defined in (30), it holds that

$$\begin{aligned} \left\| x^{k+1}-{\hat{x}}^{k+1} \right\| \le \alpha _{k} \left\| \nabla f_{0}\left( x^{k}\right) +\nabla _{x}\Psi _{\beta }\left( x^{k},z^{k}\right) -g_{0}^{k}-h^{k} \right\| . \end{aligned}$$

Proof

From the optimality condition for (15), there exists a subgradient $u \in \partial \chi _{0}\left( x^{k+1}\right)$ such that

$$\begin{aligned} \left\langle u+g_{0}^{k}+h^{k}+\frac{1}{\alpha _{k}}\nabla _{y} V\left( x^{k}, y\right) {|}_{y=x^{k+1}} ,x-x^{k+1} \right\rangle \ge 0,\quad \forall x\in X. \end{aligned}$$

Applying the convexity of $\chi _{0}$, we obtain that

$$\begin{aligned} \chi _{0}\left( x \right) -\chi _{0}\left( x^{k+1} \right) + \left\langle g_{0}^{k}+h^{k}+\frac{1}{\alpha _{k}}\nabla _{y} V\left( x^{k}, y\right) {|}_{y=x^{k+1}} ,x-x^{k+1}\right\rangle \ge 0,\quad \forall x\in X. \end{aligned}$$

(54)

Similarly, it yields from the optimality condition for ${\hat{x}}^{k+1}$ that for any $x\in X$,

$$\begin{aligned}&\chi _{0}\left( x \right) -\chi _{0}\left( {\hat{x}}^{k+1} \right) + \left\langle \nabla f_{0}\left( x^{k} \right) +\nabla _{x} \Psi _{\beta }\left( x^{k},z^{k}\right) \right. \nonumber \\&\quad \left. +\frac{1}{\alpha _{k}}\nabla _{y} V\left( x^{k}, y\right) {|}_{y={\hat{x}}^{k+1}} ,x-{\hat{x}}^{k+1} \right\rangle \ge 0. \end{aligned}$$

(55)

Letting $x={\hat{x}}^{k+1}$ in (54) and $x=x^{k+1}$ in (55) and adding them up, we obtain

$$\begin{aligned}&\, \left\langle \nabla f_{0}\left( x^{k}\right) +\nabla _{x}\Psi _{\beta }\left( x^{k},z^{k}\right) -g_{0}^{k}-h^{k}, x^{k+1}-{\hat{x}}^{k+1} \right\rangle \\ \ge&\, \frac{1}{\alpha _{k}} \left\langle \nabla _{y} V\left( x^{k},{y} \right) {|}_{y=x^{k+1}} -\nabla _{y} V\left( x^{k}, y\right) {|}_{y={\hat{x}}^{k+1}} ,x^{k+1}-{\hat{x}}^{k+1} \right\rangle \\ =&\, \frac{1}{\alpha _{k}} \left\langle \nabla v\left( x^{k+1}\right) -\nabla v\left( {\hat{x}}^{k+1}\right) ,x^{k+1}-{\hat{x}}^{k+1} \right\rangle \\ \ge&\frac{1}{\alpha _{k}} \left\| x^{k+1}-{\hat{x}}^{k+1} \right\| ^{2} , \end{aligned}$$

where the last inequality comes from the 1-strong convexity of $v\left( x\right)$. Then applying Cauchy-Schwarz inequality indicates the result. $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, L., Wang, X. A stochastic primal-dual method for a class of nonconvex constrained optimization. Comput Optim Appl 83, 143–180 (2022). https://doi.org/10.1007/s10589-022-00384-w

Download citation

Received: 06 July 2021
Accepted: 26 May 2022
Published: 24 June 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s10589-022-00384-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A stochastic primal-dual method for a class of nonconvex constrained optimization

Abstract

Access this article

Similar content being viewed by others

Study of a primal-dual algorithm for equality constrained minimization

Complexity of an inexact proximal-point penalty method for constrained smooth non-convex optimization

Level constrained first order methods for function constrained optimization

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 A.1: Lipschitz-continuity of \(\psi _{\beta }\left( u,v\right)\) in \(v \in \mathbb {R}_{+}\)

Proof

1.2 A.2: Nonexpansiveness of prox operator

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A stochastic primal-dual method for a class of nonconvex constrained optimization

Abstract

Access this article

Similar content being viewed by others

Study of a primal-dual algorithm for equality constrained minimization

Complexity of an inexact proximal-point penalty method for constrained smooth non-convex optimization

Level constrained first order methods for function constrained optimization

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 A.1: Lipschitz-continuity of \(\psi _{\beta }\left( u,v\right)\) in \(v \in \mathbb {R}_{+}\)

Proof

1.2 A.2: Nonexpansiveness of prox operator

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation