Skip to main content
Log in

An Interior Stochastic Gradient Method for a Class of Non-Lipschitz Optimization Problems

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

In this paper, we propose an interior stochastic gradient method for bound constrained optimization problems where the objective function is the sum of a smooth expectation of random integrands and a non-Lipshitz \(\ell _p (0<p<1)\) regularizer. We first define an L-stationary point of the problem and show that an L-stationary point is a first-order stationary point and a global minimizer is an L-stationary point. Next we propose a stochastic gradient method to reduce the computational cost for the exact gradients of the smooth term in the objective function and to keep the feasibility of iterates. We show that the proposed stochastic gradient method is convergent to an \(\epsilon \)-approximate L-stationary point in \({\mathcal {O}}(\epsilon ^{-4})\) computational complexity, with optimality measure in \(\ell _2\) norm. Finally, we conduct preliminary numerical experiments to test the proposed method and compare it with some existing method. Preliminary numerical results indicate that the proposed method is promising.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

All used data sets in Figs. 5-7 were downloaded from the website https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.

Notes

  1. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/

References

  1. Beck, A., Eldar, Y.C.: Sparsity constrained nonlinear optimization: optimality conditions and algorithms. SIAM J. Optim. 23, 1480–1509 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bian, W., Chen, X.: Optimality and complexity for constrained optimization problems with nonconvex regularization. Math. Oper. Res. 42, 1063–1084 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bian, W., Chen, X., Ye, Y.Y.: Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization. Math. Program. 149, 301–327 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  4. Candès, E., Wakin, M., Boyd, S.: Enhancing sparsity by reweighted \(\ell _1\) minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)

    MathSciNet  MATH  Google Scholar 

  5. Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained \(\ell _2\)-\(\ell _p\) minimization. Math. Program. 143, 371–383 (2014)

    MathSciNet  Google Scholar 

  6. Chen, X., Ng, M., Zhang, C.: Nonconvex \(\ell _p \) regularization and box constrained model for image restoration. IEEE Trans. Image Process. 21, 4709–4721 (2012)

    MathSciNet  MATH  Google Scholar 

  7. Chen, X., Niu, L.F., Yuan, Y.X.: Optimality conditions and a smoothing trust region Newton method for nonLipschitz optimization. SIAM J. Optim. 23, 1528–1552 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. Chen, X., Toint, Ph.L., Wang, H.: Complexity of partially-separable convexly-constrained optimization with non-Lipschitzian singularities. SIAM J. Optim. 29, 874–903 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  9. Chen, X., Toint, Ph.L.: High-order evaluation complexity for convexly-constrained optimization with non-Lipschitzian group sparsity terms. Math. Program. 187, 47–78 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of \(\ell _2\)-\(\ell _p\) minimization. SIAM J. Imaging Sci. 32, 2832–2852 (2010)

    Google Scholar 

  11. Daubechies, I., Devore, R., Fornasier, M., Güntürk, C.S.: Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 63, 1–38 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  12. Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  13. Fan, J., Xue, L., Zou, H.: Strong oracle optimality of folded concave penalized estimation. Ann. Stat. 42, 819–849 (2014)

    MathSciNet  MATH  Google Scholar 

  14. Ge, D., Jiang, X., Ye, Y.: A note on the complexity of \(l_p\) minimization. Math. Program. 21, 1721–1739 (2011)

    Google Scholar 

  15. Ge, D.D., He, R.H., He, S.M.: An improved algorithm for the \({L}_2\)-\({L}_p\) minimization problem. Math. Program. 166, 131–158 (2017)

    MathSciNet  MATH  Google Scholar 

  16. Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155, 267–305 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  17. Lai, M.J., Wang, Y.: An unconstrained \(\ell _q\) minimization with \(0<q<1\) for sparse solution of under-determined linear systems. SIAM J. Optim. 21, 82–101 (2011)

    MathSciNet  Google Scholar 

  18. Lai, M.J., Xu, Y., Yin, W.: Improved iteratively reweighted least squares for unconstrained smoothed \(\ell _q\) minimization. SIAM J. Numer. Anal. 51, 927–957 (2013)

    MathSciNet  MATH  Google Scholar 

  19. Li, D.H., Sun, Z., Zhang, X.J.: A constrained optimization reformulation and a feasible descent direction method for \({L}_{{1}/{2}}\) regularization. Comput. Optim. Appl. 59, 263–284 (2014)

    MathSciNet  MATH  Google Scholar 

  20. Liu, Y.F., Ma, S.Q., Dai, Y.H., Zhang, S.Z.: A smoothing SQP framework for a class of composite \({L}_q\) minimization over polyhedron. Math. Program. 158, 467–500 (2016)

    MathSciNet  MATH  Google Scholar 

  21. Lu, Z.S.: Iterative reweighted minimization methods for \(\ell _p\) regularized unconstrained nonlinear programming. Math. Program. 147, 277–307 (2014)

    MathSciNet  MATH  Google Scholar 

  22. Metel, M.R., Takeda, A.: Simple stochastic gradient methods for non-smooth non-convex regularized optimization. ICML, pages 4537–4545, (2019)

  23. Pham, N.H., Nguyen, L.M., Phan, D.T., Tran-Dinh, Q.: Proxsarah: An efficient algorithmic framework for stochastic composite nonconvex optimization. J. Mach. Learn. Res. 21, 1–48 (2020)

    MathSciNet  MATH  Google Scholar 

  24. Reddi, S.J., Sra, S., Póczos, B., Smola, A.: Proximal stochastic methods for nonsmooth non-convex finite-sum optimization. In: NIPS, (2016)

  25. Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, 3rd printing Springer Verlag, Heidelberg, Berlin, New York (2009)

    MATH  Google Scholar 

  26. Wang, X., Wang, S.X., Zhang, H.: Inexact proximal stochastic gradient method for convex composite optimization. Comput. Optim. and Appl. 68, 579–618 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  27. Wang, X., Zhang, H.: Inexact proximal stochastic second-order methods for nonconvex composite optimization. Optim. Method Softw. 35, 808–835 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  28. Wang, X.Y., Wang, X., Yuan, Y.: Stochastic proximal quasi-Newton methods for non-convex composite optimization. Optim. Method Softw. 34, 922–948 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  29. Wen, Z.W., Yin, W.T., Goldfarb, D., Zhang, Y.: A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation. SIAM J. Sci. Comput. 32, 1832–1857 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  30. Wright, J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57, 2479–2493 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  31. Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24, 2057–2075 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  32. Xu, Y., Jin, R., Yang, T.: Stochastic proximal gradient methods for non-smooth non-convex regularized problems. arXiv:1902.07672v3 [math.OC], 30 Mar (2019)

  33. Xu, Y., Qi, Q., Lin, Q., Jin, R., Yang, T.: Stochastic optimization for DC functions and non-smooth non-convex regularizers with non-asymptotic convergence. ICML, pages 6942–6951, (2019)

  34. Xu, Z.B., Chang, X.Y., Xu, F.M., Zhang, H.: \({L}_{{1}/{2}}\) regularization: A thresholding representation theory and a fast solver. IEEE T. Neur. Net. Lear. 23, 1013–1027 (2012)

    Article  Google Scholar 

  35. Yuan, G.X., Ho, C.H., Lin, C.J.: Recent advances of large-scale linear classification. Proceedings of the IEEE 100, 2584–2603 (2012)

    Article  Google Scholar 

  36. Zhang, C., Chen, X.: A smoothing active set method for linearly constrained non-Lipschitz nonconvex optimizaiton. SIAM J. Optim 30, 1–30 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  37. Zhang, S., Xin, J.: Minimization of transformed \({L}_1\) penalty: theory, difference of convex function algorithm, and robust application in compressed sensing. Math. Program. 169, 307–336 (2018)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank two anonymous reviewers for their valuable comments and suggestions which helped to improve the paper.

Funding

This work was supported by the Chinese NSF Grant (nos.11771157, 11731013, 11871453, 11961011 and 11971106), the Ministry of Education, Humanities and Social Sciences project (no.17JYJAZH011), Young Elite Scientists Sponsorship Program by CAST (2018QNRC001), Youth Innovation Promotion Association, CAS, by special projects in key fields of Guangdong Universities (2021ZDZX1054), the Natural Science Foundation of Guangdong Province (2022A1515010567), the Major Key Project of PCL (No. PCL2022A05) and PolyU153001/18P, and the University Research Facility in Big Data Analytics of the Hong Kong Polytechnic University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Wang.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Proof of Lemma 2.2

Proof of Lemma 2.2

Notice that h is level bounded. The solution set of (2.2) is nonempty and bounded.

Firstly, we assume \(\alpha < {\bar{\omega }}\). For any \(t^*\in {\mathcal {T}}^*_\alpha \), suppose that \(t^*\ne 0\), then \(t^*>0\). By the optimality condition for (2.2), we have

$$\begin{aligned} \alpha =t^*+\frac{\lambda p}{L}(t^*)^{p-1}. \end{aligned}$$

Then it derives

$$\begin{aligned} 0\ge h(t^*)-h(0)= \;&\frac{1}{2}(t^*-\alpha )^2 +\frac{\lambda }{L}(t^*)^p -\frac{1}{2}\alpha ^2\\ = \;&\frac{1}{2}(t^*)^2 - t^*\alpha + \frac{\lambda }{L}(t^*)^p \\ =\;&\frac{1}{2}(t^*)^2 - t^*(t^*+\frac{\lambda p}{L}(t^*)^{p-1}) + \frac{\lambda }{L}(t^*)^p\\ = \;&-\frac{1}{2}(t^*)^2 + \frac{\lambda }{L}(1-p)(t^*)^p \end{aligned}$$

which yields \(t^* \ge (\frac{2\lambda (1-p)}{L})^{\frac{1}{2-p}}\). Then it implies from Lemma 2.1 that

$$\begin{aligned} \alpha \ge h_1\left( \left( \frac{2\lambda (1-p)}{L}\right) ^{\frac{1}{2-p}}\right) = {\bar{\omega }}. \end{aligned}$$

However, it contradict the assumption \(\alpha <{\bar{\omega }}\). Hence, \({\mathcal {T}}^*_\alpha =\{0\}\).

Secondly, we assume \(\alpha ={\bar{\omega }}\). By easy computations we have

$$\begin{aligned} h(0)=h\left( \left( \frac{2\lambda (1-p)}{L}\right) ^{\frac{1}{2-p}}\right) . \end{aligned}$$
(A.1)

And for any \(t>0\), the following equalities hold:

$$\begin{aligned} h(t) =h(0)+ \frac{1}{2}t^2-t\alpha +\frac{\lambda }{L}t^p \, \text{ and } \, h'(t)=t +p \frac{\lambda }{L}t^{p-1}-\alpha . \end{aligned}$$

Recall that by Lemma 2.1, \(h'(t)\) is monotonically decreasing in \((0,\varrho \,]\) and monotonically increasing in \([\,\varrho ,+\infty )\) with \(\varrho =(\frac{\lambda p(1-p)}{L})^{\frac{1}{2-p}}\). Then we obtain

$$\begin{aligned} \lim \limits _{t\rightarrow 0{+}}h'(t)>0,\;\; \; \lim \limits _{t\rightarrow +\infty }h'(t)>0\;\;\; \mathrm{and}\;\;\; h'(\varrho ) < h'\left( \left( \frac{2\lambda (1-p)}{L}\right) ^{\frac{1}{2-p}}\right) =0. \end{aligned}$$

Thus \(h'(t)\) has two roots:

$$\begin{aligned} t_1\in (0,\varrho ),\, \, t_2=\left( \frac{2\lambda (1-p)}{L}\right) ^{\frac{1}{2-p}} \in (\varrho ,+\infty ), \end{aligned}$$

\(h'(t)>0\) in \((0,t_1) \cup (t_2,+\infty )\) and \(h'(t)<0\) in \((t_1,t_2)\). Thus h(t) is monotonically increasing in \((0,t_1]\) and \([t_2,+\infty )\) and is monotonically decreasing in \([t_1,t_2]\). Then by (A.1) we obtain

$$\begin{aligned} h(t_1)=\max _{{t\in (0,t_1]}} h(t)\;\;\; \text{ and } \;\;\; h(t_2)=\min _{t>0} h(t). \end{aligned}$$

It thus follows that

$$\begin{aligned} h(t)>h(t_2) \quad \text{ for } \text{ any } 0<t\ne t_2, \end{aligned}$$

which together with (A.1) indicates that \({\mathcal {T}}^*_\alpha =\{0, t_2\}\) when \(\alpha ={\bar{\omega }}\).

At last we assume \(\alpha > {\bar{\omega }}\). Let \(\bar{t}=\left( \frac{2\lambda (1-p)}{L}\right) ^{\frac{1}{2-p}}\). Then for any \(t^*\in {\mathcal {T}}^*_\alpha \), it yields from \(h(t^*)\le h(\bar{t})\) that

$$\begin{aligned} h(t^*)=\frac{1}{2}(t^*-\alpha )^2 + \frac{\lambda }{L}(t^*)^p\le & {} \frac{1}{2}(\bar{t}-\alpha )^2 + \frac{\lambda }{L}\bar{t}^p\\= & {} { \frac{1}{2}\alpha ^2 + \bar{t}( {\bar{\omega }}- \alpha )}\\< & {} \frac{1}{2}\alpha ^2 = h(0). \end{aligned}$$

Then \(t^*\ne 0\) thus \(t^*>0\). Then the optimality of \(t^*\) indicates

$$\begin{aligned} \alpha =t^*+\frac{\lambda p}{L}(t^*)^{p-1}=h_1(t^*), \end{aligned}$$
(A.2)

which together with \(h(t^*)< h(0)\) implies

$$\begin{aligned} t^*> \left( \frac{2\lambda (1-p)}{L}\right) ^{\frac{1}{2-p}}. \end{aligned}$$
(A.3)

Recall that by Lemma 2.1, \(h_1\) is monotonically increasing on \([(\frac{\lambda p(1-p)}{L})^{\frac{1}{2-p}}, +\infty )\). Hence, \(t^*\) satisfying (A.2) and (A.3) is unique. The proof is completed.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, W., Wang, X. & Chen, X. An Interior Stochastic Gradient Method for a Class of Non-Lipschitz Optimization Problems. J Sci Comput 92, 42 (2022). https://doi.org/10.1007/s10915-022-01897-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-022-01897-6

Keywords

Mathematics Subject Classification

Navigation