Skip to main content
Log in

The modified second APG method for DC optimization problems

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

In this paper, we construct a variant of the second accelerated proximal gradient method introduced by Nesterov (Introductory lectures on convex optimization, Kluwer Academic Publisher, Dordrecht, 2004) and Auslender and Teboulle (SIAM J Optim 16:697–725, 2006) [and named by Tseng (Math Program 125:263–295, 2010)] for solving the minimization of DC functions (difference of two convex functions). Under some suitable assumptions such as level boundedness, Kurdyka–Łojasiewicz property, and locally Lipschitz differentiability, we prove that the sequence generated by our algorithm locally linearly converges to some stationary point of the given DC function. Numerical results show that our method performs well and fast, comparing to some other often used algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer Academic Publisher, Dordrecht (2004)

    Book  MATH  Google Scholar 

  2. Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  3. Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. 125, 263–295 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  4. Le Thi, H.A., Le Hoai, M., Nguyen, V.V., Pham Dinh, T.: A dc programming approach for feature selection in support vector machines learning. Adv. Data Anal. Classif. 2(3), 259–278 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  5. Le Thi, H.A., Pham Dinh, T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  6. Alvarado, A., Scutari, G., Pang, J.S.: A new decomposition method for multiuser DC programming and its applications. IEEE Trans. Signal Process. 62, 2984–2998 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  7. Zhang, S., Xin, J.: Minimization of transformed \(L_{1}\) penalty: theory, difference of convex function algorithm, and robust application in compressed sensing. arXiv preprint arXiv:1441.5735v3

  8. Sanjabi, M., Razaviyayn, M., Luo, Z.-Q.: Optimal joint base station assignment and beamforming for heterogeneous networks. IEEE Trans. Signal Process. 62, 1950–1961 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  9. Hiriart-Urruty, J.B.: From convex optimization to nonconvex optimization necessary and sufficient conditions for global optimization. In: Clarke, F.H., Dem’yanov, V.F., Giannessi, F. (eds.) Nonsmooth Optimization and Related Topics, vol. 43, pp. 219–240. Plenum Press, New York (1989)

    Chapter  Google Scholar 

  10. Hiriart-Urruty, J.B.: Generalized differentiability, duality and optimization for problems dealing with difference of convex functions. In: Ponstein, J. (ed.) Convexity and Duality in Optimization. Lecture Notes in Economics And Mathematical Systems, vol. 256, pp. 37–70. Springer, Berlin (1986)

    Chapter  Google Scholar 

  11. Hiriart-Urruty, J.B., Tuy, H.: Essays on nonconvex optimization. Math. Program. 41, 229–248 (1988)

    Article  Google Scholar 

  12. Auchmuty, G.: Duality algorithm for nonconvex variational principle. Research Report UH/MD-41, University of Houston (1988)

  13. Pham Dinh, T., Souad, E.B.: Algorithms for solving a class of nonconvex optimizations problems: methods of subgradient. Fermat Day 85: Mathematics for Optimization, North Holland (1986)

  14. Gu, J., Xiao, X., Zhang, L.: A subgradient-based convex approximations method for DC programming and its applications. J. Ind. Manag. Optim. 12(4), 1349–1366 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  15. Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithm and applications. Acta Math. Vietnam. 22, 289–355 (1997)

    MathSciNet  MATH  Google Scholar 

  16. Le Thi, H.A., Quynh, T.D., Adjallah, K.H.: A difference of convex functions algorithm for optimal scheduling and real-time assignment of preventive maintenance jobs on parallel processors. J. Ind. Manag. Optim. 10(1), 243–258 (2014)

    MathSciNet  MATH  Google Scholar 

  17. Wu, C., Li, C., Long, Q.: A DC programming approach for senor network localization with uncertainties in anchor positions. J. Ind. Manag. Optim. 10(3), 817–826 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  18. Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for spare optimization problems. Math. Program. Ser. B. https://doi.org/10.1007/s10107-107-1181-0

  19. Pham, D.T., Le Thi, H.A.: A D.C. optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8, 476–505 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  20. Artacho, F.J.A., Fleming, R.M.T., Vuong, P.T.: Accelerating the DC algorithm for smooth functions. Math. Program. 169, 95–118 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  21. Liu, T., Pong, T.K., Takeda, A.: A successive difference-of-convex approximation method for a class of nonconvex nonmooth optimization problems. Preprint, 2017. http://arxiv.org/abs/1710.05778

  22. Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-conex algorithm with extrapolation. Comput. Optim. Appl. 69, 297–324 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  23. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  24. Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Paper (2007)

  25. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, New York (1998)

    Book  MATH  Google Scholar 

  26. Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Lojasiewicz inequality. Math. Oper. Res. 35, 438–457 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  27. Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and Gauss–Seidel methods. Math. Program. 137, 91–129 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  28. Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Lojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. https://doi.org/10.1007/s10208-017-9366-8

  29. Yang, W.H.: Error bounds for convex polynomials. SIAM J. Optim. 19, 1633–1647 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  30. Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Progam. 165, 271–507 (2017)

    MathSciNet  MATH  Google Scholar 

  31. Liu, H., Wu, W., SO, A. M.-C.: Quadratic optimization with orthogonality constraints: explicit Lojasiewicz exponent and linear convergence of line-search methods. In: Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), pp. 1158–1167 (2016)

  32. Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Progam. Ser. A 146, 459–494 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  33. Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. Fixed Point Algorithms Inverse Probl. Sci. Eng. 49, 185–212 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  34. Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of \(l_{1-2}\) foe compressed sensing. SIAM J. Sci. Comput. 37, 536–563 (2015)

    Article  MathSciNet  Google Scholar 

  35. Candes, E.J., Wakin, M., Boyd, S.: Enhancing spasity by reweighted \(l_{1}\) minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  36. Liu, T., Pong, T.K.: Further properties of the forward-backward envelope with applications to difference-of-convex programming. Comput. Optim. Appl. 67(3), 489–520 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  37. Chen, G., Teboulle, M.: Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM J. Optim. 3, 538–543 (1993)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Prof. T. K. Pong for his stimulating discussion and useful suggestions on the paper, and to thank the referees for their constructive comments leading to many improvements and for pointing out the references [2, 14, 16, 17, 19, 20, 21, 30, 36]. This work was supported by National Natural Science Foundation of China (Grant Nos. 11371173, 11301222), and the Fundamental Research Funds for the Central Universities (Grant Nos. 21615453, 21617417).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daoling Lin.

Appendix: Proof of Lemma 2

Appendix: Proof of Lemma 2

Proof

(Lemma 2) By the definition of \(z^{k+1}\) in Algorithm 1 and together with 3-Point Property (see [37, Lemma 2.2] and [3, Section 5]), we have

$$\begin{aligned}&\langle \nabla f(y^{k})-\xi ^k,z^{k+1}-y^{k} \rangle +P_1(z^{k+1})+\displaystyle \frac{\theta _k L}{2}\Vert z^{k+1}-z^{k}\Vert ^2\nonumber \\&\quad \le \,\langle \nabla f(y^{k})-\xi ^k,x^{k}-y^{k} \rangle -\displaystyle \frac{\theta _k L}{2}\Vert z^{k+1}-x^{k}\Vert ^2\nonumber \\&\quad \quad +\,\displaystyle \frac{\theta _k L}{2}\Vert x^{k}-z^{k}\Vert ^2+P_1(x^{k}). \end{aligned}$$
(24)

On the other hand, we have

$$\begin{aligned} F(x^{k+1})= & {} f(x^{k+1})+P(x^{k+1})\le f(y^{k})+\langle \nabla f(y^{k}),x^{k+1}-y^{k} \rangle \nonumber \\&+\,\frac{L}{2}\Vert x^{k+1}-y^{k}\Vert ^2+P(x^{k+1})\nonumber \\= & {} f(y^{k})+\langle \nabla f(y^{k}),(1-\theta _{k})x^{k}+\theta _{k}z^{k+1}-y^k \rangle +P_1(x^{k+1})-P_2(x^{k+1})\nonumber \\&+\,\frac{L}{2}\Vert x^{k+1}-y^{k}\Vert ^2\nonumber \\\le & {} \, (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle \right] \nonumber \\&+\,\theta _{k}\left[ f(y^{k}) +\langle \nabla f(y^{k}),z^{k+1}-y^{k}\rangle \right] \nonumber \\&+\,(1-\theta _{k})P_1(x^{k})+\theta _{k}P_1(z^{k+1})\nonumber \\&-\,P_2(x^{k}) -\langle \xi ^k,x^{k+1}-x^{k} \rangle +\displaystyle \frac{L}{2}{{\theta ^2_{k}}}\Vert z^{k+1}-z^{k}\Vert ^2\nonumber \\= & {} (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] \nonumber \\&-\,P_2(x^{k}) -\langle \xi ^k,x^{k+1}-x^{k} \rangle \nonumber \\&+\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k}),z^{k+1}-y^{k} \rangle +P_1(z^{k+1})+\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2]\nonumber \\= & {} (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] \nonumber \\&-\,P_2(x^{k}) -\theta _{k}\langle \xi ^k,z^{k+1}-x^{k} \rangle \nonumber \\&+\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k}),z^{k+1}-y^{k} \rangle +P_1(z^{k+1})+\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2]\nonumber \\= & {} (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] -P_2(x^{k})\nonumber \\&+\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k}),z^{k+1}-y^{k} \rangle -\langle \xi ^k,z^{k+1}-x^{k} \rangle \nonumber \\&+\,P_1(z^{k+1})+\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2]\nonumber \\= & {} (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] -P_2(x^{k}) +\theta _{k}[f(y^{k})\nonumber \\&+\,\langle \nabla f(y^{k}),z^{k+1}-y^{k} \rangle -\langle \xi ^k,z^{k+1}-y^{k}+y^{k}-x^{k} \rangle \nonumber \\&+\,P_1(z^{k+1})+\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2] \end{aligned}$$
(25)

where the first inequality is obtained since \(\nabla f\) is Lipschitz continuous with the modulus \(L> 0\), the second inequality follows from the convexity of \(P_1\) and \(P_2\) and the facts that \(\xi ^k\in \partial P_2(x^k)\) and together with the relation \(x^{k+1}-y^k=\theta _{k}(z^{k+1}-z^k)\), and \(x^{k+1}=(1-\theta _{k})x^{k}+\theta _{k}z^{k+1}\). Now, we obtain further from (25) that

$$\begin{aligned} F(x^{k+1})&\le (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] -P_2(x^{k})\\&+\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k})-\xi ^k,z^{k+1}-y^{k} \rangle +P_1(z^{k+1})\\&+\,\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2]-\theta _{k}\langle \xi ^k,y^{k}-x^{k} \rangle . \end{aligned}$$

This together with (24) imply

$$\begin{aligned} \begin{aligned} F(x^{k+1})&\le (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] -P_2(x^{k})\\&\quad +\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k})-\xi ^k,x^{k}-y^{k} \rangle +P_1(x^{k})\\&\quad +\,\frac{L}{2}\theta _{k}\Vert x^{k}-z^{k}\Vert ^2-\frac{L}{2}\theta _{k}\Vert z^{k+1}-x^{k}\Vert ^2] -\theta _{k}\langle \xi ^k,y^{k}-x^{k} \rangle \\&=f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k}\rangle +P_1(x^{k})-P_2(x^{k})\\&\quad +\,\frac{L}{2}{\theta ^2_{k}}\Vert x^{k}-z^{k}\Vert ^2 -\frac{L}{2}{\theta ^2_{k}}\Vert x^{k}-z^{k+1}\Vert ^2. \end{aligned} \end{aligned}$$
(26)

Using the convexity of f,

$$\begin{aligned} f(y^k)+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle \le f(x^k). \end{aligned}$$
(27)

On the other hand, we have

$$\begin{aligned} x^k-z^k=(1-\theta _{k-1})x^{k-1}+\theta _{k-1}z^k-z^k=(1-\theta _{k-1}) (x^{k-1}-z^k). \end{aligned}$$

This relation and (26), (27) together imply

$$\begin{aligned} \begin{aligned}&[F(x^{k+1})+\frac{\alpha _{k+1}}{2}\Vert x^{k}-z^{k+1}\Vert ^2] - [F(x^{k})+\frac{\alpha _{k}}{2}\Vert x^{k-1}-z^{k}\Vert ^2]\\&\le \, \frac{1}{2}[{\theta ^2_{k}}(1-{\theta _{k-1}})^2L-\alpha _{k}]\Vert x^{k-1}-z^{k}\Vert ^2 +\frac{1}{2}[\alpha _{k+1}-{\theta ^2_{k}}L]\Vert x^{k}-z^{k+1}\Vert ^2. \end{aligned} \end{aligned}$$
(28)

Since \({\theta ^2_{k}}(1-{\theta _{k-1}})^2L -\alpha _{k}\le 0,\alpha _{k+1}-{\theta ^2_{k}}L\le -\delta \), we obtain (6) by (5). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, D., Liu, C. The modified second APG method for DC optimization problems. Optim Lett 13, 805–824 (2019). https://doi.org/10.1007/s11590-018-1280-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-018-1280-8

Keywords

Navigation