The modified second APG method for DC optimization problems

Lin, Daoling; Liu, Chunguang

doi:10.1007/s11590-018-1280-8

The modified second APG method for DC optimization problems

Original Paper
Published: 11 June 2018

Volume 13, pages 805–824, (2019)
Cite this article

Optimization Letters Aims and scope Submit manuscript

409 Accesses
Explore all metrics

Abstract

In this paper, we construct a variant of the second accelerated proximal gradient method introduced by Nesterov (Introductory lectures on convex optimization, Kluwer Academic Publisher, Dordrecht, 2004) and Auslender and Teboulle (SIAM J Optim 16:697–725, 2006) [and named by Tseng (Math Program 125:263–295, 2010)] for solving the minimization of DC functions (difference of two convex functions). Under some suitable assumptions such as level boundedness, Kurdyka–Łojasiewicz property, and locally Lipschitz differentiability, we prove that the sequence generated by our algorithm locally linearly converges to some stationary point of the given DC function. Numerical results show that our method performs well and fast, comparing to some other often used algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Article Open access 13 June 2022

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Article 05 April 2024

The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving Variational Problems

Article 14 February 2018

References

Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer Academic Publisher, Dordrecht (2004)
Book MATH Google Scholar
Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)
Article MathSciNet MATH Google Scholar
Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. 125, 263–295 (2010)
Article MathSciNet MATH Google Scholar
Le Thi, H.A., Le Hoai, M., Nguyen, V.V., Pham Dinh, T.: A dc programming approach for feature selection in support vector machines learning. Adv. Data Anal. Classif. 2(3), 259–278 (2008)
Article MathSciNet MATH Google Scholar
Le Thi, H.A., Pham Dinh, T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)
Article MathSciNet MATH Google Scholar
Alvarado, A., Scutari, G., Pang, J.S.: A new decomposition method for multiuser DC programming and its applications. IEEE Trans. Signal Process. 62, 2984–2998 (2014)
Article MathSciNet MATH Google Scholar
Zhang, S., Xin, J.: Minimization of transformed $L_{1}$ penalty: theory, difference of convex function algorithm, and robust application in compressed sensing. arXiv preprint arXiv:1441.5735v3
Sanjabi, M., Razaviyayn, M., Luo, Z.-Q.: Optimal joint base station assignment and beamforming for heterogeneous networks. IEEE Trans. Signal Process. 62, 1950–1961 (2014)
Article MathSciNet MATH Google Scholar
Hiriart-Urruty, J.B.: From convex optimization to nonconvex optimization necessary and sufficient conditions for global optimization. In: Clarke, F.H., Dem’yanov, V.F., Giannessi, F. (eds.) Nonsmooth Optimization and Related Topics, vol. 43, pp. 219–240. Plenum Press, New York (1989)
Chapter Google Scholar
Hiriart-Urruty, J.B.: Generalized differentiability, duality and optimization for problems dealing with difference of convex functions. In: Ponstein, J. (ed.) Convexity and Duality in Optimization. Lecture Notes in Economics And Mathematical Systems, vol. 256, pp. 37–70. Springer, Berlin (1986)
Chapter Google Scholar
Hiriart-Urruty, J.B., Tuy, H.: Essays on nonconvex optimization. Math. Program. 41, 229–248 (1988)
Article Google Scholar
Auchmuty, G.: Duality algorithm for nonconvex variational principle. Research Report UH/MD-41, University of Houston (1988)
Pham Dinh, T., Souad, E.B.: Algorithms for solving a class of nonconvex optimizations problems: methods of subgradient. Fermat Day 85: Mathematics for Optimization, North Holland (1986)
Gu, J., Xiao, X., Zhang, L.: A subgradient-based convex approximations method for DC programming and its applications. J. Ind. Manag. Optim. 12(4), 1349–1366 (2016)
Article MathSciNet MATH Google Scholar
Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithm and applications. Acta Math. Vietnam. 22, 289–355 (1997)
MathSciNet MATH Google Scholar
Le Thi, H.A., Quynh, T.D., Adjallah, K.H.: A difference of convex functions algorithm for optimal scheduling and real-time assignment of preventive maintenance jobs on parallel processors. J. Ind. Manag. Optim. 10(1), 243–258 (2014)
MathSciNet MATH Google Scholar
Wu, C., Li, C., Long, Q.: A DC programming approach for senor network localization with uncertainties in anchor positions. J. Ind. Manag. Optim. 10(3), 817–826 (2014)
Article MathSciNet MATH Google Scholar
Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for spare optimization problems. Math. Program. Ser. B. https://doi.org/10.1007/s10107-107-1181-0
Pham, D.T., Le Thi, H.A.: A D.C. optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8, 476–505 (1998)
Article MathSciNet MATH Google Scholar
Artacho, F.J.A., Fleming, R.M.T., Vuong, P.T.: Accelerating the DC algorithm for smooth functions. Math. Program. 169, 95–118 (2018)
Article MathSciNet MATH Google Scholar
Liu, T., Pong, T.K., Takeda, A.: A successive difference-of-convex approximation method for a class of nonconvex nonmooth optimization problems. Preprint, 2017. http://arxiv.org/abs/1710.05778
Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-conex algorithm with extrapolation. Comput. Optim. Appl. 69, 297–324 (2018)
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Paper (2007)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, New York (1998)
Book MATH Google Scholar
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Lojasiewicz inequality. Math. Oper. Res. 35, 438–457 (2010)
Article MathSciNet MATH Google Scholar
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and Gauss–Seidel methods. Math. Program. 137, 91–129 (2013)
Article MathSciNet MATH Google Scholar
Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Lojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. https://doi.org/10.1007/s10208-017-9366-8
Yang, W.H.: Error bounds for convex polynomials. SIAM J. Optim. 19, 1633–1647 (2009)
Article MathSciNet MATH Google Scholar
Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Progam. 165, 271–507 (2017)
MathSciNet MATH Google Scholar
Liu, H., Wu, W., SO, A. M.-C.: Quadratic optimization with orthogonality constraints: explicit Lojasiewicz exponent and linear convergence of line-search methods. In: Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), pp. 1158–1167 (2016)
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Progam. Ser. A 146, 459–494 (2014)
Article MathSciNet MATH Google Scholar
Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. Fixed Point Algorithms Inverse Probl. Sci. Eng. 49, 185–212 (2012)
Article MathSciNet MATH Google Scholar
Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of $l_{1-2}$ foe compressed sensing. SIAM J. Sci. Comput. 37, 536–563 (2015)
Article MathSciNet Google Scholar
Candes, E.J., Wakin, M., Boyd, S.: Enhancing spasity by reweighted $l_{1}$ minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)
Article MathSciNet MATH Google Scholar
Liu, T., Pong, T.K.: Further properties of the forward-backward envelope with applications to difference-of-convex programming. Comput. Optim. Appl. 67(3), 489–520 (2017)
Article MathSciNet MATH Google Scholar
Chen, G., Teboulle, M.: Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM J. Optim. 3, 538–543 (1993)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank Prof. T. K. Pong for his stimulating discussion and useful suggestions on the paper, and to thank the referees for their constructive comments leading to many improvements and for pointing out the references [2, 14, 16, 17, 19, 20, 21, 30, 36]. This work was supported by National Natural Science Foundation of China (Grant Nos. 11371173, 11301222), and the Fundamental Research Funds for the Central Universities (Grant Nos. 21615453, 21617417).

Author information

Authors and Affiliations

Department of Mathematics, Jinan University, Guangzhou, 510632, Guangdong, China
Daoling Lin & Chunguang Liu

Authors

Daoling Lin
View author publications
You can also search for this author in PubMed Google Scholar
Chunguang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daoling Lin.

Appendix: Proof of Lemma 2

Proof

(Lemma 2) By the definition of $z^{k+1}$ in Algorithm 1 and together with 3-Point Property (see [37, Lemma 2.2] and [3, Section 5]), we have

$$\begin{aligned}&\langle \nabla f(y^{k})-\xi ^k,z^{k+1}-y^{k} \rangle +P_1(z^{k+1})+\displaystyle \frac{\theta _k L}{2}\Vert z^{k+1}-z^{k}\Vert ^2\nonumber \\&\quad \le \,\langle \nabla f(y^{k})-\xi ^k,x^{k}-y^{k} \rangle -\displaystyle \frac{\theta _k L}{2}\Vert z^{k+1}-x^{k}\Vert ^2\nonumber \\&\quad \quad +\,\displaystyle \frac{\theta _k L}{2}\Vert x^{k}-z^{k}\Vert ^2+P_1(x^{k}). \end{aligned}$$

(24)

On the other hand, we have

$$\begin{aligned} F(x^{k+1})= & {} f(x^{k+1})+P(x^{k+1})\le f(y^{k})+\langle \nabla f(y^{k}),x^{k+1}-y^{k} \rangle \nonumber \\&+\,\frac{L}{2}\Vert x^{k+1}-y^{k}\Vert ^2+P(x^{k+1})\nonumber \\= & {} f(y^{k})+\langle \nabla f(y^{k}),(1-\theta _{k})x^{k}+\theta _{k}z^{k+1}-y^k \rangle +P_1(x^{k+1})-P_2(x^{k+1})\nonumber \\&+\,\frac{L}{2}\Vert x^{k+1}-y^{k}\Vert ^2\nonumber \\\le & {} \, (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle \right] \nonumber \\&+\,\theta _{k}\left[ f(y^{k}) +\langle \nabla f(y^{k}),z^{k+1}-y^{k}\rangle \right] \nonumber \\&+\,(1-\theta _{k})P_1(x^{k})+\theta _{k}P_1(z^{k+1})\nonumber \\&-\,P_2(x^{k}) -\langle \xi ^k,x^{k+1}-x^{k} \rangle +\displaystyle \frac{L}{2}{{\theta ^2_{k}}}\Vert z^{k+1}-z^{k}\Vert ^2\nonumber \\= & {} (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] \nonumber \\&-\,P_2(x^{k}) -\langle \xi ^k,x^{k+1}-x^{k} \rangle \nonumber \\&+\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k}),z^{k+1}-y^{k} \rangle +P_1(z^{k+1})+\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2]\nonumber \\= & {} (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] \nonumber \\&-\,P_2(x^{k}) -\theta _{k}\langle \xi ^k,z^{k+1}-x^{k} \rangle \nonumber \\&+\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k}),z^{k+1}-y^{k} \rangle +P_1(z^{k+1})+\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2]\nonumber \\= & {} (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] -P_2(x^{k})\nonumber \\&+\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k}),z^{k+1}-y^{k} \rangle -\langle \xi ^k,z^{k+1}-x^{k} \rangle \nonumber \\&+\,P_1(z^{k+1})+\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2]\nonumber \\= & {} (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] -P_2(x^{k}) +\theta _{k}[f(y^{k})\nonumber \\&+\,\langle \nabla f(y^{k}),z^{k+1}-y^{k} \rangle -\langle \xi ^k,z^{k+1}-y^{k}+y^{k}-x^{k} \rangle \nonumber \\&+\,P_1(z^{k+1})+\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2] \end{aligned}$$

(25)

where the first inequality is obtained since $\nabla f$ is Lipschitz continuous with the modulus $L> 0$, the second inequality follows from the convexity of $P_1$ and $P_2$ and the facts that $\xi ^k\in \partial P_2(x^k)$ and together with the relation $x^{k+1}-y^k=\theta _{k}(z^{k+1}-z^k)$, and $x^{k+1}=(1-\theta _{k})x^{k}+\theta _{k}z^{k+1}$. Now, we obtain further from (25) that

$$\begin{aligned} F(x^{k+1})&\le (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] -P_2(x^{k})\\&+\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k})-\xi ^k,z^{k+1}-y^{k} \rangle +P_1(z^{k+1})\\&+\,\frac{L}{2}\theta _{k}\Vert z^{k+1}-z^{k}\Vert ^2]-\theta _{k}\langle \xi ^k,y^{k}-x^{k} \rangle . \end{aligned}$$

This together with (24) imply

$$\begin{aligned} \begin{aligned} F(x^{k+1})&\le (1-\theta _{k})\left[ f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle +P_1(x^{k})\right] -P_2(x^{k})\\&\quad +\,\theta _{k}[f(y^{k})+\langle \nabla f(y^{k})-\xi ^k,x^{k}-y^{k} \rangle +P_1(x^{k})\\&\quad +\,\frac{L}{2}\theta _{k}\Vert x^{k}-z^{k}\Vert ^2-\frac{L}{2}\theta _{k}\Vert z^{k+1}-x^{k}\Vert ^2] -\theta _{k}\langle \xi ^k,y^{k}-x^{k} \rangle \\&=f(y^{k})+\langle \nabla f(y^{k}),x^{k}-y^{k}\rangle +P_1(x^{k})-P_2(x^{k})\\&\quad +\,\frac{L}{2}{\theta ^2_{k}}\Vert x^{k}-z^{k}\Vert ^2 -\frac{L}{2}{\theta ^2_{k}}\Vert x^{k}-z^{k+1}\Vert ^2. \end{aligned} \end{aligned}$$

(26)

Using the convexity of f,

$$\begin{aligned} f(y^k)+\langle \nabla f(y^{k}),x^{k}-y^{k} \rangle \le f(x^k). \end{aligned}$$

(27)

On the other hand, we have

$$\begin{aligned} x^k-z^k=(1-\theta _{k-1})x^{k-1}+\theta _{k-1}z^k-z^k=(1-\theta _{k-1}) (x^{k-1}-z^k). \end{aligned}$$

This relation and (26), (27) together imply

$$\begin{aligned} \begin{aligned}&[F(x^{k+1})+\frac{\alpha _{k+1}}{2}\Vert x^{k}-z^{k+1}\Vert ^2] - [F(x^{k})+\frac{\alpha _{k}}{2}\Vert x^{k-1}-z^{k}\Vert ^2]\\&\le \, \frac{1}{2}[{\theta ^2_{k}}(1-{\theta _{k-1}})^2L-\alpha _{k}]\Vert x^{k-1}-z^{k}\Vert ^2 +\frac{1}{2}[\alpha _{k+1}-{\theta ^2_{k}}L]\Vert x^{k}-z^{k+1}\Vert ^2. \end{aligned} \end{aligned}$$

(28)

Since ${\theta ^2_{k}}(1-{\theta _{k-1}})^2L -\alpha _{k}\le 0,\alpha _{k+1}-{\theta ^2_{k}}L\le -\delta $, we obtain (6) by (5). $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, D., Liu, C. The modified second APG method for DC optimization problems. Optim Lett 13, 805–824 (2019). https://doi.org/10.1007/s11590-018-1280-8

Download citation

Received: 29 October 2017
Accepted: 04 June 2018
Published: 11 June 2018
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s11590-018-1280-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The modified second APG method for DC optimization problems

Abstract

Access this article

Similar content being viewed by others

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Sum-of-Squares Relaxations for Information Theory and Variational Inference

The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving Variational Problems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The modified second APG method for DC optimization problems

Abstract

Access this article

Similar content being viewed by others

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Sum-of-Squares Relaxations for Information Theory and Variational Inference

The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving Variational Problems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Lemma 2

Appendix: Proof of Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation