Abstract
In this paper, we construct a variant of the second accelerated proximal gradient method introduced by Nesterov (Introductory lectures on convex optimization, Kluwer Academic Publisher, Dordrecht, 2004) and Auslender and Teboulle (SIAM J Optim 16:697–725, 2006) [and named by Tseng (Math Program 125:263–295, 2010)] for solving the minimization of DC functions (difference of two convex functions). Under some suitable assumptions such as level boundedness, Kurdyka–Łojasiewicz property, and locally Lipschitz differentiability, we prove that the sequence generated by our algorithm locally linearly converges to some stationary point of the given DC function. Numerical results show that our method performs well and fast, comparing to some other often used algorithms.
Similar content being viewed by others
References
Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer Academic Publisher, Dordrecht (2004)
Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)
Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. 125, 263–295 (2010)
Le Thi, H.A., Le Hoai, M., Nguyen, V.V., Pham Dinh, T.: A dc programming approach for feature selection in support vector machines learning. Adv. Data Anal. Classif. 2(3), 259–278 (2008)
Le Thi, H.A., Pham Dinh, T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)
Alvarado, A., Scutari, G., Pang, J.S.: A new decomposition method for multiuser DC programming and its applications. IEEE Trans. Signal Process. 62, 2984–2998 (2014)
Zhang, S., Xin, J.: Minimization of transformed \(L_{1}\) penalty: theory, difference of convex function algorithm, and robust application in compressed sensing. arXiv preprint arXiv:1441.5735v3
Sanjabi, M., Razaviyayn, M., Luo, Z.-Q.: Optimal joint base station assignment and beamforming for heterogeneous networks. IEEE Trans. Signal Process. 62, 1950–1961 (2014)
Hiriart-Urruty, J.B.: From convex optimization to nonconvex optimization necessary and sufficient conditions for global optimization. In: Clarke, F.H., Dem’yanov, V.F., Giannessi, F. (eds.) Nonsmooth Optimization and Related Topics, vol. 43, pp. 219–240. Plenum Press, New York (1989)
Hiriart-Urruty, J.B.: Generalized differentiability, duality and optimization for problems dealing with difference of convex functions. In: Ponstein, J. (ed.) Convexity and Duality in Optimization. Lecture Notes in Economics And Mathematical Systems, vol. 256, pp. 37–70. Springer, Berlin (1986)
Hiriart-Urruty, J.B., Tuy, H.: Essays on nonconvex optimization. Math. Program. 41, 229–248 (1988)
Auchmuty, G.: Duality algorithm for nonconvex variational principle. Research Report UH/MD-41, University of Houston (1988)
Pham Dinh, T., Souad, E.B.: Algorithms for solving a class of nonconvex optimizations problems: methods of subgradient. Fermat Day 85: Mathematics for Optimization, North Holland (1986)
Gu, J., Xiao, X., Zhang, L.: A subgradient-based convex approximations method for DC programming and its applications. J. Ind. Manag. Optim. 12(4), 1349–1366 (2016)
Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithm and applications. Acta Math. Vietnam. 22, 289–355 (1997)
Le Thi, H.A., Quynh, T.D., Adjallah, K.H.: A difference of convex functions algorithm for optimal scheduling and real-time assignment of preventive maintenance jobs on parallel processors. J. Ind. Manag. Optim. 10(1), 243–258 (2014)
Wu, C., Li, C., Long, Q.: A DC programming approach for senor network localization with uncertainties in anchor positions. J. Ind. Manag. Optim. 10(3), 817–826 (2014)
Gotoh, J., Takeda, A., Tono, K.: DC formulations and algorithms for spare optimization problems. Math. Program. Ser. B. https://doi.org/10.1007/s10107-107-1181-0
Pham, D.T., Le Thi, H.A.: A D.C. optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8, 476–505 (1998)
Artacho, F.J.A., Fleming, R.M.T., Vuong, P.T.: Accelerating the DC algorithm for smooth functions. Math. Program. 169, 95–118 (2018)
Liu, T., Pong, T.K., Takeda, A.: A successive difference-of-convex approximation method for a class of nonconvex nonmooth optimization problems. Preprint, 2017. http://arxiv.org/abs/1710.05778
Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-conex algorithm with extrapolation. Comput. Optim. Appl. 69, 297–324 (2018)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)
Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Paper (2007)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, New York (1998)
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Lojasiewicz inequality. Math. Oper. Res. 35, 438–457 (2010)
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and Gauss–Seidel methods. Math. Program. 137, 91–129 (2013)
Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Lojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. https://doi.org/10.1007/s10208-017-9366-8
Yang, W.H.: Error bounds for convex polynomials. SIAM J. Optim. 19, 1633–1647 (2009)
Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Progam. 165, 271–507 (2017)
Liu, H., Wu, W., SO, A. M.-C.: Quadratic optimization with orthogonality constraints: explicit Lojasiewicz exponent and linear convergence of line-search methods. In: Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), pp. 1158–1167 (2016)
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Progam. Ser. A 146, 459–494 (2014)
Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. Fixed Point Algorithms Inverse Probl. Sci. Eng. 49, 185–212 (2012)
Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of \(l_{1-2}\) foe compressed sensing. SIAM J. Sci. Comput. 37, 536–563 (2015)
Candes, E.J., Wakin, M., Boyd, S.: Enhancing spasity by reweighted \(l_{1}\) minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)
Liu, T., Pong, T.K.: Further properties of the forward-backward envelope with applications to difference-of-convex programming. Comput. Optim. Appl. 67(3), 489–520 (2017)
Chen, G., Teboulle, M.: Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM J. Optim. 3, 538–543 (1993)
Acknowledgements
The authors would like to thank Prof. T. K. Pong for his stimulating discussion and useful suggestions on the paper, and to thank the referees for their constructive comments leading to many improvements and for pointing out the references [2, 14, 16, 17, 19, 20, 21, 30, 36]. This work was supported by National Natural Science Foundation of China (Grant Nos. 11371173, 11301222), and the Fundamental Research Funds for the Central Universities (Grant Nos. 21615453, 21617417).
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of Lemma 2
Appendix: Proof of Lemma 2
Proof
(Lemma 2) By the definition of \(z^{k+1}\) in Algorithm 1 and together with 3-Point Property (see [37, Lemma 2.2] and [3, Section 5]), we have
On the other hand, we have
where the first inequality is obtained since \(\nabla f\) is Lipschitz continuous with the modulus \(L> 0\), the second inequality follows from the convexity of \(P_1\) and \(P_2\) and the facts that \(\xi ^k\in \partial P_2(x^k)\) and together with the relation \(x^{k+1}-y^k=\theta _{k}(z^{k+1}-z^k)\), and \(x^{k+1}=(1-\theta _{k})x^{k}+\theta _{k}z^{k+1}\). Now, we obtain further from (25) that
This together with (24) imply
Using the convexity of f,
On the other hand, we have
This relation and (26), (27) together imply
Since \({\theta ^2_{k}}(1-{\theta _{k-1}})^2L -\alpha _{k}\le 0,\alpha _{k+1}-{\theta ^2_{k}}L\le -\delta \), we obtain (6) by (5). \(\square \)
Rights and permissions
About this article
Cite this article
Lin, D., Liu, C. The modified second APG method for DC optimization problems. Optim Lett 13, 805–824 (2019). https://doi.org/10.1007/s11590-018-1280-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-018-1280-8