Open issues and recent advances in DC programming and DCA

Le Thi, Hoai An; Pham Dinh, Tao

doi:10.1007/s10898-023-01272-1

Open issues and recent advances in DC programming and DCA

Published: 15 February 2023

Volume 88, pages 533–590, (2024)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

941 Accesses
6 Citations
Explore all metrics

Abstract

DC (difference of convex functions) programming and DC algorithm (DCA) are powerful tools for nonsmooth nonconvex optimization. This field was created in 1985 by Pham Dinh Tao in its preliminary state, then the intensive research of the authors of this paper has led to decisive developments since 1993, and has now become classic and increasingly popular worldwide. For 35 years from their birthday, these theoretical and algorithmic tools have been greatly enriched, thanks to a lot of their applications, by researchers and practitioners in the world, to model and solve nonconvex programs from many fields of applied sciences. This paper is devoted to key open issues, recent advances and trends in the development of these tools to meet the growing need for nonconvex programming and global optimization. We first give an outline in foundations of DC programming and DCA which permits us to highlight the philosophy of these tools, discuss key issues, formulate open problems, and bring relevant answers. After outlining key open issues that require deeper and more appropriate investigations, we will present recent advances and ongoing works in these issues. They turn around novel solution techniques in order to improve DCA’s efficiency and scalability, a new generation of algorithms beyond the standard framework of DC programming and DCA for large-dimensional DC programs and DC learning with Big data, as well as for broader classes of nonconvex problems beyond DC programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Article 13 April 2024

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

Article 03 April 2024

References

Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133(1), 23–46 (2005)
MathSciNet Google Scholar
Le Thi, H.A., Pham Dinh, T.: DC programming and DCA: thirty years of developments. Math. Program. Special Issue DC Program. Theory Algorithms Appl. 169(1), 5–68 (2018)
MathSciNet Google Scholar
Pham Dinh, T., Le Thi, H.A.: D.C. optimization algorithms for solving the trust region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)
MathSciNet Google Scholar
Pham Dinh, T., Le Thi, H.A.: Recent advances in DC programming and DCA. In: Nguyen, N.-T., Le-Thi, H. (eds.) Transactions on Computational Intelligence XIII. Lecture Notes in Computer Science, vol. 8342, pp. 1–37. Springer, Berlin (2014)
Google Scholar
Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to D.C. programming: theory, algorithm and applications. Acta Math. Vietnam 22(1), 289–355 (1997)
MathSciNet Google Scholar
Hartman, P.: On functions representable as a difference of convex functions. Pac. J. Math. 9(3), 707–713 (1959)
MathSciNet Google Scholar
Pham Dinh, T., Souad, E.B.: Algorithms for solving a class of nonconvex optimization problems. Methods of subgradients. In: Hiriart-Urruty, J.-B. (ed.) Fermat Days 85: Mathematics for Optimization. North-Holland Mathematics Studies, vol. 129, pp. 249–271. North-Holland, Amsterdam (1986)
Google Scholar
Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches, 3rd edn. Springer, Heidelberg (1996)
Google Scholar
Horst, R., Pardalos, P.M., Thoai, N.V.: Introduction to Global Optimization. Springer, New York (1995)
Google Scholar
Horst, R., Thoai, N.V.: DC programming: overview. J. Optim. Theory Appl. 103(1), 1–43 (1999)
MathSciNet Google Scholar
Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: Convergence analysis of DCA with subanalytic data. J. Optim. Theory Appl. 179, 103–126 (2018)
MathSciNet Google Scholar
Pang, J.-S., Razaviyayn, M., Alvarado, A.: Computing B-stationary points of nonsmooth DC programs. Math. Oper. Res. 42(1), 95–118 (2017)
MathSciNet Google Scholar
Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Exact penalty and error bounds in DC programming. J. Global Optim. 52(3), 509–535 (2012)
MathSciNet Google Scholar
Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: Error bounds via exact penalization with applications to concave and quadratic systems. J. Optim. Theory Appl. 171(1), 228–250 (2016)
MathSciNet Google Scholar
Le Thi, H.A.: An efficient algorithm for globally minimizing a quadratic function under convex quadratic constraints. Math. Program. 87, 401–426 (2000)
MathSciNet Google Scholar
Le Thi, H.A., Phan, D.N., Pham Dinh, T.: Advanced Difference of Convex functions Algorithms for Nonconvex Programming (submitted) (2021)
Le Thi, H.A., Phan, D.N., Pham Dinh, T.: Extended DCA based Algorithms for Nonconvex Programming (submitted) (2021)
Polyak, B.: Introduction to Optimization. Optimization Software Inc, New York (1987)
Google Scholar
Chambolle, A., Devore, R.A., Lee, N.Y., Lucier, B.J.: Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEETrans Image Process 7, 319–335 (1998)
MathSciNet Google Scholar
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Elsevier, San Diego (1970)
Google Scholar
Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Proceedings of the 15th International Conference on Machine Learning, pp. 82–90. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1998)
Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)
CAS PubMed Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B Met. 39(1), 1–38 (1977)
MathSciNet Google Scholar
Sun, W., Sampaio, R.J.B., Candido, M.A.B.: Proximal point algorithm for minimization of DC function. J. Comput. Math. 21, 451–462 (2003)
MathSciNet Google Scholar
Razaviyayn, M.: Successive convex approximation: analysis and applications. Ph.D. thesis, University of Minnesota (2014)
Razaviyayn, M., Hong, M., Luo, Z.-Q., Pang, J.S.: Parallel successive convex approximation for nonsmooth nonconvex optimization. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.d., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 27, pp. 1440–1448. Curran Associates, Inc., Montreal (2014)
Scutari, G., Facchinei, F., Song, P., Palomar, D.P., Pang, J.S.: Decomposition by partial linearization: parallel optimization of multi-agent systems. IEEE Trans. Signal Process. 62(3), 641–656 (2014)
MathSciNet Google Scholar
Scutari, G., Facchinei, F., Lampariello, L.: Parallel and distributed methods for constrained nonconvex optimization-part I: theory. IEEE Trans. Signal Process. 65(8), 1929–1944 (2017)
MathSciNet Google Scholar
Scutari, G., Facchinei, F., Lampariello, L., Sardellitti, S., Song, P.: Parallel and distributed methods for constrained nonconvex optimization-part II: applications in communications and machine learning. IEEE Trans. Signal Process. 65(8), 1945–1960 (2017)
MathSciNet Google Scholar
Razaviyayn, M., Hong, M., Luo, Z.-Q.: A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM J. Optim. 23(2), 1126–1153 (2013)
MathSciNet Google Scholar
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005)
MathSciNet Google Scholar
Gong, P., Zhang, C., Lu, Z., Huang, J.Z., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: Proceedings of the 30th International Conference on International Conference on Machine Learning, vol. 28. Atlanta, GA, USA, pp. 37–45 (2013)
Rakotomamonjy, A., Flamary, R., Gasso, G.: Dc proximal newton for nonconvex optimization problems. IEEE Trans. Neural Netw. Learn. Syst. 27(3), 636–647 (2016)
MathSciNet PubMed Google Scholar
Le, H.M., Ta, M.T.: DC programming and DCA for solving minimum sum-of-squares clustering using weighted dissimilarity measures. In: Transactions on Computational Intelligence XIII. LNCS, vol. 8342, pp. 113–131. Springer, Berlin (2014)
Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: DC programming and DCA for general DC programs. In: van Do, T., Le Thi, H.A., Nguyen, N.T. (eds.) Advanced Computational Methods for Knowledge Engineering, pp. 15–35. Springer, Cham (2014)
Google Scholar
Solodov, M.V.: On the sequential quadratically constrained quadratic programming methods. Math. Oper. Res. 29(1), 64–79 (2004)
MathSciNet Google Scholar
Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: Novel DCA based algorithms for a special class of nonconvex problems with application in machine learning. Appl. Math. Comput. 409, 1–22 (2021)
MathSciNet Google Scholar
Nesterov, Y.: A method of solving a convex programming problem with convergence rate $\cal{O} (1/k^2)$. Sov. Math. Dokl. 27, 372–376 (1983)
Google Scholar
Phan, D.N., Le, H.M., Le Thi, H.A.: Accelerated difference of convex functions algorithm and its application to sparse binary logistic regression. In: 27th International Joint Conference on Artificial Intelligence and 23rd European Conference on Artificial Intelligence (IJCAI-ECAI 2018), Stockholm, Sweden, pp. 1369–1375 (2018)
Grippo, L., Sciandrone, M.: Nonmonotone globalization techniques for the Barzilai-Borwein gradient method. Comput. Optim. Appl. 23(2), 143–169 (2002)
MathSciNet Google Scholar
Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57(7), 2479–2493 (2009)
MathSciNet Google Scholar
Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)
Google Scholar
de Oliveira, W., Tcheou, M.P.: An inertial algorithm for dc programming. Set-Valued Var. Anal. 27(4), 895–919 (2019)
MathSciNet Google Scholar
Phan, D.N., Le Thi, H.A.: DCA based Algorithm with Extrapolation for Nonconvex Nonsmooth Optimization (Submitted) (2021)
Fukushima, M., Mine, H.: A generalized proximal point algorithm for certain non-convex minimization problems. Int. J. Syst. Sci. 12(8) (1981)
Aragón Artacho, F., Fleming, R.M.T., Phan, T.V.: Accelerating the DC algorithm for smooth functions. Math. Program. 169(1), 95–118 (2018)
Aragón Artacho, F.J., Phan, T.V.: The boosted difference of convex functions algorithm for nonsmooth functions. SIAM J. Optim. 30(1), 980–1006 (2020)
MathSciNet Google Scholar
Niu, Y.-S., Wang, Y.-J., Le Thi, H.A., Pham Dinh, T.: Higher-order Moment Portfolio Optimization via The Difference-of-Convex Programming and Sums-of-Squares (submitted) (2021)
Le Thi, H.A., Vu, V.H.K.: Accelerated Difference of Convex functions Algorithms: a comparative study on two approaches and applications in Machine Learning. Technical report, University of Lorraine (2021)
Le Thi, H.A., Pham Dinh, T.: D.C. programming approach to the multidimensional scaling problem. In: Migdalas, A., Pardalos, P.M., Värbrand, P. (eds.) From Local to Global Optimization, pp. 231–276. Springer, Boston (2001)
Li, H., Lin, Z.: Accelerated proximal gradient methods for nonconvex programming. In: Advances in Neural Information Processing Systems, pp. 377–387 (2015)
Yao, Q., Kwok, J.T., Gao, F., Chen, W., Liu, T.Y.: Efficient inexact proximal gradient algorithm for nonconvex problems. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 3308–3314 (2017)
Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69(2), 297–324 (2018)
MathSciNet Google Scholar
Lu, Z., Zhou, Z., Sun, Z.: Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization. Math. Program. 176(1), 369–401 (2019)
MathSciNet Google Scholar
Lu, Z., Zhou, Z.: Nonmonotone Enhanced Proximal DC Algorithms for a Class of Structured Nonsmooth DC Programming. SIAM J. Optim. 29, 2725–2752 (2019)
MathSciNet Google Scholar
Yu, P., Pong, T.K.: Iteratively reweighted $\ell _1$ algorithms with extrapolation. Comput. Optim. Appl. 73, 353–386 (2019)
MathSciNet Google Scholar
Tsiligkaridis, T., Marcheret, E., Goel, V.: A difference of convex functions approach to large-scale log-linear model estimation. IEEE Trans. Audio Speech Lang. Process. 21(11), 2255–2266 (2013)
Google Scholar
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. 137(1), 91–129 (2013)
MathSciNet Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Image Sci. 2, 183–202 (2009)
MathSciNet Google Scholar
Ackooij, W., de Oliveira, W.: Nonsmooth and nonconvex optimization via approximate difference-of-convex decompositions. J. Optim. Theory Appl. 182, 49–80 (2019)
MathSciNet Google Scholar
Le Thi, H.A., Phan, D.N., Le, H.M.: DCA-Like and its accelerated scheme for a class of structured Nonconvex Optimization Problems (Submitted) (2021)
Le Thi, H.A., Pham Dinh, T.: Solving a class of linearly constrained indefinite quadratic problems by D.C. algorithms. J. Global Optim. 11(3), 253–285 (1997)
MathSciNet Google Scholar
Pham Dinh, T., Nguyen Canh, N., Le Thi, H.A.: An efficient combination of DCA and B &B using DC/SDP relaxation for globally solving binary quadratic programs. J. Global Optim. 48(4), 595–632 (2010)
MathSciNet Google Scholar
Hiriart-Urruty, J.-B., Lemarechal, C.: Convex Analysis and Minimization Algorithms, Parts I & II. Springer, Berlin (1993)
Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton Mathematical Series. Princeton University Press, Princeton (1970)
Google Scholar
Le Thi, H.A., Ho, V.T.: Online learning based on online DCA and application to online classification. Neural Comput. 32(4), 759–793 (2020)
MathSciNet PubMed Google Scholar
Shor, N.Z.: Minimization Methods for Non-differentiable Functions. Springer, Berlin (1985)
Google Scholar
Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: Stochastic DCA for the large-sum of non-convex functions problem and its application to group variable selection in classification. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 3394–3403. PMLR, Sydney, NSW, Australia (2017)
Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1–2), 83–112 (2017)
MathSciNet Google Scholar
Le Thi, H.A., Luu, H.P.H., Le, H.M., Pham Dinh, T.: Stochastic DCA with variance reduction and applications in machine learning. J. Mach. Learn. Res. 23(206), 1–44 (2022)
Liu, J., Cui, Y., Pang, J.S., Sen, S.: Two-stage stochastic programming with linearly bi-parameterized quadratic recourse. SIAM J. Optim. 30(3), 2530–2558 (2020)
MathSciNet Google Scholar
Nitanda, A., Suzuki, T.: Stochastic Difference of convex algorithm and its application to training deep boltzmann machines. In: Singh, A., Zhu, J. (eds.) Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 54, pp. 470–478. PMLR, Florida, USA (2017)
Xu, Y., Qi, Q., Lin, Q., Jin, R., Yang, T.: Stochastic optimization for DC functions and non-smooth non-convex regularizers with non-asymptotic convergence. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 6942–6951. PMLR, California, USA (2019)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(61), 2121–2159 (2011)
MathSciNet Google Scholar
Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
MathSciNet Google Scholar
Le Thi, H.A., Huynh, V.N., Pham Dinh, T., Luu, H.P.H.: Stochastic difference-of-convex algorithms for solving nonconvex optimization problems. SIAM J. Optim. 32(3), 2263–2293 (2022)
Le Thi, H.A., Pham Dinh, T., Luu, H.P.H., Le, H.M.: Deterministic and stochastic DCA for DC programming. In: Handbook of Engineering Statistics, 2nd edn. Springer, Cham (2021) (in press)
Le Thi, H.A., Luu, H.P.H., Pham Dinh, T.: Online stochastic DCA with applications to principal component analysis. IEEE Trans. Neural Netw. Learn. Syst. (in press) (2022)
Le Thi, H.A., Pham Dinh, T.: A continuous approach for globally solving linearly constrained quadratic zero-one programming problems. Optimization 50(1–2), 93–120 (2001)
MathSciNet Google Scholar
Le Thi, H.A., Pham Dinh, T., Thoai, N.V., Nguyen Canh, N.: D.C. optimization techniques for solving a class of nonlinear bilevel programs. J. Global Optim. 44(3), 313–337 (2009)
MathSciNet Google Scholar
Le Thi, H.A., Pham Dinh, T., Le, D.M.: Numerical solution for optimization over the efficient set by DC optimization algorithms. Oper. Res. Lett. 19(3), 117–128 (1996)
MathSciNet Google Scholar
Le Thi, H.A., Pham Dinh, T., Muu, L.D.: Simplicially constrained D.C. optimization over the efficient and weakly efficient sets. J. Optim. Theory Appl. 117(3), 503–521 (2003)
MathSciNet Google Scholar
Le Thi, H.A., Pham Dinh, T., Thoai, N.V.: Combination between global and local methods for solving an optimization problem over the efficient set. Eur. J. Oper. Res. 142(2), 258–270 (2002)
MathSciNet Google Scholar
Le Thi, H.A., Pham Dinh, T., Le, H.M., Vo, X.T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)
MathSciNet Google Scholar
Ge, R., Huang, C.: A continuous approach to nonlinear integer programming. Appl. Math. Comput. 34(1), 39–60 (1989)
MathSciNet Google Scholar
Pham Dinh, T., Le Thi, H.A., Pham, V.N., Niu, Y.-S.: DC programming approaches for discrete portfolio optimization under concave transaction costs. Optim. Lett. 10(2), 261–282 (2016)
MathSciNet Google Scholar
Le Thi, H.A., Le, H.M., Nguyen, V.V., Pham Dinh, T.: A DC programming approach for feature selection in support vector machines learning. J. Adv. Data Anal. Classif. 2(3), 259–278 (2008)
MathSciNet Google Scholar
Le Thi, H.A., Nguyen, V.V., Ouchani, S.: Gene selection for cancer classification using DCA. J. Front. Comput. Sci. Technol. 3(6), 612–620 (2009)
Google Scholar
Ong, C.S., Le Thi, H.A.: Learning sparse classifiers with difference of convex functions algorithms. Optim. Methods Softw. 28(4), 830–854 (2013)
MathSciNet Google Scholar
Thiao, M., Pham Dinh, T., Le Thi, H.A.: A DC programming approach for sparse eigenvalue problem. In: Fürnkranz, J., Joachims, T. (eds.) Proceedings of the 27th International Conference on Machine Learning, pp. 1063–1070. Omnipress, Haifa, Israel (2010)
Le Thi, H.A., Le, H.M., Pham Dinh, T.: Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm. Mach. Learn. 101(1–3), 163–186 (2015)
MathSciNet Google Scholar
Le Thi, H.A., Pham Dinh, T., Thiao, M.: Efficient approaches for $\ell _2-\ell _0$ regularization and applications to feature selection in SVM. Appl. Intell. 45(2), 549–565 (2016)
Google Scholar
Le Thi, H.A., Phan, D.N., Pham Dinh, T.: DCA based approaches for bi-level variable selection and application for estimate multiple sparse covariance matrices. Neurocomputing 466, 162–177 (2021)
Google Scholar
Phan, D.N., Le Thi, H.A.: Group variable selection via $\ell _{p,0}$ regularization and application to optimal scoring. Neural Netw. 118, 220–234 (2019)
PubMed Google Scholar
Pham Dinh, T., Huynh, V.N., Le Thi, H.A., Ho, V.T.: Alternating DC algorithm for partial DC programming problems. J. Global Optim. 82(4), 897–928 (2022)
MathSciNet Google Scholar
Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: Minimizing compositions of differences-of-convex functions with smooth mappings. Math. Oper. Res. (2023) (Minor revision)
Le Thi, H.A., Belghiti, M.T., Pham Dinh, T.: A new efficient algorithm based on DC programming and DCA for clustering. J. Global Optim. 37(4), 593–608 (2007)
MathSciNet Google Scholar
Le Thi, H.A., Le, H.M., Pham Dinh, T.: New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recogn. 47(1), 388–401 (2014)
Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)
Google Scholar
Le Thi, H.A., Le, H.M., Pham Dinh, T.: Fuzzy clustering based on nonconvex optimisation approaches using difference of convex (DC) functions algorithms. Adv. Data Anal. Classif. 1(2), 85–104 (2007)
MathSciNet Google Scholar
Le, H.M., Nguyen, T.B.T., Ta, M.T., Le Thi, H.A.: Image segmentation via feature weighted fuzzy clustering by a DCA based algorithm. In: Advanced Computational Methods for Knowledge Engineering. Studies in Computational Intelligence, vol. 479, pp. 53–63. Springer, Heidelberg (2013)
Le, H.M., Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Block clustering based on difference of convex functions (DC) programming and DC algorithms. Neural Comput. 25(10), 2776–2807 (2013)
MathSciNet PubMed Google Scholar
Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Optimization based DC programming and DCA for hierarchical clustering. Eur. J. Oper. Res. 183(3), 1067–1085 (2007)
MathSciNet Google Scholar
Le Thi, H.A., Le, H.M., Nguyen, V.A.: DCA-like for GMM clsutering with sparse regularization (submitted) (2021)
Nguyen, V.A., Le Thi, H.A., Le, H.M.: A DCA based algorithm for feature selection in model-based clustering. In: Nguyen, N.T., Jearanaitanakij, K., Selamat, A., Trawiński, B., Chittayasothorn, S. (eds.) Intelligent Information and Database Systems, pp. 404–415. Springer, Cham (2020)
Google Scholar
Brandes, U., Delling, D., Gaertler, M., Gorke, R., Hoefer, M., Nikoloski, Z., Wagner, D.: On modularity clustering. IEEE Trans. Knowl. Data Eng. 20(2), 172–188 (2008)
Google Scholar
Le Thi, H.A., Nguyen, M.C., Pham Dinh, T.: A DC programming approach for finding communities in networks. Neural Comput. 26(12), 2827–2854 (2014)
MathSciNet PubMed Google Scholar
Le Thi, H.A., Nguyen, M.C.: Self-organizing maps by difference of convex functions optimization. Data Min. Knowl. Disc. 28(5–6), 1336–1365 (2014)
MathSciNet Google Scholar
Le Thi, H.A., Vo, X.T., Pham Dinh, T.: Efficient nonnegative matrix factorization by DC programming and DCA. Neural Comput. 28(6), 1163–1216 (2016)
MathSciNet PubMed Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
Google Scholar
Yang, Z., Peltonen, J., Kaski, S.: Majorization-Minimization for Manifold Embedding. In: Lebanon, G., Vishwanathan, S.V.N. (eds.) Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 38, pp. 1088–1097. PMLR, San Diego, California (2015)
Neumann, J., Schnorr, G., Steidl, G.: Combined SVM-based feature selection and classification. Mach. Learn. 61, 129–150 (2005)
Google Scholar
Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Machine Learning Proceedings of the Fifteenth International Conference, pp. 82–90. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Le Thi, H.A., Ho, V.T.: DCA for Gaussian kernel support vector machines with feature selection. In: Modelling. Computation and Optimization in Information Systems and Management Sciences, pp. 223–234. Springer, Cham (2022)
Le, H.M., Le Thi, H.A., Nguyen, M.C.: Sparse semi-supervised support vector machines by DC programming and DCA. Neurocomputing 153, 62–76 (2015)
Google Scholar
Le Thi, H.A., Nguyen, M.C.: DCA based algorithms for feature selection in multi-class support vector machine. Ann. Oper. Res. 249(1), 273–300 (2017)
MathSciNet Google Scholar
Le Thi, H.A., Phan, D.N.: DC programming and DCA for sparse Fisher linear discriminant analysis. Neural Comput. Appl. 28(9), 2809–2822 (2016)
Google Scholar
Le Thi, H.A., Phan, D.N.: DC programming and DCA for sparse optimal scoring problem. Neurocomputing 186, 170–181 (2016)
Google Scholar
Le Thi, H.A., Nguyen, T.B.T., Le,: H.M.: Sparse signal recovery by difference of convex functions algorithms. In: Intelligent Information and Database Systems. LNCS, vol. 7803, pp. 387–397. Springer, Berlin (2013)
Yang, L., Qian, Y.: A sparse logistic regression framework by difference of convex functions programming. Appl. Intell. 45(2), 241–254 (2016)
Google Scholar
Wang, L., Kim, Y., Li, R.: Calibrating nonconvex penalized regression in ultra-high dimension. Ann. Stat. 41(5), 2505–2536 (2013)
MathSciNet PubMed PubMed Central Google Scholar
Song, Y., Lin, L., Jian, L.: Robust check loss-based variable selection of high-dimensional single-index varying-coefficient model. Commun. Nonlinear Sci. 36, 109–128 (2016)
MathSciNet Google Scholar
Wu, Y., Liu, Y.: Variable selection in quantile regression. Stat. Sin. 19, 801–817 (2009)
MathSciNet Google Scholar
Gasso, G., Rakotomamonjy, A., Canu, S.: Recovering sparse signals with a certain family of nonconvex penalties and DC programming. IEEE Trans. Signal Process. 57(12), 4686–4698 (2009)
MathSciNet Google Scholar
Nguyen, T.B.T., Le Thi, H.A., Le, H.M., Vo, X.T.: DC approximation approach for $\ell _0$-minimization in compressed sensing. In: Le Thi, H.A., Nguyen, N.T., Do, T.V. (eds.) Advanced Computational Methods for Knowledge Engineering. Advances in Intelligent Systems and Computing, vol. 358, pp. 37–48. Springer, Cham (2015)
Google Scholar
Esser, E., Lou, Y., Xin, J.: A method for finding structured sparse solutions to nonnegative least squares problems with applications. SIAM J. Imag. Sci. 6(4), 2010–2046 (2013)
MathSciNet Google Scholar
Lou, Y., Osher, S., Xin, J.: Computational aspects of constrained l1–l2 minimization for compressive sensing. In: Le Thi, H.A., Pham Dinh, T., Nguyen, N.T. (eds.) Modelling, Computation and Optimization in Information Systems and Management Sciences. Advances in Intelligent Systems and Computing, vol. 359, pp. 169–180. Springer, Cham (2015)
Google Scholar
Lou, Y., Yin, P., He, Q., Xin, J.: Computing sparse representation in a highly coherent dictionary based on difference of L1 and L2. J. Sci. Comput. 64(1), 178–196 (2015)
MathSciNet Google Scholar
Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of $\ell _{1-2}$ for compressed sensing. SIAM J. Sci. Comput. 37(1), 536–563 (2015)
MathSciNet Google Scholar
Gorodnitsky, I.F., Rao, B.D.: Sparse signal reconstructions from limited data using FOCUSS: a re-weighted minimum norm algorithm. IEEE Trans. Signal Process. 45(3), 600–616 (1997)
Google Scholar
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
MathSciNet Google Scholar
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)
MathSciNet CAS Google Scholar
Candes, E.J., Wakin, M., Boyd, S.: Enhancing sparsity by reweighted-$l_{1}$ minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)
MathSciNet Google Scholar
Chartrand, R., Yin, W.: Iteratively reweighted algorithms for compressive sensing. In: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3869–3872 (2008)
Zou, H., Li, R.: One-step sparse estimates in nonconcave penalized likelihood models. Ann. Stat. 36(4), 1509–1533 (2008)
MathSciNet PubMed PubMed Central Google Scholar
Zou, H., Hastie, T., Tibshirani, R.J.: Sparse principal component analysis. J. Comput. Graph. Stat. 15, 265–286 (2006)
MathSciNet Google Scholar
Cotter, S.F., Rao, B.D., Engan, K., Kreutz-Delgado, K.: Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans. Signal Process. 53, 2477–2488 (2005)
MathSciNet Google Scholar
Chen, J., Huo, X.: Theoretical results on sparse representations of multiple-measurement vectors. IEEE Trans. Signal Process. 54, 4634–4643 (2006)
Google Scholar
Sun, L., Liu, J., Chen, J., Ye, J.: Efficient recovery of jointly sparse vectors. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 1812–1820. Curran Associates Inc, Vancouver (2009)
Google Scholar
Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: Stochastic DCA for minimizing a large sum of DC functions with application to multi-class logistic regression. Neural Netw. 132, 220–231 (2020)
PubMed Google Scholar
Danaher, P., Wang, P., Witten, D.M.: The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. Series B Stat. Methodol. 76, 373–397 (2014)
MathSciNet PubMed Google Scholar
Calandriello, D., Lazaric, A., Restelli, M.: Sparse multi-task reinforcement learning. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 819–827. Curran Associates Inc, Montreal (2014)
Google Scholar
Phan, D.N., Le Thi, H.A., Pham Dinh, T.: Sparse covariance matrix estimation by DCA-based algorithms. Neural Comput. 29(11), 3040–3077 (2017)
MathSciNet PubMed Google Scholar
Vo, X.T., Le Thi, H.A., Pham Dinh, T., Nguyen, T.B.T.: DC programming and DCA for dictionary learning. In: Computational Collective Intelligence vol. 9329, pp. 295–304. Springer, Cham (2015)
Ben-Tal, A., El Ghaoui, L., Nemirovski, A.S.: Robust Optimization. Princeton Series in Applied Mathematics. Princeton University Press, Princeton (2009)
Google Scholar
Le Thi, H.A., Vo, X.T., Pham Dinh, T.: Feature selection for linear SVMs under uncertain data: robust optimization based on difference of convex functions algorithms. Neural Netw. 59, 36–50 (2014)
PubMed Google Scholar
Vo, X.T.: Learning with sparsity and uncertainty by difference of convex functions optimization. Ph.D. thesis, University of Lorraine (2015)
Vo, X.T., Le Thi, H.A., Pham Dinh, T.: Robust optimization for clustering. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, T.-P. (eds.) Intelligent Information and Database Systems, pp. 671–680. Springer, Berlin (2016)
Google Scholar
Shalev-Shwartz, S.: Online learning and online convex optimization. Found. Trends® Mach. Learn. 4(2), 107–194 (2012)
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings of the 20th on International Conference on Machine Learning, pp. 928–935. AAAI Press, Washington (2003)
Shalev-Shwartz, S., Singer, Y.: A primal-dual perspective of online learning algorithms. Mach. Learn. 69(2–3), 115–142 (2007)
Google Scholar
Chung, T.H.: Approximate methods for sequential decision making using expert advice. In: Proceedings of the Seventh Annual Conference on Computational Learning Theory. COLT ’94, pp. 183–189. ACM, New York (1994)
Le Thi, H.A., Ho, V.T.: DCA for online prediction with expert advice. Neural Comput. Appl. 33(15), 9521–9544 (2021)
Google Scholar
Le Thi, H.A., Ho, V.T., Pham Dinh, T.: A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning. J. Global Optim. 73(2), 279–310 (2019)
MathSciNet Google Scholar
Calafiore, G.C., Gaubert, S., Possieri, C.: A universal approximation result for difference of log-sum-exp neural networks. IEEE Trans. Neural Netw. Learn. Syst. 31(12), 5603–5612 (2020)
MathSciNet PubMed Google Scholar
Brüggemann, S., Possieri, C.: On the use of difference of log-sum-exp neural networks to solve data-driven model predictive control tracking problems. IEEE Control Syst. Lett. 5(4), 1267–1272 (2020)
MathSciNet Google Scholar
Sankaranarayanan, P., Rengaswamy, R.: CDiNN-Convex Difference Neural Networks. Preprint at https://arxiv.org/abs/2103.17231 (2021)
Cui, Y., He, Z., Pang, J.-S.: Multicomposite nonconvex optimization for training deep neural networks. SIAM J. Optim. 30(2), 1693–1723 (2020)
MathSciNet Google Scholar
Berrada, L., Zisserman, A., Kumar, M.P.: Trusting SVM for piecewise linear CNNs. Preprint at https://arxiv.org/abs/1611.02185 (2016)
Mangasarian, O.L., Fromovitz, S.: The fritz john necessary optimality conditions in the presence of equality and inequality constraints. J. Math. Anal. Appl. 17(1), 37–47 (1967)
MathSciNet Google Scholar
Mangasarian, O.L.: Nonlinear Programming. McGraw-Hill, New York (1969)
Google Scholar

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Université de Lorraine, LGIPM, 57000, Metz, France
Hoai An Le Thi
Institut Universitaire de France (IUF), Paris, France
Hoai An Le Thi
Laboratory of Mathematics, INSA-Rouen, University of Normandie, 76801, Saint-Étienne-du-Rouvray Cedex, France
Tao Pham Dinh

Authors

Hoai An Le Thi
View author publications
You can also search for this author in PubMed Google Scholar
Tao Pham Dinh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hoai An Le Thi.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is the full paper of the author’s plenary lecture - as the winner of the Constantin Caratheodory prize 2021 - at the World Congress on Global Optimization WCGO 2021.

Appendices

Appendix A Convergence of Standard DCA

Let X $\subset \mathbb {R}^n$ and Y $\subset \mathbb {R}^n$ be two nonempty convex sets, $\rho _{i}$ and $\rho _{i}^{*},(i=1,2)$ be real nonnegative numbers such that $0 \le \rho _{i}<\rho (f_{i},X)$ (resp. $0 \le \rho _{i}^{*}<\rho (f_{i}^{*},Y)$) where $\rho _{i}=0$ (resp. $\rho _{i}^{*}=0$) if $\rho (f_{i},X)=0$ (resp. $\rho (f_{i}^{*},Y)=0$) and $\rho _{i}$ (resp. $\rho _{i}^{*}$) may take the value $\rho (f_{i},X)$ (resp. $\rho (f_{i}^{*},Y)$) if it is attained. We next set $f_{1}=g$ and $f_{2}=h$. Also let $dx^{k}:=x^{k+1}-x^{k}$ and $dy^{k}:=y^{k+1}-y^{k}$.

Theorem 11

Let $X \subset \mathbb {R}^n$ and $Y \subset \mathbb {R}^n$ be two nonempty convex sets containing the sequences $\{x^{k}\}$ and $\{y^{k}\}$ generated by DCA, respectively, and $dx^{k}:=x^{k+1}-x^{k}, dy^{k}:=y^{k+1}-y^{k}$. The DCA is a descent method without linesearch but with global convergence, which enjoys the following key properties:

1 For the primal DC program $(P_{dc})$

The decrease of the sequence $\{(g-h)(x^{k})\}$ is expressed by $(g-h)(x^{k+1}) \le (h^{*}-g^{*})(y^{k})-\dfrac{\rho _{2}}{2}\Vert dx^{k}\Vert ^{2} \le (g-h)(x^{k})-\dfrac{\rho _{1}+\rho _{2}}{2}\Vert dx^{k}\Vert ^{2}, \forall k$ where the equality

$$\begin{aligned} (g-h)(x^{k+1})=(g-h)(x^{k}) \end{aligned}$$

(A1)

is verified if and only if $x^{k}\in \partial g^{*}(y^{k})$, $y^{k}\in \partial h(x^{k+1})$ and $(\rho _{1}+\rho _{2})dx^{k}=0.$

In this case, one obtains the following main statements:

1.1 $x^{k},x^{k+1}$ are DC-critical points of $g-h$ satisfying $y^{k}\in (\partial g(x^{k})\cap \partial h(x^{k}))$ and $y^{k}\in (\partial g(x^{k+1})\cap \partial h(x^{k+1})),$

1.2 $y^{k}$ is a DC critical point of $h^{*}-g^{*}$ and $[x^{k},x^{k+1}]\subset (\partial g^{*}(y^{k})\cap \partial h^{*}(y^{k})),$

1.3 If $\rho _{1}+\rho _{2}>0$, then $x^{k+1}=x^{k}$, $y^{k}=y^{k-1}$ if $\rho _{1}^{*}>0$ and $y^{k+1}=y^{k}$ if $\rho _{2}^{*}>0$.

Furthermore, if g or h is strictly convex on X, then $x^{k+1}=x^{k}$.

In such a case (A1), the DCA terminates at the $k^{th}$ iteration (finite convergence of DCA).

2 For the dual DC program $(D_{dc})$

Similarly, the DC duality provides the dual DC program $(D_{dc})$ with

$$\begin{aligned} (h^{*}-g^{*})(y^{k+1}) \le (g-h)(x^{k+1})-\dfrac{\rho _{1}^{*}}{2}\Vert dy^{k}\Vert ^{2} \le (h^{*}-g^{*})(y^{k})-\dfrac{\rho _{1}^{*}+\rho _{2}^{*}}{2}\Vert dy^{k}\Vert ^{2}. \end{aligned}$$

The equality

$$\begin{aligned} (h^{*}-g^{*})(y^{k+1})=(h^{*}-g^{*})(y^{k}) \end{aligned}$$

(A2)

occurs if and only if $x^{k+1}\in \partial g^{*}(y^{k+1}),y^{k}\in \partial h(x^{k+1})$ and $(\rho _{1}^{*}+\rho _{2}^{*})dy^{k}=0$.

In this case, the following properties hold:

2.1 The equality $(h^{*}-g^{*})(y^{k+1})=(g-h)(x^{k+1})$ holds and $y^{k},y^{k+1}$ are the DC critical points of $h^{*}-g^{*}$ with $x^{k+1}\in (\partial g^{*}(y^{k})\cap h^{*}(y^{k}))$ and $x^{k+1}\in (\partial g^{*}(y^{k+1})\cap \partial h^{*}(y^{k+1})),$

2.2 $x^{k+1}$ is a DC critical point of $g-h$ and $[y^{k},y^{k+1}]\subset (\partial g(x^{k+1})\cap \partial h(x^{k+1})),$

2.3 $y^{k+1}=y^{k}$ if $\rho _{1}^{*}+\rho _{2}^{*}>0,x^{k+1}=x^{k}$ if $\rho _{2}>0$ and $x^{k+2}=x^{k+1}$ if $\rho _{1}>0$.

Furthermore, if $g^{*}$ or $h^{*}$ is strictly convex on $\mathbb {R}^n$, then $y^{k+1}=y^{k}.$

As for 1.3, in the case (A2), the DCA terminates at the $k^{th}$ iteration (finite convergence of DCA).

3. If $\rho _{1}+\rho _{2}>0$ then the primal DC series $\{\Vert x^{k+1}-x^{k}\Vert ^{2}\}$ converges with its limit bounded above by

$$\begin{aligned} \frac{\rho _{1}+\rho _{2}}{2}\sum \limits _{l=0}^{k}\Vert x^{l+1}-x^{l}\Vert ^{2}&\le (g-h)(x^{0})-(g-h)(x^{k+1}) \nonumber \\&\le (g-h)(x^{0})-\beta \le (g-h)(x^{0})-\alpha , \forall k. \end{aligned}$$

(A3)

Dually, if $\rho _{1}^{*}+\rho _{2}^{*}>0$, then the dual DC series $\{\Vert y^{k+1}-y^{k}\Vert ^{2}\})$ is convergent with its limit bounded above by

$$\begin{aligned} \frac{\rho _{1}^{*}+\rho _{2}^{*}}{2}\sum \limits _{l=0}^{k}\Vert y^{l+1}-y^{l}\Vert ^{2}&\le (h^{*}-g^{*})(y^{0})-(h^{*}-g^{*})(y^{k+1})\le (h^{*}-g^{*})(y^{0})-\beta \nonumber \\&\le (h^{*}-g^{*})(y^{0})-\alpha ,~\forall k. \end{aligned}$$

(A4)

4. If $\alpha $ is finite, the sequences $\{(g-h)(x^{k})\}$, $\{(h^{*}-g^{*})(y^{k})\}$ decrease and converge to the same limit $\beta \ge \alpha $: $\lim _{k\rightarrow +\infty }(g-h)(x^{k})=\lim _{k\rightarrow +\infty }(h^{*}-g^{*})(y^{k})=\beta .$

5. If $\alpha $ is finite and the sequences $\{x^{k}\}$ and $\{y^{k}\}$ are bounded, then for every limit point $x^{*}$ of $\{x^{k}\}$ (resp. $y^{*}$ of $\{y^{k}\}$) there exists a limit point $y^{*}$ of $\{y^{k}\}$ (resp. $x^{*}$ of $\{x^{k}\}$) such that $(x^{*},y^{*})\in [\partial g^{*}(y^{*})\cap \partial h^{*}(y^{*})]\times [\partial g(x^{*})\cap \partial h(x^{*})]$ and $(g-h)(x^{*})=(h^{*}-g^{*})(y^{*})=\beta \ge \alpha .$ Such a point $x^{*}$(resp. $y^{*})$ is DC critical point of $g-h$ (resp. $h^{*}-g^{*}$).

6. DCA’s complexity for primal and dual DC programs

Let $x^{*}$be a DC critical point of $g-h$ defined as a limit point of the sequence $\{x^{k}\}$ computed by the primal DCA. Then, from (A3), one deduces $f(x^{*})=(g-h)(x^{*})=\beta :=\lim _{k\rightarrow +\infty }f(x^{k})=\lim _{k\rightarrow +\infty }(g-h)(x^{k})$ and $\frac{\rho _{1}+\rho _{2}}{2}(k+1)\min \{\Vert x^{l+1}-x^{l}\Vert ^{2}:l=0,\ldots ,k\}\le [f(x^{0})-f(x^{*})].$ Moreover, if $\rho _{1}+\rho _{2}>0$, then $\min \{\Vert x^{l+1}-x^{l}\Vert :l=0,\ldots ,k\}\le \frac{2^{1/2}[f(x^{0})-f(x^{*})]^{1/2}}{(\rho _{1}+\rho _{2})^{1/2}(k+1)^{1/2}}.$

Likewise, by using the same reasoning for the sequence $\{y^{k}\}$ via (A4), we get the similar results for dual DCA: $(h^{*}-g^{*})(y^{*})=\beta :=\lim _{k\rightarrow +\infty }(h^{*}-g^{*})(y^{k})$ and $\frac{\rho _{1}^{*}+\rho _{2}^{*}}{2}\sum \limits _{k=0}^{k}\Vert y^{l+1}-y^{l}\Vert ^{2} \le (h^{*}-g^{*})(y^{0})-(h^{*}-g^{*})(y^{k+1}) \le (h^{*}-g^{*})(y^{0})-\beta \le (h^{*}-g^{*})(y^{0})-\alpha ,~\forall k.$ Hence, the inequality $\rho _{1}^{*}+\rho _{2}^{*}>0$ gives

$\min \{\Vert y^{l+1}-y^{l}\Vert :l=0,\ldots ,k\}\le \frac{2^{1/2}[(h^{*}-g^{*})(y^{0})-(h^{*}-g^{*})(y^{*})]^{1/2}}{(\rho _{1}^{*}+\rho _{2}^{*})^{1/2}(k+1)^{1/2}}.$

Therefore, both primal and dual DCA have a complexity $O(1/\sqrt{k})$.

Appendix B Global convergence of GDCA1

Denote by $I(x):=\left\{ i\in \{1,\ldots ,m\}: ~ f_{i}(x)=p(x)\right\} $. We say that the extended Mangasarian-Fromowitz constraint qualification (EMFCQ) is satisfied at $x^{*}\in E$ with $I(x^{*})\not =\emptyset $ if

$$\begin{aligned} \begin{array}{ll} (\text {MFCQ}) &{} \text{ there } \text{ is } \text{ a } \text{ vector } d\in \text{ cone }(C-\{x^*\}) \text{(the } \text{ cone } \text{ hull } \text{ of } C-\{x^*\}) \\ &{} \text {such that }f_{i}^{\uparrow }(x^{*},d)<0 \text{ for } \text{ all } i\in I(x^{*}). \end{array} \end{aligned}$$

When $f_{i}^{\prime }s$ are continuously differentiable, then $f_{i}^{\uparrow }(x^{*},d)=\langle \nabla f(x^{*}),d\rangle .$ Therefore, (EMFCQ) becomes the well-known Mangasarian-Fromowitz constraint qualification. It is well known that if the (extended) Mangasarian-Fromowitz constraint qualification is satisfied at a local minimizer $x^{*}$ of problem (6) then the KKT first order necessary condition (7) holds (see [160, 161]). In the global convergence theorem, we make use of the following assumption:

Assumption 3

The (extended) Mangasarian-Fromowitz constraint qualification (EMFCQ) is satisfied at any $x\in {{\mathbb {R}}}^{n}$ with $p(x)\ge 0.$

When $f_{i}$, $i=1,\ldots ,m,$ are all convex functions, then it is obvious that this assumption is satisfied under the $f_{i}(x)<0$ for all $i=1,\ldots ,m.$

Theorem 12

Suppose that $C\subseteq {{\mathbb {R}}}^{n}$ is a nonempty closed convex set and $f_{i}$, $i=1,\ldots ,m$ are DC functions on C. Suppose further that Assumptions 1–3 are verified. Let $\delta >0,$ $\beta _{1}>0$ be given. Let $\{x^{k}\}$ be a sequence generated by GDCA1. Then GDCA1 either stops, after finitely many iterations, at a KKT point $x^{k}$ for problem (6) or generates an infinite sequence $\{x^{k}\}$ of iterates such that $\lim _{k\rightarrow \infty }\Vert x^{k+1}-x^{k}\Vert =0$ and every limit point $x^{\infty }$ of the sequence $\{x^{k}\}$ is a KKT point of problem (6).

Appendix C Global convergence of GDCA2

Recall, as defined in the preceding section, that $\varphi _{k}(x):=f_{0}(x)+\beta _{k}p^{+}(x).$ The following lemma is needed to investigate the convergence of GDCA2.

Lemma 13

The sequence $(x^{k},t^{k})$ generated by GDCA2 satisfies the following inequality $\varphi _{k}(x^{k})-\varphi _{k}(x^{k+1})\ge \frac{\rho }{2}\Vert x^{k+1}-x^{k}\Vert ^{2}$, for all $k=1,2,\ldots $ where, $\rho :=\rho (g_{0},C)+\rho (h_{0},C)+\min \{\rho (g_{i},C): ~ i=1,\ldots ,m\}.$

Theorem 14

Suppose that $C\subseteq {{\mathbb {R}}}^{n}$ is a nonempty closed convex set and $f_{i}, i=1,\ldots ,m,$ are DC functions on C such that Assumptions 1 and 3 are verified. Suppose further that for each $i=0,\ldots ,m,$ either $g_{i}$ or $h_{i}$ is differentiable on C and that $\rho :=\rho (g_{0},C)+\rho (h_{0},C)+\min \{\rho (g_{i},C): i=1,\ldots ,m\}>0.$ Let $\delta _{1},\delta _{2}>0,$ $\beta _{1}>0$ be given. Let $\{x^{_{k}}\}$ be a sequence generated by GDCA2. Then GDCA2 either stops, after finitely many iterations, at a KKT point $x^{k}$ for problem (6) or generates an infinite sequence $\{x^{k}\}$ of iterates such that $\lim _{k\rightarrow \infty }\Vert x^{k+1}-x^{k}\Vert =0$ and every limit point $x^{\infty }$ of the sequence $\{x^{k}\}$ is a KKT point of problem (6).

Note that, as shown in Theorems 12 and 14, the penalty parameter $\beta _{k}$ is constant when k is sufficiently large. Observing from the proof of these convergence theorems, the sequence $\{\varphi (x^{k})\}$ of values of the function $\varphi (x)=f_{0}(x)+\beta _{k}p^{+}(x)$ along with the sequence $\{x^{k}\}$ generated by GDCA1 and GDCA2 is decreasing. These results remain valid if we replace, in (11), the variable t by $t_{i}$ for $i=1,\ldots ,m$ and the function $\beta _{k}t$ by $\beta _{k}\sum _{i=1}^{m}t_{i}.$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Le Thi, H.A., Pham Dinh, T. Open issues and recent advances in DC programming and DCA. J Glob Optim 88, 533–590 (2024). https://doi.org/10.1007/s10898-023-01272-1

Download citation

Received: 17 January 2022
Accepted: 10 January 2023
Published: 15 February 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s10898-023-01272-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Open issues and recent advances in DC programming and DCA

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A Convergence of Standard DCA

Theorem 11

Appendix B Global convergence of GDCA1

Assumption 3

Theorem 12

Appendix C Global convergence of GDCA2

Lemma 13

Theorem 14

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Open issues and recent advances in DC programming and DCA

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A Convergence of Standard DCA

Theorem 11

Appendix B Global convergence of GDCA1

Assumption 3

Theorem 12

Appendix C Global convergence of GDCA2

Lemma 13

Theorem 14

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation