Skip to main content
Log in

Open issues and recent advances in DC programming and DCA

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

DC (difference of convex functions) programming and DC algorithm (DCA) are powerful tools for nonsmooth nonconvex optimization. This field was created in 1985 by Pham Dinh Tao in its preliminary state, then the intensive research of the authors of this paper has led to decisive developments since 1993, and has now become classic and increasingly popular worldwide. For 35 years from their birthday, these theoretical and algorithmic tools have been greatly enriched, thanks to a lot of their applications, by researchers and practitioners in the world, to model and solve nonconvex programs from many fields of applied sciences. This paper is devoted to key open issues, recent advances and trends in the development of these tools to meet the growing need for nonconvex programming and global optimization. We first give an outline in foundations of DC programming and DCA which permits us to highlight the philosophy of these tools, discuss key issues, formulate open problems, and bring relevant answers. After outlining key open issues that require deeper and more appropriate investigations, we will present recent advances and ongoing works in these issues. They turn around novel solution techniques in order to improve DCA’s efficiency and scalability, a new generation of algorithms beyond the standard framework of DC programming and DCA for large-dimensional DC programs and DC learning with Big data, as well as for broader classes of nonconvex problems beyond DC programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133(1), 23–46 (2005)

    MathSciNet  Google Scholar 

  2. Le Thi, H.A., Pham Dinh, T.: DC programming and DCA: thirty years of developments. Math. Program. Special Issue DC Program. Theory Algorithms Appl. 169(1), 5–68 (2018)

    MathSciNet  Google Scholar 

  3. Pham Dinh, T., Le Thi, H.A.: D.C. optimization algorithms for solving the trust region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)

    MathSciNet  Google Scholar 

  4. Pham Dinh, T., Le Thi, H.A.: Recent advances in DC programming and DCA. In: Nguyen, N.-T., Le-Thi, H. (eds.) Transactions on Computational Intelligence XIII. Lecture Notes in Computer Science, vol. 8342, pp. 1–37. Springer, Berlin (2014)

    Google Scholar 

  5. Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to D.C. programming: theory, algorithm and applications. Acta Math. Vietnam 22(1), 289–355 (1997)

    MathSciNet  Google Scholar 

  6. Hartman, P.: On functions representable as a difference of convex functions. Pac. J. Math. 9(3), 707–713 (1959)

    MathSciNet  Google Scholar 

  7. Pham Dinh, T., Souad, E.B.: Algorithms for solving a class of nonconvex optimization problems. Methods of subgradients. In: Hiriart-Urruty, J.-B. (ed.) Fermat Days 85: Mathematics for Optimization. North-Holland Mathematics Studies, vol. 129, pp. 249–271. North-Holland, Amsterdam (1986)

    Google Scholar 

  8. Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches, 3rd edn. Springer, Heidelberg (1996)

    Google Scholar 

  9. Horst, R., Pardalos, P.M., Thoai, N.V.: Introduction to Global Optimization. Springer, New York (1995)

    Google Scholar 

  10. Horst, R., Thoai, N.V.: DC programming: overview. J. Optim. Theory Appl. 103(1), 1–43 (1999)

    MathSciNet  Google Scholar 

  11. Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: Convergence analysis of DCA with subanalytic data. J. Optim. Theory Appl. 179, 103–126 (2018)

    MathSciNet  Google Scholar 

  12. Pang, J.-S., Razaviyayn, M., Alvarado, A.: Computing B-stationary points of nonsmooth DC programs. Math. Oper. Res. 42(1), 95–118 (2017)

    MathSciNet  Google Scholar 

  13. Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Exact penalty and error bounds in DC programming. J. Global Optim. 52(3), 509–535 (2012)

    MathSciNet  Google Scholar 

  14. Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: Error bounds via exact penalization with applications to concave and quadratic systems. J. Optim. Theory Appl. 171(1), 228–250 (2016)

    MathSciNet  Google Scholar 

  15. Le Thi, H.A.: An efficient algorithm for globally minimizing a quadratic function under convex quadratic constraints. Math. Program. 87, 401–426 (2000)

    MathSciNet  Google Scholar 

  16. Le Thi, H.A., Phan, D.N., Pham Dinh, T.: Advanced Difference of Convex functions Algorithms for Nonconvex Programming (submitted) (2021)

  17. Le Thi, H.A., Phan, D.N., Pham Dinh, T.: Extended DCA based Algorithms for Nonconvex Programming (submitted) (2021)

  18. Polyak, B.: Introduction to Optimization. Optimization Software Inc, New York (1987)

    Google Scholar 

  19. Chambolle, A., Devore, R.A., Lee, N.Y., Lucier, B.J.: Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEETrans Image Process 7, 319–335 (1998)

    MathSciNet  Google Scholar 

  20. Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Elsevier, San Diego (1970)

    Google Scholar 

  21. Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Proceedings of the 15th International Conference on Machine Learning, pp. 82–90. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1998)

  22. Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)

    CAS  PubMed  Google Scholar 

  23. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B Met. 39(1), 1–38 (1977)

    MathSciNet  Google Scholar 

  24. Sun, W., Sampaio, R.J.B., Candido, M.A.B.: Proximal point algorithm for minimization of DC function. J. Comput. Math. 21, 451–462 (2003)

    MathSciNet  Google Scholar 

  25. Razaviyayn, M.: Successive convex approximation: analysis and applications. Ph.D. thesis, University of Minnesota (2014)

  26. Razaviyayn, M., Hong, M., Luo, Z.-Q., Pang, J.S.: Parallel successive convex approximation for nonsmooth nonconvex optimization. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.d., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 27, pp. 1440–1448. Curran Associates, Inc., Montreal (2014)

  27. Scutari, G., Facchinei, F., Song, P., Palomar, D.P., Pang, J.S.: Decomposition by partial linearization: parallel optimization of multi-agent systems. IEEE Trans. Signal Process. 62(3), 641–656 (2014)

    MathSciNet  Google Scholar 

  28. Scutari, G., Facchinei, F., Lampariello, L.: Parallel and distributed methods for constrained nonconvex optimization-part I: theory. IEEE Trans. Signal Process. 65(8), 1929–1944 (2017)

    MathSciNet  Google Scholar 

  29. Scutari, G., Facchinei, F., Lampariello, L., Sardellitti, S., Song, P.: Parallel and distributed methods for constrained nonconvex optimization-part II: applications in communications and machine learning. IEEE Trans. Signal Process. 65(8), 1945–1960 (2017)

    MathSciNet  Google Scholar 

  30. Razaviyayn, M., Hong, M., Luo, Z.-Q.: A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM J. Optim. 23(2), 1126–1153 (2013)

    MathSciNet  Google Scholar 

  31. Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005)

    MathSciNet  Google Scholar 

  32. Gong, P., Zhang, C., Lu, Z., Huang, J.Z., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: Proceedings of the 30th International Conference on International Conference on Machine Learning, vol. 28. Atlanta, GA, USA, pp. 37–45 (2013)

  33. Rakotomamonjy, A., Flamary, R., Gasso, G.: Dc proximal newton for nonconvex optimization problems. IEEE Trans. Neural Netw. Learn. Syst. 27(3), 636–647 (2016)

    MathSciNet  PubMed  Google Scholar 

  34. Le, H.M., Ta, M.T.: DC programming and DCA for solving minimum sum-of-squares clustering using weighted dissimilarity measures. In: Transactions on Computational Intelligence XIII. LNCS, vol. 8342, pp. 113–131. Springer, Berlin (2014)

  35. Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: DC programming and DCA for general DC programs. In: van Do, T., Le Thi, H.A., Nguyen, N.T. (eds.) Advanced Computational Methods for Knowledge Engineering, pp. 15–35. Springer, Cham (2014)

    Google Scholar 

  36. Solodov, M.V.: On the sequential quadratically constrained quadratic programming methods. Math. Oper. Res. 29(1), 64–79 (2004)

    MathSciNet  Google Scholar 

  37. Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: Novel DCA based algorithms for a special class of nonconvex problems with application in machine learning. Appl. Math. Comput. 409, 1–22 (2021)

    MathSciNet  Google Scholar 

  38. Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(\cal{O} (1/k^2)\). Sov. Math. Dokl. 27, 372–376 (1983)

    Google Scholar 

  39. Phan, D.N., Le, H.M., Le Thi, H.A.: Accelerated difference of convex functions algorithm and its application to sparse binary logistic regression. In: 27th International Joint Conference on Artificial Intelligence and 23rd European Conference on Artificial Intelligence (IJCAI-ECAI 2018), Stockholm, Sweden, pp. 1369–1375 (2018)

  40. Grippo, L., Sciandrone, M.: Nonmonotone globalization techniques for the Barzilai-Borwein gradient method. Comput. Optim. Appl. 23(2), 143–169 (2002)

    MathSciNet  Google Scholar 

  41. Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57(7), 2479–2493 (2009)

    MathSciNet  Google Scholar 

  42. Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)

    Google Scholar 

  43. de Oliveira, W., Tcheou, M.P.: An inertial algorithm for dc programming. Set-Valued Var. Anal. 27(4), 895–919 (2019)

    MathSciNet  Google Scholar 

  44. Phan, D.N., Le Thi, H.A.: DCA based Algorithm with Extrapolation for Nonconvex Nonsmooth Optimization (Submitted) (2021)

  45. Fukushima, M., Mine, H.: A generalized proximal point algorithm for certain non-convex minimization problems. Int. J. Syst. Sci. 12(8) (1981)

  46. Aragón Artacho, F., Fleming, R.M.T., Phan, T.V.: Accelerating the DC algorithm for smooth functions. Math. Program. 169(1), 95–118 (2018)

  47. Aragón Artacho, F.J., Phan, T.V.: The boosted difference of convex functions algorithm for nonsmooth functions. SIAM J. Optim. 30(1), 980–1006 (2020)

    MathSciNet  Google Scholar 

  48. Niu, Y.-S., Wang, Y.-J., Le Thi, H.A., Pham Dinh, T.: Higher-order Moment Portfolio Optimization via The Difference-of-Convex Programming and Sums-of-Squares (submitted) (2021)

  49. Le Thi, H.A., Vu, V.H.K.: Accelerated Difference of Convex functions Algorithms: a comparative study on two approaches and applications in Machine Learning. Technical report, University of Lorraine (2021)

  50. Le Thi, H.A., Pham Dinh, T.: D.C. programming approach to the multidimensional scaling problem. In: Migdalas, A., Pardalos, P.M., Värbrand, P. (eds.) From Local to Global Optimization, pp. 231–276. Springer, Boston (2001)

  51. Li, H., Lin, Z.: Accelerated proximal gradient methods for nonconvex programming. In: Advances in Neural Information Processing Systems, pp. 377–387 (2015)

  52. Yao, Q., Kwok, J.T., Gao, F., Chen, W., Liu, T.Y.: Efficient inexact proximal gradient algorithm for nonconvex problems. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 3308–3314 (2017)

  53. Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69(2), 297–324 (2018)

    MathSciNet  Google Scholar 

  54. Lu, Z., Zhou, Z., Sun, Z.: Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization. Math. Program. 176(1), 369–401 (2019)

    MathSciNet  Google Scholar 

  55. Lu, Z., Zhou, Z.: Nonmonotone Enhanced Proximal DC Algorithms for a Class of Structured Nonsmooth DC Programming. SIAM J. Optim. 29, 2725–2752 (2019)

    MathSciNet  Google Scholar 

  56. Yu, P., Pong, T.K.: Iteratively reweighted \(\ell _1\) algorithms with extrapolation. Comput. Optim. Appl. 73, 353–386 (2019)

    MathSciNet  Google Scholar 

  57. Tsiligkaridis, T., Marcheret, E., Goel, V.: A difference of convex functions approach to large-scale log-linear model estimation. IEEE Trans. Audio Speech Lang. Process. 21(11), 2255–2266 (2013)

    Google Scholar 

  58. Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. 137(1), 91–129 (2013)

    MathSciNet  Google Scholar 

  59. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Image Sci. 2, 183–202 (2009)

    MathSciNet  Google Scholar 

  60. Ackooij, W., de Oliveira, W.: Nonsmooth and nonconvex optimization via approximate difference-of-convex decompositions. J. Optim. Theory Appl. 182, 49–80 (2019)

    MathSciNet  Google Scholar 

  61. Le Thi, H.A., Phan, D.N., Le, H.M.: DCA-Like and its accelerated scheme for a class of structured Nonconvex Optimization Problems (Submitted) (2021)

  62. Le Thi, H.A., Pham Dinh, T.: Solving a class of linearly constrained indefinite quadratic problems by D.C. algorithms. J. Global Optim. 11(3), 253–285 (1997)

    MathSciNet  Google Scholar 

  63. Pham Dinh, T., Nguyen Canh, N., Le Thi, H.A.: An efficient combination of DCA and B &B using DC/SDP relaxation for globally solving binary quadratic programs. J. Global Optim. 48(4), 595–632 (2010)

    MathSciNet  Google Scholar 

  64. Hiriart-Urruty, J.-B., Lemarechal, C.: Convex Analysis and Minimization Algorithms, Parts I & II. Springer, Berlin (1993)

    Google Scholar 

  65. Rockafellar, R.T.: Convex Analysis. Princeton Mathematical Series. Princeton University Press, Princeton (1970)

    Google Scholar 

  66. Le Thi, H.A., Ho, V.T.: Online learning based on online DCA and application to online classification. Neural Comput. 32(4), 759–793 (2020)

    MathSciNet  PubMed  Google Scholar 

  67. Shor, N.Z.: Minimization Methods for Non-differentiable Functions. Springer, Berlin (1985)

    Google Scholar 

  68. Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: Stochastic DCA for the large-sum of non-convex functions problem and its application to group variable selection in classification. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 3394–3403. PMLR, Sydney, NSW, Australia (2017)

  69. Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1–2), 83–112 (2017)

    MathSciNet  Google Scholar 

  70. Le Thi, H.A., Luu, H.P.H., Le, H.M., Pham Dinh, T.: Stochastic DCA with variance reduction and applications in machine learning. J. Mach. Learn. Res. 23(206), 1–44 (2022)

  71. Liu, J., Cui, Y., Pang, J.S., Sen, S.: Two-stage stochastic programming with linearly bi-parameterized quadratic recourse. SIAM J. Optim. 30(3), 2530–2558 (2020)

    MathSciNet  Google Scholar 

  72. Nitanda, A., Suzuki, T.: Stochastic Difference of convex algorithm and its application to training deep boltzmann machines. In: Singh, A., Zhu, J. (eds.) Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 54, pp. 470–478. PMLR, Florida, USA (2017)

  73. Xu, Y., Qi, Q., Lin, Q., Jin, R., Yang, T.: Stochastic optimization for DC functions and non-smooth non-convex regularizers with non-asymptotic convergence. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 6942–6951. PMLR, California, USA (2019)

  74. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(61), 2121–2159 (2011)

    MathSciNet  Google Scholar 

  75. Xiao, L., Zhang, T.: A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)

    MathSciNet  Google Scholar 

  76. Le Thi, H.A., Huynh, V.N., Pham Dinh, T., Luu, H.P.H.: Stochastic difference-of-convex algorithms for solving nonconvex optimization problems. SIAM J. Optim. 32(3), 2263–2293 (2022)

  77. Le Thi, H.A., Pham Dinh, T., Luu, H.P.H., Le, H.M.: Deterministic and stochastic DCA for DC programming. In: Handbook of Engineering Statistics, 2nd edn. Springer, Cham (2021) (in press)

  78. Le Thi, H.A., Luu, H.P.H., Pham Dinh, T.: Online stochastic DCA with applications to principal component analysis. IEEE Trans. Neural Netw. Learn. Syst. (in press) (2022)

  79. Le Thi, H.A., Pham Dinh, T.: A continuous approach for globally solving linearly constrained quadratic zero-one programming problems. Optimization 50(1–2), 93–120 (2001)

    MathSciNet  Google Scholar 

  80. Le Thi, H.A., Pham Dinh, T., Thoai, N.V., Nguyen Canh, N.: D.C. optimization techniques for solving a class of nonlinear bilevel programs. J. Global Optim. 44(3), 313–337 (2009)

    MathSciNet  Google Scholar 

  81. Le Thi, H.A., Pham Dinh, T., Le, D.M.: Numerical solution for optimization over the efficient set by DC optimization algorithms. Oper. Res. Lett. 19(3), 117–128 (1996)

    MathSciNet  Google Scholar 

  82. Le Thi, H.A., Pham Dinh, T., Muu, L.D.: Simplicially constrained D.C. optimization over the efficient and weakly efficient sets. J. Optim. Theory Appl. 117(3), 503–521 (2003)

    MathSciNet  Google Scholar 

  83. Le Thi, H.A., Pham Dinh, T., Thoai, N.V.: Combination between global and local methods for solving an optimization problem over the efficient set. Eur. J. Oper. Res. 142(2), 258–270 (2002)

    MathSciNet  Google Scholar 

  84. Le Thi, H.A., Pham Dinh, T., Le, H.M., Vo, X.T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)

    MathSciNet  Google Scholar 

  85. Ge, R., Huang, C.: A continuous approach to nonlinear integer programming. Appl. Math. Comput. 34(1), 39–60 (1989)

    MathSciNet  Google Scholar 

  86. Pham Dinh, T., Le Thi, H.A., Pham, V.N., Niu, Y.-S.: DC programming approaches for discrete portfolio optimization under concave transaction costs. Optim. Lett. 10(2), 261–282 (2016)

    MathSciNet  Google Scholar 

  87. Le Thi, H.A., Le, H.M., Nguyen, V.V., Pham Dinh, T.: A DC programming approach for feature selection in support vector machines learning. J. Adv. Data Anal. Classif. 2(3), 259–278 (2008)

    MathSciNet  Google Scholar 

  88. Le Thi, H.A., Nguyen, V.V., Ouchani, S.: Gene selection for cancer classification using DCA. J. Front. Comput. Sci. Technol. 3(6), 612–620 (2009)

    Google Scholar 

  89. Ong, C.S., Le Thi, H.A.: Learning sparse classifiers with difference of convex functions algorithms. Optim. Methods Softw. 28(4), 830–854 (2013)

    MathSciNet  Google Scholar 

  90. Thiao, M., Pham Dinh, T., Le Thi, H.A.: A DC programming approach for sparse eigenvalue problem. In: Fürnkranz, J., Joachims, T. (eds.) Proceedings of the 27th International Conference on Machine Learning, pp. 1063–1070. Omnipress, Haifa, Israel (2010)

  91. Le Thi, H.A., Le, H.M., Pham Dinh, T.: Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm. Mach. Learn. 101(1–3), 163–186 (2015)

    MathSciNet  Google Scholar 

  92. Le Thi, H.A., Pham Dinh, T., Thiao, M.: Efficient approaches for \(\ell _2-\ell _0\) regularization and applications to feature selection in SVM. Appl. Intell. 45(2), 549–565 (2016)

    Google Scholar 

  93. Le Thi, H.A., Phan, D.N., Pham Dinh, T.: DCA based approaches for bi-level variable selection and application for estimate multiple sparse covariance matrices. Neurocomputing 466, 162–177 (2021)

    Google Scholar 

  94. Phan, D.N., Le Thi, H.A.: Group variable selection via \(\ell _{p,0}\) regularization and application to optimal scoring. Neural Netw. 118, 220–234 (2019)

    PubMed  Google Scholar 

  95. Pham Dinh, T., Huynh, V.N., Le Thi, H.A., Ho, V.T.: Alternating DC algorithm for partial DC programming problems. J. Global Optim. 82(4), 897–928 (2022)

    MathSciNet  Google Scholar 

  96. Le Thi, H.A., Huynh, V.N., Pham Dinh, T.: Minimizing compositions of differences-of-convex functions with smooth mappings. Math. Oper. Res. (2023) (Minor revision)

  97. Le Thi, H.A., Belghiti, M.T., Pham Dinh, T.: A new efficient algorithm based on DC programming and DCA for clustering. J. Global Optim. 37(4), 593–608 (2007)

    MathSciNet  Google Scholar 

  98. Le Thi, H.A., Le, H.M., Pham Dinh, T.: New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recogn. 47(1), 388–401 (2014)

    Google Scholar 

  99. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)

    Google Scholar 

  100. Le Thi, H.A., Le, H.M., Pham Dinh, T.: Fuzzy clustering based on nonconvex optimisation approaches using difference of convex (DC) functions algorithms. Adv. Data Anal. Classif. 1(2), 85–104 (2007)

    MathSciNet  Google Scholar 

  101. Le, H.M., Nguyen, T.B.T., Ta, M.T., Le Thi, H.A.: Image segmentation via feature weighted fuzzy clustering by a DCA based algorithm. In: Advanced Computational Methods for Knowledge Engineering. Studies in Computational Intelligence, vol. 479, pp. 53–63. Springer, Heidelberg (2013)

  102. Le, H.M., Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Block clustering based on difference of convex functions (DC) programming and DC algorithms. Neural Comput. 25(10), 2776–2807 (2013)

    MathSciNet  PubMed  Google Scholar 

  103. Le Thi, H.A., Pham Dinh, T., Huynh, V.N.: Optimization based DC programming and DCA for hierarchical clustering. Eur. J. Oper. Res. 183(3), 1067–1085 (2007)

    MathSciNet  Google Scholar 

  104. Le Thi, H.A., Le, H.M., Nguyen, V.A.: DCA-like for GMM clsutering with sparse regularization (submitted) (2021)

  105. Nguyen, V.A., Le Thi, H.A., Le, H.M.: A DCA based algorithm for feature selection in model-based clustering. In: Nguyen, N.T., Jearanaitanakij, K., Selamat, A., Trawiński, B., Chittayasothorn, S. (eds.) Intelligent Information and Database Systems, pp. 404–415. Springer, Cham (2020)

    Google Scholar 

  106. Brandes, U., Delling, D., Gaertler, M., Gorke, R., Hoefer, M., Nikoloski, Z., Wagner, D.: On modularity clustering. IEEE Trans. Knowl. Data Eng. 20(2), 172–188 (2008)

    Google Scholar 

  107. Le Thi, H.A., Nguyen, M.C., Pham Dinh, T.: A DC programming approach for finding communities in networks. Neural Comput. 26(12), 2827–2854 (2014)

    MathSciNet  PubMed  Google Scholar 

  108. Le Thi, H.A., Nguyen, M.C.: Self-organizing maps by difference of convex functions optimization. Data Min. Knowl. Disc. 28(5–6), 1336–1365 (2014)

    MathSciNet  Google Scholar 

  109. Le Thi, H.A., Vo, X.T., Pham Dinh, T.: Efficient nonnegative matrix factorization by DC programming and DCA. Neural Comput. 28(6), 1163–1216 (2016)

    MathSciNet  PubMed  Google Scholar 

  110. van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)

    Google Scholar 

  111. Yang, Z., Peltonen, J., Kaski, S.: Majorization-Minimization for Manifold Embedding. In: Lebanon, G., Vishwanathan, S.V.N. (eds.) Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 38, pp. 1088–1097. PMLR, San Diego, California (2015)

  112. Neumann, J., Schnorr, G., Steidl, G.: Combined SVM-based feature selection and classification. Mach. Learn. 61, 129–150 (2005)

    Google Scholar 

  113. Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Machine Learning Proceedings of the Fifteenth International Conference, pp. 82–90. Morgan Kaufmann Publishers Inc., San Francisco (1998)

  114. Le Thi, H.A., Ho, V.T.: DCA for Gaussian kernel support vector machines with feature selection. In: Modelling. Computation and Optimization in Information Systems and Management Sciences, pp. 223–234. Springer, Cham (2022)

  115. Le, H.M., Le Thi, H.A., Nguyen, M.C.: Sparse semi-supervised support vector machines by DC programming and DCA. Neurocomputing 153, 62–76 (2015)

    Google Scholar 

  116. Le Thi, H.A., Nguyen, M.C.: DCA based algorithms for feature selection in multi-class support vector machine. Ann. Oper. Res. 249(1), 273–300 (2017)

    MathSciNet  Google Scholar 

  117. Le Thi, H.A., Phan, D.N.: DC programming and DCA for sparse Fisher linear discriminant analysis. Neural Comput. Appl. 28(9), 2809–2822 (2016)

    Google Scholar 

  118. Le Thi, H.A., Phan, D.N.: DC programming and DCA for sparse optimal scoring problem. Neurocomputing 186, 170–181 (2016)

    Google Scholar 

  119. Le Thi, H.A., Nguyen, T.B.T., Le,: H.M.: Sparse signal recovery by difference of convex functions algorithms. In: Intelligent Information and Database Systems. LNCS, vol. 7803, pp. 387–397. Springer, Berlin (2013)

  120. Yang, L., Qian, Y.: A sparse logistic regression framework by difference of convex functions programming. Appl. Intell. 45(2), 241–254 (2016)

    Google Scholar 

  121. Wang, L., Kim, Y., Li, R.: Calibrating nonconvex penalized regression in ultra-high dimension. Ann. Stat. 41(5), 2505–2536 (2013)

    MathSciNet  PubMed  PubMed Central  Google Scholar 

  122. Song, Y., Lin, L., Jian, L.: Robust check loss-based variable selection of high-dimensional single-index varying-coefficient model. Commun. Nonlinear Sci. 36, 109–128 (2016)

    MathSciNet  Google Scholar 

  123. Wu, Y., Liu, Y.: Variable selection in quantile regression. Stat. Sin. 19, 801–817 (2009)

    MathSciNet  Google Scholar 

  124. Gasso, G., Rakotomamonjy, A., Canu, S.: Recovering sparse signals with a certain family of nonconvex penalties and DC programming. IEEE Trans. Signal Process. 57(12), 4686–4698 (2009)

    MathSciNet  Google Scholar 

  125. Nguyen, T.B.T., Le Thi, H.A., Le, H.M., Vo, X.T.: DC approximation approach for \(\ell _0\)-minimization in compressed sensing. In: Le Thi, H.A., Nguyen, N.T., Do, T.V. (eds.) Advanced Computational Methods for Knowledge Engineering. Advances in Intelligent Systems and Computing, vol. 358, pp. 37–48. Springer, Cham (2015)

    Google Scholar 

  126. Esser, E., Lou, Y., Xin, J.: A method for finding structured sparse solutions to nonnegative least squares problems with applications. SIAM J. Imag. Sci. 6(4), 2010–2046 (2013)

    MathSciNet  Google Scholar 

  127. Lou, Y., Osher, S., Xin, J.: Computational aspects of constrained l1–l2 minimization for compressive sensing. In: Le Thi, H.A., Pham Dinh, T., Nguyen, N.T. (eds.) Modelling, Computation and Optimization in Information Systems and Management Sciences. Advances in Intelligent Systems and Computing, vol. 359, pp. 169–180. Springer, Cham (2015)

    Google Scholar 

  128. Lou, Y., Yin, P., He, Q., Xin, J.: Computing sparse representation in a highly coherent dictionary based on difference of L1 and L2. J. Sci. Comput. 64(1), 178–196 (2015)

    MathSciNet  Google Scholar 

  129. Yin, P., Lou, Y., He, Q., Xin, J.: Minimization of \(\ell _{1-2}\) for compressed sensing. SIAM J. Sci. Comput. 37(1), 536–563 (2015)

    MathSciNet  Google Scholar 

  130. Gorodnitsky, I.F., Rao, B.D.: Sparse signal reconstructions from limited data using FOCUSS: a re-weighted minimum norm algorithm. IEEE Trans. Signal Process. 45(3), 600–616 (1997)

    Google Scholar 

  131. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)

    MathSciNet  Google Scholar 

  132. Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)

    MathSciNet  CAS  Google Scholar 

  133. Candes, E.J., Wakin, M., Boyd, S.: Enhancing sparsity by reweighted-\(l_{1}\) minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)

    MathSciNet  Google Scholar 

  134. Chartrand, R., Yin, W.: Iteratively reweighted algorithms for compressive sensing. In: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3869–3872 (2008)

  135. Zou, H., Li, R.: One-step sparse estimates in nonconcave penalized likelihood models. Ann. Stat. 36(4), 1509–1533 (2008)

    MathSciNet  PubMed  PubMed Central  Google Scholar 

  136. Zou, H., Hastie, T., Tibshirani, R.J.: Sparse principal component analysis. J. Comput. Graph. Stat. 15, 265–286 (2006)

    MathSciNet  Google Scholar 

  137. Cotter, S.F., Rao, B.D., Engan, K., Kreutz-Delgado, K.: Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans. Signal Process. 53, 2477–2488 (2005)

    MathSciNet  Google Scholar 

  138. Chen, J., Huo, X.: Theoretical results on sparse representations of multiple-measurement vectors. IEEE Trans. Signal Process. 54, 4634–4643 (2006)

    Google Scholar 

  139. Sun, L., Liu, J., Chen, J., Ye, J.: Efficient recovery of jointly sparse vectors. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 1812–1820. Curran Associates Inc, Vancouver (2009)

    Google Scholar 

  140. Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: Stochastic DCA for minimizing a large sum of DC functions with application to multi-class logistic regression. Neural Netw. 132, 220–231 (2020)

    PubMed  Google Scholar 

  141. Danaher, P., Wang, P., Witten, D.M.: The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. Series B Stat. Methodol. 76, 373–397 (2014)

    MathSciNet  PubMed  Google Scholar 

  142. Calandriello, D., Lazaric, A., Restelli, M.: Sparse multi-task reinforcement learning. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 819–827. Curran Associates Inc, Montreal (2014)

    Google Scholar 

  143. Phan, D.N., Le Thi, H.A., Pham Dinh, T.: Sparse covariance matrix estimation by DCA-based algorithms. Neural Comput. 29(11), 3040–3077 (2017)

    MathSciNet  PubMed  Google Scholar 

  144. Vo, X.T., Le Thi, H.A., Pham Dinh, T., Nguyen, T.B.T.: DC programming and DCA for dictionary learning. In: Computational Collective Intelligence vol. 9329, pp. 295–304. Springer, Cham (2015)

  145. Ben-Tal, A., El Ghaoui, L., Nemirovski, A.S.: Robust Optimization. Princeton Series in Applied Mathematics. Princeton University Press, Princeton (2009)

    Google Scholar 

  146. Le Thi, H.A., Vo, X.T., Pham Dinh, T.: Feature selection for linear SVMs under uncertain data: robust optimization based on difference of convex functions algorithms. Neural Netw. 59, 36–50 (2014)

    PubMed  Google Scholar 

  147. Vo, X.T.: Learning with sparsity and uncertainty by difference of convex functions optimization. Ph.D. thesis, University of Lorraine (2015)

  148. Vo, X.T., Le Thi, H.A., Pham Dinh, T.: Robust optimization for clustering. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, T.-P. (eds.) Intelligent Information and Database Systems, pp. 671–680. Springer, Berlin (2016)

    Google Scholar 

  149. Shalev-Shwartz, S.: Online learning and online convex optimization. Found. Trends® Mach. Learn. 4(2), 107–194 (2012)

  150. Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings of the 20th on International Conference on Machine Learning, pp. 928–935. AAAI Press, Washington (2003)

  151. Shalev-Shwartz, S., Singer, Y.: A primal-dual perspective of online learning algorithms. Mach. Learn. 69(2–3), 115–142 (2007)

    Google Scholar 

  152. Chung, T.H.: Approximate methods for sequential decision making using expert advice. In: Proceedings of the Seventh Annual Conference on Computational Learning Theory. COLT ’94, pp. 183–189. ACM, New York (1994)

  153. Le Thi, H.A., Ho, V.T.: DCA for online prediction with expert advice. Neural Comput. Appl. 33(15), 9521–9544 (2021)

    Google Scholar 

  154. Le Thi, H.A., Ho, V.T., Pham Dinh, T.: A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning. J. Global Optim. 73(2), 279–310 (2019)

    MathSciNet  Google Scholar 

  155. Calafiore, G.C., Gaubert, S., Possieri, C.: A universal approximation result for difference of log-sum-exp neural networks. IEEE Trans. Neural Netw. Learn. Syst. 31(12), 5603–5612 (2020)

    MathSciNet  PubMed  Google Scholar 

  156. Brüggemann, S., Possieri, C.: On the use of difference of log-sum-exp neural networks to solve data-driven model predictive control tracking problems. IEEE Control Syst. Lett. 5(4), 1267–1272 (2020)

    MathSciNet  Google Scholar 

  157. Sankaranarayanan, P., Rengaswamy, R.: CDiNN-Convex Difference Neural Networks. Preprint at https://arxiv.org/abs/2103.17231 (2021)

  158. Cui, Y., He, Z., Pang, J.-S.: Multicomposite nonconvex optimization for training deep neural networks. SIAM J. Optim. 30(2), 1693–1723 (2020)

    MathSciNet  Google Scholar 

  159. Berrada, L., Zisserman, A., Kumar, M.P.: Trusting SVM for piecewise linear CNNs. Preprint at https://arxiv.org/abs/1611.02185 (2016)

  160. Mangasarian, O.L., Fromovitz, S.: The fritz john necessary optimality conditions in the presence of equality and inequality constraints. J. Math. Anal. Appl. 17(1), 37–47 (1967)

    MathSciNet  Google Scholar 

  161. Mangasarian, O.L.: Nonlinear Programming. McGraw-Hill, New York (1969)

    Google Scholar 

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hoai An Le Thi.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is the full paper of the author’s plenary lecture - as the winner of the Constantin Caratheodory prize 2021 - at the World Congress on Global Optimization WCGO 2021.

Appendices

Appendix A Convergence of Standard DCA

Let X \(\subset \mathbb {R}^n\) and Y \(\subset \mathbb {R}^n\) be two nonempty convex sets, \(\rho _{i}\) and \(\rho _{i}^{*},(i=1,2)\) be real nonnegative numbers such that \(0 \le \rho _{i}<\rho (f_{i},X)\) (resp. \(0 \le \rho _{i}^{*}<\rho (f_{i}^{*},Y)\)) where \(\rho _{i}=0\) (resp. \(\rho _{i}^{*}=0\)) if \(\rho (f_{i},X)=0\) (resp. \(\rho (f_{i}^{*},Y)=0\)) and \(\rho _{i}\) (resp. \(\rho _{i}^{*}\)) may take the value \(\rho (f_{i},X)\) (resp. \(\rho (f_{i}^{*},Y)\)) if it is attained. We next set \(f_{1}=g\) and \(f_{2}=h\). Also let \(dx^{k}:=x^{k+1}-x^{k}\) and \(dy^{k}:=y^{k+1}-y^{k}\).

Theorem 11

Let \(X \subset \mathbb {R}^n\) and \(Y \subset \mathbb {R}^n\) be two nonempty convex sets containing the sequences \(\{x^{k}\}\) and \(\{y^{k}\}\) generated by DCA, respectively, and \(dx^{k}:=x^{k+1}-x^{k}, dy^{k}:=y^{k+1}-y^{k}\). The DCA is a descent method without linesearch but with global convergence, which enjoys the following key properties:

1 For the primal DC program \((P_{dc})\)

The decrease of the sequence \(\{(g-h)(x^{k})\}\) is expressed by \((g-h)(x^{k+1}) \le (h^{*}-g^{*})(y^{k})-\dfrac{\rho _{2}}{2}\Vert dx^{k}\Vert ^{2} \le (g-h)(x^{k})-\dfrac{\rho _{1}+\rho _{2}}{2}\Vert dx^{k}\Vert ^{2}, \forall k\) where the equality

$$\begin{aligned} (g-h)(x^{k+1})=(g-h)(x^{k}) \end{aligned}$$
(A1)

is verified if and only if \(x^{k}\in \partial g^{*}(y^{k})\), \(y^{k}\in \partial h(x^{k+1})\) and \((\rho _{1}+\rho _{2})dx^{k}=0.\)

In this case, one obtains the following main statements:

1.1 \(x^{k},x^{k+1}\) are DC-critical points of \(g-h\) satisfying \(y^{k}\in (\partial g(x^{k})\cap \partial h(x^{k}))\) and \(y^{k}\in (\partial g(x^{k+1})\cap \partial h(x^{k+1})),\)

1.2 \(y^{k}\) is a DC critical point of \(h^{*}-g^{*}\) and \([x^{k},x^{k+1}]\subset (\partial g^{*}(y^{k})\cap \partial h^{*}(y^{k})),\)

1.3 If \(\rho _{1}+\rho _{2}>0\), then \(x^{k+1}=x^{k}\), \(y^{k}=y^{k-1}\) if \(\rho _{1}^{*}>0\) and \(y^{k+1}=y^{k}\) if \(\rho _{2}^{*}>0\).

Furthermore, if g or h is strictly convex on X, then \(x^{k+1}=x^{k}\).

In such a case (A1), the DCA terminates at the \(k^{th}\) iteration (finite convergence of DCA).

2 For the dual DC program \((D_{dc})\)

Similarly, the DC duality provides the dual DC program \((D_{dc})\) with

$$\begin{aligned} (h^{*}-g^{*})(y^{k+1}) \le (g-h)(x^{k+1})-\dfrac{\rho _{1}^{*}}{2}\Vert dy^{k}\Vert ^{2} \le (h^{*}-g^{*})(y^{k})-\dfrac{\rho _{1}^{*}+\rho _{2}^{*}}{2}\Vert dy^{k}\Vert ^{2}. \end{aligned}$$

The equality

$$\begin{aligned} (h^{*}-g^{*})(y^{k+1})=(h^{*}-g^{*})(y^{k}) \end{aligned}$$
(A2)

occurs if and only if \(x^{k+1}\in \partial g^{*}(y^{k+1}),y^{k}\in \partial h(x^{k+1})\) and \((\rho _{1}^{*}+\rho _{2}^{*})dy^{k}=0\).

In this case, the following properties hold:

2.1 The equality \((h^{*}-g^{*})(y^{k+1})=(g-h)(x^{k+1})\) holds and \(y^{k},y^{k+1}\) are the DC critical points of \(h^{*}-g^{*}\) with \(x^{k+1}\in (\partial g^{*}(y^{k})\cap h^{*}(y^{k}))\) and \(x^{k+1}\in (\partial g^{*}(y^{k+1})\cap \partial h^{*}(y^{k+1})),\)

2.2 \(x^{k+1}\) is a DC critical point of \(g-h\) and \([y^{k},y^{k+1}]\subset (\partial g(x^{k+1})\cap \partial h(x^{k+1})),\)

2.3 \(y^{k+1}=y^{k}\) if \(\rho _{1}^{*}+\rho _{2}^{*}>0,x^{k+1}=x^{k}\) if \(\rho _{2}>0\) and \(x^{k+2}=x^{k+1}\) if \(\rho _{1}>0\).

Furthermore, if \(g^{*}\) or \(h^{*}\) is strictly convex on \(\mathbb {R}^n\), then \(y^{k+1}=y^{k}.\)

As for 1.3, in the case (A2), the DCA terminates at the \(k^{th}\) iteration (finite convergence of DCA).

3. If \(\rho _{1}+\rho _{2}>0\) then the primal DC series \(\{\Vert x^{k+1}-x^{k}\Vert ^{2}\}\) converges with its limit bounded above by

$$\begin{aligned} \frac{\rho _{1}+\rho _{2}}{2}\sum \limits _{l=0}^{k}\Vert x^{l+1}-x^{l}\Vert ^{2}&\le (g-h)(x^{0})-(g-h)(x^{k+1}) \nonumber \\&\le (g-h)(x^{0})-\beta \le (g-h)(x^{0})-\alpha , \forall k. \end{aligned}$$
(A3)

Dually, if \(\rho _{1}^{*}+\rho _{2}^{*}>0\), then the dual DC series \(\{\Vert y^{k+1}-y^{k}\Vert ^{2}\})\) is convergent with its limit bounded above by

$$\begin{aligned} \frac{\rho _{1}^{*}+\rho _{2}^{*}}{2}\sum \limits _{l=0}^{k}\Vert y^{l+1}-y^{l}\Vert ^{2}&\le (h^{*}-g^{*})(y^{0})-(h^{*}-g^{*})(y^{k+1})\le (h^{*}-g^{*})(y^{0})-\beta \nonumber \\&\le (h^{*}-g^{*})(y^{0})-\alpha ,~\forall k. \end{aligned}$$
(A4)

4. If \(\alpha \) is finite, the sequences \(\{(g-h)(x^{k})\}\), \(\{(h^{*}-g^{*})(y^{k})\}\) decrease and converge to the same limit \(\beta \ge \alpha \): \(\lim _{k\rightarrow +\infty }(g-h)(x^{k})=\lim _{k\rightarrow +\infty }(h^{*}-g^{*})(y^{k})=\beta .\)

5. If \(\alpha \) is finite and the sequences \(\{x^{k}\}\) and \(\{y^{k}\}\) are bounded, then for every limit point \(x^{*}\) of \(\{x^{k}\}\) (resp. \(y^{*}\) of \(\{y^{k}\}\)) there exists a limit point \(y^{*}\) of \(\{y^{k}\}\) (resp. \(x^{*}\) of \(\{x^{k}\}\)) such that \((x^{*},y^{*})\in [\partial g^{*}(y^{*})\cap \partial h^{*}(y^{*})]\times [\partial g(x^{*})\cap \partial h(x^{*})]\) and \((g-h)(x^{*})=(h^{*}-g^{*})(y^{*})=\beta \ge \alpha .\) Such a point \(x^{*}\)(resp. \(y^{*})\) is DC critical point of \(g-h\) (resp. \(h^{*}-g^{*}\)).

6. DCA’s complexity for primal and dual DC programs

Let \(x^{*}\)be a DC critical point of \(g-h\) defined as a limit point of the sequence \(\{x^{k}\}\) computed by the primal DCA. Then, from (A3), one deduces \(f(x^{*})=(g-h)(x^{*})=\beta :=\lim _{k\rightarrow +\infty }f(x^{k})=\lim _{k\rightarrow +\infty }(g-h)(x^{k})\) and \(\frac{\rho _{1}+\rho _{2}}{2}(k+1)\min \{\Vert x^{l+1}-x^{l}\Vert ^{2}:l=0,\ldots ,k\}\le [f(x^{0})-f(x^{*})].\) Moreover, if \(\rho _{1}+\rho _{2}>0\), then \(\min \{\Vert x^{l+1}-x^{l}\Vert :l=0,\ldots ,k\}\le \frac{2^{1/2}[f(x^{0})-f(x^{*})]^{1/2}}{(\rho _{1}+\rho _{2})^{1/2}(k+1)^{1/2}}.\)

Likewise, by using the same reasoning for the sequence \(\{y^{k}\}\) via (A4), we get the similar results for dual DCA: \((h^{*}-g^{*})(y^{*})=\beta :=\lim _{k\rightarrow +\infty }(h^{*}-g^{*})(y^{k})\) and \(\frac{\rho _{1}^{*}+\rho _{2}^{*}}{2}\sum \limits _{k=0}^{k}\Vert y^{l+1}-y^{l}\Vert ^{2} \le (h^{*}-g^{*})(y^{0})-(h^{*}-g^{*})(y^{k+1}) \le (h^{*}-g^{*})(y^{0})-\beta \le (h^{*}-g^{*})(y^{0})-\alpha ,~\forall k.\) Hence, the inequality \(\rho _{1}^{*}+\rho _{2}^{*}>0\) gives

\(\min \{\Vert y^{l+1}-y^{l}\Vert :l=0,\ldots ,k\}\le \frac{2^{1/2}[(h^{*}-g^{*})(y^{0})-(h^{*}-g^{*})(y^{*})]^{1/2}}{(\rho _{1}^{*}+\rho _{2}^{*})^{1/2}(k+1)^{1/2}}.\)

Therefore, both primal and dual DCA have a complexity \(O(1/\sqrt{k})\).

Appendix B Global convergence of GDCA1

Denote by \(I(x):=\left\{ i\in \{1,\ldots ,m\}: ~ f_{i}(x)=p(x)\right\} \). We say that the extended Mangasarian-Fromowitz constraint qualification (EMFCQ) is satisfied at \(x^{*}\in E\) with \(I(x^{*})\not =\emptyset \) if

$$\begin{aligned} \begin{array}{ll} (\text {MFCQ}) &{} \text{ there } \text{ is } \text{ a } \text{ vector } d\in \text{ cone }(C-\{x^*\}) \text{(the } \text{ cone } \text{ hull } \text{ of } C-\{x^*\}) \\ &{} \text {such that }f_{i}^{\uparrow }(x^{*},d)<0 \text{ for } \text{ all } i\in I(x^{*}). \end{array} \end{aligned}$$

When \(f_{i}^{\prime }s\) are continuously differentiable, then \(f_{i}^{\uparrow }(x^{*},d)=\langle \nabla f(x^{*}),d\rangle .\) Therefore, (EMFCQ) becomes the well-known Mangasarian-Fromowitz constraint qualification. It is well known that if the (extended) Mangasarian-Fromowitz constraint qualification is satisfied at a local minimizer \(x^{*}\) of problem (6) then the KKT first order necessary condition (7) holds (see [160, 161]). In the global convergence theorem, we make use of the following assumption:

Assumption 3

The (extended) Mangasarian-Fromowitz constraint qualification (EMFCQ) is satisfied at any \(x\in {{\mathbb {R}}}^{n}\) with \(p(x)\ge 0.\)

When \(f_{i}\), \(i=1,\ldots ,m,\) are all convex functions, then it is obvious that this assumption is satisfied under the \(f_{i}(x)<0\) for all \(i=1,\ldots ,m.\)

Theorem 12

Suppose that \(C\subseteq {{\mathbb {R}}}^{n}\) is a nonempty closed convex set and \(f_{i}\), \(i=1,\ldots ,m\) are DC functions on C. Suppose further that Assumptions 13 are verified. Let \(\delta >0,\) \(\beta _{1}>0\) be given. Let \(\{x^{k}\}\) be a sequence generated by GDCA1. Then GDCA1 either stops, after finitely many iterations, at a KKT point \(x^{k}\) for problem (6) or generates an infinite sequence \(\{x^{k}\}\) of iterates such that \(\lim _{k\rightarrow \infty }\Vert x^{k+1}-x^{k}\Vert =0\) and every limit point \(x^{\infty }\) of the sequence \(\{x^{k}\}\) is a KKT point of problem (6).

Appendix C Global convergence of GDCA2

Recall, as defined in the preceding section, that \(\varphi _{k}(x):=f_{0}(x)+\beta _{k}p^{+}(x).\) The following lemma is needed to investigate the convergence of GDCA2.

Lemma 13

The sequence \((x^{k},t^{k})\) generated by GDCA2 satisfies the following inequality \(\varphi _{k}(x^{k})-\varphi _{k}(x^{k+1})\ge \frac{\rho }{2}\Vert x^{k+1}-x^{k}\Vert ^{2}\), for all \(k=1,2,\ldots \) where, \(\rho :=\rho (g_{0},C)+\rho (h_{0},C)+\min \{\rho (g_{i},C): ~ i=1,\ldots ,m\}.\)

Theorem 14

Suppose that \(C\subseteq {{\mathbb {R}}}^{n}\) is a nonempty closed convex set and \(f_{i}, i=1,\ldots ,m,\) are DC functions on C such that Assumptions 1 and 3 are verified. Suppose further that for each \(i=0,\ldots ,m,\) either \(g_{i}\) or \(h_{i}\) is differentiable on C and that \(\rho :=\rho (g_{0},C)+\rho (h_{0},C)+\min \{\rho (g_{i},C): i=1,\ldots ,m\}>0.\) Let \(\delta _{1},\delta _{2}>0,\) \(\beta _{1}>0\) be given. Let \(\{x^{_{k}}\}\) be a sequence generated by GDCA2. Then GDCA2 either stops, after finitely many iterations, at a KKT point \(x^{k}\) for problem (6) or generates an infinite sequence \(\{x^{k}\}\) of iterates such that \(\lim _{k\rightarrow \infty }\Vert x^{k+1}-x^{k}\Vert =0\) and every limit point \(x^{\infty }\) of the sequence \(\{x^{k}\}\) is a KKT point of problem (6).

Note that, as shown in Theorems 12 and 14, the penalty parameter \(\beta _{k}\) is constant when k is sufficiently large. Observing from the proof of these convergence theorems, the sequence \(\{\varphi (x^{k})\}\) of values of the function \(\varphi (x)=f_{0}(x)+\beta _{k}p^{+}(x)\) along with the sequence \(\{x^{k}\}\) generated by GDCA1 and GDCA2 is decreasing. These results remain valid if we replace, in (11), the variable t by \(t_{i}\) for \(i=1,\ldots ,m\) and the function \(\beta _{k}t\) by \(\beta _{k}\sum _{i=1}^{m}t_{i}.\)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Le Thi, H.A., Pham Dinh, T. Open issues and recent advances in DC programming and DCA. J Glob Optim 88, 533–590 (2024). https://doi.org/10.1007/s10898-023-01272-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-023-01272-1

Keywords

Navigation