Abstract
This paper studies several versions of the sparse optimization problem in statistical estimation defined by a pairwise separation objective. The sparsity (i.e., \(\ell _0\)) function is approximated by a folded concave function; the pairwise separation gives rise to an objective of the Z-type. After presenting several realistic estimation problems to illustrate the Z-structure, we introduce a linear-step inner-outer loop algorithm for computing a directional stationary solution of the nonconvex nondifferentiable folded concave sparsity problem. When specialized to a quadratic loss function with a Z-matrix and a piecewise quadratic folded concave sparsity function, the overall complexity of the algorithm is a low-order polynomial in the number of variables of the problem; thus the algorithm is strongly polynomial in this quadratic case. We also consider the parametric version of the problem that has a weighted \(\ell _1\)-regularizer and a quadratic loss function with a (hidden) Z-matrix. We present a linear-step algorithm in two cases depending on whether the variables have prescribed signs or with unknown signs. In both cases, a parametric algorithm is presented and its strong polynomiality is established under suitable conditions on the weights. Such a parametric algorithm can be combined with an interval search scheme for choosing the parameter to optimize a secondary objective function in a bilevel setting. The analysis makes use of a least-element property of a Z-function, and, for the case of a quadratic loss function, the strongly polynomial solvability of a linear complementarity problem with a hidden Z-matrix. The origin of the latter class of matrices can be traced to an inspirational paper of Olvi Mangasarian to whom we dedicate our present work.


Similar content being viewed by others
References
Adler, I., Cottle, R.W., Pang, J.S.: Some LCPs solvable in strongly polynomial time with Lemke’s algorithm. Math. Progr., Ser. A 160(1), 477–493 (2016)
Ahn, M., Pang, J.S., Xin, J.: Difference-of-convex learning: directional stationarity, optimality, and sparsity. SIAM J. Optim. 27(3), 1637–1665 (2017)
Atamtürk, A., Gómez, A.: Strong formulations for quadratic optimzation with M-matrices and indicator variables. Math. Progr. Seri. B 170, 141–176 (2018)
Atamtürk, A., Gómez, A., Han, S.: Sparse and smooth signal estimation: convexification of L0 formulations. J. Mach. Learn. Res. 22, 1–43 (2021)
Bach, F.: Submodular functions: from discrete to continuous domains. Math. Program. 175(1), 419–459 (2019)
Barlow, R.E., Bartholomew, D., Bremmer, J.M., Brunk, H.D.: Statistical Inference Under Order Restrictions: The Theory and Application of Order Regression. Wiley, New York (1972)
Bennett, K.P., Kunapuli, G., Hu, J., Pang, J.S.: Bilevel optimization and machine learning. In: Computational Intelligence: Research Frontiers. Lecture Notes in Computer Science, vol. 5050, pp. 25–47 (2008)
Bertsimas, D., Cory-Wright, R.: A scalable algorithm for sparse portfolio selection. arXiv preprint (2018). arXiv:1811.00138
Bian, W., Chen, X.: A smoothing proximal gradient algorithm for nonsmooth convex regression with cardinality penalty. SIAM J. Numer. Anal. 58(1), 858–883 (2020)
Blumensath, T., Davies, M.E.: Iterative thresholding for sparse approximations. J. Four. Anal. Appl. 14, 629–654 (2008)
Cai, B., Zhang, A., Stephen, J.M., Wilson, T.W., Calhoun, V.D., Wang, Y.P.: Capturing dynamic connectivity from resting state FMRI using time-varying graphical lasso. IEEE Trans. Biomed. Eng. 66(7), 1852–1862 (2018)
Candès, E.J., Watkins, M.B., Boyd, S.P.: Enhancing sparsity by reweighted \(\ell _1\) minimization. J. Four. Anal. Appl. 14, 877–905 (2008)
Chandrasekaran, R.: A special case of the complementary pivot problem. Opsearch 7, 263–268 (1970)
Chen, T.W., Wardill, T., Sun, Y., Pulver, S., Renninger, S., Baohan, A., Schreiter, E.R., Kerr, R.A., Orger, M., Jayaraman, V.: Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature 499, 295–300 (2013)
Chen, X.: Smoothing methods for nonsmooth, novonvex minimization. Math. Progr. 134, 71–99 (2012)
Chen, Y., Ge, D., Wang, M., Wang, Z., Ye, Y., Yin, H.: Strong NP-hardness for sparse optimization with concave penalty functions. In: Proceedings of the 34 the International Conference on Machine Learning, Sydney, Australia, PMLR 70 (2017)
Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained L2-Lp minimization. Math. Progr. 143, 371–383 (2014)
Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of \(\ell _2\)-\(\ell _p\) minimization. SIAM J. Sci. Comput. 32, 2832–2852 (2010)
Chen, X., Zhou, W.: Convergence of the reweighted \(\ell _1\) minimization algorithm for \(\ell _2\)-\(\ell _p\) minimization. Comput. Optim. Appl. 59, 47–61 (2014)
Cottle, R.W., Pang, J.S.: On solving linear complementarity problems as linear programs. Math. Progr. Study 7, 88–107 (1978)
Cottle, R.W., Pang, J.S., Stone, R.E.: The linear complementarity problem, vol. 60. SIAM Classics in Applied Mathematics, Philadelphia (2009) [Originally published by Academic Press, Boston (1992)]
Cottle, R.W., Veinott, A.F., Jr.: Polyhedral sets having a least element. Math. Progr. 3, 23–249 (1969)
Cui, Y., Chang, T.H., Hong, M., Pang, J.S.: A study of piecewise-linear quadratic programs. J. Optim. Theory Appl. 186, 523–553 (2020)
Cui, Y., Pang, J.S.: Modern nonconvex and nondifferentiable optimization. In: Society for Industrial and Applied Mathematics. MOS-SIAM Series on Optimization, Philadelphia (2021)
Dong, H., Ahn, M., Pang, J.S.: Structural properties of affine sparsity constraints. Math. Progr., Ser. B 176(1–2), 95–135 (2018)
Dong, H., Chen, K., Linderoth, J.: Regularization vs. relaxation: a conic optimization perspective of statistical variable selection (2015). arXiv:1510.06083
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Fan, J., Xue, L., Zou, H.: Strong oracle optimality of folded concave penalized estimation. Ann. Stat. 42(3), 819–849 (2014)
Fattahi, S., Gómez A.: Scalable inference of sparsely-changing Markov random fields with strong statistical guarantees. Forthcoming in NeurIPS (2021). https://proceedings.neurips.cc/paper/2021/hash/33853141e0873909be88f5c3e6144cc6-Abstract.html
Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual (2021). https://www.gurobi.com
Hallac, D., Park, Y., Boyd, S., Leskovec, J.: Network inference via the time-varying graphical lasso. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 205–213 (2017)
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical learning with sparsity: the Lasso and generalizations. In: Monographs on Statistics and Applied Probability, vol. 143. CRC Press (2015)
He, Z., Han, S., Gómez, A., Cui, Y., Pang, J.S.: Comparing solution paths of sparse quadratic minimization with a Stieltjes matrix. Department of Industrial and Systems Engineering, University of Southern California (2021)
Hochbaum, D.S., Lu, Ch.: A faster algorithm for solving a generalization of isotonic median regression and a class of fused Lasso problems. SIAM J. Optim. 27(4), 2563–2596 (2017)
Jewell, S., Witten, D.: Exact spike train inference via \(\ell 0\) optimization. Ann. Appl. Stat. 12(4), 2457–2482 (2018)
Kunapuli, G., Bennett, K., Hu, J., Pang, J.S.: Classification model selection via bilevel programming. Optim. Methods Softw. 23(4), 475–489 (2008)
Kunapuli, G., Bennett, K., Hu, J., Pang, J.S.: Bilevel model selection for support vector machines. In: Hansen, P., Pardolos, P. (eds.) CRM Proceedings and Lecture Notes. American Mathematical Society, vol. 45, pp. 129–158 (2008)
Lee, Y.C., Mitchell, J.E., Pang, J.S.: Global resolution of the support vector machine regression parameters selection problem with LPCC. EURO J. Comput. Optim. 3(3), 197–261 (2015)
Lee, Y.C., Mitchell, J.E., Pang, J.S.: An algorithm for global solution to bi-parametric linear complementarity constrained linear programs. J. Glob. Optim. 62(2), 263–297 (2015)
Le Thi, H.A., Pham Dinh, T., Vo, X.T.: DC approximation approaches for sparse optimization. Eur. J. Oper. Res. 244(1), 26–46 (2015)
Liu, H., Yao, T., Li, R., Ye, Y.: Folded concave penalized sparse linear regression: sparsity, statistical performance, and algorithmic theory for local solutions. Math. Progr. 166, 207–240 (2017)
Lu, Z., Zhou, Z., Sun, Z.: Enhanced proximal DC algorithms with extrapolation for a class of structured nonsmooth DC minimization. Math. Progr. 176(1–2), 369–401 (2019)
Mairal, J., Yu, B.: Complexity analysis of the Lasso regularization path. In: Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, UK (2012)
Mangasarian, O.L.: Linear complementarity problems solvable by a single linear program. Math. Progr. 10, 263–270 (1976)
Moré, J., Rheinboldt, W.C.: On P- and S-functions and related classes of nonlinear mappings. Linear Algebra Appl. 6, 45–68 (1973)
Mosek ApS. The MOSEK optimization toolbox for MATLAB manual. Version 9.3 (2019). http://docs.mosek.com/9.3/toolbox/index.html
Pan, L., Chen, X.: Group sparse optimization for images recovery using capped folded concave functions. SIAM J. Image Sci. 14(1), 1–25 (2021)
Pang, J.S.: On a class of least-element linear complementarity problems. Math. Progr. 16, 111–126 (1979)
Pang, J.S.: Leaast-element complementarity theory. Ph.D. Thesis. Department of Operations Research, Stanford University (1976)
Pang, J.S., Chandrasekaran, R.: Linear complementarity problems solvable by a polynomially bounded pivoting algorithm. Math. Progr. Study 25, 13–27 (1985)
Pang, J.S., Razaviyayn, M., Alvarado, A.: Computing B-stationary points of nonsmooth dc programs. Math. Oper. Res. 42, 95–118 (2017)
Rheinboldt, W.C.: On M-functions and their applications to nonlinear Gauss-Seidel iterations and to network flows. J. Math. Anal. Appl. 32, 274–307 (1970)
Rockafellar, R.T.: Convex Analysis. Princeton University Press (1970)
Tamir, A.: Minimality and complementarity properties associated with Z-functions and M-functions. Math. Progr. 7, 17–31 (1974)
Tibshirani, R.J., Hoefling, H., Tibshirani, R.: Nearly-isotonic regression. Technometrics 53(1), 54–61 (2011)
Vogelstein, J.C., Packer, A.M., Machado, T.A., Sippy, T., Babadi, B., Paninski, L.: Fast nonnegative deconvolution for spike train inference from population calcium imaging. J. Neurophysiol. 6, 3691–3704 (2010)
Ye, Y.: On the complexity of approximating a KKT point of quadratic programming. Math. Progr. 80, 195–211 (1998)
Zhang, C.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)
Acknowledgements
The authors would like to thank Dr. Ying Cui at the University of Minnesota for her insightful comments that have helped to improve this manuscript, and for bringing to our attention the two references [16, 43]. They are also grateful to two referees who have provided constructive comments and offered additional references that have helped to improve the presentation and quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The work of the first author was based on research support by the National Science Foundation under grant CIF-2006762. The work of the third author was based on research supported by the U.S. Air Force Office of Scientific Research under Grant FA9550-18-1-0382. A tribute to Olvi L. Mangasarian.
Rights and permissions
About this article
Cite this article
Gómez, A., He, Z. & Pang, JS. Linear-step solvability of some folded concave and singly-parametric sparse optimization problems. Math. Program. 198, 1339–1380 (2023). https://doi.org/10.1007/s10107-021-01766-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-021-01766-4