Abstract
Motivated by modern regression applications, in this paper, we study the convexification of quadratic optimization problems with indicator variables and combinatorial constraints on the indicators. Unlike most of the previous work on convexification of sparse regression problems, we simultaneously consider the nonlinear objective, indicator variables, and combinatorial constraints. We prove that for a separable quadratic objective function, the perspective reformulation is ideal independent from the constraints of the problem. In contrast, while rank-one relaxations cannot be strengthened by exploiting information from k-sparsity constraint for \(k\ge 2\), they can be improved for other constraints arising in inference problems with hierarchical structure or multi-collinearity.
Andrés Gómez is supported, in part, by grant 1930582 of the National Science Foundation. Simge Küçükyavuz is supported, in part, by ONR grant N00014-19-1-2321.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aktürk, M.S., Atamtürk, A., Gürel, S.: A strong conic quadratic reformulation for machine-job assignment with controllable processing times. Oper. Res. Lett. 37(3), 187–191 (2009)
Anstreicher, K.M.: On convex relaxations for quadratically constrained quadratic programming. Math. Program. 136(2), 233–251 (2012). https://doi.org/10.1007/s10107-012-0602-3
Atamtürk, A., Gómez, A.: Strong formulations for quadratic optimization with M-matrices and indicator variables. Math. Program. 170(1), 141–176 (2018). https://doi.org/10.1007/s10107-018-1301-5
Atamtürk, A., Gómez, A.: Rank-one convexification for sparse regression (2019). http://www.optimization-online.org/DB_HTML/2019/01/7050.html
Atamtürk, A., Gómez, A., Han, S.: Sparse and smooth signal estimation: convexification of L0 formulations (2018). http://www.optimization-online.org/DB_HTML/2018/11/6948.html
Bacci, T., Frangioni, A., Gentile, C., Tavlaridis-Gyparakis, K.: New MINLP formulations for the unit commitment problems with ramping constraints. Optimization (2019). http://www.optimization-online.org/DB_FILE/2019/10/7426.pdf
Belotti, P., Góez, J.C., Pólik, I., Ralphs, T.K., Terlaky, T.: A conic representation of the convex hull of disjunctive sets and conic cuts for integer second order cone optimization. In: Al-Baali, M., Grandinetti, L., Purnama, A. (eds.) Numerical Analysis and Optimization. SPMS, vol. 134, pp. 1–35. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17689-5_1
Bertsimas, D., Cory-Wright, R., Pauphilet, J.: A unified approach to mixed-integer optimization: nonlinear formulations and scalable algorithms. arXiv preprint arXiv:1907.02109 (2019)
Bertsimas, D., King, A.: OR forum - an algorithmic approach to linear regression. Oper. Res. 64(1), 2–16 (2016)
Bertsimas, D., King, A., Mazumder, R.: Best subset selection via a modern optimization lens. Ann. Stat. 44(2), 813–852 (2016)
Bien, J., Taylor, J., Tibshirani, R.: A lasso for hierarchical interactions. Ann. Stat. 41(3), 1111 (2013)
Bienstock, D., Michalka, A.: Cutting-planes for optimization of convex functions over nonconvex sets. SIAM J. Optim. 24(2), 643–677 (2014)
Burer, S.: On the copositive representation of binary and continuous nonconvex quadratic programs. Math. Program. 120(2), 479–495 (2009). https://doi.org/10.1007/s10107-008-0223-z
Burer, S., Kılınç-Karzan, F.: How to convexify the intersection of a second order cone and a nonconvex quadratic. Math. Program. 162(1–2), 393–429 (2016). https://doi.org/10.1007/s10107-016-1045-z
Ceria, S., Soares, J.: Convex programming for disjunctive convex optimization. Math. Program. 86, 595–614 (1999). https://doi.org/10.1007/s101070050106
Cozad, A., Sahinidis, N.V., Miller, D.C.: Learning surrogate models for simulation-based optimization. AIChE J. 60(6), 2211–2227 (2014)
Cozad, A., Sahinidis, N.V., Miller, D.C.: A combined first-principles and data-driven approach to model building. Comput. Chem. Eng. 73, 116–127 (2015)
Dong, H.: On integer and MPCC representability of affine sparsity. Oper. Res. Lett. 47(3), 208–212 (2019)
Dong, H., Ahn, M., Pang, J.-S.: Structural properties of affine sparsity constraints. Math. Program. 176(1–2), 95–135 (2019). https://doi.org/10.1007/s10107-018-1283-3
Dong, H., Chen, K., Linderoth, J.: Regularization vs. relaxation: a conic optimization perspective of statistical variable selection. arXiv preprint arXiv:1510.06083 (2015)
Dong, H., Linderoth, J.: On valid inequalities for quadratic programming with continuous variables and binary indicators. In: Goemans, M., Correa, J. (eds.) IPCO 2013. LNCS, vol. 7801, pp. 169–180. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36694-9_15
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
Frangioni, A., Furini, F., Gentile, C.: Approximated perspective relaxations: a project and lift approach. Comput. Optim. Appl. 63(3), 705–735 (2015). https://doi.org/10.1007/s10589-015-9787-8
Frangioni, A., Gentile, C.: Perspective cuts for a class of convex 0–1 mixed integer programs. Math. Program. 106, 225–236 (2006). https://doi.org/10.1007/s10107-005-0594-3
Frangioni, A., Gentile, C.: SDP diagonalizations and perspective cuts for a class of nonseparable MIQP. Oper. Res. Lett. 35(2), 181–185 (2007)
Frangioni, A., Gentile, C., Grande, E., Pacifici, A.: Projected perspective reformulations with applications in design problems. Oper. Res. 59(5), 1225–1232 (2011)
Frangioni, A., Gentile, C., Hungerford, J.: Decompositions of semidefinite matrices and the perspective reformulation of nonseparable quadratic programs. Math. Oper. Res. (2019). https://doi.org/10.1287/moor.2018.0969. Article in Advance (October)
Günlük, O., Linderoth, J.: Perspective reformulations of mixed integer nonlinear programs with indicator variables. Math. Program. 124, 183–205 (2010). https://doi.org/10.1007/s10107-010-0360-z
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. Monographs on Statistics and Applied Probability, vol. 143. Chapman and Hall/CRC, Boca Raton (2015)
Hazimeh, H., Mazumder, R.: Learning hierarchical interactions at scale: a convex optimization approach. arXiv preprint arXiv:1902.01542 (2019)
Hijazi, H., Bonami, P., Cornuéjols, G., Ouorou, A.: Mixed-integer nonlinear programs featuring “on/off” constraints. Comput. Optim. Appl. 52(2), 537–558 (2012). https://doi.org/10.1007/s10589-011-9424-0
Huang, J., Breheny, P., Ma, S.: A selective review of group selection in high-dimensional models. Stat. Sci.: Rev. J. Inst. Math. Stat. 27(4), 481–499 (2012)
Jeon, H., Linderoth, J., Miller, A.: Quadratic cone cutting surfaces for quadratic programs with on-off constraints. Discrete Optim. 24, 32–50 (2017)
Kılınç-Karzan, F., Yıldız, S.: Two-term disjunctions on the second-order cone. In: Lee, J., Vygen, J. (eds.) IPCO 2014. LNCS, vol. 8494, pp. 345–356. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07557-0_29
Manzour, H., Küçükyavuz, S., Shojaie, A.: Integer programming for learning directed acyclic graphs from continuous data. arXiv preprint arXiv:1904.10574 (2019)
Miller, A.: Subset Selection in Regression. Chapman and Hall/CRC, Boca Raton (2002). https://doi.org/10.1201/9781420035933
Modaresi, S., Kılınç, M.R., Vielma, J.P.: Intersection cuts for nonlinear integer programming: convexification techniques for structured sets. Math. Program. 155(1), 575–611 (2015). https://doi.org/10.1007/s10107-015-0866-5
Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227–234 (1995)
Richard, J.-P.P., Tawarmalani, M.: Lifting inequalities: a framework for generating strong cuts for nonlinear programs. Math. Program. 121(1), 61–104 (2010). https://doi.org/10.1007/s10107-008-0226-9
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58, 267–288 (1996)
Vielma, J.P.: Small and strong formulations for unions of convex sets from the Cayley embedding. Math. Program. 177(1–2), 21–53 (2019). https://doi.org/10.1007/s10107-018-1258-4
Wang, A.L., Kılınç-Karzan, F.: The generalized trust region subproblem: solution complexity and convex hull results. arXiv preprint arXiv:1907.08843 (2019a)
Wang, A.L., Kılınç-Karzan, F.: On the tightness of SDP relaxations of QCQPs. Optimization Online preprint (2019b). http://www.optimization-online.org/DB_FILE/2019/11/7487.pdf
Wu, B., Sun, X., Li, D., Zheng, X.: Quadratic convex reformulations for semicontinuous quadratic programming. SIAM J. Optim. 27(3), 1531–1553 (2017)
Xie, W., Deng, X.: The CCP selector: scalable algorithms for sparse ridge regression from chance-constrained programming. arXiv preprint arXiv:1806.03756 (2018)
Zhang, C.-H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)
Zheng, X., Sun, X., Li, D.: Improving the performance of MIQP solvers for quadratic programs with cardinality and minimum threshold constraints: a semidefinite program approach. INFORMS J. Comput. 26(4), 690–703 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Proof
(Theorem 2). First, note that the validity of the new inequality defining \(\mathrm {cl \ conv}\left( Z_{Q_1}\right) \) follows from Proposition 1. For \(a, b \in \mathbb R^{p}\) and \(c \in \mathbb {R}\), consider the following two optimization problems:
and
The analysis for cases where \(c=0\) and \(c<0\) is similar to the proof of Theorem 1, and we can proceed with assuming \(c=1\) and \(b \in \mathbb R^{p}\). First suppose that b is not a multiple of all-ones vector, then \(\exists b_i < b_j\) for some \(i,j\in [p], i\ne j\). Let \(\bar{z} = e_i + e_j\), \(\bar{\beta }= \tau (e_i - e_j)\) for some scalar \(\tau \), and \(\bar{t}=0\). Note that \((\bar{z},\bar{\beta },\bar{t})\) is feasible for both (14) and (15), and if we let \(\tau \) go to infinity the objective value goes to minus infinity. So (14) and (15) are unbounded.
Now suppose that \(b = \kappa \mathbf 1^\top \) for some \(\kappa \in \mathbb {R}\) and \(c = 1\); in this case both (14) and (15) have finite optimal value. It suffices to show that there exists an optimal solution \((z^{*}, \beta ^{*}, t^{*})\) of (15) that is integral in \(z^{*}\). If \(\sum _{i \in [p]} z^{*}_i = 0\), then we know \(z^{*}_i =\beta _i^*= 0, \forall i \in [p]\) for both (14) and (15), and we are done. If \(0< \sum _{i \in [p]} z^{*}_i < 1\) and the corresponding optimal objective value is 0 (or positive), then by letting \(z^{*} =\mathbf 0\), \(\beta ^{*} =\mathbf 0\) and \(t^{*} = 0\), we get a feasible solution with the same objective value (or better). If \(0< \sum _{i \in [p]} z^{*}_i < 1\) and \((z^{*}, \beta ^{*}, t^{*})\) attains a negative objective value, then let \(\gamma = \frac{1}{\sum _{i \in [p]} z^{*}_i}\): \((\gamma z^{*}, \gamma \beta ^{*}, \gamma t^{*})\) is also a feasible solution of (15) with a strictly smaller objective value, which is a contradiction.
Finally, consider the case where \(\sum _{i \in [p]} z^{*}_i \ge 1\). In this case, the constraint \((\mathbf 1^\top \beta )^2 \le t\) is active and the optimal value is attained when \(\mathbf 1^\top \beta ^{*} = -\frac{\kappa }{2}\) and \(t^{*} = (\mathbf 1^\top \beta ^{*})^2 \), and (15) has the same optimal value as the LP:
The constraint set of this LP is an interval matrix, so the LP has an integral optimal solution, \(z^{*}\), hence, so does (15). \(\square \)
Proof
(Lemma 1). Suppose \(z^{*}\) is an extreme point of \(Q_g\) and \(z^{*}\) has a fractional entry. If \(\sum _{i \in [p-1]} z^{*}_i - (p-2)z^{*}_p > 1\), let us consider the two cases where \(z^{*}_p = 0\) and \(z^{*}_p >0\). When \(z^{*}_p = 0\) and there exists a fractional coordinate \(z^{*}_i\) where \(i \in [p-1]\), we can perturb \(z^{*}_i\) by a sufficient small quantity \(\epsilon \) such that \(z^{*} + \epsilon e_i\) and \(z^{*} - \epsilon e_i\) are in \(Q_g\). Then, \(z^{*} = \frac{1}{2} (z^{*} + \epsilon e_i) + \frac{1}{2} (z^{*} - \epsilon e_i)\) which contradicts the fact that \(z^{*}\) is an extreme point of \(Q_g\). When \(1> z^{*}_p > 0\) we can perturb \(z^{*}_p\) and all other \(z^{*}_i\) with \(z^{*}_i = z^{*}_p\) by a sufficiently small quantity \(\epsilon \) and stay in \(Q_g\). Similarly, we will reach a contradiction.
Now suppose \(\sum _{i \in [p-1]} z^{*}_i - (p-2)z^{*}_p = 1\), and let us consider again the two cases where \(z^{*}_p = 0\) and \(z^{*}_p >0\). When \(z^{*}_p = 0\), \(z^{*} = z^{*}_1e_1 + \cdots + z^{*}_{(p-1)} e_{(p-1)}\), which is a contradiction since we can write \(z^{*}\) as a convex combination of points \(e_i\in Q_g, i\in [p-1]\) and there exists at least two indices \(i, j \in [p-1], i\ne j\) such that \(1> z^{*}_i , z^{*}_j >0\) by the fact that \(z^{*}\) has a fractional entry and \(\sum _{i \in [p-1]} z^{*}_i = 1, 0 \le z^{*}_i \le 1, \forall i\). When \(1> z^{*}_p > 0\), we first show that there exists at most one 1 in \(z^{*}_1, z^{*}_2, \dots , z^{*}_{(p-1)}\). Suppose we have \(z^{*}_i =1 \) and \(z^{*}_j = 1\) for \(i,j\in [p-1]\) with \(i\ne j\), then \(\sum _{i \in [p-1]} z^{*}_i - (p-2)z^{*}_p = z^{*}_i + \sum _{l \in [p-1], l \ne i} (z^{*}_{l} - z^{*}_p) \ge z^{*}_i + (z^{*}_{j} - z^{*}_p) > z^{*}_i = 1\), which is a contradiction. We now show that we can perturb \(z^{*}_p\) and the \(p-2\) smallest elements in \(z^{*}_i, i \in [p-1]\) by a small quantity \(\epsilon \) and remain in \(Q_g\). The equality \(\sum _{i \in [p-1]} z_i - (p-2) z_p = 1\) clearly holds after the perturbation. And, adding a small quantity \(\epsilon \) to \(z^{*}_p\) and the \(p-2\) smallest elements in \(z^{*}_i, i \in [p-1]\) will not violate the hierarchy constraint since the largest element in \(z^{*}_i, i \in [p-1]\) has to be strictly greater than \(z^{*}_p\). (Note that if \(z^{*}_i = z^{*}_p, \forall i \in [p]\), \(\sum _{i \in [p-1]} z^{*}_i - (p-2)z^{*}_p = z^{*}_p < 1\).) Since \(z^{*}_i \ge z^{*}_p >0, \forall i \in [p-1]\) subtracting a small quantity \(\epsilon \) will not violate the non-negativity constraint. Thus, we can write \(z^{*}\) as a convex combination of two points in \(Q_g\), which is a contradiction. \(\square \)
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wei, L., Gómez, A., Küçükyavuz, S. (2020). On the Convexification of Constrained Quadratic Optimization Problems with Indicator Variables. In: Bienstock, D., Zambelli, G. (eds) Integer Programming and Combinatorial Optimization. IPCO 2020. Lecture Notes in Computer Science(), vol 12125. Springer, Cham. https://doi.org/10.1007/978-3-030-45771-6_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-45771-6_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45770-9
Online ISBN: 978-3-030-45771-6
eBook Packages: Computer ScienceComputer Science (R0)