Skip to main content
Log in

Approximating polyhedra with sparse inequalities

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

In this paper, we study how well one can approximate arbitrary polytopes using sparse inequalities. Our motivation comes from the use of sparse cutting-planes in mixed-integer programing (MIP) solvers, since they help in solving the linear programs encountered during branch-&-bound more efficiently. However, how well can we approximate the integer hull by just using sparse cutting-planes? In order to understand this question better, given a polyope \(P\) (e.g. the integer hull of a MIP), let \(P^k\) be its best approximation using cuts with at most k non-zero coefficients. We consider \({\text {d}}(P, P^k) = \max _{x \in P^k} \left( \min _{y \in P} \Vert x - y\Vert \right) \) as a measure of the quality of sparse cuts.In our first result, we present general upper bounds on \({\text {d}}(P, P^k)\) which depend on the number of vertices in the polytope. Our bounds imply that if \(P\) has polynomially many vertices, using half sparsity already approximates it very well. Second, we present a lower bound on \({\text {d}}(P, P^k)\) for random polytopes that show that the upper bounds are quite tight. Third, we show that for a class of hard packing IPs, sparse cutting-planes do not approximate the integer hull well, that is \(d(P, P^k)\) is large for such instances unless k is very close to n. Finally, we show that using sparse cutting-planes in extended formulations is at least as good as using them in the original polyhedron, and give an example where the former is actually much better.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. If \(k \ge \frac{8\log 4tn }{9}\), then \(\frac{n^{\frac{1}{4}}\sqrt{8\sqrt{n}}\sqrt{\log 4tn} }{\sqrt{k}} \ge \frac{8\sqrt{n}\log 4tn}{3k}\).

References

  1. Achterberg, T.: Personal communication

  2. Amaldi, E., Coniglio, S., Gualandi, S.: Coordinated cutting plane generation via multi-objective separation. Math. Program. 143(1–2), 87–110 (2014). doi:10.1007/s10107-012-0596-x

    Article  MATH  MathSciNet  Google Scholar 

  3. Andersen, K., Weismantel, R.: Zero-coefficient cuts. In: Eisenbrand, F., Shepherd F.B. (eds.) Integer Programming and Combinatorial Optimization. Lecture Notes in Computer Science, vol. 6080, pp. 57–70. Springer, Berlin, Heidelberg (2010)

  4. Balas, E., Souza, CCd: The vertex separator problem: a polyhedral investigation. Math. Program. 103(3), 583–608 (2005). doi:10.1007/s10107-005-0574-7

    Article  MATH  MathSciNet  Google Scholar 

  5. Basu, A., Bonami, P., Cornuéjols, G., Margot, F.: On the relative strength of split, triangle and quadrilateral cuts. Math. Program. 126(2), 281–314 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  6. Basu, A., Cornuéjols, G., Molinaro, M.: A probabilistic analysis of the strength of the split and triangle closures. In: Günlük, O., Woeginger, G. (eds.) Integer Programming and Combinatoral Optimization. Lecture Notes in Computer Science, vol. 6655, pp. 27–38. Springer, Berlin, Heidelberg (2011)

  7. Bixby, R.E.: Solving real-world linear programs: a decade and more of progress. Oper. Res. 50(1), 3–15 (2002). doi:10.1287/opre.50.1.3.17780

    Article  MATH  MathSciNet  Google Scholar 

  8. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  9. Coleman, T.F.: Large Sparse Numerical Optimization. Springer, New York, NY (1984)

  10. DasGupta, A.: Probability for Statistics and Machine Learning. Springer, Berlin (2011)

    Book  MATH  Google Scholar 

  11. David, H., Nagaraja, H.: Order Statistics. Wiley, New York (2003)

    Book  MATH  Google Scholar 

  12. Dey, S.S., Iroume, A., Molinaro, M.: Some lower bounds on sparse outer approximations of polytopes. arXiv:1412.3765

  13. Eldersveld, S., Saunders, M.: A block-\(lu\) update for large-scale linear programming. SIAM J. Matrix Anal. Appl. 13(1), 191–201 (1992). doi:10.1137/0613016

    Article  MATH  MathSciNet  Google Scholar 

  14. Goemans, M.X.: Worst-case comparison of valid inequalities for the tsp. Math. Program. 69(2), 335–349 (1995)

    MATH  MathSciNet  Google Scholar 

  15. Gu, Z.: Personal communication

  16. Jeroslow, R.: On defining sets of vertices of the hypercube by linear inequalities. Discret. Math. 11, 119–124 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  17. Kaparis, K., Letchford, A.N.: Separation algorithms for 0–1 knapsack polytopes. Math. Program. 124(1–2), 69–91 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  18. Koltchinskii, V.: Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. Springer, Berlin (2011)

    Book  MATH  Google Scholar 

  19. Matousek, J., Vondrak, J.: The Probabilistic Method (2008). Manuscript

  20. Narisetty, A.: Personal communication

  21. Reid, J.: A sparsity-exploiting variant of the Bartels-Golub decomposition for linear programming bases. Math. Program. 24(1), 55–69 (1982). doi:10.1007/BF01585094

    Article  MATH  Google Scholar 

  22. Ziegler, G.M.: Lectures on 0/1-polytopes. In: Polytopes Combinatorics and Computation, pp. 1–41. Springer, Berlin (2000)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Molinaro.

Additional information

Santanu S. Dey and Qianyi Wang were partially supported by NSF Grant CMMI-1149400.

Appendices

Appendix 1: Concentration inequalities

We state Bernstein’s inequality in a slightly weaker but more convenient form.

Theorem 8

(Bernstein’s Inequality [18], Appendix A.2]) Let \({\varvec{X}}_1, {\varvec{X}}_2, \ldots , {\varvec{X}}_n\) be independent random variables such that \(|{\varvec{X}}_i - {\mathbb {E}}[{\varvec{X}}_i]| \le M\) for all \(i \in [n]\). Let \({\varvec{X}}= \sum _{i = 1}^n {\varvec{X}}_i\) and define \(\sigma ^2 = {\text {Var}}({\varvec{X}})\). Then for all \(t > 0\) we have

$$\begin{aligned} \Pr (|{\varvec{X}}- {\mathbb {E}}[{\varvec{X}}]| > t) \le \exp \left( -\min \left\{ \frac{t^2}{4 \sigma ^2}, \frac{3t}{4M} \right\} \right) . \end{aligned}$$

Appendix 2: Empirically generating lower bound on \(d(P,P^k)\)

We estimate a lower bound on \(d(P,P^k)\) using the following procedure. The input to the procedure is the set of points \(\{p^1, \ldots , p^t \} \in [0,1 ]^n\) which are vertices of \(P\). For every \(I \in {[n] \atopwithdelims ()k}\), we use PORTA to obtain an inequality description of \(P+ \mathbb {R}^{\bar{I}}\). Putting all these inequalities together we obtain an inequality description of \(P^k\). Unfortunately due to the large number of inequalities, we are unable to find the vertices of \(P^k\) using PORTA. Therefore, we obtain a lower bound on \(d(P,P^k)\) via a shooting experiment.

First observe that given \(u \in \mathbb {R}^n{\setminus } \{0\}\) we obtain a lower bound on \(d(P,P^k)\) as:

$$\begin{aligned} \frac{1}{\Vert u\Vert }\left( \max \{ u^Tx : x \in P^k\} - \max \{ u^Tx : x \in P\} \right) . \end{aligned}$$

Moreover it can be verified that there exists a direction which achieves the correct value of \(d(P, P^k)\). We generated 20,000 random directions u by picking them uniformly in the set \([-1,1]^n\). Also we found that for instances where \(p^j \in \{ x \in \{0,1 \}^n \,:\, \sum _{i = 1}^n x_i = \frac{n}{2}\}\), the directions \((\frac{1}{\sqrt{n}}, \ldots , \frac{1}{\sqrt{n}})\) and \(-(\frac{1}{\sqrt{n}}, \ldots , \frac{1}{\sqrt{n}})\) yield good lower bounds. The Figure in Section 1.3(c) plots the best lower bound among the 20,002 lower bounds found as above.

Appendix 3: Anticoncentration of linear combination of Bernoulli’s

It is convenient to restate Lemma 3 in terms of Rademacher random variables (i.e. that takes values \(-\)1/1 with equal probability).

Lemma 14

(Lemma 3, restated) Let \({\varvec{X}}_1, {\varvec{X}}_2, \ldots , {\varvec{X}}_n\) be independent Rademacher random variables. Then for every \(a \in [-1,1]^n\),

$$\begin{aligned}\Pr \left( a{\varvec{X}}\!\ge \! \frac{\alpha }{\sqrt{n}} \left( 1 - \frac{1}{n^2}\right) \Vert a\Vert _1 - \frac{1}{n^2} \right) \!\ge \! \left( e^{-50 \alpha ^2} - e^{-100 \alpha ^2}\right) ^{60 \log n}, \quad \alpha \!\in \! \left[ 0, \frac{\sqrt{n}}{8} \right] .\end{aligned}$$

We start with the case where the vector a has all of its coordinates being similar.

Lemma 15

Let \({\varvec{X}}_1, {\varvec{X}}_2, \ldots , {\varvec{X}}_n\) be independent Rademacher random variables. For every \(\epsilon \ge 1/20\) and \(a \in [1 - \epsilon , 1]^n\),

$$\begin{aligned}\Pr \left( a{\varvec{X}}\ge \frac{\alpha }{\sqrt{n}} \Vert a\Vert _1\right) \ge e^{-50 \alpha ^2} - e^{-\frac{\alpha ^2}{4 \epsilon ^2}}, \quad \alpha \in \left[ 0, \frac{\sqrt{n}}{8} \right] .\end{aligned}$$

Proof

Since \(a{\varvec{X}}= \sum _{i=1}^n {\varvec{X}}_i - \sum _{i=1}^n (1-a_i) {\varvec{X}}_i\), having \(\sum _{i=1}^n {\varvec{X}}_i \ge 2t\) and \(\sum _{i=1}^n (1-a_i) {\varvec{X}}_i \le t\) implies that \(a{\varvec{X}}\ge t\). Therefore,

$$\begin{aligned} \Pr (a{\varvec{X}}\ge t)&\ge \Pr \left( \left( \sum _{i=1}^n {\varvec{X}}_i \ge 2t\right) \vee \left( \sum _{i=1}^n (1 -a_i) {\varvec{X}}_i \le t\right) \right) \nonumber \\&\ge \Pr \left( \sum _{i=1}^n {\varvec{X}}_i \ge 2t\right) - \Pr \left( \sum _{i=1}^n (1 -a_i) {\varvec{X}}_i \le t\right) , \end{aligned}$$
(3)

where the second inequality comes from union bound. For \(t \in [0, n/8]\), the first term in the right-hand side can be lower bounded by \(e^{-\frac{50 t^2}{n}}\) (see for instance Section 7.3 of [19]). The second term in the right-hand side can be bounded using Bernstein’s inequality: given that \({\text {Var}}(\sum _{i=1}^n (1 -a_i) {\varvec{X}}_i) = \sum _{i=1}^n (1 - a_i)^2 \le n \epsilon ^2\), we get that for all \(t \in [0,n/8]\)

$$\begin{aligned} \Pr \left( \sum _{i=1}^n (1 -a_i) {\varvec{X}}_i \le t\right) \le \exp \left( - \min \left\{ \frac{t^2}{4 n \epsilon ^2}, \frac{3t}{4 \epsilon } \right\} \right) = e^{-\frac{t^2}{4n\epsilon ^2}}. \end{aligned}$$

The lemma then follows by plugging these bounds on (3) and using \(t = \alpha \sqrt{n} \ge \frac{\alpha }{\sqrt{n}}\Vert a\Vert _1\). \(\square \)

Proof of Lemma 14

Without loss of generality assume \(a > 0\), since flipping the sign of negative coordinates of a does not change the distribution of \(a{\varvec{Z}}\) neither the term \(\frac{\alpha }{\sqrt{n}} \left( 1 - \frac{2}{n^2}\right) \Vert a\Vert _1\). Also assume without loss of generality that \(\Vert a\Vert _\infty = 1\). The idea of the proof is to bucket the coordinates such that in each bucket the values of a is within a factor of \((1 \pm \epsilon )\) of each other, and then apply Lemma 15 in each bucket.

The first step is to trim the coefficients of a that are very small. Define the trimmed version b of a by setting \(b_i = a_i\) for all i where \(a_i \ge 1/n^3\) and \(b_i = 0\) for all other i. We first show that

$$\begin{aligned} \Pr \left( b{\varvec{Z}}\ge \frac{\alpha }{\sqrt{n}} \Vert b\Vert _1\right) \ge \left( e^{-50 \alpha ^2} - e^{-100 \alpha ^2}\right) ^{60 \log n}, \end{aligned}$$
(4)

and then we argue that the error introduced by considering b instead of a is small.

For \(j \in \{0, 1, \ldots , \frac{3 \log n}{\epsilon }\}\), define the jth bucket as \(I_j = \{i : b_i \in ((1-\epsilon )^{j+1}, (1-\epsilon )^j]\}\). Since \((1-\epsilon )^{\frac{3 \log n}{\epsilon }} \le e^{-3 \log n} = 1/n^3\), we have that every index i with \(b_i > 0\) lies within some bucket.

Now fix some bucket j. Let \(\epsilon = 1/20\) and \(\gamma = \frac{\alpha }{\sqrt{n}}\). Let \(E_j\) be the event that \(\sum _{i \in I_j} b_i {\varvec{Z}}_i \ge \gamma \sum _{i \in I_j} b_i\). Employing Lemma 15 over the vector \((1-\epsilon )^j b|_{I_j}\), gives

$$\begin{aligned}\Pr \left( \sum _{i \in I_j} b_i {\varvec{Z}}_i \ge \gamma \sum _{i \in I_j} b_i \right) \!\ge \! e^{-50 \gamma ^2 |I_j|} - e^{-\frac{\gamma ^2 |I_j|}{4 \epsilon ^2}} \!\ge \! e^{-50 \gamma ^2 n} - e^{-\frac{\gamma ^2 n}{4 \epsilon ^2}} , \quad \gamma \in \left[ 0, \frac{1}{8} \right] .\end{aligned}$$

But now notice that if in a scenario we have \(E_j\) holding for all j, then in this scenario we have \(b{\varvec{Z}}\ge \gamma \Vert b\Vert _1\). Using the fact that the \(E_j\)’s are independent (due to the independence of the coordinates of \({\varvec{Z}}\)), we have

$$\begin{aligned} \Pr (b{\varvec{Z}}\ge \gamma \Vert b\Vert _1) \ge \Pr \left( \bigvee _j E_j \right) \ge \left( e^{-50 \gamma ^2 n} - e^{-\frac{\gamma ^2 n}{4 \epsilon ^2}}\right) ^{\frac{3 \log n}{\epsilon }} , \quad \gamma \in \left[ 0, \frac{1}{8} \right] . \end{aligned}$$

Now we claim that whenever \(bX \ge \gamma \Vert b\Vert _1\), then we have \(a{\varvec{Z}}\ge \frac{\alpha }{\sqrt{n}} \left( 1 - \frac{2}{n^2} \right) \Vert a\Vert _1\). First notice that \(\Vert b\Vert _1 \ge \Vert a\Vert _1 - 1/n^2 \ge \Vert a\Vert _1 (1 - 1/n^2)\), since \(\Vert a\Vert _1 \ge \Vert a\Vert _\infty = 1\). Moreover, with probability 1 we have \(a{\varvec{Z}}\ge b{\varvec{Z}}- 1/n^2\). Therefore, whenever \(b{\varvec{Z}}\ge \gamma \Vert b\Vert _1\):

$$\begin{aligned}a{\varvec{Z}}\ge b{\varvec{Z}}\!-\! \frac{1}{n^2} \ge \gamma \Vert b\Vert _1 \!-\! \frac{1}{n^2} \ge \gamma \left( 1 \!-\! \frac{1}{n^2}\right) \Vert a\Vert _1 \!-\! \frac{1}{n^2} = \frac{\alpha }{\sqrt{n}} \left( 1 \!-\! \frac{1}{n^2}\right) \Vert a\Vert _1 \!-\! \frac{1}{n^2}.\end{aligned}$$

This concludes the proof of the lemma. \(\square \)

Appendix 4: Hard packing integer programs

1.1 Proof of Lemma 8

Fix \(i \in [n]\). We have \({\mathbb {E}}[\sum _{j=1}^m {\varvec{A}}^j_i] = \frac{mM}{2}\) and \({\text {Var}}(\sum _{j=1}^m {\varvec{A}}^j_i) \le \frac{mM^2}{4}\). Employing Bernstein’s inequality we get

$$\begin{aligned} \Pr \left( \sum _{j=1}^m {\varvec{A}}^j_i \!<\! \frac{mM}{2} - \sqrt{m \log 8n} M\right) \le \exp \left( -\min \left\{ \log 8n, \frac{3 \sqrt{m \log 8n}}{4} \right\} \right) \!\le \! \frac{1}{8n}, \end{aligned}$$

where the last inequality uses the assumption that \(m \ge 8 \log 8n\). Similarly, we get that

$$\begin{aligned}&\Pr \left( \sum _{i,j} {\varvec{A}}^j_i > \frac{nmM}{2} + \sqrt{n m \log 8n} M\right) \\ {}&\quad \le \exp \left( -\min \left\{ \log 8n, \frac{3 \sqrt{n m \log 8n}}{4} \right\} \right) \le \frac{1}{8n}. \end{aligned}$$

Taking a union bound over the first displayed inequality over all \(i \in [n]\) and also over the last inequality, with probability at least \(1-1/4\) the valid cut \(\sum _{i} \left( \frac{2}{mM} \sum _j {\varvec{A}}^j_i\right) x_i \le \frac{1}{mM} \sum _{i,j} {\varvec{A}}^j_i\) (obtained by aggregating all inequalities in the formulation) has all coefficients on the left-hand side being at least \(\left( 1 - \frac{2 \sqrt{\log 8n}}{\sqrt{m}}\right) \) and the right-hand side at most \(\frac{n}{2} + \frac{\sqrt{n \log 8}}{\sqrt{m}}\). This concludes the proof.

1.2 Proof of Lemma 9

Fix \(j \in [m]\). We have \({\mathbb {E}}[\sum _{i = 1}^n {\varvec{A}}^j_i] = \frac{n M}{2}\) and \({\text {Var}}( \sum _{i = 1}^n {\varvec{A}}^j_i) \le n M^2/4\) and hence by Bernstein’s inequality we get

$$\begin{aligned} \Pr \left( \sum _{i = 1}^n {\varvec{A}}^j_i > \frac{n M}{2} + M \sqrt{n \log 8 m}\right) \!\le \! \exp \left( - \min \left\{ \log 8m, \frac{3 \sqrt{n \log 8 m}}{4} \right\} \right) \!\le \! \frac{1}{8m}, \end{aligned}$$

where the last inequality uses the assumption that \(m \le n\). The lemma then follows by taking a union bound over all \(j \in [m]\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dey, S.S., Molinaro, M. & Wang, Q. Approximating polyhedra with sparse inequalities. Math. Program. 154, 329–352 (2015). https://doi.org/10.1007/s10107-015-0925-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-015-0925-y

Keywords

Mathematics Subject Classification

Navigation