Skip to main content
Log in

A Multiplicative Weight Updates Algorithm for Packing and Covering Semi-infinite Linear Programs

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We consider the following semi-infinite linear programming problems: \(\max \) (resp., \(\min \)) \(c^Tx\) s.t. \(y^TA_ix+(d^i)^Tx \le b_i\) (resp., \(y^TA_ix+(d^i)^Tx \ge b_i)\), for all \(y \in {{\mathcal {Y}}}_i\), for \(i=1,\ldots ,N\), where \({{\mathcal {Y}}}_i\subseteq {\mathbb {R}}^m_+\) are given compact convex sets and \(A_i\in {\mathbb {R}}^{m_i\times n}_+\), \(b=(b_1,\ldots b_N)\in {\mathbb {R}}_+^N\), \(d^i\in {\mathbb {R}}_+^n\), and \(c\in {\mathbb {R}}_+^n\) are given non-negative matrices and vectors. This general framework is useful in modeling many interesting problems. For example, it can be used to represent a sub-class of robust optimization in which the coefficients of the constraints are drawn from convex uncertainty sets \({{\mathcal {Y}}}_i\), and the goal is to optimize the objective function for the worst case choice in each \({{\mathcal {Y}}}_i\). When the uncertainty sets \({{\mathcal {Y}}}_i\) are ellipsoids, we obtain a sub-class of second-order cone programming. We show how to extend the multiplicative weights update method to derive approximation schemes for the above packing and covering problems. When the sets \({{\mathcal {Y}}}_i\) are simple, such as ellipsoids or boxes, this yields substantial improvements in running time over general convex programming solvers. We also consider the mixed packing/covering problem, in which both packing and covering constraints are given, and the objective is to find an approximately feasible solution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. This is already part of the problem definition, but we repeat here for ease of reference in the rest of the the paper.

  2. \({\tilde{O}}(\cdot )\) suppresses polylogarithmic factors that depend on m, N, and \(\frac{1}{\epsilon }\).

  3. Note that the definition of these oracles comes naturally from the corresponding algorithms for packing/covering LPs; whether weaker oracles suffice is an interesting open question.

  4. A typical example is the so-called Gaussian kernel, where \(q_{ij}=e^{-\Vert z^i-z^j\Vert ^2/(2\sigma ^2)}\).

  5. That is, \(\log p\) is concave.

  6. Throughout “\(\log \)” denotes the natural logarithm.

  7. \({\tilde{O}}(\cdot )\) suppresses polylogarithmic factors that depend on m, N, and \(\epsilon \).

  8. We assume here the natural logarithm.

  9. In fact, in the packing algorithm, \(y^TA_ix(t)\) is bounded from above by \(T/(1-\epsilon _3)\) by the property of the oracle MinVec and the fact that \(M(t)<T\), while in the covering algorithm, it is bounded by T since the integral is taken over the active subset \({{\mathcal {Y}}}_i(t)\).

References

  1. Alizadeh, F., Goldfarb, D.: Second-order cone programming. Math. Program. 95(1), 3–51 (2003)

    MathSciNet  MATH  Google Scholar 

  2. Allen-Zhu, Z., Lee, Y.T., Orecchia, L.: Using optimization to obtain a width-independent, parallel, simpler, and faster positive SDP solver, pp. 1824–1831 (2016). http://dl.acm.org/citation.cfm?id=2884435.2884562

  3. Allen-Zhu, Z., Orecchia, L.: Nearly-linear time positive LP solver with faster convergence rate, pp. 229–236 (2015). https://doi.org/10.1145/2746539.2746573

  4. Arora, S., Hazan, E., Kale, S.: Fast algorithms for approximate semidefinite programming using the multiplicative weights update method. In: Proceedings of the 46th Symposium on Foundations of Computer Science (FOCS), pp. 339–348 (2005)

  5. Arora, S., Kale, S.: A combinatorial, primal–dual approach to semidefinite programs. In: Proceedings of the 39th Symposium on Theory of Computing (STOC), pp. 227–236 (2007)

  6. Bartal, Y., Byers, J., Raz, D.: Fast, distributed approximation algorithms for positive linear programming with applications to flow control. SIAM J. Comput. 33(6), 1261–1279 (2004)

    MathSciNet  MATH  Google Scholar 

  7. Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3), 167–175 (2003). https://doi.org/10.1016/S0167-6377(02)00231-6

    Article  MathSciNet  MATH  Google Scholar 

  8. Ben-Tal, A., Hazan, E., Koren, T., Mannor, S.: Oracle-based robust optimization via online learning. Oper. Res. 63(3), 628–638 (2015)

    MathSciNet  MATH  Google Scholar 

  9. Ben-Tal, A., Nemirovski, A.: Robust optimization: methodology and applications. Math. Program. 92(3), 453–480 (2002)

    MathSciNet  MATH  Google Scholar 

  10. Bertsimas, D., Brown, D.B., Caramanis, C.: Theory and applications of robust optimization. SIAM Rev. 53(3), 464–501 (2011)

    MathSciNet  MATH  Google Scholar 

  11. Bertsimas, D., Thiele, A.: A robust optimization approach to supply chain management. In: Integer Programming and Combinatorial Optimization, 10th International IPCO Conference, New York, NY, USA, 7–11 June 2004, Proceedings, pp. 86–100 (2004)

  12. Bhalgat, A., Gollapudi, S., Munagala, K.: Optimal auctions via the multiplicative weight method. In: Proceedings of the Fourteenth ACM Conference on Electronic Commerce, EC ’13, pp. 73–90. ACM, New York (2013)

  13. Blum, A.: On-line algorithms in machine learning. In: Developments from a June 1996 Seminar on Online Algorithms: The State of the Art, pp. 306–325. Springer, London (1998)

  14. Brönnimann, H., Goodrich, M.T.: Almost optimal set covers in finite VC-dimension. Discrete Comput. Geom. 14(4), 463–479 (1995)

    MathSciNet  MATH  Google Scholar 

  15. Brown, G.W.: Iterative solution of games by fictitious play. Activity Anal. Prod. Allocation 13(1), 374–376 (1951)

    MathSciNet  MATH  Google Scholar 

  16. Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8(3–4), 231–357 (2015). https://doi.org/10.1561/2200000050

    Article  MATH  Google Scholar 

  17. Chau, C.K., Elbassioni, K., Khonji, M.: Truthful mechanisms for combinatorial ac electric power allocation. In: Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems, pp. 1005–1012. International Foundation for Autonomous Agents and Multiagent Systems (2014)

  18. Chazelle, B.: The Discrepancy Method: Randomness and Complexity. Cambridge University Press, New York (2000)

    MATH  Google Scholar 

  19. Constantine, C., Mannor, S., Xu, H.: Robust Optimization in Machine Learning, pp. 369–402. MIT Press, Cambridge (2012)

    Google Scholar 

  20. Daskalakis, C., Deckelbaum, A., Kim, A.: Near-optimal no-regret algorithms for zero-sum games. In: Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’11, pp. 235–254. SIAM (2011)

  21. Diedrich, F., Jansen, K.: Faster and simpler approximation algorithms for mixed packing and covering problems. Theor. Comput. Sci. 377(1–3), 182–204 (2007)

    MathSciNet  MATH  Google Scholar 

  22. Elbassioni, K., Makino, K., Mehlhorn, K., Ramezani, F.: On randomized fictitious play for approximating saddle points over convex sets. Algorithmica 73(2), 441–459 (2015). https://doi.org/10.1007/s00453-014-9902-8

    Article  MathSciNet  MATH  Google Scholar 

  23. Elbassioni, K., Nguyen, T.T.: Approximation schemes for binary quadratic programming problems with low CP-rank decompositions. arXiv preprint arXiv:1411.5050 (2014)

  24. Freund, Y., Schapire, R.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29(1–2), 79–103 (1999)

    MathSciNet  MATH  Google Scholar 

  25. Garg, N., Khandekar, R.: Fractional covering with upper bounds on the variables: solving LPs with negative entries. In: Proceedings of the 14th European Symposium on Algorithms (ESA), pp. 371–382 (2004)

  26. Garg, N., Könemann, J.: Faster and simpler algorithms for multicommodity flow and other fractional packing problems. In: Proceedings of the 39th Symposium on Foundations of Computer Science (FOCS), pp. 300–309 (1998)

  27. Goldfarb, D., Iyengar, G.: Robust portfolio selection problems. Math. Oper. Res. 28(1), 1–38 (2003)

    MathSciNet  MATH  Google Scholar 

  28. Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Blondel, V., Boyd, S., Kimura, H. (eds.) Recent Advances in Learning and Control. Lecture Notes in Control and Information Sciences, pp. 95–110. Springer, Berlin (2008)

    Google Scholar 

  29. Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1 (2014)

  30. Grigoriadis, M., Khachiyan, L.: Approximate solution of matrix games in parallel. In: Advances in Optimization and Parallel Computing, pp. 129–136 (1992)

  31. Grigoriadis, M., Khachiyan, L.: A sublinear-time randomized approximation algorithm for matrix games. Oper. Res. Lett. 18(2), 53–58 (1995)

    MathSciNet  MATH  Google Scholar 

  32. Grigoriadis, M., Khachiyan, L.: Coordination complexity of parallel price-directive de-composition. Math. Oper. Res. 21(2), 321–340 (1996)

    MathSciNet  MATH  Google Scholar 

  33. Grigoriadis, M.D., Khachiyan, L.G., Porkolab, L., Villavicencio, J.: Approximate max–min resource sharing for structured concave optimization. SIAM J. Optim. 41, 1081–1091 (2001)

    MathSciNet  MATH  Google Scholar 

  34. Hazan, E.: Efficient algorithms for online convex optimization and their application. Ph.D. thesis, Princeton University, USA (2006)

  35. Helmbold, D.P., Schapire, R.E., Singer, Y., Warmuth, M.K.: On-line portfolio selection using multiplicative updates. In: Machine Learning, Proceedings of the Thirteenth International Conference (ICML ’96), Bari, Italy, 3–6 July 1996, pp. 243–251 (1996)

  36. Jain, R., Yao, P.: A parallel approximation algorithm for positive semidefinite programming. In: IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011, Palm Springs, CA, USA, 22–25 October 2011, pp. 463–471 (2011)

  37. Kale, S.: Efficient algorithms using the multiplicative weights update method. Ph.D. thesis, Princeton University, USA (2007)

  38. Khandekar, R.: Lagrangian relaxation based algorithms for convex programming problems. Ph.D. thesis, Indian Institute of Technology, Delhi (2004)

  39. Koufogiannakis, C., Young, N.: Beating simplex for fractional packing and covering linear programs. In: Proceedings of the 48th Symposium on Foundations of Computer Science (FOCS), pp. 494–504 (2007)

  40. Littlestone, N., Warmuth, M.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)

    MathSciNet  MATH  Google Scholar 

  41. Lobo, M.S., Vandenberghe, L., Boyd, S., Lebret, H.: Applications of second-order cone programming. Linear Algebra Appl. 284(13), 193–228 (1998)

    MathSciNet  MATH  Google Scholar 

  42. López, M., Still, G.: Semi-infinite programming. Eur. J. Oper. Res. 180(2), 491–518 (2007)

    MathSciNet  MATH  Google Scholar 

  43. Lorenz, R., Boyd, S.: Robust minimum variance beamforming. IEEE Trans. Signal Process. 53(5), 1684–1696 (2005)

    MathSciNet  MATH  Google Scholar 

  44. Lovász, L., Vempala, S.: Fast algorithms for logconcave functions: sampling, rounding, integration and optimization. In: Proceedings of the 47th Symposium on Foundations of Computer Science (FOCS), pp. 57–68 (2006)

  45. Luby, M., Nisan, N.: A parallel approximation algorithm for positive linear programming. In: Proceedings of the 25th Symposium on Theory of Computing (STOC), pp. 448–457 (1993)

  46. Magaril-Il’yaev, G.G., Tikhomirov, V.M.: Convex Analysis: Theory and Applications, vol. 222. American Mathematical Society, Providence (2003)

    MATH  Google Scholar 

  47. Nemirovskii, A., Yudin, D.: Problem Complexity and Method Efficiency in Optimization. A Wiley-Interscience Publication. Wiley, London (1983)

    Google Scholar 

  48. Nesterov, Y.: A method of solving a convex programming problem with convergence rate O(1/sqr(k)). Sov. Math. Dokl. 27, 372–376 (1983)

    MATH  Google Scholar 

  49. Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course, vol. 87. Springer, Berlin (2013)

    MATH  Google Scholar 

  50. Peng, R., Tangwongsan, K.: Faster and simpler width-independent parallel algorithms for positive semidefinite programming. In: Proceedings of the Twenty-Fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’12, pp. 101–108. ACM, New York (2012)

  51. Plotkin, S., Shmoys, D., Tardos, É.: Fast approximation algorithms for fractional packing and covering problems. In: Proceedings of the 32nd Symposium on Foundations of Computer Science (FOCS), pp. 495–504 (1991)

  52. Robinson, J.: An iterative method of solving a game. Ann. Math. 54(2), 296–301 (1951)

    MathSciNet  MATH  Google Scholar 

  53. Shivaswamy, P.K., Bhattacharyya, C., Smola, A.J.: Second order cone programming approaches for handling missing and uncertain data. J. Mach. Learn. Res. 7, 1283–1314 (2006)

    MathSciNet  MATH  Google Scholar 

  54. Tsang, I.W., Kwok, J.T.: Efficient hyperkernel learning using second-order cone programming. IEEE Trans. Neural Netw. 17(1), 48–58 (2006)

    MATH  Google Scholar 

  55. Tulabandhula, T., Rudin, C.: Robust optimization using machine learning for uncertainty sets. In: International Symposium on Artificial Intelligence and Mathematics, ISAIM 2014, Fort Lauderdale, FL, USA, 6–8 January 2014 (2014)

  56. Wang, D., Rao, S., Mahoney, M.W.: Unified acceleration method for packing and covering problems via diameter reduction. arXiv preprint arXiv:1508.02439 (2015)

  57. Young, N.: Sequential and parallel algorithms for mixed packing and covering. In: Proceedings of the 42nd Symposium on Foundations of Computer Science (FOCS), pp. 538–546 (2001)

  58. Zass, R., Shashua, A.: A unifying approach to hard and probabilistic clustering. In: Tenth IEEE International Conference on Computer Vision, 2005. ICCV 2005, vol. 1, pp. 294–301. IEEE (2005)

  59. Zhu, Z.A., Orecchia, L.: A novel, simple interpretation of Nesterov’s accelerated method as a combination of gradient and mirror descent. CoRR arXiv:1407.1537 (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Waleed Najy.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Efficient Implementations of the Oracle Integral

In this section we discuss how to efficiently implement the oracle Integral\((p,f,\epsilon ,\sigma ,{{\mathcal {Y}}})\) for a given compact convex set Y, a log-concave function \(p:Y\rightarrow {\mathbb {R}}_+\), a function \(f:Y\rightarrow {\mathbb {R}}^k\) which is either \(f(y):=1\) or \(f(y):=y\), and \(\epsilon ,\sigma \in [0,1)\).

Write \(F(y):=f(y)p(y)\) for \(y\in {{\mathcal {Y}}}\). Then \(F_i(y)\) is log-concave for all \(i\in [k]\). Thus, the question is to how to efficiently well-approximate \(\int _{Y}F_i(y)dy\).

1.1 General Convex Sets

Lovász and Vempala show how to implement the integration oracle in \({\tilde{O}}(m^4)\) time. More precisely, they prove the following result.

Theorem 4

[44] Let \(F:Y\rightarrow {\mathbb {R}}\) be a log-concave function over a compact convex set \(Y\subseteq {\mathbb {R}}^m\). Assume also the availability of a point \(\tilde{y}\) that (approximately) maximizes F over Y. Given \(\epsilon ,\sigma >0\), a number A can be computed such that with probability \(1-\sigma \),

$$\begin{aligned} (1-\epsilon )\int _{Y} F(y)dy \le A \le (1+\epsilon )\int _{Y} F(y)dy \end{aligned}$$
(33)

in \(O\left( \displaystyle \frac{m^4}{\epsilon ^2}\log ^7\frac{m}{\epsilon \sigma }\right) \) oracle calls, each involving ether a membership oracle for Y and/or an evaluation of F.

1.2 Reduction to Volume Computation

We consider here the particular case that we actually need in our algorithms, where \(F_i(y)\) is either \((1\pm \epsilon )^{a^Ty}\) or \(y_i(1\pm \epsilon )^{a^Ty}\), for some \(a\in {\mathbb {R}}_+^m\), and whereFootnote 9\(a^Ty\le T/m\) for all \(y\in {{\mathcal {Y}}}\subseteq {\mathbb {R}}_+^m\). Standardly, the idea is slice Y along the direction orthogonal to a and approximate the values of the function \(a^ty\) by their values at the boundary of the slices.

For simplicity let us consider the case when \(F_i(y)=F(y)=(1+\epsilon )^{a^Ty}\), and we can compute the the maximum and minimum of a linear function over Y exactly. Let \(y_{\min }\in {\text {argmin}}_{y\in {{\mathcal {Y}}}}a^Ty\), \(y_{\max }\in {\text {argmax}}_{y\in {{\mathcal {Y}}}}a^Ty\), and define the set of points \(y_{\min }=y^0,y^1,\ldots ,y^r=y_{\max }\) in Y, such that \(a^Ty^i=a^Ty^{i-1}+\eta \) for \(i=1,\ldots ,r\), where \(\eta :=\frac{\log (1+\epsilon _1)}{\log (1+\epsilon )}\) and \(r:=\lceil \frac{a^Ty_{\max }-a^Ty_{\min }}{\eta }\rceil \le \lceil \frac{T}{m\eta }\rceil \). Note that \((1+\epsilon )^{a^Ty^i}=(1+\epsilon _1)(1+\epsilon )^{a^Ty^{i-1}}\). Note also that, once we have \(y_{\min }\) and \(y_{\max }\), we can easily compute \(y^i=y_{\min }+\lambda _i(y_{\max }-y_{\min })\), where \(\lambda _i:=\frac{\eta \cdot i}{a^Ty_{\max }-a^Ty_{\min }}\), for \(i=1,\ldots ,r\).

For \(i=1,\ldots ,r\), define the slice \(Y(i):=\{y\in {{\mathcal {Y}}}:~a^Ty^{i-1}\le a^Ty\le a^Ty^i\}\). Then it is easy to see that

$$\begin{aligned} A:=(1+\epsilon )^{a^Ty_{\min }}\sum _{i=1}^r(1+\epsilon _1)^{i}{\text {vol}}(Y(i)) \end{aligned}$$
(34)

satisfies (33). Indeed,

$$\begin{aligned} \frac{A}{1+\epsilon _1}&=\sum _{i=1}^r(1+\epsilon )^{a^Ty_{\min }+\eta \cdot (i-1)}{\text {vol}}(Y(i))\le \sum _{i=1}^r\int _{Y(i)}(1+\epsilon )^{a^Ty}dy=\int _Y(1+\epsilon )^{a^Ty}dy\\&\le \sum _{i=1}^r(1+\epsilon )^{a^Ty_{\min }+\eta \cdot i}{\text {vol}}(Y(i))=A. \end{aligned}$$

If \(F(y)=y\), then \({\text {vol}}(Y(i)\) in (34) should be replaced by \(\int _{Y(i)}ydy\).

1.3 Euclidean Balls

We can further show that for the case of Euclidean balls (and similarly for Ellipsoids), one can reduce the integral computation to numerical integration over a single variable.

For \(z\in {\mathbb {R}}^m\) and \(\rho \in {\mathbb {R}}_+\), let \({\mathbf {B}}_m(y^0,\rho ):=\{y\in {\mathbb {R}}^m:~\Vert y-y^0\Vert \le \rho \}\) be the ball of radius \(\rho \) centered at \(y^0\). Denote by \({\text {vol}}_m({\mathbf {B}}_m(y^0,\rho ))\) the volume of the ball \({\mathbf {B}}_m(y^0,\rho )\) in \({\mathbb {R}}^m\).

Lemma 24

Let \(Y:={\mathbf {B}}_m(y^0,\rho )\), \(a\in {\mathbb {R}}^m\) be a given vector, and \(F(y):=y(1+\epsilon )^{a^Ty}\) for \(y\in {\mathbb {R}}^m\). Let \({\hat{y}}:={\text {argmin}}\{a^Ty:~y\in {{\mathcal {Y}}}\}\). Then

$$\begin{aligned} \int _YF(y)dy= (1+\epsilon )^{a^T{\hat{y}}}\displaystyle \int _{0}^{2\rho } (1+\epsilon )^{\Vert a\Vert h}v_{m-1} (h) ({\hat{y}}+h\frac{a}{\Vert a\Vert })dh, \end{aligned}$$
(35)

where \(v_{m-1} (h)\) is the volume of an \((m-1)\)-dimensional ball of radius \(\sqrt{h(2\rho -h)})\).

Proof

To simplify matters, we use the change of variable \(z=Uy\), where \(U\in {\mathbb {R}}^{m\times m}\) is a rotation transformation (i.e., \(U^TU=I\)) such that \(Ua = \Vert a\Vert \varvec{1}_m\). Hence, \(dy = \prod _i dy_i = \det U \prod _i dz_i = dz\). Note that this transformation maps the ball \({\mathbf {B}}_m(y^0,\rho )\) to to the ball \(Z:={\mathbf {B}}_m(Uy^0,\rho )\), that is, \(U {{\mathcal {Y}}}=Z\).

For a given \(h\in [0,2\rho ]\), let \({\mathbf {B}}(h):={\mathbf {B}}_{m-1}({\hat{z}}+h\varvec{1}_m,\sqrt{h(2\rho -h)})\) be the lower-dimensional ball centered at \(U \left( {\hat{y}}+ha/||a||\right) = {\hat{z}}+h\varvec{1}_m\) with radius \(\sqrt{\rho ^2-(\rho -h)^2}=\sqrt{h(2\rho -h)}\). For any vector \(z \in {\mathbb {R}}^m\), we denote by \(z_{{\overline{m}}}\) the vector of the first \(m-1\) components of y, with \(y_m\) being the mth component.

$$\begin{aligned} \int _YF(y)dy&= \displaystyle U^T\int _Z z(1+\epsilon )^{a^TU^Tz}dz\\&=U^T\displaystyle \int _{{\hat{z}}_m}^{{\hat{z}}_m+2\rho } \int _{B(z_m-{\hat{z}}_m)} z(1+\epsilon )^{||a||\varvec{1}_m^Tz}dz_{\overline{m}}dz_m\\&= U^T\displaystyle \int _{0}^{2\rho } \int _{{\mathbf {B}}(h)} \left[ \begin{array}{c}z_{{\overline{m}}}\\ \hat{z}_m+h\end{array}\right] (1+\epsilon )^{\Vert a\Vert ({\hat{z}}_m+h)}dz_{\overline{m}}dh\\&= (1+\epsilon )^{\Vert a\Vert {\hat{z}}_m}U^T\displaystyle \int _{0}^{2\rho } (1+\epsilon )^{||a||h}\int _{{\mathbf {B}}(h)}\left[ \begin{array}{c}z_{\overline{m}}\\ {\hat{z}}_m+h\end{array}\right] dz_{{\overline{m}}}dh\\&= (1+\epsilon )^{\Vert a\Vert {\hat{z}}_m}U^T\displaystyle \int _{0}^{2\rho } (1+\epsilon )^{||a||h}{\text {vol}}_{m-1} ({\mathbf {B}}(h)) ({\hat{z}}+h\varvec{1}_m )dh\\&=(1+\epsilon )^{a^T{\hat{y}}}\displaystyle \int _{0}^{2\rho } (1+\epsilon )^{\Vert a\Vert h}{\text {vol}}_{m-1} ({\mathbf {B}}(h)) ({\hat{y}}+h\frac{a}{\Vert a\Vert })dh. \end{aligned}$$

The equality before the last is due to the fact that for \(i=1,\cdots ,m-1\), \(\int _{{\mathbf {B}}(h)} z_idz_{\overline{m}}={\hat{z}}_i{\text {vol}}_{m-1} ({\mathbf {B}}(h))\) due to symmetry, while \(\int _{{\mathbf {B}}(h)} ({\hat{z}}_m+h)dz_{{\overline{m}}}=({\hat{z}}_m+h){\text {vol}}_{m-1} ({\mathbf {B}}(h))\). \(\square \)

Corollary 5

Suppose that there is an algorithm that approximates the integral \(\int _{h_1}^{h_2}\tau (h)dh\) to within an additive error \(\epsilon \), in time \(q(\tau _{max},\frac{1}{\epsilon })\), where \(\tau _{\max }:=\max _{h\in [h_1,h_2]}\log \tau (h)\). Then there is an algorithm that approximates the integral in (35) within an additive error of \(\epsilon \) in time \(q(O(m+\frac{T}{m}+H)\log m,\frac{1}{\epsilon })\), where H is the maximum number of bits needed to represent any of the components of a, \(\rho \), and \(y^0\).

Proof

The function \(\tau (h)\) inside the integral on the R.H.S. of (35) can be written as

$$\begin{aligned} \tau (h):=\frac{\pi ^{\frac{m-1}{2}}}{\varGamma (\frac{m-1}{2}+1)}(1+\epsilon )^{\Vert a\Vert h}\left( h(2\rho -h)\right) ^{\frac{m-1}{2}} ({\hat{y}}+h\frac{a}{\Vert a\Vert }). \end{aligned}$$

where \(\varGamma \) is Euler’s gamma function, and \(\Vert a\Vert h\le T/m\). It can be easily verified that \(\tau _{\max }=O((m+\frac{T}{m}+H)\log m)\). \(\square \)

Proofs Omitted from Section 5.1

Lemma 25

\(\varPhi (t+1) \le \displaystyle \varPhi (t) \exp \Big ( -{\epsilon }\delta (t)\sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t)} \frac{p_i(y,t)}{\varPhi (t)} y^TA_i\mathbf {1}_{j(t)}dy \Big )\).

Proof

First, we note that

$$\begin{aligned} \varPhi (t+1)&= \sum _{i\in I(t+1)} \int _{{{\mathcal {Y}}}_i(t+1)} p_i(y,t+1)dy \\&= \sum _{i\in I(t+1)} \int _{{{\mathcal {Y}}}_i(t+1)} (1-\epsilon )^{g_i(y)^Tx(t+1)}dy \\&= \sum _{i\in I(t+1)} \int _{{{\mathcal {Y}}}_i(t+1)} (1-\epsilon )^{g_i(y)^T\left[ x(t)+\delta (t)\mathbf {1}_{j(t)}\right] }dy \\&= \sum _{i\in I(t+1)} \int _{{{\mathcal {Y}}}_i(t+1)} p_i(y,t)(1-\epsilon )^{\delta (t)g_i(y)^T\mathbf {1}_{j(t)}}dy \\&\le \sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t)} p_i(y,t)(1-\epsilon )^{\delta (t)g_i(y)^T\mathbf {1}_{j(t)}}dy. \end{aligned}$$

The last inequality is due to the fact that updates of x(t) are non-negative, and hence, \(g_i(y)^Tx(t)\le g_i(y)^Tx(t)+\delta (t) g_i(y)^T\varvec{1}_{j(t)}=g_i(y)^Tx(t+1)\), implying that \({{\mathcal {Y}}}_i(t+1) \subseteq {{\mathcal {Y}}}_i(t)\) and \(I(t+1)\subseteq I(t)\).

By the definition of the oracle MaxVec, the exponent of \((1-\epsilon )\) satisfies

$$\begin{aligned} \delta (t)g_i(y)^T\mathbf {1}_{j(t)} \le \frac{g_i(y)^T\mathbf {1}_{j(t)}}{\max _i \max _{z \in {{\mathcal {Y}}}_i(t)} g_i(z)^T\mathbf {1}_{j(t)}} \le 1. \end{aligned}$$
(36)

Recalling that for \(z \in [0,1]\) and for \(a\ge 0\), \((1-a)^z \le 1-az\), we have

$$\begin{aligned}&\sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t)} p_i(y,t)(1-\epsilon )^{\delta (t)g_i(y)^T\mathbf {1}_{j(t)}}dy\nonumber \\&\quad \le \sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t)} p_i(y,t)\Big (1-\epsilon \delta (t)g_i(y)^T\mathbf {1}_{j(t)}\Big )dy\nonumber \\&\quad = \varPhi (t)\left( 1-\epsilon \delta (t)\sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t)} \frac{p_i(y,t)}{\varPhi (t)} g_i(y)^T\mathbf {1}_{j(t)}dy\right) . \end{aligned}$$
(37)

Finally, using \(1-z \le e^{-z}\) for all z, we get:

$$\begin{aligned} \varPhi (t+1)&\le \varPhi (t) \exp \left( -{\epsilon }\delta (t)\sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t)} \frac{p_i(y,t)}{\varPhi (t)} g_i(y)^T\mathbf {1}_{j(t)}dy\right) . \end{aligned}$$
(38)

\(\square \)

Define \(1-{\bar{\epsilon }}:=\frac{(1-\epsilon _2)(1-\epsilon _1)}{1+\epsilon _1}\).

Lemma 26

Let \(\kappa (t):=\sum _{t'=0}^{t-1} \delta (t')\sum _{i\in I(t)} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^T\mathbf {1}_{j(t')}dy\). Then with probability at least \(1-2N\sigma t\), \(\kappa (t) \ge \frac{(1-{\bar{\epsilon }})\varvec{1}^Tx(t)}{z^*}\) for all t.

Proof

Let \(x^*\) be an optimal solution for (Covering). Then by the feasibility of \(x^*\) for (Covering), \(g_i(y)^Tx^* \ge 1\) for all \(y \in {{\mathcal {Y}}}_i\) and for all \(i\in [N]\). Multiplying both sides by \(p_i(y,t')\), integrating over the respective active sets, and summing over all \(i\in I(t')\), we get

$$\begin{aligned} \sum _{i\in I(t')} \int _{{{\mathcal {Y}}}_i(t')} p_i(y,t') g_i(y)^Tx^*dy \ge \sum _{i\in I(t')}\int _{{{\mathcal {Y}}}_i(t')} p_i(y,t')dy = \varPhi (t'), \qquad \forall t'. \end{aligned}$$

Therefore,

$$\begin{aligned} \sum _{i\in I(t')} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^Tx^*dy&\ge 1, \qquad \forall t'. \end{aligned}$$
(39)

For \({i\in I(t')}\), let \({\bar{y}}^i(t)\) and \({\bar{\phi }}^i(t)\) be the outputs of the oracles \({\textsc {Integral}}(p_i(y,t'),f(y):=y,\epsilon _1,\sigma ,{{\mathcal {Y}}}_i(t'))\) and \({\textsc {Integral}}(p_i(y,t'), f(y):=1,\epsilon _1,\sigma , {{\mathcal {Y}}}_i(t'))\) in steps 15 and 16 of iteration \(t'\), respectively. Then we have by the union bound,

$$\begin{aligned}&\Pr \left[ ~\left| \bar{y}^i(t')-\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')ydy\right| \le \epsilon _1\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')ydy, \text { for all }i \in I(t')\right] \ge 1-N\sigma ,\\&\Pr \left[ ~\left| \bar{\phi }^i(t')-\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')dy\right| \le \epsilon _1\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')dy, \text { for all }i \in I(t')\right] \ge 1-N\sigma , \end{aligned}$$

which imply

$$\begin{aligned} \Pr \Big [\left| \sum _{i\in I(t')} (A_i^T{\bar{y}}^i(t')+\bar{\phi }^i(t')d^i)^T\varvec{1}_j-\sum _{i\in I(t')}\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')g_i(y)^T\varvec{1}_jdy\right| \\ \le \epsilon _1\sum _{i\in I(t')}\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')g_i(y)^T\varvec{1}_jdy, \text { for all }j\in [n]\Big ] \ge 1-2N\sigma . \end{aligned}$$

These together with the setting of \(j(t')\) in step 18 give that with probability at least \( 1-2N\sigma \),

$$\begin{aligned}&\sum _{i\in I(t')}\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')g_i(y)^T\varvec{1}_{j(t')}dy\\&\quad \ge \frac{1}{1+\epsilon _1}\sum _{i\in I(t')}(A_i^T{\bar{y}}^i(t')+\bar{\phi }^i(t')d^i)^T\varvec{1}_{j(t')}\\&\quad \ge \frac{1-\epsilon _2}{1+\epsilon _1}\max _j\sum _{i\in I(t')}(A_i^T{\bar{y}}^i(t')+{\bar{\phi }}^i(t')d^i)^T\varvec{1}_j\\&\quad \ge \frac{(1-\epsilon _2)(1-\epsilon _1)}{1+\epsilon _1}\max _j\sum _{i\in I(t')}\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')g_i(y)^T\varvec{1}_jdy. \end{aligned}$$

It follows that with probability at least \( 1-2N\sigma \),

$$\begin{aligned}&\sum _{i\in I(t)}\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')g_i(y)^T\varvec{1}_{j(t')}dy\nonumber \\&\quad \ge (1-{\bar{\epsilon }}) \sum _{i\in I(t')}\int _{{{\mathcal {Y}}}_i(t')}p_i(y,t')g_i(y)^T\varvec{1}_jdy,\quad \text {for all }j\in [n]. \end{aligned}$$
(40)

Multiplying both sides of (40) by \(x_j^*/\varPhi (t')\) (a non-negative quantity), summing over all \(j\in [n]\), and recalling that \(z^* = \varvec{1}^Tx^* = \sum _j x_j^*\), we get that with probability at least \( 1-2N\sigma \),

$$\begin{aligned}&{z^*\sum _{i\in I(t')} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^T\varvec{1}_{j(t')}}dy\nonumber \\&\quad \ge (1-{\bar{\epsilon }}){\sum _j x_j^*\sum _{i\in I(t')} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^T\varvec{1}_{j}}dy \nonumber \\&\quad = (1-{\bar{\epsilon }}){\sum _{i\in I(t')} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^T {\left( \sum _j x_j^*\varvec{1}_{j}\right) }}dy \nonumber \\&\quad = (1-{\bar{\epsilon }}){\sum _{i\in I(t')} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^Tx^*}dy\nonumber \\&\quad \ge (1-{\bar{\epsilon }}), \end{aligned}$$
(41)

where the last inequality follows from (39). Using this result, we can conclude that with probability at least \( 1-2N\sigma \),

$$\begin{aligned} z^* \kappa (t)&= z^* \sum _{t'=0}^{t-1} \delta (t'){\sum _{i\in I(t')} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^T\mathbf {1}_{j(t')}} dy \\&= \sum _{t'=0}^{t-1} \delta (t') \left( {z^*\sum _{i\in I(t')} \int _{{{\mathcal {Y}}}_i(t')} \frac{p_i(y,t')}{\varPhi (t')} g_i(y)^T\mathbf {1}_{j(t')}}dy\right) \\&\ge (1-{\bar{\epsilon }})\sum _{t'=0}^{t-1} \delta (t')\\&=(1-{\bar{\epsilon }})\mathbf {1}^Tx(t). \end{aligned}$$

Here, the last equality is a result of the update formula for \(x(t+1)\) in step 23 of Algorithm 4. \(\square \)

Lemma 27

For all t, with probability at least \( 1-2N\sigma t\), it holds that

$$\begin{aligned} \varPhi (t) \le \varPhi (0) \exp \left( -\displaystyle \frac{\epsilon (1-{\bar{\epsilon }})\mathbf {1}^Tx(t)}{z^*}\right) . \end{aligned}$$
(42)

Proof

By repeated application of (38) and using Lemma 9 we can bound \(\varPhi (t)\) as follows:

figure s

\(\square \)

Lemma 28

Suppose \(\varPhi (t) \le \gamma \) for some \(\gamma >0\) and some iteration t of the algorithm. Assume also that there is \(v_i>0\) such that \({\text {vol}}({{\mathcal {Y}}}_i(t))\ge v_i\) for all \(i\in I(t)\). In addition, suppose that for given \(T>0\) and \(\epsilon _3 >0\) there exists an \(\alpha \) such that

$$\begin{aligned}&0<\alpha <\alpha _0 \end{aligned}$$
(43)
$$\begin{aligned}&(1-\epsilon )^{-\alpha /2}\min _{i\in I(t)}\left\{ \left( \frac{\alpha }{\alpha _0}\right) ^{m_i}v_i\right\} >1, \end{aligned}$$
(44)

where \(\alpha _0:=2T\). Then \((1-\epsilon )^{g_i(y)^Tx(t)} \le \gamma (1-\epsilon )^{-\alpha }\) for all \(y \in {{\mathcal {Y}}}_i(t)\), for all \(i \in I(t)\).

Proof

Towards a contradiction, suppose that there exist \(i\in I(t)\) and \(y^* \in {{\mathcal {Y}}}_i(t)\) such that \((1-\epsilon )^{g_i({y^*})^Tx(t)} > \gamma (1-\epsilon )^{-\alpha }\) in iteration t. Then define \(\mu ^* = \frac{\alpha }{\alpha _0}\in (0,1)\). Also, define the following sets:

$$\begin{aligned}&{{\mathcal {Y}}}_i^+(t) := \left\{ y \in {{\mathcal {Y}}}_i(t): (y-y^*)^TA_ix(t) \le \frac{\alpha }{2}\right\} ,\nonumber \\&{{\mathcal {Y}}}_i^{++}(t) :=\left( 1-\frac{1}{\mu ^*}\right) y^*+\frac{1}{\mu ^*}{{\mathcal {Y}}}_i^+(t) =\left\{ y^*+\frac{1}{\mu ^*}(y-y^*) : y \in {{\mathcal {Y}}}_i^+(t) \right\} . \end{aligned}$$
(45)

\(\square \)

Claim 7

\(y^*+\mu '(y-y^*) \in {{\mathcal {Y}}}_i^{++}(t)\) for all \(\mu ' \in [0,\frac{1}{\mu ^*}]\) and \(y \in {{\mathcal {Y}}}_i^+(t)\). In particular, \({{\mathcal {Y}}}_i^+(t) \subseteq {{\mathcal {Y}}}_i^{++}(t)\).

Proof

\(y^*\) is in \({{\mathcal {Y}}}_i^+(t)\) since \((y^*-y^*)^TA_ix(t)=0\le \alpha /2\). \({{\mathcal {Y}}}_i^+(t)\) is the intersection of a half-space and \({{\mathcal {Y}}}_i(t)\), which are both convex. Thus, \({{\mathcal {Y}}}_i^+(t)\) is convex. Therefore, for all \(\mu ' \in [0,\frac{1}{\mu ^*}]\) and for any \(y \in {{\mathcal {Y}}}_i^+(t)\), we have

$$\begin{aligned}&\mu '\mu ^*y+(1-\mu '\mu ^*)y^* \in {{\mathcal {Y}}}_i^+(t), \\&\therefore y^*+\mu '\mu ^*(y-y^*) \in {{\mathcal {Y}}}_i^+(t). \end{aligned}$$

Consider the point \(y^*+\mu '\mu ^*(y-y^*)\). Since this point is in \({{\mathcal {Y}}}_i^+(t)\), we can substitute it into the definition of \({{\mathcal {Y}}}_i^{++}(t)\) to get that

$$\begin{aligned}&y^*+\frac{1}{\mu ^*}\big (y^*+\mu '\mu ^*(y-y^*)-y^*\big ) \in {{\mathcal {Y}}}_i^{++}(t), \\&\therefore y^*+\mu '(y-y^*) \in {{\mathcal {Y}}}_i^{++}(t). \end{aligned}$$

In particular for \(\mu '=1\), we have \(y\in {{\mathcal {Y}}}_i^{++}(t)\), implying the claim. \(\square \)

Claim 8

\({\text {vol}}({{\mathcal {Y}}}_i^{++}(t)) = \displaystyle \left( \frac{1}{\mu ^*}\right) ^{m_i} {\text {vol}}({{\mathcal {Y}}}_i^+(t))\)

Proof

Immediate from the definition in (45). \(\square \)

Claim 9

\({{\mathcal {Y}}}_i(t) \subseteq {{\mathcal {Y}}}_i^{++}(t)\).

Proof

Suppose for a contradiction that there exists a point \(y \in {{\mathcal {Y}}}_i(t)\backslash {{\mathcal {Y}}}_i^{++}(t)\). Then by Claim 7, y is also outside \({{\mathcal {Y}}}_i^+(t)\). Now define:

$$\begin{aligned} \mu ^+ = \max \left\{ \mu :y^*+\mu (y-y^*) \in {{\mathcal {Y}}}_i^+(t)\right\} . \end{aligned}$$

(Note that the maximum exists since \({{\mathcal {Y}}}_i^+(t)\) is closed and bounded.) Define also \(y^+ = y^* + \mu ^+(y-y^*)\). By the definition of \(\mu ^+\), \(y^+\) is on the boundary of \({{\mathcal {Y}}}_i^+(t)\), and the segment joining \(y\not \in {{\mathcal {Y}}}_i^+(t)\) and \(y^*\in {{\mathcal {Y}}}_i^+(t)\) crosses the hyperplane \(\{z\in {\mathbb {R}}^{m_i}: (z-y^*)^TA_ix(t)=\frac{\alpha }{2}\}\) at \(z=y^+\). This implies that \((y^+-y^*)^TA_ix(t) = \frac{\alpha }{2}\). Therefore,

$$\begin{aligned} {y^*}^TA_ix(t) + \alpha /2&= {y^+}^TA_ix(t)\\&=(\mu ^+y+(1-\mu ^+)y^*)^TA_ix(t).\\ \therefore \mu ^+{y^*}^TA_ix(t) +\alpha /2&= \mu ^+y^TA_ix(t). \end{aligned}$$

Thus,

$$\begin{aligned} \alpha /2&= \mu ^+(y-y^*)^TA_ix(t)\nonumber \\&\le \mu ^+\left( {y}^TA_ix(t)+(d^i)^Tx(t)\right) \nonumber \\&\le \mu ^*\left( {y}^TA_ix(t)+(d^i)^Tx(t)\right) \nonumber \\&\le \mu ^*T, \end{aligned}$$
(46)

where the first inequality is due to the non-negativity of \(y^{*T}A_ix(t)\) (by (A1)) and \((d^i)^Tx(t)\), the second is because \(\mu ^+ < \mu ^*\) (by Claim 7, as \(y^*+\frac{1}{\mu ^+}(y^+-y^*)\not \in {{\mathcal {Y}}}_i^{++}(t)\)), and the third is because \(y \in {{\mathcal {Y}}}_i(t)\) and thus \(g_i(y)^Tx(t) \le T\) by the definition of \({{\mathcal {Y}}}_i(t)\). But then plugging \(\mu ^*=\frac{\alpha }{\alpha _0}\) in (46) gives \(\alpha _0<2T\), contradicting the definition of \(\alpha _0\). Thus, no points \(y \in {{\mathcal {Y}}}_i(t) \backslash {{\mathcal {Y}}}_i^{++}(t)\) exist, and \({{\mathcal {Y}}}_i(t) \subseteq {{\mathcal {Y}}}_i^{++}(t)\). \(\square \)

Recalling that \(y^TA_ix(t) \le {y^*}^TA_ix(t)+\alpha /2\), and hence, \(g_i(y)^Tx(t)\le g_i(y^*)^Tx(t)+\alpha /2\), for all y in \({{\mathcal {Y}}}_i^+(t)\), we have

$$\begin{aligned} \varPhi _i(t)&= \int _{{{\mathcal {Y}}}_i(t)} (1-\epsilon )^{g_i(y)^Tx(t)}dy \ge \int _{{{\mathcal {Y}}}_i^+(t)} (1-\epsilon )^{g_i(y)^Tx(t)}dy\\&\ge \int _{{{\mathcal {Y}}}_i^+(t)} (1-\epsilon )^{g_i({y^*})^Tx(t)+\alpha /2}dy = (1-\epsilon )^{g_i({y^*})^Tx(t)+\alpha /2}{\text {vol}}({{\mathcal {Y}}}_i^+(t)). \end{aligned}$$

From the claims above, the volumes of the sets \({{\mathcal {Y}}}_i(t)\), \({{\mathcal {Y}}}_i^+(t)\) and \({{\mathcal {Y}}}_i^{++}(t)\) are related by \({\text {vol}}({{\mathcal {Y}}}_i^+(t)) = \left( \mu ^*\right) ^{m_i}{\text {vol}}({{\mathcal {Y}}}_i^{++}(t)) \ge \left( {\mu ^*}\right) ^{m_i}{\text {vol}}({{\mathcal {Y}}}_i(t))\ge \left( {\mu ^*}\right) ^{m_i} v_i\). Therefore,

$$\begin{aligned} \varPhi _i(t)&\ge (1-\epsilon )^{g_i({y^*})^Tx(t)+\alpha /2}\left( {\mu ^*}\right) ^{m_i}v_i. \end{aligned}$$
(47)

The point \(y^*\) is one such that \((1-\epsilon )^{g_i({y^*})^Tx(t)} > \gamma (1-\epsilon )^{-\alpha }\). Also, \(\alpha \) was chosen such that \((1-\epsilon )^{-\alpha /2}\left( \frac{\alpha }{\alpha _0}\right) ^{m_i}v_i>1\). Using these two facts in (47), we obtain

$$\begin{aligned} \varPhi (t) \ge \varPhi _i(t)&> \gamma (1-\epsilon )^{-\alpha +\alpha /2}(\mu ^*)^{m_i}{\text {vol}}({{\mathcal {Y}}}_i(t))\\&\ge \gamma (1-\epsilon )^{-\alpha /2}\left( \frac{\alpha }{\alpha _0}\right) ^{m_i}v_i > \gamma . \end{aligned}$$

This contradicts the hypothesis of the lemma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elbassioni, K., Makino, K. & Najy, W. A Multiplicative Weight Updates Algorithm for Packing and Covering Semi-infinite Linear Programs. Algorithmica 81, 2377–2429 (2019). https://doi.org/10.1007/s00453-018-00539-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-018-00539-4

Keywords

Navigation