Chance-constrained problems and rare events: an importance sampling approach

Barrera, Javiera; Homem-de-Mello, Tito; Moreno, Eduardo; Pagnoncelli, Bernardo K.; Canessa, Gianpiero

doi:10.1007/s10107-015-0942-x

Chance-constrained problems and rare events: an importance sampling approach

Full Length Paper
Series B
Published: 07 September 2015

Volume 157, pages 153–189, (2016)
Cite this article

Mathematical Programming Submit manuscript

Javiera Barrera ORCID: orcid.org/0000-0003-2866-3730¹,
Tito Homem-de-Mello²,
Eduardo Moreno¹,
Bernardo K. Pagnoncelli² &
…
Gianpiero Canessa³

1619 Accesses
Explore all metrics

Abstract

We study chance-constrained problems in which the constraints involve the probability of a rare event. We discuss the relevance of such problems and show that the existing sampling-based algorithms cannot be applied directly in this case, since they require an impractical number of samples to yield reasonable solutions. We argue that importance sampling (IS) techniques, combined with a Sample Average Approximation (SAA) approach, can be effectively used in such situations, provided that variance can be reduced uniformly with respect to the decision variables. We give sufficient conditions to obtain such uniform variance reduction, and prove asymptotic convergence of the combined SAA-IS approach. As it often happens with IS techniques, the practical performance of the proposed approach relies on exploiting the structure of the problem under study; in our case, we work with a telecommunications problem with Bernoulli input distributions, and show how variance can be reduced uniformly over a suitable approximation of the feasibility set by choosing proper parameters for the IS distributions. Although some of the results are specific to this problem, we are able to draw general insights that can be useful for other classes of problems. We present numerical results to illustrate our findings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimization of a class of rare event probabilities and buffered probabilities of exceedance

Article 18 March 2021

Efficient importance sampling for large sums of independent and identically distributed random variables

Article Open access 11 October 2021

Rare Events in Random Geometric Graphs

Article Open access 04 May 2021

Notes

Random lower semicontinuous functions are called normal integrands in [36].

References

Adas, A.: Traffic models in broadband networks. IEEE Commun. Mag. 35(7), 82–89 (1997)
Article Google Scholar
Andrieu, L., Henrion, R., Römisch, W.: A model for dynamic chance constraints in hydro power reservoir management. Eur. J. Oper. Res. 207(2), 579–589 (2010)
Article MathSciNet MATH Google Scholar
Artstein, Z., Wets, R.J.B.: Consistency of minimizers and the SLLN for stochastic programs. J. Convex Anal. 2(1–2), 1–17 (1996)
MathSciNet MATH Google Scholar
Asmussen, S., Glynn, P.: Stochastic Simulation. Springer, New York (2007)
MATH Google Scholar
Beraldi, P., Ruszczyński, A.: The probabilistic set-covering problem. Oper. Res. 50(6), 956–967 (2002)
Article MathSciNet MATH Google Scholar
Bonami, P., Lejeune, M.: An exact solution approach for portfolio optimization problems under stochastic and integer constraints. Oper. Res. 57(3), 650–670 (2009)
Article MathSciNet MATH Google Scholar
Calafiore, G., Campi, M.C.: Uncertain convex programs: randomized solutions and confidence levels. Math. Program. 102(1), 25–46 (2005)
Article MathSciNet MATH Google Scholar
Campi, M.C., Garatti, S.: The exact feasibility of randomized solutions of uncertain convex programs. SIAM J. Optim. 19(3), 1211–1230 (2008)
Article MathSciNet MATH Google Scholar
Campi, M.C., Garatti, S.: A sampling-and-discarding approach to chance-constrained optimization: feasibility and optimality. J. Optim. Theory Appl. 148(2), 257–280 (2011)
Article MathSciNet MATH Google Scholar
Campi, M.C., Garatti, S., Prandini, M.: The scenario approach for systems and control design. Ann. Rev. Control 33(2), 149–157 (2009)
Article Google Scholar
Carniato, A., Camponogara, E.: Integrated coal-mining operations planning: modeling and case study. Int. J. Coal Prep. Util. 31(6), 299–334 (2011)
Article Google Scholar
Charnes, A., Cooper, W.W., Symonds, G.H.: Cost horizons and certainty equivalents: an approach to stochastic programming of heating oil. Manag. Sci. 4, 235–263 (1958)
Article Google Scholar
Chung, K.L.: A Course in Probability Theory, 2nd edn. Academic Press, New York (1974)
MATH Google Scholar
Dantzig, G.B., Glynn, P.W.: Parallel processors for planning under uncertainty. Ann. Oper. Res. 22(1), 1–21 (1990)
Article MathSciNet MATH Google Scholar
Dentcheva, D., Prékopa, A., Ruszczynski, A.: Concavity and efficient points of discrete distributions in probabilistic programming. Math. Program. 89(1), 55–77 (2000)
Article MathSciNet MATH Google Scholar
Dorfleitner, G., Utz, S.: Safety first portfolio choice based on financial and sustainability returns. Eur. J. Oper. Res. 221(1), 155–164 (2012)
Article MathSciNet MATH Google Scholar
Duckett, W.: Risk analysis and the acceptable probability of failure. Struct. Eng. 83(15), 25–26 (2005)
Google Scholar
Ermoliev, Y.M., Ermolieva, T.Y., MacDonald, G., Norkin, V.: Stochastic optimization of insurance portfolios for managing exposure to catastrophic risks. Ann. Oper. Res. 99(1–4), 207–225 (2000)
Article MathSciNet MATH Google Scholar
Henrion, R., Römisch, W.: Metric regularity and quantitative stability in stochastic programs with probabilistic constraints. Math. Program. 84(1), 55–88 (1999)
MathSciNet MATH Google Scholar
Homem-de-Mello, T., Bayraksan, G.: Monte Carlo methods for stochastic optimization. Surv. Oper. Res. Manag. Sci. 19(1), 56–85 (2014)
MathSciNet Google Scholar
Infanger, G.: Monte Carlo (importance) sampling within a Benders decomposition algorithm for stochastic linear programs. Ann. Oper. Res. 39(1), 69–95 (1992)
Article MathSciNet MATH Google Scholar
Jiang, R., Guan, Y.: Data-driven chance constrained stochastic program (2012). http://www.optimization-online.org
Kahn, H., Harris, T.: Estimation of particle transmission by random sampling. Nat. Bur. Stand. Appl. Math. Ser. 12, 27–30 (1951)
Google Scholar
L’Ecuyer, P., Mandjes, M., Tuffin, B.: Importance sampling in rare event simulation. In: Rubino, G., Tuffin, B., (eds.) Rare Event Simulation using Monte Carlo Methods, Chap. 2. Wiley, New York (2009)
Lejeune, M.: Pattern definition of the p-efficiency concept. Ann. Oper. Res. 200(1), 23–36 (2012)
Article MathSciNet MATH Google Scholar
Li, W.L., Zhang, Y., So, A.C., Win, Z.: Slow adaptive OFDMA systems through chance constrained programming. IEEE Trans. Signal Process. 58(7), 3858–3869 (2010)
Article MathSciNet Google Scholar
Liu, Y., Guo, H., Zhou, F., Qin, X., Huang, K., Yu, Y.: Inexact chance-constrained linear programming model for optimal water pollution management at the watershed scale. J. Water Resour. Plan. Manag. 134(4), 347–356 (2008)
Article Google Scholar
Luedtke, J., Ahmed, S.: A sample approximation approach for optimization with probabilistic constraints. SIAM J. Optim. 19(2), 674–699 (2008)
Article MathSciNet MATH Google Scholar
Minoux, M.: Discrete cost multicommodity network optimization problems and exact solution methods. Ann. Oper. Res. 106(1–4), 19–46 (2001)
Article MathSciNet MATH Google Scholar
Minoux, M.: Multicommodity network flow models and algorithms in telecommunications. In: Resende, M., Pardalos, P. (eds.) Handbook of Optimization in Telecommunications, pp. 163–184. Springer, Berlin (2006)
Nemirovski, A., Shapiro, A.: Convex approximations of chance constrained programs. SIAM J. Optim. 17(4), 969–996 (2006)
Article MathSciNet MATH Google Scholar
Pagnoncelli, B., Ahmed, S., Shapiro, A.: Sample average approximation method for chance constrained programming: theory and applications. J. Optim. Theory Appl. 142(2), 399–416 (2009)
Article MathSciNet MATH Google Scholar
Pagnoncelli, B.K., Reich, D., Campi, M.C.: Risk-return trade-off with the scenario approach in practice: a case study in portfolio selection. J. Optim. Theory Appl. 155(2), 707–722 (2012)
Article MathSciNet MATH Google Scholar
Prékopa, A.: Probabilistic programming. In: Ruszczyński, A., Shapiro, A. (eds.) Stochastic Programming, vol. 10, pp. 267–351. Elsevier, Amsterdam (2004)
Ramaswami, R., Sivarajan, K., Sasaki, G.: Optical Networks: A Practical Perspective. Morgan Kaufmann, Los Altos (2009)
Google Scholar
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, A Series of Comprehensive Studies in Mathematics, vol. 317. Springer, Berlin (1998)
Google Scholar
Römisch, W., Schultz, R.: Stability analysis for stochastic programs. Ann. Oper. Res. 30(1), 241–266 (1991)
Article MathSciNet MATH Google Scholar
Rosenbluth, M.N., Rosenbluth, A.W.: Monte Carlo calculation of the average extension of molecular chains. J. Chem. Phys. 23, 356 (1955)
Article Google Scholar
Rubinstein, R.Y.: Cross-entropy and rare events for maximal cut and partition problems. ACM Trans. Model. Comput. Simul. 12(1), 27–53 (2002)
Article Google Scholar
Rubinstein, R.Y., Shapiro, A.: Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method. Wiley, Chichester (1993)
MATH Google Scholar
Shapiro, A.: Monte Carlo sampling methods. In: Ruszczynski, A., Shapiro, A. (eds.) Stochastic Programming, Handbooks in Operations Research and Management Science, vol. 10. Elsevier, Amsterdam (2003)
Google Scholar
Shapiro, A., Dentcheva, D., Ruszczyński, A.: Lectures on stochastic programming: modeling and theory, vol. 9. SIAM (2009)
Soekkha, H.M.: Aviation Safety: Human Factors, System Engineering, Flight Operations, Economics, Strategies, Management. VSP, Utrecht (1997)
Google Scholar
Thieu, Q.T., Hsieh, H.Y.: Use of chance-constrained programming for solving the opportunistic spectrum sharing problem under rayleigh fading. In: 9th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 1792–1797 (2013)
Tran, Q.K., Parpas, P., Rustem, B., Ustun, B., Webster, M.: Importance sampling in stochastic programming: a Markov chain Monte Carlo approach (2013). http://www.optimization-online.org
Vallejos, R., Zapata-Beghelli, A., Albornoz, V., Tarifeño, M.: Joint routing and dimensioning of optical burst switching networks. Photon Netw. Commun. 17(3), 266–276 (2009)
Article Google Scholar

Download references

Acknowledgments

Authors acknowledge the financial support of Grant Anillo ACT-88, Basal Center CMM-UCh, CIRIC-INRIA Chile (J.B., E.M., G.C.), Programa Iniciativa Cientifica Milenio NC130062 (J.B) and FONDECYT Grants 1120244 (T.H., B.P.) and 1130681 (E.M.).

Author information

Authors and Affiliations

Faculty of Engineering and Sciences, Universidad Adolfo Ibáñez, Santiago, Chile
Javiera Barrera & Eduardo Moreno
School of Business, Universidad Adolfo Ibáñez, Santiago, Chile
Tito Homem-de-Mello & Bernardo K. Pagnoncelli
IE&OR Program, Universidad Adolfo Ibáñez, Santiago, Chile
Gianpiero Canessa

Authors

Javiera Barrera
View author publications
You can also search for this author inPubMed Google Scholar
Tito Homem-de-Mello
View author publications
You can also search for this author inPubMed Google Scholar
Eduardo Moreno
View author publications
You can also search for this author inPubMed Google Scholar
Bernardo K. Pagnoncelli
View author publications
You can also search for this author inPubMed Google Scholar
Gianpiero Canessa
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Javiera Barrera.

Appendices

Appendix 1: MIP formulation for $\hat{p}^{\text {IS}_0}_a$ estimator under heterogeneous demand

We can formulate an integer linear programming model for this problem

$$\begin{aligned} \min \sum \limits _{a\in A} w_a&\ \end{aligned}$$

(52)

$$\begin{aligned} \mathcal {N}y^c = d^c&\quad \quad \forall c=1,\ldots ,C \end{aligned}$$

(53)

$$\begin{aligned} \sum \limits _{c=1}^{C} \hat{\xi }^s_c y_{c,a} \le w_a + \sum \limits _{k=1}^C k\cdot u_{a,s,k}&\ \quad \quad \forall a\in A, \quad \forall s= 1,\ldots ,N \end{aligned}$$

(54)

$$\begin{aligned} \sum \limits _{c=1}^{C} \hat{\xi }^s_c y_{c,a} \ge \sum \limits _{k=1}^C k\cdot u_{a,s,k}&\ \quad \quad \forall a\in A, \quad \forall s= 1,\ldots ,N \end{aligned}$$

(55)

$$\begin{aligned} \sum \limits _{s=1}^N \sum \limits _{k=1}^C e^{-k\lambda } u_{a,s,k} \le \alpha N \sum \limits _{k=0}^C G_a(k) v_{a,k}&\ \quad \quad \forall a\in A \end{aligned}$$

(56)

$$\begin{aligned} \sum \limits _{c=1}^{C} y_{c,a} = \sum \limits _{k=0}^C k v_{a,k}&\ \quad \quad \forall a\in A \end{aligned}$$

(57)

$$\begin{aligned} \sum \limits _{k=0}^C v_{a,k} = 1&\ \quad \quad \forall a\in A \end{aligned}$$

(58)

$$\begin{aligned} \sum \limits _{k=1}^C u_{a,s,k} \le 1&\ \quad \quad \forall a\in A, \quad \forall s= 1,\ldots ,N \end{aligned}$$

(59)

$$\begin{aligned} w_a\in \mathbb {N},\ y_{c,a} \in \{0,1\}, u_{a,s,k}\in \{0,1\},\nonumber \\ v_{a,k}\in \{0,1\}&\quad \quad \forall a\in A, \quad \forall c=1,\ldots ,C,\nonumber \\&\qquad \forall s= 1,\ldots ,N \end{aligned}$$

(60)

Binary variables $v_{a,k}$, together with Eqs. (58) and (57), satisfy that $v_{a,k}=1$ if and only if $\sum \nolimits _{c=1}^C y_{c,a}=k$. The role of binary variables u is explained in the following lemma

Lemma 2

Let (x, w, u, v) be an optimal solution of previous formulation, then there exist an optimal solution $(x,w,\hat{u},v)$ such that

1.
$\sum _{c=1}^C \hat{\xi }^s_c y_{c,a} \le w_a$ if and only if $\hat{u}_{a,s,k}=0$ for all $k=1,\ldots ,C$.
2.
if $\hat{u}_{a,s,k}=1$ then $\sum _{c=1}^C \hat{\xi }^s_c y_{c,a} =k$,

Hence,

$$\begin{aligned} \sum _{k=1}^C e^{-k\lambda } \hat{u}_{a,s,k} = e^{\sum _{c=1}^C \hat{\xi }^s_c y_{c,a}} {\mathbbm {1}}_{\left\{ \sum \limits _{c=1}^C \hat{\xi }^s_c y_{c,a} > w_a\right\} } \end{aligned}$$

Proof

First, note that constraint (54) impose that if $u_{a,s,k}=0$ for all k then $\sum _{c=1}^C \hat{\xi }^s_c y_{c,a} \le w_a$. Suppose that $\sum _{c=1}^C \hat{\xi }^s_c y_{c,a} \le w_a$ but $u_{a,s,k'}=1$ for some $k'$. It is easy to see that defining $\hat{u}_{a,s,k}=0$ for all k and $\hat{u}=u$ for the other variables, then $\hat{u}$ also satisfy Eqs. (54) and (55), and since $\hat{u}\le u$ then it also satisfy Eqs. (59) and (56), hence $(x,w,\hat{u},v)$ is also optimal. Repeating this procedure is easy to see that we obtain a solution that satisfies condition (1). For the second condition, suppose that $u_{a,s,k}=1$ for some k but $\sum _{c=1}^C \hat{\xi }^s_c y_{c,a} > k$. Let $\hat{k}=\sum _{c=1}^C \hat{\xi }^s_c y_{c,a}$ and define $\hat{u}_{a,s,\hat{k}}=1$, $\hat{u}_{a,s,k}=0$ $\forall k\ne \hat{k}$ and $\hat{u}=u$ for the other variables. By definition, $(w,x,\hat{u},v)$ satisfies (54) and (59), and since $\hat{k}>k$ then it also satisfies (54). On the other hand, since $\lambda >0$ then $e^{-\lambda k}>e^{-\lambda \hat{k}}$ so it also satisfy (56) and then $(x,w,\hat{u},v)$ is also optimal. Repeating this procedure is easy to see that we obtain a solution that satisfies condition (2). $\square $

Lemma 2 shows that the optimal solution (y, w) of this MIP formulation satisfies

$$\begin{aligned} \frac{1}{N} \sum _{s= 1}^N e^{\sum _{c=1}^C \hat{\xi }^s_c y_{c,a}} {\mathbbm {1}}_{\left\{ \sum \limits _{c= 1}^C\hat{\xi }^s_c y_{c,a} > w_a\right\} } \le \alpha \cdot G_a\left( \sum _{c=1}^C y_{c,a} \right) \quad \forall a\in A, \end{aligned}$$

which is the desired approximation of equation $\hat{p}^{\text {IS}_0}_a \le \alpha $.

Appendix 2: Proofs of results

1.1 Proof of Lemma 1

Lemma 1

Suppose that the set-function $I_x$ is such that $G(x,\cdot )$ is $I_x$-determined for each $x \in X$. Given an i.i.d. sample $(\hat{\xi }^1,\ldots ,\hat{\xi }^N)$ from the distribution of $\hat{\xi }$, let

$$\begin{aligned} \hat{p}^{\text {IS}_0}(x)\ :=\ \frac{1}{N} \sum _{j=1}^{N} {\mathbbm {1}}_{\{G\big (x,\hat{\xi }^j\big )>0\}}L_x(\hat{\xi }^j). \end{aligned}$$

(61)

Then $\hat{p}^{\text {IS}_0}(x)$ is also an unbiased estimator of p(x). Moreover,

$$\begin{aligned} \text {Var}(\hat{p}^{\text {IS}_0}(x))\ = \ \text {Var}(\hat{p}^{\text {IS}}(x)) - \frac{1}{N}\mathbb {E}_{\hat{\xi }}\left[ \text {Var}\left( {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}\,|\,(\hat{\xi }_i)_{i\in I_x}\right) \right] \end{aligned}$$

(62)

Proof

First let us prove that the estimator $\hat{p}^{\text {IS}_0}(x)$ is unbiased, for which it suffices to show that $\mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L_x(\hat{\xi })\right] =p(x)$. Indeed, we have

$$\begin{aligned} \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L_x(\hat{\xi })\right]= & {} \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}\mathbb {E}_{\hat{\xi }}\left[ L(\hat{\xi })~|(\hat{\xi }_i)_{i \in I_x}\right] \right] \nonumber \\= & {} \mathbb {E}_{\hat{\xi }}\left[ \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })~|~(\hat{\xi }_i)_{i \in I_x}\right] \right] \end{aligned}$$

(63)

$$\begin{aligned}= & {} \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })\right] = p(x), \end{aligned}$$

(64)

where the second equality follows from the assumption that $G(x,\cdot )$ is $I_x$-determined, which implies that $G(x,\hat{\xi })$ is measurable with respect to the sigma-algebra generated by $(\hat{\xi }_i)_{i \in I_x}$.

For the second assertion of the theorem, note that

$$\begin{aligned} \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}^2 L_x(\hat{\xi })^2\right]&= \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}} \left( \mathbb {E}_{\hat{\xi }}\left[ L(\hat{\xi })~|~(\hat{\xi }_i)_{i \in I_x}\right] \right) ^2\right] \\&= \mathbb {E}_{\hat{\xi }}\left[ \left( \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })~|~(\hat{\xi }_i)_{i \in I_x}\right] \right) ^2\right] \\&= \ \mathbb {E}_{\hat{\xi }}\left[ \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })^2~|~(\hat{\xi }_i)_{i \in I_x}\right] \right. \\&\qquad - \left. \text {Var}\left( {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })~|~(\hat{\xi }_i)_{i \in I_x}\right) \right] \\&= \ \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })^2\right] \\&\qquad - \mathbb {E}_{\hat{\xi }}\left[ \text {Var}\left( {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })~|~(\hat{\xi }_i)_{i \in I_x}\right) \right] \end{aligned}$$

and therefore

$$\begin{aligned} N \text {Var}(\hat{p}^{\text {IS}_0}(x))&= \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}^2 L_x(\hat{\xi })^2\right] - p(x)^2 = \ \mathbb {E}_{\hat{\xi }}\left[ {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })^2\right] \\&\quad - \mathbb {E}_{\hat{\xi }}\left[ \text {Var}\left( {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}L(\hat{\xi })~|~(\hat{\xi }_i)_{i \in I_x}\right) \right] - p(x)^2 \\&= N \text {Var}(\hat{p}^{\text {IS}}(x)) - \mathbb {E}_{\hat{\xi }}\left[ \text {Var}\left( {\mathbbm {1}}_{\{G(x,\hat{\xi })>0\}}\,|\,(\hat{\xi }_i)_{i\in I_x}\right) \right] . \end{aligned}$$

$\square $

1.2 Proof of Proposition 3

Proposition 3

Let $\zeta _1,\ldots ,\zeta _m$ be $m\ge 1$ independent Bernoulli random variables with $\mathbb {P}\{\zeta _i=1\}=p_i$, and suppose that $0<p_i<1$ for all i. Let $Z:= \sum _{i=1}^m \zeta _i$, and define $\delta := \min _i\, p_i(1-p_i) > 0$. Then, we have

$$\begin{aligned} \mathbb {P}\left\{ Z > \mathbb {E}[Z]\right\} \ > \ \frac{\delta }{2m}. \end{aligned}$$

(65)

Proof

Let $u:[0,m]\mapsto \mathbb {R}$ be the function defined as $u(t):= m^2 - t^2$. Since $u(\cdot )$ is nonnegative and decreasing on [0, m], we have that

$$\begin{aligned} \mathbb {P}\left\{ Z \le \mathbb {E}[Z]\right\}&= \mathbb {P}\left\{ u(Z) \ge u(\mathbb {E}[Z])\right\} \\&= \ \mathbb {P}\left\{ m^2 -Z^2 \ge m^2 - (\mathbb {E}[Z])^2\right\} \\&\le \ \frac{\mathbb {E}\left[ m^2 - Z^2\right] }{m^2 - (\mathbb {E}[Z])^2}, \end{aligned}$$

where the last inequality follows from Markov’s inequality. Thus, we have

$$\begin{aligned} \mathbb {P}\left\{ Z > \mathbb {E}[Z]\right\}&\ge \ 1 - \frac{\mathbb {E}\left[ m^2 - Z^2\right] }{m^2 - (\mathbb {E}[Z])^2} \ =\ \frac{\mathbb {E}\left[ Z^2\right] -(\mathbb {E}[Z])^2}{m^2 - (\mathbb {E}[Z])^2} \nonumber \\&=\ \frac{\text {Var}(Z)}{(m + \mathbb {E}[Z])(m - \mathbb {E}[Z])} . \end{aligned}$$

(66)

Next, notice that independence of $\{\zeta _i\}$ implies that $\text {Var}(Z)=\sum _{i=1}^m p_i(1-p_i)$. Moreover, since $0<\mathbb {E}[Z]<m$ we have that $m + \mathbb {E}[Z] < 2m$, $m - \mathbb {E}[Z]< m$ and thus from (66) we have that

$$\begin{aligned} \mathbb {P}\left\{ Z > \mathbb {E}[Z]\right\}&> \ \frac{\sum _{i=1}^m p_i(1-p_i)}{2m^2} \ \ge \ \frac{\delta m}{2m^2} \ = \ \frac{\delta }{2m}. \end{aligned}$$

$\square $

1.3 Proof of Theorem 1

Theorem 1

Suppose that $0<\rho _c<1$ for all $c=1,\dots , C$. Let $x=(y,w)$ be such that $w \in \mathbb {N}$ satisfies $\sum _{c=1}^C \rho _c y_c < w \le \sum _{c=1}^C y_c-1$. Then the function $B_{x}(\varvec{\lambda })$ is convex and there exists $\lambda ^*_{x}\in \mathbb {R}_+\cup \{\infty \}$ such that the vector $\varvec{\lambda }$ defined as $\lambda _c=\lambda ^*_{x}$ $\forall c \in C$ minimizes $B_{x}(\varvec{\lambda })$. If $w=\sum _{c=1}^C y_c-1$, then the optimal $\lambda ^*_{x}$ is $\lambda ^*_{x}=\infty $ and $\hat{\rho }_c(\lambda ^*_{x})=1$; otherwise, $\lambda ^*_{x}$ and $\hat{\rho }_c(\lambda ^*_{x})$ satisfy

$$\begin{aligned} \sum _{c=1}^C \hat{\rho }_c(\lambda ^*_{x}) y_c = w+1 \qquad \text{ and } \qquad \hat{\rho }_c(\lambda ^*_{x}) =\frac{e^{\lambda ^*_{x}} \rho _c}{e^{\lambda ^*_{x}} \rho _c +(1-\rho _c)}. \end{aligned}$$

To prove the theorem, we need the following lemma, the proof of which is shown after the proof of the theorem.

Lemma 3

For $n\ge 1$, let $\rho _i$, $i=1,\ldots ,n$ be numbers such that $\rho _i \in (0,1)$ and $\rho _1 \ge \rho _2 \ge \ldots \ge \rho _n$. Given an integer w such that $0 \le w \le n-1$, consider problem (P) defined as follows:

$$\begin{aligned} \min _{\lambda \in \mathbb {R}^n_+} \max _{\begin{array}{c} z_i\in \{0,1\}^n\\ \sum _i z_i =w+1 \end{array}} -\sum _{i=1}^n z_i \lambda _i + \sum _{i=1}^n \log (e^{\lambda _i} \rho _i + (1-\rho _i)) . \end{aligned}$$

(P)

Then, there exists an optimal solution to (P) that satisfies $\lambda _1 \le \lambda _2 \le \ldots \le \lambda _n$.

Proof

(of Theorem 1) Let $n=\sum _{c=1}^C y_c$. Without loss of generality, let us assume for the sake of simplifying notation that the set $\{c: y_c=1\}$ corresponds to $\{1,\ldots ,n\}$. Since the $\log $ function is increasing, we have that

$$\begin{aligned} \log (B_{x}(\varvec{\lambda }))\ =\ \max _{\begin{array}{c} z_i\in \{0,1\}^n\\ \sum _i z_i =w+1 \end{array}} -\sum _{i=1}^n z_i \lambda _i + \sum _{i=1}^n \log (e^{\lambda _i} \rho _i + (1-\rho _i)) \end{aligned}$$

By Lemma 3, minimizing $\log (B_{x}(\varvec{\lambda }))$ amounts to solving the following problem:

$$\begin{aligned} \min _{\varvec{\lambda }\in \mathbb {R}^n}\, \psi (\varvec{\lambda }):= -\sum _{i=1}^{w+1} \lambda _i +&\sum _{i=1}^n \log (e^{\lambda _i} \rho _i + (1-\rho _i)) \end{aligned}$$

(67)

$$\begin{aligned} \lambda _i \le \lambda _{i+1}&\quad i=1\ldots n-1 \end{aligned}$$

(68)

$$\begin{aligned} \lambda _1 \ge 0&\end{aligned}$$

(69)

Note that the objective function of the above problem is strictly convex in $\varvec{\lambda }$. In fact, its second derivatives are

$$\begin{aligned} \frac{\partial ^2 \psi }{\partial \lambda _i^2} = \frac{e^{\lambda _i} \rho _i (1- \rho _i)}{(e^{\lambda _i}\rho _i + (1-\rho _i))^2}>0, \qquad \frac{\partial ^2 \psi }{\partial \lambda _i\partial \lambda _j}=0. \end{aligned}$$

Since $B_x(\varvec{\lambda })=\exp (\log (B_x(\varvec{\lambda }))$ and $\log (B_x(\varvec{\lambda }))$ is convex—though not strictly convex due to the components $\lambda _c$ such that $y_c=0$—it follows that $B_x$ is convex in $\varvec{\lambda }$. Of course, the components $\lambda _c$ such that $y_c=0$ do not affect the value of $B_x(\varvec{\lambda })$.

Suppose first that $w=n-1$. Then, the first derivative of the objective function in (67) is given by

$$\begin{aligned} \frac{\partial \psi }{\partial \lambda _i} \ = \ -1 + \frac{e^{\lambda _i} \rho _i}{e^{\lambda _i} + (1- \rho _i)}, \quad i=1,\ldots ,n, \end{aligned}$$

so we see that $\lim _{\varvec{\lambda }\rightarrow \infty } \nabla \psi (\varvec{\lambda }) =0$. Notice that we can in particular interpret $\lim _{\varvec{\lambda }\rightarrow \infty }$ as $\lim _{{\lambda }\rightarrow \infty }$ with $\lambda _i=\lambda $. That is, in that case the optimal solution of (67)–(69) is $\lambda _i=\infty $, $i=1,\ldots ,n$.

Consider now the case $w<n-1$. We will show that problem (67)–(69) has a unique optimal solution, which can be found by writing the Karush-Kuhn-Tucker conditions as follows:

$$\begin{aligned} -1_{(i\le w+1)} + \frac{e^{\lambda _i} \rho _i}{e^{\lambda _i} \rho _i +(1-\rho _i)} +\mu _{i}-\mu _{i-1}&= 0&i=1\ldots n-1 \end{aligned}$$

(70)

$$\begin{aligned} \frac{e^{\lambda _n} \rho _n}{e^{\lambda _n} \rho _n +(1-\rho _n)} -\mu _{n-1}&= 0&\end{aligned}$$

(71)

$$\begin{aligned} \mu _i (\lambda _{i+1}-\lambda _i)&= 0&i=1\ldots n-1 \end{aligned}$$

(72)

$$\begin{aligned} \mu _0 \lambda _1&= 0&\end{aligned}$$

(73)

$$\begin{aligned} \mu _i&\ge 0&i=0\ldots n-1 \end{aligned}$$

(74)

where $\varvec{\mu }=(\mu _i)$ is the vector of Lagrangean multipliers of constraints (68) and $\mu _0$ is the Lagrangean multiplier of constraint (69).

Consider now a particular choice of vectors $\varvec{\mu }$ and $\varvec{\lambda }$ defined as follows. All components of $\varvec{\lambda }$ are identical, with $\lambda _i=\lambda ^*$, where $\lambda ^*\in \mathbb {R}_{+}$ solves the equation

$$\begin{aligned} \varphi (\lambda ^*):= \sum _{i=1}^{n} \frac{e^{\lambda ^*} \rho _i}{e^{\lambda ^*} \rho _i +(1-\rho _i)}\ = \ w+1. \end{aligned}$$

(75)

Note that we can always find such $\lambda ^*$, since the function $\varphi (\lambda )$ is continuous and increasing, and

$$\begin{aligned} \varphi (0)&= \ \sum _{i=1}^n \rho _i\ < w \ < \ w+1 \end{aligned}$$

(76)

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \varphi (\lambda )&= \ n\ > \ w+1. \end{aligned}$$

(77)

The inequalities in (76) follow from the assumptions of the theorem on w and the fact that we are analyzing the case $w<n-1$. The components of $\varvec{\mu }$ are defined as

$$\begin{aligned} \mu _0&:= 0 \end{aligned}$$

(78)

$$\begin{aligned} \mu _i&:= \min \{i,w+1\} - \sum _{j=1}^i \frac{e^{\lambda ^*} \rho _j}{e^{\lambda ^*} \rho _j +(1-\rho _j)} \quad i=1,\ldots ,n-1. \end{aligned}$$

(79)

We claim that $\varvec{\mu }$ and $\varvec{\lambda }$ satisfy the KKT conditions (70)–(74) laid out above. To see that, observe that Eqs. (78)–(79) imply (70). Equation (71) follows from (75), since we have

$$\begin{aligned} \frac{e^{\lambda ^*} \rho _n}{e^{\lambda ^*} \rho _n +(1-\rho _n)}\ = \ w+1- \sum _{i=1}^{n-1} \frac{e^{\lambda ^*} \rho _i}{e^{\lambda ^*} \rho _i +(1-\rho _i)} \end{aligned}$$

and the latter term coincides with $\mu _{n-1}$ defined in (79). Equations (72) and (73) are trivially satisfied. Finally, we show that (74) holds, with strict inequality if $i\ge 1$. Indeed, (75) implies that

$$\begin{aligned} \sum _{j=1}^{i} \frac{e^{\lambda ^*} \rho _j}{e^{\lambda ^*} \rho _j +(1-\rho _j)}\ < \ w+1 \quad i=1,\ldots ,n-1 \end{aligned}$$

and clearly have

$$\begin{aligned} \sum _{j=1}^{i} \frac{e^{\lambda ^*} \rho _j}{e^{\lambda ^*} \rho _j +(1-\rho _j)}\ < \ i \quad i=1,\ldots ,n \end{aligned}$$

as each term in the summand is less than 1. $\square $

Proof

(of Lemma 3) Suppose that $\varvec{\lambda }=(\lambda _1, \ldots , \lambda _n)$ is an optimal solution and there exists some $j<n$ such that $\lambda _j>\lambda _{j+1}$. We will show that $\bar{\varvec{\lambda }}$ defined as $\bar{\lambda }_j=\lambda _{j+1}$, $\bar{\lambda }_{j+1}=\lambda _j$ and $\bar{\lambda }_i=\lambda _i$ for $i\ne \{j,j+1\}$ has no worse objective function than $\varvec{\lambda }$. Let $\varDelta $ be defined as the difference in objective function between $\varvec{\lambda }$ and $\bar{\varvec{\lambda }}$, i.e.,

$$\begin{aligned} \varDelta =&\max _{\begin{array}{c} z_i\in \{0,1\}^n\\ \sum _i z_i =w+1 \end{array}} -\sum _{i=1}^n z_i \lambda _i + \sum _{i=1}^n \log (e^{\lambda _i} \rho _i + (1-\rho _i)) \end{aligned}$$

(80)

$$\begin{aligned}&\ \ -\left( \max _{\begin{array}{c} z_i\in \{0,1\}^n\\ \sum _i z_i =w+1 \end{array}} -\sum _{i=1}^n z_i \bar{\lambda }_i + \sum _{i=1}^n \log (e^{\bar{\lambda }_i} \rho _i + (1-\rho _i)) \right) . \end{aligned}$$

(81)

We will prove that $\varDelta \ge 0$, showing that ${\bar{\varvec{\lambda }}}$ is no worse than $\varvec{\lambda }$. Note initially that

$$\begin{aligned} \max _{\begin{array}{c} z_i\in \{0,1\}^n\\ \sum _i z_i =w+1 \end{array}} -\sum _{i=1}^n z_i \lambda _i \ = \ \max _{\begin{array}{c} z_i\in \{0,1\}^n\\ \sum _i z_i =w+1 \end{array}} -\sum _{i=1}^n z_i \bar{\lambda }_i, \end{aligned}$$

since the maximum value on both sides is equal to the sum of the smallest $w+1$ components of the vector $\varvec{\lambda }$. Thus, we only need to compare remaining part of the objective function, i.e., we have

$$\begin{aligned} \varDelta&=\ \sum _{i=1}^n \log (e^{\lambda _i} \rho _i + (1-\rho _i)) -\sum _{i=1}^n \log (e^{\bar{\lambda }_i} \rho _i + (1-\rho _i)) \\&=\ \log (e^{\lambda _j} \rho _j + (1-\rho _j))+\log (e^{\lambda _{j+1}} \rho _{j+1} + (1-\rho _{j+1})) \\&\qquad - \log (e^{\bar{\lambda }_j} \rho _j + (1-\rho _j))-\log (e^{\bar{\lambda }_{j+1}} \rho _{j+1} + (1-\rho _{j+1})). \end{aligned}$$

Since $\bar{\lambda }_{j}=\lambda _{j+1}$ and $\bar{\lambda }_{j+1}=\lambda _{j}$, it follows that

$$\begin{aligned} \varDelta&=\ \log \left( \frac{e^{\lambda _j} \rho _j + (1-\rho _j)}{e^{\lambda _{j+1}} \rho _j + (1-\rho _j)}\right) -\log \left( \frac{e^{\lambda _{j}} \rho _{j+1} + (1-\rho _{j+1})}{e^{\lambda _{j+1}} \rho _{j+1} + (1-\rho _{j+1})}\right) \\&=\ \log \left( \frac{e^{\lambda _{j}}-e^{\lambda _{j+1}}}{e^{\lambda _{j+1}} + \frac{1}{\rho _{j}}-1}+1\right) - \log \left( \frac{e^{\lambda _{j}}-e^{\lambda _{j+1}}}{e^{\lambda _{j+1}} + \frac{1}{\rho _{j+1}}-1}+1\right) . \end{aligned}$$

Note that the argument inside the $\log $ is positive, since $\lambda _j>\lambda _{j+1}$. Moreover, since $\rho _j \ge \rho _{j+1}$, we see that $1/\rho _{j}-1\le 1/\rho _{j+1}-1$ and hence we conclude that $\varDelta \ge 0$. $\quad \square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barrera, J., Homem-de-Mello, T., Moreno, E. et al. Chance-constrained problems and rare events: an importance sampling approach. Math. Program. 157, 153–189 (2016). https://doi.org/10.1007/s10107-015-0942-x

Download citation

Received: 26 February 2014
Accepted: 11 August 2015
Published: 07 September 2015
Issue Date: May 2016
DOI: https://doi.org/10.1007/s10107-015-0942-x

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Chance-constrained problems and rare events: an importance sampling approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Minimization of a class of rare event probabilities and buffered probabilities of exceedance

Efficient importance sampling for large sums of independent and identically distributed random variables

Rare Events in Random Geometric Graphs

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: MIP formulation for \(\hat{p}^{\text {IS}_0}_a\) estimator under heterogeneous demand

Lemma 2

Proof

Appendix 2: Proofs of results

1.1 Proof of Lemma 1

Lemma 1

Proof

1.2 Proof of Proposition 3

Proposition 3

Proof

1.3 Proof of Theorem 1

Theorem 1

Lemma 3

Proof

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Chance-constrained problems and rare events: an importance sampling approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Minimization of a class of rare event probabilities and buffered probabilities of exceedance

Efficient importance sampling for large sums of independent and identically distributed random variables

Rare Events in Random Geometric Graphs

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: MIP formulation for \(\hat{p}^{\text {IS}_0}_a\) estimator under heterogeneous demand

Lemma 2

Proof

Appendix 2: Proofs of results

1.1 Proof of Lemma 1

Lemma 1

Proof

1.2 Proof of Proposition 3

Proposition 3

Proof

1.3 Proof of Theorem 1

Theorem 1

Lemma 3

Proof

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now