1 Introduction

We consider chance-constrained stochastic programs (CCSPs) of the form

$$\begin{aligned} \mathrm {v}^*= \min \left\{ f(x) : \ x \in S, \ {\mathbb {P}}[x \in X(\varvec{\xi })] \ge 1-\epsilon \right\} . \end{aligned}$$
(1)

We make the following assumptions throughout this paper: (i) the random vector \({\varvec{\xi }}\) has a finite support, i.e., \(\Xi = \{\xi ^1,\ldots ,\xi ^N\}\), where each \(i \in \mathcal {N}:= \{1,\ldots ,N\}\) is referred to as a scenario; (ii) \(f:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) is a linear function; i.e., \(f(x)=c^\top x\) (if f(x) is nonlinear, we can introduce a new variable y, add a new constraint \(f(x)\le y\) into S and change the objective to minimize y); (iii) \(X(\xi ^i)\subseteq S\) for each \(i\in \mathcal {N}\); otherwise, we can replace \(X(\xi ^i)\) by \(X(\xi ^i)\cap S\) which yields an equivalent problem; (iv) the feasible region is nonempty. We can then rewrite (1) as

$$\begin{aligned} \mathrm {v}^*= \min \left\{ c^\top x : \ x \in S, \ \sum _{i \in \mathcal {N}} p_i\mathbb {I}(x \in X^i) \ge 1-\epsilon \right\} , \end{aligned}$$
(2)

where S is a mixed-integer set (i.e., \(S\subseteq {\mathbb {R}}^{n-r} \times \mathbb {Z}^r\) with \(0\le r\le n\)), \(X^i := X(\xi ^i)\), \(\mathbb {I}(\cdot )\) is the indicator function and \(p_i\) is the probability mass associated with scenario i. Using a binary variable \(z_i\) to model the indicator function for each scenario i, we reformulate (1) as

$$\begin{aligned} \mathrm {v}^*= \min \bigl \{ c^\top x : \ x \in S, \ z_i = \mathbb {I}(x \in X^i), i \in \mathcal {N}, \ z \in Z\bigr \}, \end{aligned}$$
(3)

where

$$\begin{aligned} Z:= \left\{ z \in \{0,1\}^N : \sum _{i \in \mathcal {N}} p_iz_i \ge 1-\epsilon \right\} . \end{aligned}$$

We assume throughout this paper that S, and therefore \(X^i\), are compact sets for all \(i\in \mathcal {N}\). Our results can be directly generalized to the unbounded case when the sets \({{\mathrm{conv}}}(X^i)\) for all \(i \in \mathcal {N}\) share the same recession cone.

CCSPs were first studied in [10, 11]. Subsequently, significant research has been carried out on structural and algorithmic issues for CCSPs under various distributional settings [14, 15, 18, 29, 36], as well as on sampled approximations [5, 6, 17, 25, 28]. A CCSP with a finite number of scenarios can be formulated as a large-scale mixed-integer program (MIP), by introducing a binary variable for each individual scenario and adding big-M inequalities into the formulation. However, the natural MIP formulation based on big-M inequalities often has a weak linear programming relaxation [7]. Motivated by this drawback, there has been significant recent works investigating the use of MIP techniques for solving CCSPs having a finite number of scenarios. In particular, the mixing structure of CCSPs has been studied in [20, 24, 26]. By exploring special combinatorial structures of CCSPs, [21, 34, 35] introduced problem formulations without the binary scenario variables. Chance-constrained formulations have also been proposed for two-stage [23] and multi-stage settings [38].

In this paper we introduce two new Lagrangian relaxation techniques for obtaining lower bounds for the CCSPs (3). Inspired by the associated Lagrangian dual problems, we also introduce new MIP formulations of (3) that yield stronger relaxations than existing formulations. The Lagrangian relaxations we construct are obtained by variable splitting: creating multiple copies of the variables x, which are constrained to be equal to each other, and then constructing a relaxation in which these “nonanticipativity constraints” are relaxed. In stochastic programming this technique is known as dual decomposition, and was first introduced by [31], and used in [8] for obtaining strong relaxations of two-stage stochastic integer programs. See also [16, 32] for more results on dual decomposition in the two-stage stochastic programming. Some Lagrangian relaxation approaches for CCSPs have been investigated recently. In [3] the original problem constraints defining \(X^i\) are relaxed within an augmented Lagrangian relaxation framework. In [15], auxiliary variables are introduced to separate the deterministic and probabilistic constraints, and the deterministic constraints are relaxed. The work in [36] exploits decomposable structure of the deterministic constraints for a specific class of CCSPs. Finally, in [37], both the nonanticipativity constraints and the knapsack constraint \(\sum _{i \in \mathcal {N}} p_iz_i \ge 1-\epsilon \) are relaxed. In contrast, we do not relax the knapsack constraint or the original problem constraints and directly work on the original formulation (3), leading to relaxation bounds that are better than existing alternatives. Somewhat surprisingly, even though the knapsack constraint—which links scenarios together—is not relaxed, the majority of the work required to solve our proposed Lagrangian relaxation problems can be still decomposed by scenarios.

The remainder of the paper is organized as follows. In Sect. 2, we discuss three valid lower bounds, which can be obtained by continuous relaxation, quantile bounding and scenario grouping. We then provide two new Lagrangian dual formulations based on relaxing the nonanticipativity constraints in Sect. 3. In Sect. 4, we compare these bounds with the basic lower bounds introduced in Sect. 2. We derive new primal formulations in Sect. 5 that are related to the dual formulations from Sect. 3. In Sect. 6 we present a heuristic and two new exact algorithms to solve CCSPs. Finally, we devote Sect. 7 to computational illustration of the lower bounds and performances of the proposed algorithms and make concluding remarks in Sect. 8.

2 Basic lower bounds

We first present three different lower bounds for the CCSPs (3). The first two are known results while the third bound presented in Sect. 2.3 is new.

2.1 Continuous relaxation

Assume that for each scenario i, \(X^i\) is described by an inequality system \(G_i(x) \le 0\) with a mapping \(G_i:S \mapsto {\mathbb {R}}^{m_i}\). Problem (3) can then be modeled as the following MIP:

$$\begin{aligned} \mathrm {v}^*=\min _{x,z} \displaystyle {\left\{ c^\top x : \ x \in S, \ G_i(x) \le M_i(1-z_i),\ \forall i \in \mathcal {N}, z \in Z\right\} }, \end{aligned}$$
(4)

where \(M_i\) is a vector of big-M parameters such that \(M_{ij}\) gives a valid upper bound of \(G_{ij}(x)\) for all feasible x in (2), that is, \(M_{ij} \ge \max \{G_{ij}(x): x\in S, \ \sum _{i \in \mathcal {N}} p_i\mathbb {I}(x \in X^i) \ge 1-\epsilon \}\) for each \(j = 1,2,\ldots , m_i\). It is impractical to compute the tightest possible upper bound, since it involves solving another CCSP. Therefore, computationally tractable techniques for deriving upper bounds have been investigated. For example, one may begin by choosing \(M_{ij} \ge \sup \{G_{ij}(x): \ x \in S\cap X^i\}\) for all \(j=1,2,\ldots , m_i\). This simple bound could then be strengthened using a coefficient strengthening that considers more of the feasible region of (4) as in [30, 35] (see Sect. 7). We assume that whenever a strengthened big-M parameter \(M'_{ij} < M_{ij}\) is obtained, we include the valid inequality \(G_{ij}(x) \le M'_{ij}\) in S, so that \(S\subseteq \{x \in {\mathbb {R}}^n : G_i(x) \le M'_i, \ \forall i\in \mathcal {N}\}\). We define \(\mathrm {v}^C(M)\) to be the optimal objective value of the relaxation of (4) in which the integrality constraints on the z variables are relaxed (but the integrality constraints on x, if any, are not relaxed). We denote by \(\underline{\mathrm {v}}^C(M)\) the relaxation obtained by also relaxing integrality constraints on x. Clearly, we have

$$\begin{aligned} \underline{\mathrm {v}}^C(M)\le \mathrm {v}^C(M)\le \mathrm {v}^*. \end{aligned}$$

2.2 Quantile bound

Another lower bound for problem (3) is the so-called quantile bound [35]. We first calculate for each \(i \in \mathcal {N}\),

$$\begin{aligned} \eta _i = \min _x \left\{ c^\top x : \ x \in X^i\right\} . \end{aligned}$$

We then sort these values to obtain a permutation \(\sigma \) of \(\mathcal {N}\) such that \(\eta _{\sigma _1} \ge \cdots \ge \eta _{\sigma _N}\). The quantile bound is then defined as \(\mathrm {v}^Q= \eta _{\sigma _q}\), where \(q = \min \{ k \in \mathcal {N}: \sum _{i=1}^k p_{\sigma _i} > \epsilon \}\). Then clearly

$$\begin{aligned} \mathrm {v}^Q\le \mathrm {v}^*\end{aligned}$$

because at least one scenario in the scenario set \(\{ \sigma _1,\ldots ,\sigma _q\}\) must be satisfied in a feasible solution.

2.3 Scenario grouping based lower bound

We partition the scenarios \(\mathcal {N}\) into \(K < N\) disjoint subsets \(\mathcal {N}_j,j\in \mathcal {K}=\{1,\ldots , K\}\) where \(\sum _{j=1}^K|\mathcal {N}_j|=N\). For each \(j\in \mathcal {K}\), we define \(\tilde{z}_j=1\) if \(z_i =1\) for all the scenarios \(i \in \mathcal {N}_j\) and 0 otherwise (i.e., \(\tilde{z}_j= \min \{ z_i : i \in \mathcal {N}_j \}\)). For each \(j \in \mathcal {K}\), we define \(q_j = \min _{i \in \mathcal {N}_j} p_i\). We then define the following scenario grouping model:

$$\begin{aligned} \mathrm {v}^G= \min \left\{ c^\top x: x \in S, \tilde{z}_j= \mathbb {I}\Bigl (x \in \bigcap _{i\in \mathcal {N}_j}X^i\Bigr ), j \in \mathcal {K}, \ \sum _{j \in \mathcal {K}} q_j \tilde{z}_j\ge \sum _{j \in \mathcal {K}} q_j - \epsilon \right\} . \end{aligned}$$
(5)

Proposition 1

The scenario grouping model (5) is a relaxation of the CCSP (3), i.e., \(\mathrm {v}^G\le \mathrm {v}^*\).

Proof

Let (xz) be any feasible solution to (3) and let \(\tilde{z}_j = \min \{ z_i : i \in \mathcal {N}_j \}\) for each \(j \in \mathcal {K}\). We show that \((\tilde{z},x)\) is feasible to (5). By construction it holds that \(\tilde{z}_j = \mathbb {I}(x \in \bigcap _{i\in \mathcal {N}_j}X^i)\) for each \(j \in \mathcal {K}\). We first establish the following inequality for each \(j \in \mathcal {K}\):

$$\begin{aligned} q_j \tilde{z}_j \ge \sum _{i \in \mathcal {N}_j} p_i z_i - \left( \sum _{i \in \mathcal {N}_j} p_i - q_j\right) . \end{aligned}$$
(6)

Indeed, if \(\tilde{z}_j = 0\) this implies that \(z_i = 0\) for some \(i \in \mathcal {N}_j\), and thus \(\sum _{i \in \mathcal {N}_j} p_i z_i \le \sum _{i \in \mathcal {N}_j} p_i - q_j\) as required. On the other hand, if \(\tilde{z}_j = 1\) this implies that \(z_i = 1\) for all \(i \in \mathcal {N}_j\), and (6) follows. Now, summing (6) over all \(j \in \mathcal {K}\) yields

$$\begin{aligned} \sum _{j \in \mathcal {K}} q_j\tilde{z}_j&\ge \sum _{j \in \mathcal {K}} \sum _{i \in \mathcal {N}_j} p_i z_i - \sum _{j \in \mathcal {K}} \sum _{i \in \mathcal {N}_j} p_i + \sum _{j \in \mathcal {K}} q_j \ge 1 -\epsilon - 1 + \sum _{j \in \mathcal {K}} q_j, \end{aligned}$$

which establishes the result. \(\square \)

If we scale the knapsack inequality in (5) by \((\sum _{j \in \mathcal {K}} q_j)^{-1}\), this problem is again a CCSP, but with \(K < N\) scenarios. Thus, any technique for obtaining a lower bound of a CCSP can also be applied to this relaxation. In particular, the quantile bound may be applied, and the resulting scenario grouping based quantile bound may be better than the original quantile bound (see Sect. 7 for an illustration). The dual bounds that we derive in the following sections may also be applied to a grouping-based relaxation.

3 Lagrangian dual bounds

We next introduce two Lagrangian dual problems associated with the CCSPs (3) obtained by relaxing nonanticipativity constraints. We use the following standard result (cf. [22, 27]) on a primal characterization of the Lagrangian dual.

Theorem 1

Consider a mathematical program \(\min \{f(x) : H(x)\le h, x\in X\}\), where fH are convex functions and X is compact, and let

$$\begin{aligned} \mathcal {L}^*:=\sup _{\lambda \ge 0} \left\{ \min _{x}\left\{ f(x)+\lambda ^\top ( H(x)-h):x\in X\right\} \right\} \end{aligned}$$

be the Lagrangian dual value. Then

$$\begin{aligned} \mathcal {L}^*=\inf \left\{ f(x) : x \in {{\mathrm{conv}}}(X), H(x) \le h\right\} , \end{aligned}$$

where \({{\mathrm{conv}}}(X)\) denotes the convex hull of the set X.

3.1 Basic nonanticipative dual

By making copies of the decision variables x, problem (3) can be reformulated as

figure a

where (7b) enforce the nonanticipativity constraints \(x^1 = \cdots = x^N\). The Lagrangian dual problem obtained by dualizing these nonanticipativity constraints with dual vector \(\lambda \) can be written as:

$$\begin{aligned} \mathrm {v}^{LD}_1= \max _{\lambda } \left\{ \mathcal {L}_1(\lambda ) - \lambda ^{\top } h \right\} , \end{aligned}$$
(8)

where

$$\begin{aligned} \mathcal {L}_1(\lambda ) := \min _{x,z} \left\{ \sum _{i \in \mathcal {N}} p_i(c^\top x^i + \lambda ^{\top }H_ix^i): \, \, \mathrm{(7c)}{-}\mathrm{(7e)} \right\} . \end{aligned}$$
(9)

Next, observe that problem (9) is equivalent to minimizing \(\psi (z)\) over \(z\in Z\), where

$$\begin{aligned} \psi (z) := \min _x \left\{ \sum _{i \in \mathcal {N}} p_i(c^\top x^i + \lambda ^{\top }H_ix^i): \mathrm{(7c)},\, \mathrm{(7e)} \right\} . \end{aligned}$$

This problem decomposes by scenario. Let

$$\begin{aligned} \theta _i(\lambda ) = \min _x \left\{ c^\top x + \lambda ^{\top }H_ix: \ x \in S\right\} \end{aligned}$$
(10)

and

$$\begin{aligned} \zeta _i(\lambda ) = \min _x \left\{ c^\top x + \lambda ^{\top }H_ix: \ x \in X^i\right\} . \end{aligned}$$
(11)

Note that the feasible region of \(\zeta _i(\lambda )\) is included in that of \(\theta _i(\lambda )\), so we have that \(\zeta _i(\lambda )\ge \theta _i(\lambda )\) for all \(i\in \mathcal {N}\). By compactness of S, both \(\theta \) and \(\zeta \) are finite valued for all \(\lambda \). Then, we have

$$\begin{aligned} \psi (z) = \sum _{i \in \mathcal {N}} p_i \bigl (\theta _i(\lambda )(1-z_i) + \zeta _i(\lambda )z_i\bigr )=\sum _{i \in \mathcal {N}} p_i \theta _i(\lambda )+ \sum _{i \in \mathcal {N}}p_i\bigl (\zeta _i(\lambda )-\theta _i(\lambda )\bigr ) z_i \end{aligned}$$

and so

$$\begin{aligned} \mathcal {L}_1(\lambda ) =\sum _{i \in \mathcal {N}} p_i \theta _i(\lambda )+ \min _z \left\{ \sum _{i \in \mathcal {N}}p_i(\zeta _i(\lambda )-\theta _i(\lambda )) z_i: \ z \in Z\right\} . \end{aligned}$$
(12)

Thus, for a fixed \(\lambda \), the Lagrangian relaxation value \(\mathcal {L}_1(\lambda ) - \lambda ^\top h\) can be calculated by first calculating the values \(\theta _i(\lambda )\) and \(\zeta _i(\lambda )\) by solving (10) and (11) separately for each \(i \in \mathcal {N}\), and then solving a single-row knapsack problem.

We close this subsection by noting that the dual problem (8) can be interpreted as a stochastic program with a mean-risk objective function, see the online supplement.

3.2 Quantile based Lagrangian dual

The quantile bound in Sect. 2.2 can be interpreted as a relaxation obtained by creating a copy \(x^i\) of the variables x for each \(i\in \mathcal {N}\), as in the reformulation (7), but then instead of using the weighted average of the objective values of these copies, the maximum objective function value among the enforced scenarios is used. This motivates the following alternative reformulation of (3):

figure b

where (13c)–(13f) are just a restatement of (7b)–(7e). For a fixed \(y \in {\mathbb {R}}\), we further define the problem:

$$\begin{aligned} g(y) := y + \min _{x,z} \left\{ 0 : \mathrm{(13b)}{-}\mathrm{(13f)} \right\} . \end{aligned}$$
(14)

Clearly, \(g(y) = y\) if (13b)–(13f) is feasible for this fixed y value, otherwise (14) is infeasible, and we use the convention \(g(y) = +\infty \) in this case. Then (13) can be formulated as:

$$\begin{aligned} \mathrm {v}^*= \min _{y} \left\{ g(y) : y \in {\mathbb {R}}\right\} . \end{aligned}$$

Next, for a fixed \(y \in {\mathbb {R}}\), let

$$\begin{aligned} R(y) = \left\{ \{x^i,z_i\}_{ i \in \mathcal {N}} : \mathrm{(13b)},\, \mathrm{(13d)}{-}\mathrm{(13f)}\right\} \end{aligned}$$

be the set of feasible solutions to (13) in which the nonanticipativity constraints (13c) are relaxed, and the variable y is fixed. Also, define

$$\begin{aligned} \mathcal {L}_2(\lambda ,y) = \min _{x,z} \Biggl \{ \sum _{i \in \mathcal {N}} p_i \lambda ^{\top }H_ix^i : \{x^i,z_i\}_{ i \in \mathcal {N}} \in R(y) \Biggr \}, \end{aligned}$$

and finally

$$\begin{aligned} \omega _2(y) = y + \max _{\lambda } \bigl \{ \mathcal {L}_2(\lambda ,y) - \lambda ^{\top } h \bigr \}. \end{aligned}$$
(15)

We use the notation \(\omega _2(y) = +\infty \) to indicate that the maximization problem in (15) is unbounded. In fact, as the following proposition shows, the maximization problem either has an optimal objective value that equals zero or is unbounded.

Proposition 2

There exists \(\bar{y} \in {\mathbb {R}}\) such that

$$\begin{aligned} \omega _2(y)=\left\{ \begin{array}{cc} y,&{}\quad \text {if}\; y \ge \bar{y}, \\ \infty , &{}\quad \text {if}\; y < \bar{y}. \end{array} \right. \end{aligned}$$
(16)

Proof

By Theorem 1, \(\omega _2(y)=y + \min \{0: \{x^i,z_i\}_{ i \in \mathcal {N}}\in T(y)\}\) where \(T(y)= \{ \{x^i,z_i\}_{ i \in \mathcal {N}}:\mathrm{(13c)}, \{x^i,z_i\}_{ i \in \mathcal {N}} \in {{\mathrm{conv}}}(R(y))\}\). Thus, \(\omega _2(y)=y\) if \(T(y)\ne \emptyset \) and \(\omega _2(y)=\infty \), otherwise. Next, for y large enough, any feasible solution to (3) can be used to construct a feasible point in T(y) (just set all \(x^i\) equal to x), and so for y large enough \(\omega _2(y) = y\). In addition, since the set S is compact, it follows that for y small enough the set R(y) is empty, and hence T(y) is empty. The result then follows because \(T(y_1)\subseteq T(y_2)\) whenever \(y_1\le y_2\). \(\square \)

We now define the quantile-based Lagrangian dual problem as:

$$\begin{aligned} \mathrm {v}^{LD}_2= \min _{y} \left\{ \omega _2(y) : y \in {\mathbb {R}}\right\} = \min _{y} \left\{ y : \omega _2(y) = y \right\} . \end{aligned}$$
(17)

Theorem 2

The quantile-based Lagrangian dual problem (17) is a relaxation of the CCSP (3), i.e., \(\mathrm {v}^{LD}_2\le \mathrm {v}^*\).

Proof

This follows because \(\omega _2(y) \le g(y)\) for all \(y \in {\mathbb {R}}\). \(\square \)

We next discuss the calculation of \(\mathrm {v}^{LD}_2\). First, for a given \(\lambda \) and y, \(\mathcal {L}_2(\lambda ,y)\) can be calculated by solving for each \(i \in \mathcal {N},\)

$$\begin{aligned} \bar{\theta }_i(\lambda ,y) := \min _x \left\{ \lambda ^{\top }H_ix:c^\top x \le y, \ x \in S\right\} \end{aligned}$$

and

$$\begin{aligned} \bar{\zeta }_i(\lambda ,y) := \min _x \left\{ \lambda ^{\top }H_ix: \, c^\top x \le y, \ x \in X^i \right\} . \end{aligned}$$

Then,

$$\begin{aligned} \mathcal {L}_2(\lambda ,y) = \sum _{i \in \mathcal {N}} p_i \bar{\theta }_i(\lambda ,y) + \min _z \left\{ \sum _{i \in \mathcal {N}} p_i \bigl (\bar{\zeta }_i(\lambda ,y) - \bar{\theta }_i(\lambda ,y)\bigr )z_i : z \in Z \right\} . \end{aligned}$$

The above characterization leads to a bisection procedure to obtain a lower bound on \(\mathrm {v}^{LD}_2\). It takes as input an upper bound, U, on the optimal objective value \(\mathrm {v}^*\), which can be obtained by any feasible solution to (3), and a lower bound L (we show in Sect. 4 that \(L=\mathrm {v}^Q\) is valid). At each iteration, we fix the candidate value \(y = (U + L)/2\), and solve the nonsmooth problem (15) with a specified finite termination condition (e.g., an iteration limit) to obtain a lower bound \(\underline{\omega }_2(y)\) on \(\omega _2(y)\). The nonsmooth problem (15) could be solved using convex programming techniques, e.g., the subgradient method or its regularized variants such as the bundle method. If \(\underline{\omega }_2(y) > y\), then we can update \(L=y\), otherwise we update \(U = y\). The bisection procedure terminates when the difference between the upper and lower bounds is less than a given tolerance. At any step of the algorithm, L is a valid lower bound on \(\mathrm {v}^{LD}_2\).

4 Strength of Lagrangian dual bounds

In this section, we compare the Lagrangian dual bounds developed in Sect. 3 and the basic lower bounds in Sect. 2.

4.1 Comparing \(\mathrm {v}^{LD}_1\) and \(\mathrm {v}^C(M)\)

We first show that \(\mathrm {v}^{LD}_1\) is no smaller than \(\mathrm {v}^C(M)\). Let

$$\begin{aligned} C_M=\Bigl \{(x,z){:}\ x {\in } S, \ G_i(x) \le M_i(1-z_i),\ z_i\in [0,1],\ \forall i \in \mathcal {N}, \sum _{i \in \mathcal {N}} p_iz_i\ge 1-\epsilon \Bigr \} \end{aligned}$$

be the feasible region of the continuous relaxation of (4) in which the variables z are relaxed to be continuous.

Theorem 3

Assume that problem (4) satisfies Slater’s condition, i.e., there exists \((\widehat{x},\widehat{z})\in {{\mathrm{int}}}({{\mathrm{conv}}}(C_M))\), where \({{\mathrm{int}}}(\cdot )\) denotes the interior of a set. Then,

$$\begin{aligned} \mathrm {v}^{LD}_1\ge \mathrm {v}^C(M). \end{aligned}$$

Proof

First observe that the continuous relaxation of (4) is a convex program with a linear objective function over the convex hull of the set \(C_M\), which is assumed to satisfy Slater’s condition [33]. Therefore, by strong duality, the Lagrangian dual of this convex program in which the nonanticipativity constraints \(\sum _{i \in \mathcal {N}} p_iH_ix^i = h\) are relaxed has the optimal value equal to \(\mathrm {v}^C(M)\). But the Lagrangian relaxation problem used in this Lagrangian dual is identical to that in (8) except that the z variables are relaxed to be continuous. The conclusion follows. \(\square \)

Next we establish a set of sufficient conditions under which \(\mathrm {v}^C(M)\) is equal to \(\mathrm {v}^{LD}_1\).

Proposition 3

Suppose that \(S = {\mathbb {R}}_+^n\) and for each \(i\in \mathcal {N}\), \(p_i=\frac{1}{N}\) and \(G_i(x)=\bar{G}_i(x)+M_i\), where \(\bar{G}_i(tx)\le t\bar{G}_i(x)\) for all \(t\ge 1\) and \(\bar{G}_i(0)\le 0\). Then \(\mathrm {v}^C(M)=\mathrm {v}^{LD}_1\).

Proof

We only need to show that \(\mathrm {v}^C(M)\ge \mathrm {v}^{LD}_1\). Recall that

$$\begin{aligned} \mathrm {v}^C(M)&= \min _{x,z} \left\{ c^\top x:x\in S, G_i(x) \le M_i(1-z_i),z_i\in [0,1],\forall i\in \mathcal {N},\right. \\&\left. \quad \qquad \quad \sum _{i\in \mathcal {N}}z_i\ge \lceil N(1-\epsilon )\rceil \right\} \\&= \min _{x,z}\left\{ c^\top x:x\in S, G_i(x) \le M_i(1-z_i),\forall i\in \mathcal {N},z\in {{\mathrm{conv}}}(Z) \right\} . \end{aligned}$$

From Theorem 1 and Eq. (8) we know that

$$\begin{aligned} \mathrm {v}^{LD}_1&= \min _{x,z}\bigg \{ \sum _{i=1}^N p_ic^\top x^i: \{(x^i,z_i)\}_{i \in \mathcal {N}} \in {{\mathrm{conv}}}\left( \left\{ \{(x^i,z_i)\}_{i \in \mathcal {N}}:x^i\in S, \right. \right. \\&\quad \qquad \quad \left. \left. G_i(x^i) \le M_i(1-z_i),\forall i\in \mathcal {N},z\in Z\right\} \right) , \sum _{i \in \mathcal {N}} p_iH_ix^i = h\bigg \}. \end{aligned}$$

Let \((\widehat{\mathbf {x}},\widehat{\mathbf {z}})\), where \(\widehat{\mathbf {z}}:= \{\widehat{z}_i\}_{i \in \mathcal {N}}\), be an optimal solution of the continuous relaxation of (4). Since \(\widehat{\mathbf {z}} \in {{\mathrm{conv}}}(Z)\), there exists a set of points \(\{\mathbf {z}_k\} \in Z\) such that \(\widehat{\mathbf {z}} = \sum _k \lambda _k \mathbf {z}_k\) with \(\sum _k \lambda _k = 1\) and \(\lambda _k > 0\). Construct \(\mathbf {x}^i_k = \frac{\widehat{\mathbf {x}}}{\widehat{z}_i}z_{ik}\) for all k and for all \(i\in \mathcal {N}\). Note that the operations are well defined since, for each i, \(\widehat{z}_i = 1\) (or 0) implies \(z_{ik} = 1\) (or 0) for all k, and we assume that \(0\cdot \infty =0\). It follows that

$$\begin{aligned}&\displaystyle \mathbf {x}^i_k\in S,\mathbf {z}_k\in Z,\\&\displaystyle G_i(\mathbf {x}^i_k)=\bar{G}_i(\mathbf {x}^i_k)+M_i\le \frac{1}{\widehat{z}_i}\bar{G}_i(\widehat{\mathbf {x}})z_{ik}+M_i\le M_i(1-z_{ik}),\forall i\in \mathcal {N}, \end{aligned}$$

where the first equality is the definition of \(G_i(\cdot )\), the second inequality follows because if \(z_{ik}=0\), then \(\mathbf {x}^i_k=0\), and \(\bar{G}_i(0)\le 0\); otherwise, \(\mathbf {x}^i_k=\frac{\widehat{\mathbf {x}}}{\widehat{z}_i}\) and \(\bar{G}_i(t x)\le t \bar{G}_i(x)\) for all \(t \ge 1\); while the last inequality follows since \( G_i(\widehat{\mathbf {x}}) \le M_i(1-\widehat{z}_i)\) or equivalently, \(\frac{1}{\widehat{z}_i}\bar{G}_i(\widehat{\mathbf {x}})\le -M_i\). Now define \((\mathbf {x}^i,\widehat{\mathbf {z}}) = \sum _k \lambda _k (\mathbf {x}_k^i,\mathbf {z}_k)\), then we have \(\mathbf {x}^i=\widehat{\mathbf {x}}\) for all \(i\in \mathcal {N}\). Hence,

$$\begin{aligned}&\{(\mathbf {x}^i,\widehat{z}_i)\}_{i \in \mathcal {N}} \!\in \! {{\mathrm{conv}}}\left( \left\{ \{(x^i,z_i)\}_{i \!\in \! \mathcal {N}}:x^i\in S,G_i(x^i) \!\le \! M_i(1\!-\!z_i),\forall i\in \mathcal {N}, z\!\in \! Z\right\} \right) \end{aligned}$$

and \(\{\mathbf {x}^i\}_{i \in \mathcal {N}} \) also satisfies the nonanticipativity constraints. Thus \((\widehat{\mathbf {x}},\widehat{\mathbf {z}})\) is also feasible to (8) implying \(\mathrm {v}^C(M)\ge \mathrm {v}^{LD}_1\). \(\square \)

A large class of problems that satisfy the conditions of Proposition 3 are chance-constrained covering linear programs with equiprobable scenarios [30]:

$$\begin{aligned} \min _{x,z}\left\{ c^\top x{:} A^i x \ge b^i z_i, \ \forall \ i {\in } \mathcal {N}, \ \sum _{i {\in } \mathcal {N}} z_i \ge (1-\epsilon )N, x \ge 0, z_i \in \{0,1\}, \ \forall i \in \mathcal {N}\right\} . \end{aligned}$$

where \(A_i\in {\mathbb {R}}_+^{m_i\times n},b_i\in {\mathbb {R}}_+^{m_i}\) for all \(i\in \mathcal {N}\). Recasting the above problem in the form of (4) we note that \(S={\mathbb {R}}_+^n, \bar{G}_i(x)=-A^ix,M_i=b_i\). Indeed we have \(\bar{G}_i(t x)=-t A^ix=t\bar{G}_i(x)\) for all \(t\ge 1\), and \(\bar{G}_i = 0\).

Remark 1

For chance-constrained covering linear programs with equiprobable scenarios [30], the basic nonanticipative dual bound \(\mathrm {v}^{LD}_1\) is equal to the relaxation bound of the big-M formulation (4) obtained by relaxing the integrality constraints on z, i.e., \(\mathrm {v}^{LD}_1= \mathrm {v}^C(M)\). For other chance-constrained linear programs, \(\mathrm {v}^{LD}_1\) could be strictly better than \(\mathrm {v}^C(M)\) (see Table 1 in Sect. 7).

4.2 Comparing \(\mathrm {v}^{LD}_1\) and \(\mathrm {v}^{LD}_2\)

The following theorem compares the strengths of the two different Lagrangian dual bounds \(\mathrm {v}^{LD}_1\) and \(\mathrm {v}^{LD}_2\).

Theorem 4

The quantile-based Lagrangian dual is at least as strong as the basic nonanticipative Lagrangian dual, i.e., \(\mathrm {v}^{LD}_1\le \mathrm {v}^{LD}_2\).

Proof

Let \(Q_i := \{(x^i,z_i): \mathrm{(7c)}, \mathrm{(7e)}\} (=\{(x^i,z_i): \mathrm{(13d)}, \mathrm{(13f)}\})\) for all \(i \in \mathcal {N}\). The claim follows since

$$\begin{aligned} \mathrm {v}^{LD}_1&=\displaystyle {\max _\lambda \min _{x,y,z}} \left\{ y + \sum _{i \in \mathcal {N}} p_i\lambda ^{\top }H_ix^i - \lambda ^\top h: \ (x^i,z_i) \in Q_i, \ \forall i \in \mathcal {N},\right. \\&\qquad \qquad \qquad \quad \left. z \in Z, y \ge \sum _{i \in \mathcal {N}} p_i c^\top x^i \right\} \\&\le \displaystyle {\max _\lambda \min _{x,y,z}} \left\{ y + \sum _{i \in \mathcal {N}} p_i\lambda ^{\top }H_ix^i - \lambda ^\top h: \ (x^i,z_i) \in Q_i, \ \forall i \in \mathcal {N},\right. \\&\qquad \qquad \qquad \quad \left. z \in Z, y \ge c^\top x^i, \ \forall i \in \mathcal {N}\right\} \\&\le \displaystyle {\min _y \max _\lambda \min _{x,z}} \left\{ y + \sum _{i \in \mathcal {N}} p_i\lambda ^{\top }H_ix^i - \lambda ^\top h: \ (x^i,z_i) \in Q_i, \ \forall i \in \mathcal {N}, \right. \\&\qquad \qquad \qquad \quad \qquad \left. z \in Z, y \ge c^\top x^i, \ \forall i \in \mathcal {N}\right\} = \mathrm {v}^{LD}_2, \end{aligned}$$

where the first equality follows from the definition of \(\mathrm {v}^{LD}_1\); the second inequality follows since \( y \ge \sum _{i \in \mathcal {N}} p_i c^\top x^i\) is an aggregation of the constraints \(y \ge c^\top x^i\) for each \(i \in \mathcal {N}\); the third inequality follows from the max-min inequality; and the final equality is from the definition of \(\mathrm {v}^{LD}_2\). \(\square \)

4.3 Comparisons with \(\mathrm {v}^Q\)

Theorem 5

The quantile-based Lagrangian dual bound is at least as strong as the quantile bound, i.e., \(\mathrm {v}^{LD}_2\ge \mathrm {v}^Q\).

Proof

First define

$$\begin{aligned} \mathrm {v}^R_2(y) = {\left\{ \begin{array}{ll} y &{} R(y) \ne \emptyset , \\ +\infty &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Observe that \(\mathrm {v}^R_2(y) = y + \mathcal {L}_2(0,y)\), and so \(\mathrm {v}^R_2(y) \le \omega _2(y)\) for all \(y \in {\mathbb {R}}\) because it is obtained by using \(\lambda = 0\) in (15). Thus, it follows that \(\mathrm {v}^R: =\min \{ \mathrm {v}^R_2(y) : y \in {\mathbb {R}}\} \le \mathrm {v}^{LD}_2\). We show that this first bound is identical to the quantile bound, i.e., \(\mathrm {v}^R = \mathrm {v}^Q\).

Recall that

$$\begin{aligned} R(y) = \left\{ \{x^i,z_i\}_{ i \in \mathcal {N}} : \mathrm{(13b)}, \mathrm{(13d)}{-}\mathrm{(13f)} \right\} . \end{aligned}$$

We first show \(R(\mathrm {v}^Q) \ne \emptyset \), which implies \(\mathrm {v}_2^R(\mathrm {v}^Q) = \mathrm {v}^Q\) and thus \(\mathrm {v}^R \le \mathrm {v}^Q\). Recall that \(\eta _i := \min \{c^\top x: x\in X^i\}\). Let \(I^Q = \{ \sigma _q,\ldots ,\sigma _N\}\) be the set of scenarios i that \(\eta _{i} \le \eta _{\sigma _q} =\mathrm {v}^Q\) for all \(i \in I^Q\). By the definition of \(\mathrm {v}^Q\), \(\sum _{i \in I^Q} p_i \ge 1 - \epsilon \). Also, for each \(i \in I^Q\), there exists \(\bar{x}^i \in X^i\) with \(c^\top \bar{x}^i = \eta _i \le \mathrm {v}^Q\). Next, let \(\widehat{x} \in \arg \min \{ c^\top x : x \in S \}\) and observe that \(c^\top \widehat{x} \le \eta _i \le \mathrm {v}^Q\) for all \(i\in I^Q\). Then, a feasible point of \(R(\mathrm {v}^Q)\) is obtained by setting \(x^i = \bar{x}^i\) for \(i \in I^Q\), \(x^i = \widehat{x}\) for \(i \in \mathcal {N}\!{\setminus }\!I^Q\) and setting \(z_i = 1\) for \(i \in I^Q\) and \(z_i = 0\) otherwise.

Now let \(y < \mathrm {v}^Q\) and let \(I(y) := \{ i \in \mathcal {N}: \eta _i \le y \}\). For each scenario \(i \in \mathcal {N}\!{\setminus }\!I(y)\) there is no \(x^i \in X^i\) with \(c^\top x^i \le y\). By definition of \(\mathrm {v}^Q\), it holds that \(\sum _{i \in I(y) } p_i< 1 - \epsilon \). Thus, \(R(y) = \emptyset \) and \(\mathrm {v}^R_2(y) = +\infty \). Thus \(\mathrm {v}^R > y\). As \(y < \mathrm {v}^Q\) is arbitrary, we conclude that \(\mathrm {v}^R \ge \mathrm {v}^Q\). \(\square \)

Neither of \(\mathrm {v}^{LD}_1, \mathrm {v}^C(M)\) has a general bound relationship with \(\mathrm {v}^Q\). The computational results in Sect. 7 provide examples where the quantile bound \(\mathrm {v}^Q\) is stronger than \(\mathrm {v}^{LD}_1\) or \(\mathrm {v}^C(M)\), while the following example shows that \(\mathrm {v}^{LD}_1\) or \(\mathrm {v}^C(M)\) can be stronger than \(\mathrm {v}^Q\).

Example 1

Consider a three-scenario instance as follows: \(X^1 = \{x\in \mathbb {R}^2_+ : 0.5x_1+2x_2 \ge 1\}\), \(X^2 = \{x\in \mathbb {R}^2_+ : 2x_1+0.5x_2 \ge 1\}\), \(X^3 = \{x\in \mathbb {R}^2_+ : x_1+x_2\ge 1\}\), and \(S = {\mathbb {R}}_+^2\). Each scenario happens with probability 1 / 3, and we let \(\epsilon = 1/3, M=1\). The objective is to minimize \(x_1 +x_2\). For this instance, the quantile bound \(\mathrm {v}^Q= 0.5\), and \(\mathrm {v}^{LD}_1= \mathrm {v}^C(M)= 4/7\), therefore, \(\mathrm {v}^{LD}_1,\mathrm {v}^C(M)\) are stronger lower bounds.

4.4 Bound comparison summary

We close this section by noting a set of sufficient conditions under which there is no duality gap.

Proposition 4

Suppose \(G_i: S \rightarrow {\mathbb {R}}_{-}^{m_i}\cup \mathcal {R}_M^i\) for all \(i\in \mathcal {N}\), where \(\mathcal {R}_M^i=\{s\in {\mathbb {R}}^{m_i}:\Vert s\Vert _{\infty }=M\}\) for all \(i\in \mathcal {N}\) and \(M\in {\mathbb {R}}_+\). Then we have \(\mathrm {v}^C(M)=\mathrm {v}^{LD}_1=\mathrm {v}^{LD}_2=\mathrm {v}^*\).

Proof

From Theorem  and Theorems 3 and 4, it is sufficient to show that \(\mathrm {v}^C(M)\ge \mathrm {v}^*\).

Suppose that \((\widehat{\mathbf {x}},\widehat{\mathbf {z}})\) is an optimal solution of the continuous relaxation of (4), where \(\widehat{\mathbf {z}}:= \{\widehat{z}_i\}_{i \in \mathcal {N}}\). We show that \((\widehat{\mathbf {x}},\lceil \widehat{\mathbf {z}}\rceil )\) is another optimal solution. Indeed, if \(\widehat{\mathbf {z}}\) is integral, then we are done. Otherwise, suppose that there is an \(i'\) such that \(\widehat{z}_{i'}\in (0,1)\), then by the definition of \(G_{i'}(\cdot )\), we must have \(G_{i'}(\widehat{\mathbf {x}})\le 0\). Thus, \((\widehat{\mathbf {x}},\lceil \widehat{\mathbf {z}}\rceil )\) is feasible to the continuous relaxation of (4) with the optimal value \(\mathrm {v}^C(M)\). Since \(\lceil \widehat{\mathbf {z}}\rceil \in Z\), \((\widehat{\mathbf {x}},\lceil \widehat{\mathbf {z}}\rceil )\) is also feasible to (4). Thus, \(\mathrm {v}^C(M)\ge \mathrm {v}^*\). \(\square \)

A large class of problems that satisfy the conditions of Proposition 4 are chance-constrained set covering problems [2, 4]:

$$\begin{aligned} \min _{x,z} \Bigl \{ c^\top x: A^i x \ge b^i z_i, \ \forall \ i \in \mathcal {N}, x \in \{0,1\}^n, z \in Z \Bigr \}, \end{aligned}$$
(18)

where \(A^i\in \{0,1\}^{m_i\times n},b^i\in \{0,1\}^{m_i}\) and \(\Vert b^i\Vert _{\infty }=1\) for all \(i\in \mathcal {N}\). Here \(G_i(x) = -A^ix+b^i\) and so \(G_i(x):\{0,1\}^n\rightarrow {\mathbb {Z}}_{-}^{m_i}\cup \mathcal {R}_1^i\) with \(\mathcal {R}_1^i=\{s\in \{0,1\}^{m_i}:\Vert s\Vert _{\infty }=1 \}\), for all \(i\in \mathcal {N}\) and hence \(\mathrm {v}^C(M)=\mathrm {v}^{LD}_1=\mathrm {v}^{LD}_2=\mathrm {v}^*\).

The relationships between the basic lower bounds of Sect. 2 and the Lagrangian dual bounds of Sect. 3 are summarized in Fig. 1.

Fig. 1
figure 1

Bound comparison summary

5 Primal formulations for chance-constrained mixed-integer linear programs

In this section we consider chance-constrained mixed-integer linear programs (MILP), i.e., problem (3) where \(S = \{x \in {\mathbb {R}}^{n-r} \times \mathbb {Z}^r : Dx \le d \}\) and \(X^i = \{ x \in {\mathbb {R}}^{n-r} \times \mathbb {Z}^r: \ A^i x \le b^i\}\) for each \(i \in \mathcal {N}\). Recall our assumption that, for all \(i \in \mathcal {N}\), \(X^i \subseteq S\) and so we may assume the constraints \(Dx \le d\) are included in the constraints \(A^ix \le b^i\). We derive two new formulations for such problems that are inspired by the two Lagrangian dual problems proposed in the previous sections. In particular, under certain conditions, these relaxations are primal formulations of the Lagrangian duals. The constructions here can be extended to the case where S, \(X^i\) and f are defined by convex functions using the perspective function approach in [9].

5.1 Primal formulation corresponding to \(\mathrm {v}^{LD}_1\)

Note that replacing \(z_i = \mathbb {I}(x^i\in X^i)\) in (8) by \(z_i \le \mathbb {I}(x^i\in X^i)\) for each \(i\in \mathcal {N}\) yields an equivalent formulation. Recall that \(Z= \{ z \in \{0,1\}^N : \sum _{i=1}^N p_iz_i \ge 1-\epsilon \}\) and let

$$\begin{aligned} T_1&:= \Bigl \{ \{(x^i,z_i)\}_{i \in \mathcal {N}}: Dx^i \le d, z_i \le \mathbb {I}(A^i x^i \le b^i), \\&\qquad x^i \in {\mathbb {R}}^{n-r} \times \mathbb {Z}^r,\ \forall \ i \in \mathcal {N},\ z \in Z \Bigr \}. \end{aligned}$$

From Theorem 1 and Eq. (8) we know that

$$\begin{aligned} \mathrm {v}^{LD}_1= \min _{x,z}\left\{ \sum _{i=1}^N p_ic^\top x^i: \{(x^i,z_i)\}_{i \in \mathcal {N}} \in {{\mathrm{conv}}}(T_1), \sum _{i \in \mathcal {N}} p_iH_ix^i = h \right\} . \end{aligned}$$
(19)

Next we use an extended formulation of \({{\mathrm{conv}}}(T_1)\) to derive a linear programming relaxation of (19) in the following form

figure c

We let \(P_S:= \{x \in {\mathbb {R}}^n : Dx \le d \}\) and \(P^i := \{ x \in {\mathbb {R}}^{n} : \ A^i x \le b^i \}, i \in \mathcal {N}\) be the continuous relaxations of \(S,X^i\) for each \(i\in \mathcal {N}\) respectively (the sets are identical in the case \(r=0\)). The next theorem shows the relationship between \(\mathrm {v}^{LD}_1\) and \(\mathrm {z}^{LP}_1\).

Theorem 6

The basic Lagrangian dual bound \(v^{LD}_1\) and the primal bound \(z^{LP}_1\) defined in formulation (20) satisfy: \(v^{LD}_1 \ge z^{LP}_1\), where the equality holds if \(P_S= {{\mathrm{conv}}}(S)\) and \(P^i = {{\mathrm{conv}}}(X^i)\) for all \(i \in \mathcal {N}\).

Proof

We just need to show that when \(P_S= {{\mathrm{conv}}}(S)\) and \(P^i = {{\mathrm{conv}}}(X^i)\) for all \(i \in \mathcal {N}\), \(\mathrm {v}^{LD}_1=\mathrm {z}^{LP}_1\). In the following, for the sake of notational simplicity, we use \((\mathbf {x},\mathbf {z}, \mathbf {u}, \mathbf {w}) = \{(x^i, z_i, u^i, w^i)\}_{i \in \mathcal {N}}\) and the operations on these vectors will be assumed to be scenario-wise, e.g., \(\mathbf {x}\cdot \mathbf {z}:= \{ x^iz_i \}_{i \in \mathcal {N}}\) and \(\mathbf {x}/ \mathbf {z}:= \{ x^i / z_i \}_{i \in \mathcal {N}}\) (here, if \(z_i = 0\) the corresponding element is defined to be zero).

  1. (i)

    Define

    $$\begin{aligned} \bar{T}_1 := \left\{ \{(x^i,z_i)\}_{i \in \mathcal {N}}: Dx^i \le d, z_i \le \mathbb {I}(A^i x^i \le b^i),\ \forall \ i \in \mathcal {N},\ z \in Z \right\} . \end{aligned}$$

    We show that \({{\mathrm{conv}}}(T_1)={{\mathrm{conv}}}(\bar{T}_1)\). Clearly, \({{\mathrm{conv}}}(T_1)\subseteq {{\mathrm{conv}}}(\bar{T}_1)\) as \(T_1\subseteq \bar{T}_1\). Hence, we only need to show that \({{\mathrm{conv}}}(T_1)\supseteq {{\mathrm{conv}}}(\bar{T}_1)\) or equivalently, \({{\mathrm{conv}}}(T_1)\supseteq \bar{T}_1\). Let \((\mathbf {x},\mathbf {z})\in \bar{T}_1\). Then for each \(i\in \mathcal {N}\), we have \(x^i\in P^i\) if \(z_i=1\); \(x^i\in P_S\), otherwise. Thus, there exists a finite set of vectors \(\{x^i_{k^i}\}_{k^i \in K^i}\) and nonnegative weights \(\{\lambda _k^i\}_{k \in K^i}\) such that \( \sum _{k \in K^i} \lambda _k^i=1,x^i= \sum _{k^i \in K^i} \lambda _{k^i}^i x^i_{k^i}\) and \(Dx^i_{k^i}\le d, z_i \le \mathbb {I}(A^i x^i_{k^i} \le b^i), x^i_{k^i} \in {\mathbb {R}}^{n-r} \times \mathbb {Z}^r\) for each \(k^i\in K^i\). Hence, for each \(\{k^i\}_{i\in \mathcal {N}}\in \prod _{i\in \mathcal {N}}K^i\), we have \(\{(x^{i}_{k^i},z_i)\}_{i \in \mathcal {N}}\in T_1\), and \((\mathbf {x},\mathbf {z}) \in {{\mathrm{conv}}}(\{(x^i_{k^i},z_i)\}_{i \in \mathcal {N}},\forall \{k^i\}_{i\in \mathcal {N}}\in \prod _{i\in \mathcal {N}}K^i)\subseteq {{\mathrm{conv}}}(T_1)\).

  2. (ii)

    Next, we define the polyhedron:

    $$\begin{aligned} W_1 := \bigl \{( x^i, z_i)\}_{i \in \mathcal {N}}{:} \exists (u^i,w^i), i {\in } \mathcal {N}, \text { s.t. } \mathrm{(20c)}{-}\mathrm{(20e)},\, u^i + w^i = x^i, \ \forall i {\in } \mathcal {N}\bigr \}. \end{aligned}$$

    Then, because the constraints \(\sum _{i \in \mathcal {N}} p_iH_ix^i = h\) enforce that all vectors \(x^i\) are equal to the same vector, say x, (20) can be reformulated as:

    $$\begin{aligned} \mathrm {z}^{LP}_1= \min _{x,z} \left\{ \sum _{i=1}^N p_ic^\top x^i: \{(x^i, z_i)\}_{i \in \mathcal {N}} \in W_1, \sum _{i \in \mathcal {N}} p_iH_ix^i = h \right\} . \end{aligned}$$

    Therefore, it is sufficient to show that \({{\mathrm{conv}}}(T_1)={{\mathrm{conv}}}(\bar{T}_1) = W_1\). We claim that \(\bar{T}_1 \subseteq W_1\), since for each \((x,z)\in \bar{T}_1\), we can find a \((u^i,w^i)\) pair for each \(i\in \mathcal {N}\) that satisfies all the constraints in \(W_1\) by setting \(u^i = x^i, w^i = 0\) if \(z_i = 1\), and \(u^i = 0, w^i = x^i\) otherwise. Therefore, \({{\mathrm{conv}}}(T_1)={{\mathrm{conv}}}(\bar{T}_1) \subseteq W_1\). Thus, we only need to show \(W_1 \subseteq {{\mathrm{conv}}}(\bar{T}_1)\). Let \((\mathbf {x},\mathbf {z}, \mathbf {u}, \mathbf {w}) \in W_1\). As \(\mathbf {z}\in {{\mathrm{conv}}}(Z)\), there exists a finite set of vectors \(\{\bar{\mathbf {z}}_k\}_{k \in K}\) and nonnegative weights \(\{\lambda _k\}_{k \in K}\) such that \(\mathbf {z}= \sum _{k \in K} \lambda _k \bar{\mathbf {z}}_k\). Now, for each \(k \in K\), define vector \(\bar{\mathbf {x}}_k = \bar{\mathbf {z}}_k \cdot ( \mathbf {u}/ \mathbf {z}) + (\mathbf {1} - \bar{\mathbf {z}}_k) \cdot \mathbf {w}/ (\mathbf {1}-\mathbf {z})\). Then, a simple calculation would show that \(\sum _{k \in K} \lambda _k \bar{\mathbf {x}}_k=\mathbf {x}\). The vector \((\bar{\mathbf {x}}_k,\bar{\mathbf {z}}_k)\) satisfies \(\bar{\mathbf {z}}_k \in Z\), and for \(i \in \mathcal {N}\), if \(\bar{\mathbf {z}}_{ik} = 0\) then \(\bar{\mathbf {x}}^{i}_k=w^i / (1-z_i) \in {{\mathrm{conv}}}(S)=P_S\) from (20d) and if \(\bar{\mathbf {z}}_{ik} = 1\), then \(\bar{\mathbf {x}}^{i}_k =u^i / z_i\in {{\mathrm{conv}}}(X^i)=P^i\) from (20c). Thus, \((\bar{\mathbf {x}}_k,\bar{\mathbf {z}}_k) \in \bar{T}_1\) for each \(k\in K\), which directly implies that \((\mathbf {x},\mathbf {z}) \in {{\mathrm{conv}}}(\bar{T}_1)\). \(\square \)

It follows from Theorem 6 that \(\mathrm {v}^{LD}_1= \mathrm {z}^{LP}_1\) for chance-constrained linear programs (i.e., when \(r=0\)).

When \(p_i = 1/N\) for all \(i\in \mathcal {N}\) then \({{\mathrm{conv}}}(Z) = \{z: \sum _{i\in \mathcal {N}} z_i \ge \lceil (1-\epsilon )N\rceil , z_i\in [0,1], i\in \mathcal {N}\}\). For general \(p_i\) values a description of \({{\mathrm{conv}}}(Z)\) in (20e) would require the convex hull of the corresponding knapsack set. Since this is in general intractable, we may replace constraint (20e) with a suitable polyhedral relaxation, at the expense of weakening the LP relaxation bound.

Inspired by the above primal formulation, we obtain the following “big-M” free formulation for chance-constrained MILP.

Proposition 5

Consider a CCSP (1) with assumptions (i)–(iv), then

$$\begin{aligned} \mathrm {v}^*=\min _{x,z,u,w} \ \{ c^\top x : \mathrm{(20b)}{-}\mathrm{(20d)}, z\in Z, x \in {\mathbb {R}}^{n-r} \times \mathbb {Z}^r \}, \end{aligned}$$
(21)

is a valid MILP formulation of (3).

5.2 Primal formulation corresponding to \(\mathrm {v}^{LD}_2\)

We next derive a primal formulation for \(\mathrm {v}^{LD}_2\) under certain conditions. From Theorem 1 and Eq. (17), we have

$$\begin{aligned} \mathrm {v}^{LD}_2= \min _{x,y,z}\left\{ y: \exists \{(x^i,z_i)\}_{i \in \mathcal {N}} \in {{\mathrm{conv}}}(R(y)), \sum _{i \in \mathcal {N}} p_iH_ix^i = h \right\} , \end{aligned}$$
(22)

where \(R(y) = \{ \{x^i,z_i\}_{i \in \mathcal {N}} : c^\top x^i\le y, Dx^i \le d, z_i \le \mathbb {I}(A^i x^i \le b^i), x^i \in {\mathbb {R}}^{n-r} \times \mathbb {Z}^r, \ \forall \ i \in \mathcal {N}, \ z \in Z \}\). Next we use an extended formulation of \({{\mathrm{conv}}}(R(y))\) to derive the following nonlinear programming formulation of (22):

figure d

We define \(P_S(y) := \{x \in {\mathbb {R}}^n : c^\top x\le y,Dx \le d \}\) and \(P^i(y) := \{ x \in {\mathbb {R}}^{n} : c^\top x\le y,\ A^i x \le b^i \}, i \in \mathcal {N}\). The next theorem shows the relationship between \(\mathrm {v}^{LD}_2\) and \(\mathrm {z}^{NLP}_2\).

Theorem 7

The quantile-based Lagrangian dual bound \(\mathrm {v}^{LD}_2\) and the primal bound \(\mathrm {z}^{NLP}_2\) defined in formulation (23) satisfy: \(\mathrm {v}^{LD}_2\ge \mathrm {z}^{NLP}_2\), where the equality holds if \(P_S(y) = {{\mathrm{conv}}}(S\cap \{x:c^\top x\le y\})\) and \(P^i(y) = {{\mathrm{conv}}}(X^i\cap \{x:c^\top x\le y\})\) for all \(i \in \mathcal {N}\) and for all \(y \in {\mathbb {R}}\).

Proof

We only need to show that when \(P_S(y) = {{\mathrm{conv}}}(S\cap \{x:c^\top x\le y\})\) and \(P^i(y) = {{\mathrm{conv}}}(X^i\cap \{x:c^\top x\le y\})\) for all \(i \in \mathcal {N}\) and y, then \(\mathrm {v}^{LD}_2=\mathrm {z}^{NLP}_2\). This directly implies that \(\mathrm {v}^{LD}_2\ge \mathrm {z}^{NLP}_2\) as \(P_S(y) \supseteq {{\mathrm{conv}}}(S\cap \{x:c^\top x\le y\})\) and \(P^i(y) \supseteq {{\mathrm{conv}}}(X^i\cap \{x:c^\top x\le y\})\).

The remainder of the proof is almost identical to that of Theorem 6, so we provide a sketch.

  1. (i)

    Let us define \(\bar{R}(y):=\{ \{(x^i,z_i)\}_{i \in \mathcal {N}}: c^\top x^i\le y, Dx^i \le d, z_i \le \mathbb {I}(A^i x^i \le b^i), \ \forall \ i \in \mathcal {N}, \ z \in Z \}\). Using an approach identical to that in part (i) of the proof of Theorem 6 it can be shown that \({{\mathrm{conv}}}(R(y))={{\mathrm{conv}}}(\bar{R}(y))\) for a given y.

  2. (ii)

    Next, we define a set \(W_2(y) = \{\{( x^i, z_i)\}_{i \in \mathcal {N}}: \exists (u^i,w^i), i \in \mathcal {N}, \text { s.t. } \mathrm{(20c)}{-}\mathrm{(20e)}, \mathrm{(23b)}, \mathrm{(23c)}, \ u^i + w^i = x^i, \ \forall i \in \mathcal {N}\}\) with a given y. Then, because the constraints \(\sum _{i \in \mathcal {N}} p_iH_ix^i = h\) enforce that all vectors \(x^i\) are equal to the same vector, say x, (23) can be reformulated as:

    $$\begin{aligned} \mathrm {z}^{NLP}_2= \min _{x,y,z} \left\{ y: \{(x^i, z_i)\}_{i \in \mathcal {N}} \in W_2(y), \sum _{i \in \mathcal {N}} p_iH_ix^i = h \right\} . \end{aligned}$$

    Therefore, it is sufficient to show that \({{\mathrm{conv}}}(R(y))={{\mathrm{conv}}}(\bar{R}(y)) = W_2(y)\). The proof of this is similar to part (ii) of the proof of Theorem 6. \(\square \)

It follows from Theorem 7 that \(\mathrm {v}^{LD}_2= \mathrm {z}^{NLP}_2\) for chance-constrained linear programs (i.e., when \(r=0\)).

Although (23) is a nonconvex nonlinear program, it can be solved by bisection on the value of y by observing that the feasible region of (23) is nonincreasing over y. Thus, \(\mathrm {z}^{NLP}_2\) can be calculated by finding the minimum value of y for which the feasible regions of (23) is nonempty. Such a minimum exists, since the feasible region of (23) is a closed set, and its objective is bounded from below. For any fixed y, the feasibility problem of (23) is a linear program. The disadvantage of solving (23) by bisection is that it may be difficult to incorporate such a procedure within a standard linear programming based branch-and-cut algorithm. We therefore propose an iterative scheme that solves a sequence of linear programs that generate progressively better lower bounds for \(\mathrm {z}^{NLP}_2\), and eventually converges to \(\mathrm {z}^{NLP}_2\).

5.2.1 A linear programming based approach for \(\mathrm {z}^{NLP}_2\)

Let \(\ell \) be a lower bound for \(\mathrm {z}^{NLP}_2\) (e.g., one can use \(\underline{\mathrm {v}}^C(M)\)). Given such a lower bound \(\ell \), the nonconvex constraints (23b) and (23c) can be reformulated with linear constraints, leading to the following formulation:

figure e

Observe that \(\mathrm {z}^{LP}_2(\ell )\) is an increasing function of \(\ell \), and if \(\ell \le \mathrm {z}^{NLP}_2\) then \(\mathrm {z}^{LP}_2(\ell ) \le \mathrm {z}^{NLP}_2\). Therefore, if we solve (24) iteratively and update \(\ell \) using the optimal objective value, eventually we will converge to some \(\mathrm {z}^{LP}_2(\bar{\ell }) = \bar{\ell } \le \mathrm {z}^{NLP}_2\). In fact, the value \(\bar{\ell }\) will be the same as \(\mathrm {z}^{NLP}_2\), since if we replace y and \(\ell \) in (24b) and (24c) by \(\bar{\ell }\), we get the same structure as (23b) and (23c). We formalize these assertions in the next two results.

Proposition 6

Let \(\mathrm {z}^{NLP}_2= \ell ^*\), then \(\mathrm {z}^{LP}_2(\ell ^*) = \ell ^*\).

Proof

First, \(y = \ell ^*\) satisfies all the constraints in (24), so \(\mathrm {z}^{LP}_2(\ell ^*) \le \ell ^*\). Suppose \(\mathrm {z}^{LP}_2(\ell ^*) < \ell ^*\), then there exists \(y^* < \ell ^*\) and \(\{(u^i,w^i,z_i)\}_{i \in \mathcal {N}}\), which is feasible to (20b)–(20e) and (24b) and (24c) being:

$$\begin{aligned} y^*&\ge c^\top u^i+\ell ^*(1-z_i), \ \forall i \in \mathcal {N},\\ y^*&\ge c^\top w^i+\ell ^*z_i, \ \forall i \in \mathcal {N}. \end{aligned}$$

Thus \(y^* \ge c^\top u^i + y^*(1-z_i),y^* \ge c^\top w^i+y^*z_i\) since \(z_i\in [0,1]\) for each \(i\in \mathcal {N}\). So this solution is feasible to (23), which is a contradiction. \(\square \)

Proposition 7

Suppose \(\ell _0 \le \mathrm {z}^{NLP}_2\), and let \(\ell _k = \mathrm {z}^{LP}_2(\ell _{k-1})\) for \(k \ge 1\). Then \(\{ \ell _k, k \ge 1\}\) converges to \(\mathrm {z}^{NLP}_2\).

Proof

As \(\{\ell _k\}\) is bounded above by \(\mathrm {z}^{NLP}_2\), the sequence converges to some \(\bar{\ell } \le \mathrm {z}^{NLP}_2\) by the monotone convergence theorem. On the other hand, as \(\mathrm {z}^{LP}_2(\bar{\ell })= \bar{\ell }\), there exists \(\{(u^i,w^i,z_i)\}_{i \in \mathcal {N}}\) such that this solution with \(y=\bar{\ell }\) is feasible to (24b) and (24c) with \(\ell = \bar{\ell }\), and (20b)–(20e). But this implies that this solution is also feasible to (23), and hence \(\mathrm {z}^{NLP}_2\le \bar{\ell }\). \(\square \)

Similar to the primal formulation (21), enforcing integrality constraints on z and any integer constrained x in (24) yields an alternative “big-M” free formulation for a general chance-constrained MILP.

Proposition 8

Consider a CCSP (1) with assumptions (i)–(iv), then

$$\begin{aligned} \mathrm {v}^*=\min \ \{ y : x \in {\mathbb {R}}^{n-r} \times \mathbb {Z}^r,z\in Z, \mathrm{(24b)}, \mathrm{(24c)}, \,\mathrm{and}\,\mathrm{(20b)}{-}\mathrm{(20d)}\}, \end{aligned}$$
(25)

is a valid MILP formulation of (3).

Recall that the constraints (24b) and (24c) depend on a given lower bound \(\ell \). In our arguments above we required \(\ell \le \mathrm {z}^{NLP}_2\) in order to argue that the iterative solution of the linear programming relaxation will converge to \(\mathrm {z}^{NLP}_2\). However, any \(\ell \le \mathrm {v}^*\) can be used for validity of the formulation (25). As examples, one may choose to use the quantile bound \(\mathrm {v}^Q\), or \(\mathrm {z}^{NLP}_2\) obtained by iteratively solving (24). In Sect. 6, we develop branch-and-cut decomposition algorithms based on MILP formulations (21) and (25).

5.2.2 A second-order cone programming based approach for \(\mathrm {z}^{NLP}_2\)

Inspired by the nonlinear program (23), we consider the following second order cone programming (SOCP) problem

figure f

where \(\ell \) is a lower bound on \(\mathrm {z}^{NLP}_2\). To see that that (26) is indeed an SOCP problem note that \(\beta z_i = \frac{1}{4}(\beta + z_i)^2 -\frac{1}{4}(\beta - z_i)^2\) and \(\beta (1-z_i) =\frac{1}{4}(\beta + (1-z_i))^2-\frac{1}{4}(\beta - (1-z_i))^2\). Note that, since \(\beta \) is nonnegative by (26d), so is \(\mathrm {z}^{SOC}_2(\ell )\). We next relate the values \(\mathrm {z}^{SOC}_2(\ell )\), \(\mathrm {z}^{NLP}_2\), and \(\mathrm {z}^{LP}_2(\ell )\).

Proposition 9

Let \(\mathrm {z}^{NLP}_2\) be the optimal value of (23), and for each \(\ell \le \mathrm {z}^{NLP}_2\), let \(\mathrm {z}^{LP}_2(\ell ),\mathrm {z}^{SOC}_2(\ell )\) be the optimal values of (24) and (26), respectively. Then these values satisfy

$$\begin{aligned} \mathrm {z}^{NLP}_2\ge \sqrt{ \mathrm {z}^{SOC}_2(\ell )} + \ell \ge \mathrm {z}^{LP}_2(\ell ). \end{aligned}$$

Proof

We first show that \(\mathrm {z}^{NLP}_2\ge \sqrt{ \mathrm {z}^{SOC}_2(\ell )} + \ell \). Let \((\mathbf {x},\mathbf {y},\mathbf {z}, \mathbf {u}, \mathbf {w})\) be an optimal solution of (23). Consider \(\beta '=(\mathbf {y}-\ell )^2\), \(s_i=\max \{c^\top u^i - \ell z_i,0\}\) and \(t_i=\max \{c^\top w^i - \ell (1-z_i),0\}\). It is clear that \((\mathbf {x},\beta ',\mathbf {z}, \mathbf {u}, \mathbf {w},\mathbf {s},\mathbf {t})\) satisfies (20b)–(20e) and (26b), (26c), (26f). From (23b), (23c), and the fact that \(\ell \) is a lower bound on \(\mathrm {z}^{NLP}_2\), we have

$$\begin{aligned}&0\le \mathbf {y}-\ell , \\&\quad c^\top u^i - \ell z_i\le (\mathbf {y}-\ell )z_i, \forall i \in \mathcal {N},\\&\quad c^\top w^i - \ell (1-z_i)\le (\mathbf {y}-\ell )(1-z_i), \forall i \in \mathcal {N}. \end{aligned}$$

Hence, since \(z_i^2\le z_i\) for all \(i \in \mathcal {N}\),

$$\begin{aligned} (s_i)^2&\le (\mathbf {y}-\ell )^2z_i^2\le \beta 'z_i, \forall i \in \mathcal {N},\\ (t_i)^2&\le (\mathbf {y}-\ell )^2(1-z_i)^2\le \beta '(1-z_i), \forall i \in \mathcal {N}. \end{aligned}$$

Thus, \((\mathbf {x},\beta ',\mathbf {z}, \mathbf {u}, \mathbf {w},\mathbf {s},\mathbf {t})\) also satisfies (26d) and (26e). Hence \(\mathrm {z}^{NLP}_2\ge \sqrt{ \mathrm {z}^{SOC}_2(\ell )} + \ell \).

Now we show that \(\sqrt{ \mathrm {z}^{SOC}_2(\ell )} + \ell \ge \mathrm {z}^{LP}_2(\ell )\). Let \((\mathbf {x},\beta ,\mathbf {z}, \mathbf {u}, \mathbf {w},\mathbf {s},\mathbf {t})\) be an optimal solution of (26). Consider \(\mathbf {y}'=\sqrt{\beta }+\ell \). It is clear that \((\mathbf {x},\mathbf {y}',\mathbf {z}, \mathbf {u}, \mathbf {w})\) satisfies (20b)–(20e). From (26b)–(26f), the fact that \(\ell \) is a lower bound on \(\mathrm {z}^{NLP}_2\), and \(z_i\le 1\) for all \(i\in \mathcal {N}\), we have

$$\begin{aligned}&(\max \{c^\top u^i - \ell z_i,0\})^2\le (\mathbf {y}'-\ell )^2z_i\le (\mathbf {y}'-\ell )^2, \forall i \in \mathcal {N},\\&\quad (\max \{c^\top w^i - \ell (1-z_i),0\})^2\le (\mathbf {y}'-\ell )^2(1-z_i)\le (\mathbf {y}'-\ell )^2, \forall i \in \mathcal {N}. \end{aligned}$$

Taking square roots of the above inequalities we see that \((\mathbf {x},\mathbf {y}',\mathbf {z}, \mathbf {u}, \mathbf {w})\) satisfies (24b) and (24c). Hence \(\sqrt{ \mathrm {z}^{SOC}_2(\ell )} + \ell \ge \mathrm {z}^{LP}_2(\ell )\). \(\square \)

Based on the above result we can extend the successive linear programming based approach established in Propositions 6 and 7 to one involving solving successive solutions of the SOCP (26). Also similar to (25), the SOCP  (26) after enforcing integrality constraints on z and any integer constrained x variables leads to a “big-M free” mixed-integer SOCP (MISOCP) formulation for a general chance-constrained MILP.

Proposition 10

Consider a CCSP (1) with assumptions (i)–(iv), then

$$\begin{aligned} (\mathrm {v}^*- \ell )^2 = \min \ \{ \beta : x \in {\mathbb {R}}^{n-r} \times \mathbb {Z}^r,z\in Z, \mathrm{(26b)}{-}\mathrm{(26f)},\, \mathrm{and}\,\mathrm{(20b)}{-}\mathrm{(20d)}\}, \end{aligned}$$
(27)

is a valid MISOCP formulation of (3).

6 Decomposition algorithms

In this section, we introduce a heuristic algorithm inspired by the bisection procedure for calculating the Lagrangian dual \(\mathrm {v}^{LD}_2\) and also present two exact algorithms for solving CCSPs (3).

6.1 A heuristic scheme

The idea of our proposed heuristic algorithm is to use bisection to search for a value \(\ell \) so that fixing \(y = \ell \) in (13) may yield a feasible solution. Let \(X^i=\{x\in {\mathbb {R}}^{n-r}\times {\mathbb {Z}}^r: G_i(x)\le 0\}\) for all \(i \in \mathcal {N}\) and let L and U be known initial lower and upper bounds on the optimal value of (13). For a fixed \(y \in [L,U]\), say \(y = (L+U)/2\), we consider the following optimization problem that minimizes the sum of infeasibilities for each scenario:

figure g

where \(\mathbf {e}\) is a vector of all 1’s. This problem is of the form of a two stage stochastic program with simple recourse and can benefit from specialized decomposition schemes for such problems. Given an optimal solution \((\widehat{x},\widehat{s})\) of (28), we check if it is feasible to the original problem (13). We set \(\widehat{z}_i =\mathbb {I}(\widehat{s}_i=0)\) for all \(i \in \mathcal {N}\). If \(\sum _{i\in \mathcal {N}} p_i \widehat{z}_i \ge 1- \epsilon \), then \(\widehat{x}\) is feasible to (13), and therefore y is a valid upper bound for (13). Then we can set \(U=y\) and repeat the above steps to find a better feasible solution and hence a better upper bound. On the other hand, if \(\sum _{i\in \mathcal {N}} p_i \widehat{z}_i < 1- \epsilon \), we set \(L = y\) and repeat the above steps to try to find a feasible solution. Notice that L might not be a valid lower bound on \(\mathrm {v}^*\) when the algorithm terminates. The detailed procedure is described in Algorithm 1.

figure h

Let \(\mathrm {v}^{H}\) denote the solution given by Algorithm 1. We next show that \(0\le \mathrm {v}^{H}-\mathrm {v}^*\le \delta \) under certain conditions.

Proposition 11

Suppose \(G_i(x): S \rightarrow {\mathbb {R}}_{-}^{m_i}\cup \mathcal {R}_M^i,\forall i\in \mathcal {N}\), where \(\mathcal {R}_M^i=\{s\in {\mathbb {R}}^{m_i}:\Vert s\Vert _{\infty }=M\}\) for all \(i\in \mathcal {N}\), \(M\in {\mathbb {R}}_+\). Then Algorithm 1 returns a feasible solution with \(\mathrm {v}^{H}-\mathrm {v}^*\le \delta \), where \(\delta >0\) is the chosen stopping tolerance parameter.

Proof

First of all, we observe that for any optimal solution \((\mathbf {x}^*,\mathbf {s}^*)\) of (28) with a given y, \(s_i^*\) is either 0 or M for each \(i\in \mathcal {N}\). Indeed, if \(G_i(\mathbf {x}^*)\le 0\), then \(s_i^*=0\); otherwise, the smallest \(s_i^*\) that can be chosen is M since \(G_i(\mathbf {x}^*)\in \mathcal {R}_M^i\).

Suppose that \((\widehat{\mathbf {x}},\widehat{\mathbf {z}})\) is an optimal solution of (3). Let \(\widehat{\mathbf {s}}=M\mathbf {e}-M\widehat{\mathbf {z}}\), and \((\widehat{\mathbf {x}},\widehat{\mathbf {s}})\) is a feasible solution to (28) with any \(y\ge \mathrm {v}^*\) and \(\sum _{i\in \mathcal {N}}p_i\widehat{\mathbf {s}}/M\le \epsilon \). Thus, \(\mathrm {v}^{H}\le \mathrm {v}^*+\delta \). \(\square \)

The conditions of Proposition 11 are identical to those in Proposition 4, and the chance-constrained set covering problems (18) satisfy these conditions. Note also that for this problem class with an integer cost vector we can choose \(\delta < 1\) and recover an exact optimal solution.

6.2 A scenario decomposition algorithm for chance-constrained 0–1 programs

For two-stage stochastic programs in which the first-stage variables are all binary [1] presented a scenario decomposition algorithm that uses the nonanticipative Lagrangian dual of such problems. In this approach, feasible solutions from the scenario subproblems are used to update the upper bound. We describe a simple extension of this method to solve chance-constrained 0–1 programs, which can take advantage of the new Lagrangian dual problems proposed in Sect. 3. Exactly solving the Lagrangian dual problems (8) and (17) may be challenging in computation. However, the scenario decomposition algorithm remains valid even if the Lagrangian dual multipliers are not optimal. In a practical implementation, we may settle with a lower bound of \(\mathrm {v}^{LD}_1\), or \(\mathrm {v}^{LD}_2\). For example, we may simply use the quantile bound \(\mathrm {v}^Q\), or even a valid lower bound from the scenario grouping based relaxation (5).

figure i

Algorithm 2 provides a description of the scenario decomposition approach. Finite convergence of Algorithm 2 is an immediate consequence of the following three facts: the lower bound is nondecreasing; a feasible solution is never excluded if it has not been evaluated; and there are finitely many feasible solutions since for each scenario \(i\in \mathcal {N}\), \(X^i\) is a finite set. Implementation of the update of the set \(X^i\) in line 11 can be accomplished with “no good” cut based on the assumption that all the x variables are binary; see [1] for details.

6.3 Branch-and-cut algorithms based on primal MILP formulations

Section 5 provided two MILP formulations (21) and (25) for (3) when the objective function and all constraints are linear. These formulations have a set of variables and constraints for each scenario \(i \in \mathcal {N}\), so solving them directly may be time-consuming. We therefore apply a branch-and-cut framework to solve these two MILPs. Specifically, a Benders decomposition algorithm is applied in which the master problem consists of the decision variables x and the scenario decision variables z, and in the case of (25) the variable y. The decision variables \(u^i\) and \(w^i\) for \(i \in \mathcal {N}\) are projected out, and Benders feasibility cuts are added to iteratively build the projected feasible region. The generation of Benders feasibility cuts decomposes into a separate subproblem for each scenario. The detailed development of the cuts and the resulting branch-and-cut algorithms are presented in the online supplement.

7 Numerical illustration

In this section we evaluate the proposed bounds and algorithms on two set of multi-dimensional knapsack instances, mk-20-10 and mk-39-5, from [35] (these instances are named 1-4-multi and 1-6-multi, respectively, in [35]). Instance name mk-n-m indicates the instance has n decision variables and m constraints in each scenario. For both sets of instances, we consider four different scenario sizes: \(N \in \{100,500,1000,3000\}\). Under each scenario size (e.g., \(N=100\)), we perform five different replications. Since the results among different replications are similar, we report averages over the five replications. We consider two different types of variables x: continuous and binary. The deterministic feasible set is \(S = [0,1]^n\) for the instances with continuous x variables and \(S = \{0,1\}^n\) for those with binary x variables. For each scenario \(i \in \mathcal {N}\), the feasible set is \(X^i := \{x\in S : A^i x \le b^i\}\), where \(A^i\in \mathbb {R}^{m\times n}_+,b^i\in \mathbb {R}^m_+\) is the ith realization of random data \(\xi \), and the objective is to minimize \(-c^\top x\), where the coefficient vector \(c\in \mathbb {R}^n_+\). In all our experiments, we set the number of threads to one, and we use the default optimality gap tolerance in CPLEX for solving MIP models, which is \(0.01\%\).

7.1 Illustration of bound quality on instances with continuous x

We first present a numerical illustration on the strength of the Lagrangian dual bounds on instances with continuous x, i.e., when \(S = [0,1]^n\). We compare the proposed Lagrangian bounds \(\mathrm {v}^{LD}_1\) and \(\mathrm {v}^{LD}_2\) with the LP relaxation bound \(\mathrm {v}^C(M)\) (\(\underline{\mathrm {v}}^C(M)\) and \(\mathrm {v}^C(M)\) are identical in this case), quantile bound \(\mathrm {v}^Q\), and the optimal objective value \(\mathrm {v}^*\). Since x is continuous, from Theorems 6 and 7, the Lagrangian dual bounds \(\mathrm {v}^{LD}_1, \mathrm {v}^{LD}_2\) are equal to \(\mathrm {z}^{LP}_1\), \(\mathrm {z}^{NLP}_2\), respectively. Therefore, we compute \(\mathrm {v}^{LD}_1\) using (20), and when computing \(\mathrm {v}^{LD}_2\) (\(\mathrm {z}^{NLP}_2\)), we start with the quantile bound \(\mathrm {v}^Q\), then solve the primal LP (24) iteratively using Benders decomposition.

For \(\mathrm {v}^{LD}_1\), \(\mathrm {v}^{LD}_2\) and \(\mathrm {v}^C(M)\), we report bounds that are obtained with and without big-M strengthening, a coefficient strengthening procedure introduced in [35] for CCSPs. Without any big-M strengthening, a naive big-M parameter for formulation (4) is calculated by: \(M_{ij} = \max \{A^i_jx-b^i_j: x\in S\}\), for all \( i\in \mathcal {N}, \ j=1,2,\ldots , m_i\). At the other extreme, one could obtain a strengthened big-M parameter \(M_{kl}\) for scenario \(k\in \mathcal {N}\), constraint l by solving an MIP that has the same structure as formulation (4) for the original CCSP: \(M_{kl} = \max \left\{ A^k_l x-b^k_l: A^ix \le b^i+M_i(1-z_i), \ \forall i\in \mathcal {N}, x\in S, z\in Z\right\} \), which would be too time-consuming. The idea of the big-M strengthening is to obtain a relaxation bound of this MIP using the quantile bound presented in Sect. 2.2. The strengthened big-M parameters \(M_i\)’s are useful for formulations other than (4) as well, where the scenario-independent feasible set S is strengthened by valid inequalities \(A^i x \le M_i+b^i\) for all \(i\in \mathcal {N}\).

In Table 1, we show the optimality gaps of \(\mathrm {v}^C(M)\), \(\mathrm {v}^Q\), \(\mathrm {v}^{LD}_1\) and \(\mathrm {v}^{LD}_2\) in the columns labeled accordingly, where optimality gap for a given lower bound LB is defined as \((\mathrm {v}^*- LB)/|\mathrm {v}^*|\). Under each lower bound, the columns under label ‘With big-M Str.’ provides the bounds obtained if the big-M coefficients have been strengthened, and the columns under label ‘No big-M Str.’ provides the bounds obtained without strengthening big-M coefficients. We also show the optimality gap of the upper bound UB given by the heuristic algorithm described in Sect. 6.1 in column labeled \(\mathrm {v}^{H}\), which is calculated as \((UB - \mathrm {v}^*)/|\mathrm {v}^*|\). In Table 2, we present the average computation time, in seconds, for obtaining each of these bounds. In addition, the column ‘M-T’ displays the time spent on the big-M strengthening procedure, which is a pre-processing step required for calculating the bounds under the ‘With big-M Str.’ columns. (Thus, the total time for calculating such a bound is the sum of the ‘M-T’ and the bound time.)

Table 1 Bound comparison for multi-dimensional continuous knapsack instances

We can see from Table 1 that strengthening big-M parameters significantly improves the bounds given by \(\mathrm {v}^{LD}_1\) and \(\mathrm {v}^C(M)\). Without strengthening big-M parameters, bounds given by \(\mathrm {v}^{LD}_1\) and \(\mathrm {v}^C(M)\) are rather weak, especially when a higher risk tolerance parameter \(\epsilon = 0.2\) is used. With strengthened big-M parameters, the difference between \(\mathrm {v}^{LD}_1\) an \(\mathrm {v}^C(M)\) is very small. On the other hand, the bound improvement by strengthening big-M parameters is modest for \(\mathrm {v}^{LD}_2\), since \(\mathrm {v}^{LD}_2\) already gives a tight bound even without strengthening big-M parameters. Overall, the best bounds are obtained by using strengthened big-M parameters and \(\mathrm {v}^{LD}_2\). We also find that the heuristic scheme yields a very small optimality gap. For large instances where the exact optimal solution may be challenging to find, one may accept the heuristic solution \(\mathrm {v}^{H}\), when the gap between the lower bound given by \(\mathrm {v}^{LD}_2\) and the upper bound given by \(\mathrm {v}^{H}\) is small enough. From Table 2 we see that we can obtain these strong bounds in a small amount of time. Interestingly, we see that formulations with naive big-M parameters take longer to solve than the ones with strengthened big-M parameters, even after including the time spent on strengthening the big-M parameters. Thus, for these instances, big-M strengthening yields improvements in both computation time and bound.

Table 2 Computational times for computing bounds for multi-dimensional continuous knapsack instances

7.2 Illustration of the branch-and-cut algorithm on instances with continuous x

In this section we describe computational experiments using the branch-and-cut approach described in Sect. 6.3 for solving formulation (25) on instances with continuous x. For all experiments in our test, we use a time limit of 3600 seconds. We turn off CPLEX presolve for the branch-and-cut algorithm in order to be able to add user cuts and lazy constraints. We use the heuristic solution obtained by the heuristic algorithm in Sect. 6.1 as an MIP start solution. Further implementation details of the branch-and-cut algorithm are given in the online supplement.

In Table 3 we compare the performances of three computational options: the MIP (4) using strengthened big-M parameters (MIP-(4)), the branch-and-cut algorithm with strengthened big-M parameters (Benders With big-M), and the branch-and-cut algorithm without strengthened big-M parameters (Benders without big-M). For instances that are not solved to optimality within the time limit, we show in parentheses the number of instances out of five replications that are solved to optimality, and report the average optimality gap. For these instances, we use the number of nodes that have been processed up to the time limit to calculate the average number of nodes.

Table 3 Computational results for MIP formulation (4) and branch-and-cut algorithm on multi-dimensional continuous knapsack instances

We observe from Table 3 that the performance of the branch-and-cut algorithm is improved by using strengthened big-M parameters. This is consistent with what has been shown in Table 1. Also, we have seen from Table 1 that the root relaxation bound for the branch-and-cut algorithm is tighter than the MIP formulation (4). However, this advantage at the root node does not lead to a significant improvement in terms of the total computation time for solving these instances to optimality. We observe that for those instances that could be solved within the time limit, the computational times for the big-M formulation and the branch-and-cut algorithm are similar, although for instances with the largest number of scenarios in our experiments, the branch-and-cut algorithm yields much smaller optimality gaps. We also notice that both methods seem having more difficulty in solving the instances with a higher risk tolerance \(\epsilon \). It appears that branching in the Benders formulation is less effective than that in the MIP formulation (4) and thus more nodes are explored. This motivates further study on effective ways to take advantage of the strong relaxation bound \(\mathrm {v}^{LD}_2\) for solving CCSPs to optimality.

7.3 Performance on instances with binary x

We next consider the binary instances, i.e., with \(S = \{0,1\}^n\). We compare the proposed dual bounds and also illustrate the effectiveness of the heuristic algorithm (Algorithm 1), the scenario decomposition algorithm (Algorithm 2) with and without scenario grouping, and the MIP formulation (4) with strengthened big-M coefficients. For the scenario decomposition algorithm, the lower bounds are obtained by \(\mathrm {v}^Q\). For the scenario grouping, the number of groups K is chosen as the smallest divisor of N that is larger than \(\lceil \epsilon N\rceil \) and the scenarios are divided into K groups with the same size.

Table 4 summarizes the optimality gaps of \(\underline{\mathrm {v}}^C(M)\), \(\mathrm {z}^{LP}_1\), \(\mathrm {z}^{NLP}_2\), \(\mathrm {v}^Q\), \(\mathrm {v}^{QG}\) and \(\mathrm {v}^{H}\), where \(\mathrm {v}^{QG}\) denotes the results of the grouping quantile bound. For these instances, we report only gaps obtained with strengthened big-M coefficients. As we do not have the optimal solutions for most of the mk-39-5 instances, we use the best known upper bound and lower bound to estimate the lower bound and heuristic gaps, respectively. Table 5 displays the time to compute these bounds. (Note that, the times for computing \(\underline{\mathrm {v}}^C(M)\), \(\mathrm {z}^{LP}_1\), and \(\mathrm {z}^{NLP}_2\) are the times for \(\mathrm {v}^C(M)\), \(\mathrm {v}^{LD}_1\), and \(\mathrm {v}^{LD}_2\) from Table 2 for the continuous x case, since these are equivalent.)

Table 4 Bound comparison for multi-dimensional binary knapsack instances
Table 5 Computational times for computing bounds for multi-dimensional binary knapsack instances

In Table 4, we see that the best Lagrangian dual bounds corresponding to continuous x still have at most 4% optimality gap, which demonstrates the effectiveness of these bounds. In addition, the quantile bound, which is obtained by solving binary IP subproblems, is somewhat stronger than any of the bounds \(\underline{\mathrm {v}}^C(M)\), \(\mathrm {z}^{LP}_1\), and \(\mathrm {z}^{NLP}_2\). On the other hand, we see from Table 2 that the quantile bound \(\mathrm {v}^Q\) takes longer to calculate. However, when we apply the quantile bound to the scenario grouping relaxation, the resulting bounds \(\mathrm {v}^{QG}\) are comparable with the quantile bound obtained without grouping, but take much shorter time to compute (see Table 5).

Table 6 Performance of scenario decomposition and MIP on the multi-dimensional binary knapsack instances

We observe in Tables 4 and 5 that for the mk-20-10 instances the heuristic performs extremely well in terms of both quality (within \(0.01\%\) optimality gap) and solution time. For the instances with \(N=3000\), the optimality gaps are not exact since we are not able to obtain the optimal objective values due to memory limitations. However, we can see that the bounds obtained from the heuristic methods are still quite close to the optimal ones. Thus, the solution from the heuristic method could be treated as a good starting point for other algorithms.

Table 6 presents the results of solving these instances to optimality using the scenario decomposition algorithm (Algorithm 2) with and without scenario grouping, and the MIP formulation (4) with strengthened big-M coefficients. We find that the scenario grouping based decomposition method significantly outperforms the one without grouping in terms of computational time, for the instances solved within the time limit, and ending optimality gaps, for the remaining instances. From Table 6, we further observe that when the number of scenarios is small (e.g., not larger than 1000), the MIP formulation (4) with strengthened big-M parameters gives the best performance among these three methods. However, when the number of scenarios is larger, the MIP formulation (4) could not close the optimality gap within the time limit, while the scenario decomposition method with grouping can still solve all of the mk-20-10 instances within 15 minutes. We also observe that neither method is able to solve the majority of the mk-39-5 instances. This negative result for the scenario decomposition approach on problems with a larger x space is somewhat expected, since only the simple “no good cuts” are employed to drive the convergence. This motivates future work to find better ways to incorporate the strong Lagrangian dual bound into the scenario decomposition approach.

8 Conclusion

We studied two new Lagrangian dual problems for chance-constrained stochastic programs and developed their associated primal formulations which can be used to exactly compute these dual bounds for chance-constrained linear programs, or a lower bound on them for chance-constrained mixed integer programs. We also proposed a new heuristic method and two new exact algorithms that make use of these new bounds to solve these problems to optimality. In our numerical study, we find that for all of our instances, the dual bounds can be quickly computed and demonstrate that heuristic solutions are within 4% of optimal. Our exact algorithms are able to solve more than half of the instances to optimality, although there remain some challenging unsolved instances. Thus, we see it as a direction for future work to investigate alternative means for using these strong bounds to solve these problems to optimality and also seek better ways of excluding solutions as well as optimally grouping scenarios for the scenario decomposition approach. In addition, for chance-constrained mixed integer programs, as opposed to using the (weaker) primal formulations as we have done in this work, it would be interesting to explore efficient techniques for directly computing the proposed dual bounds using, e.g., bundle methods [12, 13, 19].