Skip to main content
Log in

A Trust Region Method for the Solution of the Surrogate Dual in Integer Programming

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

We propose an algorithm for solving the surrogate dual of a mixed integer program. The algorithm uses a trust region method based on a piecewise affine model of the dual surrogate value function. A new and much more flexible way of updating bounds on the surrogate dual’s value is proposed, in which numerical experiments prove to be advantageous. A proof of convergence is given and numerical tests show that the method performance is better than a state of the art subgradient solver. Incorporation of the surrogate dual value as a cut added to the integer program is shown to greatly reduce solution times of a standard commercial solver on a specific class of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Greenberg, H.J., Pierskalla, W.P.: Surrogate mathematical programming. Oper. Res. 18, 924–939 (1970)

    Article  MATH  MathSciNet  Google Scholar 

  2. Karwan, M., Rardin, R.: Surrogate dual multiplier search procedures in integer programming. Oper. Res. 32, 52–69 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  3. Karwan, M., Rardin, R.: Some relationships between Lagrangian and surrogate duality in integer programming. Math. Progr. 17, 320–334 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  4. Kim, S.-L., Kim, S.: Exact algorithm for the surrogate dual of an integer programming problem: subgradient method approach. J. Optim. Theory Appl. 96, 363–375 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  5. Sarin, S., Karwan, M., Rardin, R.: Surrogate duality in a branch-and-bound procedure for integer programming. Euro. J. Oper. Res. 33, 326–333 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  6. Li, D., Sun, X.: Nonlinear Integer Programming. International Series in Operations Research & Management, Springer (2006)

  7. Noll, D.: Bundle method for non-convex minimization with inexact subgradients and function values. Comput. Anal. Math. 50, 555–592 (2013)

    Article  MathSciNet  Google Scholar 

  8. Linderoth, J., Wright, S.: Decomposition algorithms for stochastic programming on a computational grid. Comput. Opt. Appl. 24, 207–250 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  9. Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms II: Advanced Theory and Bundle Methods. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 306. Springer-Verlag, Berlin (1993)

    Google Scholar 

  10. Boland N, Eberhard A, Tsoukalas A (2014) A trust region method for the solution of the surrogate dual in integer programming. Optim. Online. http://www.optimization-online.org/DB_HTML/2014/02/4249.html

  11. Frangioni, A.: Generalized bundle methods. SIAM J. Optim. 13, 117–156 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  12. https://www-user.tu-chemnitz.de/~helmberg/ConicBundle/

  13. Han, B., Leblet, J., Simon, G.: Hard multidimensional multiple choice knapsack problems: an empirical study. Comput. Oper. Res. 37, 172–181 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  14. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. A series of comprehensive studies in mathematics. Springer, Berlin (1998)

    Book  MATH  Google Scholar 

Download references

Acknowledgments

We thank two anonymous referees, whose constructive comments improved the paper. This research was supported by the ARC Discovery Grant No. DP0987445.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. C. Eberhard.

Appendix

Appendix

In the following, we will denote the support functions to a closed convex set \(A\) by \(\delta ^{*}\left( A\right) \left( u\right) :=\sup \left\{ au : a\in A\right\} .\) We use epi-limits and Attouch’s theorem for which the reader may consult [14] for details.

Proof

(of Lemma 4.1) Suppose \(V_{\alpha }(u+\gamma d) < V_{\alpha }(u) <+\infty \), for some sufficiently small \(\gamma >0\) and \(x\in \arg \max \left\{ u\left( Ax^{\prime }-b\right) : x^{\prime }\in X\left( \alpha \right) \right\} \). Then, \(x\in X\left( \alpha \right) \) and so

$$\begin{aligned} \gamma d\left( Ax-b\right)&\le \max \left\{ \left( u+\gamma d\right) \left( Ax^{\prime }-b\right) : x^{\prime }\in X\left( \alpha \right) \right\} -u\left( Ax-b\right) \\&= V_{\alpha }(u+\gamma d ) - V_{\alpha }(u) <0, \end{aligned}$$

implying \(d\left( Ax-b\right) <0\) holds for all \(x\in \arg \max \left\{ u\left( Ax^{\prime }-b\right) : x^{\prime }\in X\left( \alpha \right) \right\} \). Consequently, this is also true for all \(s\) defined as in (5).

Conversely, for all \(x_{\gamma }\in M\left( \gamma \right) :=\arg \max \left\{ \left( u+\gamma d\right) \left( Ax^{\prime }-b\right) : x^{\prime }\in X\left( \alpha \right) \right\} ,\)

$$\begin{aligned} V_{\alpha } (u+\gamma d)&=\left( u+\gamma d\right) \left( Ax_{\gamma }-b\right) \\&=u\left( Ax_{\gamma }-b\right) +\gamma d\left( Ax_{\gamma }-b\right) \le V_{\alpha }( u) +\gamma d\left( Ax_{\gamma }-b\right) \text {.} \end{aligned}$$

Note that since \(M(0) \) is bounded and \(g_{\gamma }( x) := (u+\gamma d) (Ax-b) +\delta _{X(\alpha ) }(x) \) is a sequence of level set bounded, proper convex functions, epi-converging to \(g(x) := u( Ax-b) +\delta _{X( \alpha ) }(x) \), we may invoke [14], Exercise 7.32 (c) and Theorem 7.33 to deduce \(\limsup _{\gamma \downarrow 0}M(\gamma ) \subseteq M( 0)\). When we assume \(d(Ax-b) <0\) for all \(x\in M(0) \), we have \(d( Ax_{\gamma }-b) <0\) for \(\gamma \) sufficiently small. If we assume otherwise, (i.e., assuming there exists \(x_{\gamma _{m}}\in M(\gamma _{m}) \) for \(\gamma _{m}\downarrow 0\) with \(d(Ax_{\gamma _{m}}-b) \ge 0\)), the following contradiction follows. As \(V_{\alpha }(\cdot ) \) is a finite, convex function, it is locally Lipschitz and so \(\partial V_{\alpha }(\cdot ) \) is locally, uniformly bounded, implying local boundedness of \(s_{\gamma }:= \,( Ax_{\gamma }-b)\). This in turn implies local boundedness of \(\{ x_{\gamma _{m}}\} \), and on taking any convergent subsequence \(x_{\gamma _{m_{k}}}\rightarrow x\in M(0) \), we find that the assumption \(d(Ax_{\gamma _{m_{k}} }-b) \ge 0\) implies the contradiction \(d(Ax-b) \ge 0\) for some \(x\in M(0) \). Thus, \(d( Ax_{\gamma }-b) <0\) and \( V_{ \alpha }(u+\gamma d ) \le V_{\alpha } (u) +\gamma d(Ax_{\gamma }-b) < V_{\alpha }(u) \), for \(\gamma \) small. If there exists a sequence \(x_{\gamma _{m}}\in M(\gamma _{m}) \) such that \(x_{\gamma _{m} }\rightarrow x\in M( 0) \) with \(d(Ax-b) <0\), then the same argument implies \(d\) is a descent direction. Indeed, the presumption that there exists a further subsequence such that \(d( Ax_{\gamma _{m_{k}}}-b) \ge 0\) for all \(k\), implies the contradiction \(d(Ax-b) \ge 0\). Thus, \(\gamma _{m}d(Ax_{\gamma _{m} }-b) <0\), for \(m\) large, implying \(V_{ \alpha }( u+\gamma _{m}d ) < V_{\alpha }(u) \).

Finally, when \(x_{\gamma }\in M( \gamma ) \) we have \(s_{\gamma }:= ( Ax_{\gamma }-b) \in \partial V_{\alpha } ( u+\gamma d)\), and as \(V_{\alpha }(\cdot ) \) is a finite, convex function, it is also semi-smooth. Consequently, any convergent subsequence \(s_{\gamma _{m}}:= ( Ax_{\gamma _{m}}-b) \rightarrow s= ( Ax-b) \in \partial _{d} V_{\alpha } (u)\), for some \(x\in M(0) \). Assuming that \(d( Ax-b) <0\) for \(s= (Ax-b) \in \partial _{d} V_{\alpha } (u) \), we may argue as above that for \(\gamma \) sufficiently small that \(d(Ax_{\gamma }-b) <0\). Hence \(V_{\alpha }(u+\gamma d) < V_{\alpha }(u) \). \(\square \)

Proof

(of Lemma 5.1) We assume we have a sequence of trust regions of diameter \(\Delta _{k}\downarrow 0\), and an associated sequence of \(u_{k+1}\), generated in the evaluation of \(V_{\alpha _{k}} (u_{k+1} ) \) when calculating \(\rho _{k+1}\). In Step 3 of Algorithm SDTR, we add some \(s_{k+1}\in \partial V_{\alpha _{k}} \left( u_{k+1}\right) \) by choosing \(s_{k+1}=Az_{k+1}-b\).

By assumption, \(\left\| u_{k+1}-u_{k}\right\| _{\infty }\downarrow 0\). As \(u_{k}\rightarrow u\), we have \(\left\| u_{k+1}-u\right\| _{\infty }\rightarrow 0\). Also by assumption, there exists a subsequence of \(\left\{ u_{k}\right\} _{k=0}^{\infty }\), denoted by \(\left\{ u_{p}\right\} _{p=0}^{\infty }\), with \(u_{p} \rightarrow _dd\) such that \(v^{\prime }(P(\alpha ,\cdot )) ( u,d) <0\). Denote \(t_{p}:=\left\| u_{k_{p}+1}-u_{k_{p}}\right\| _{\infty }\). Then \(u_{k_{p}+1}-u_{k_{p}}=t_{p}d_{p}\). Note that in Step 3(b), just before returning to do Step 3 again, we add \(z_{k_{p}+1}\in \arg \max \left\{ u_{k_{p}+1}\left( Ax-b\right) : x\in X\left( \alpha _{k_{p}}\right) \right\} \) found while calculating \(\rho _{{k_p}+1}\). This implies \(\left( Az_{k_{p}+1}-b\right) \) must satisfy the following:

$$\begin{aligned} V_{\alpha _{k_{p}}} ( u_{k_{p}}+t_{p}d_{p} )&=\left( Az_{k_{p}+1}-b\right) \cdot \left( u_{k_{p}}+t_{p}d_{p}\right) \nonumber \\&\ge s\cdot \left( u_{k_{p}}+t_{p}d_{p}\right) ,\quad \forall s=\left( Ax-b\right) \text {, }x\in X_{k_{p}}\left( \alpha _{k_{p}}\right) . \end{aligned}$$
(10)

Take \(y_{p}\in X\left( \alpha _{k_{p}}\right) \) satisfying \(\beta _{k_{p} +1}=u_{k_{p}+1}\left( Ay_{p}-b\right) \) in the problem (3) from which we obtain \(\left( u_{k_{p}+1},\beta _{k_{p}+1}\right) \). Due to (10) and \(y_{p}\in X_{k_{p}}( \alpha _{k_{p}})\), we have

$$\begin{aligned} V_{\alpha _{k_{p}}} ( u_{k_{p}}+t_{p}d_{p} )\ge \left( Ay_{p}-b\right) \cdot \left( u_{k_{p}}+t_{p}d_{p}\right) =\beta _{k_{p}+1}. \end{aligned}$$
(11)

When (7) holds, we have \(0> V_{\alpha _{k_{p}}} ( u_{k_{p}+1} ) - V_{\alpha _{k_{p}}} ( u_{k_{p}} )\ge \beta _{k_{p}+1}- V_{\alpha _{k_{p}}} ( u_{k_{p} } )\) and so \(\rho _{k_{p}+1}\le 1\) . Note that (7) holds in either of the cases: \(\beta _{k+1}<0\) or \(\beta _{k+1}< \underline{V}_k ( u_{k}) \).

Now we wish to invoke Attouch’s theorem, ([14], Theorem 12.35), which states that an epi-convergent family of convex functions has their subdifferentials graphically converging. First we note that as \(\alpha _{k_{p}}\downarrow \alpha \), and \(X( \alpha _{k_{p}}) \) are eventually bounded, the finite sequence of convex functions \(V_{\alpha _{k_{p}}} (\cdot )\) is monotonically non-increasing, pointwise convergent and consequently also epi-converges to \(V_{\alpha } ( \cdot ) \). Thus, \(\partial V_{\alpha _{k_{p}}} ( \cdot )\) graphically converges to \(\partial V_{\alpha } (\cdot ) \).

We need to estimate the magnitude of

$$\begin{aligned} \rho _{k_{p}+1}&\ge \left( \frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{m_{p}}}} ( u_{k_{m_{p}}}) }{t_{p}}\right) \left( \frac{\beta _{k_{p}}-V_{ \alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\right) ^{-1} \end{aligned}$$
(12)

As \(z_{k_{p}}\) was added to the model in Step 3(b) at iteration \(k_{p}-1\), and by assumption is not dropped, we have \(z_{k_{p}}\in X_{k_{p}}\left( \alpha _{k_{p}}\right) \) when solving (3) at iteration \(k_{p}\). Thus, at the iteration \(k_p\), a subgradient \(s_{k_{p}}:=Az_{k_{p}}-b\) satisfies \(V_{\alpha _{k_{p}-1}} \left( u_{k_{p}}\right) =u_{k_{p}}\left( Az_{k_{p} }-b\right) \). As \(z_{k_{p}}\in X_{k_{p}}\left( \alpha _{k_{p}}\right) \), it follows that \(\beta _{k_{p}+1}\ge u_{k_{p}+1}s_{k_{p}}=s_{k_{p}}\left( u_{k_{p} }+t_{p}d_{l_{p}}\right) \). Since \(s_{k_{p}}\in \partial V_{\alpha _{k_{p}-1}} \left( u_{k_{p}}\right) \), taking a further subsequence, on re-numbering, we may assume we have such a sequence of subgradients with \(s_{k_{p}}\rightarrow s = (Ax-b) \in \partial V_{\alpha } (\bar{u}) \), for some \(x\in X(\alpha ) \). (Note that local Lipshitzness of \(u\mapsto V_{\alpha } (u) \) ensures \(\partial V_{\alpha }( \bar{u}) \) is bounded. Then by the graphical convergence of subdifferentials, the sequence \(\{ s_{p}\} \) is locally bounded, see [14] Exercise 5.34(b)). As (7) holds for all \(p\), we have

$$\begin{aligned} 0&>\frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\ge \frac{\beta _{k_{p}+1}- V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}} \\&\ge \frac{s_{k_{p}}\left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\ge \frac{s_{k_{p}}\left( u_{k_{p}}+t_{p}d_{p}\right) -V_{\alpha _{k_{p}-1}} \left( u_{k_{p}}\right) \ }{t_{p}}=s_{k_{p}}d_{p}, \end{aligned}$$

using (11) and monotonicity. The last equality holds as \(s_{k_{p}}\in \partial V_{\alpha _{k_{p}-1}} \left( u_{k_{p}}\right) \) implies \(V_{\alpha _{k_{p}-1}} ( u_{k_{p}} ) =s_{k_{p}}u_{k_{p}}\). Due to Attouch’s theorem, \(s_{k_{p}}\rightarrow s\in \partial V_{\alpha } \left( u\right) \), and so \(s_{k_{p}}d_{p}\rightarrow sd\le V^{\prime }_{\alpha } \left( u,d\right) \).

As graphical convergence of sets implies the epi-convergence of the associated support functions, (see [14], Corollary 11.36), we have for all \(w_p \rightarrow u\),

$$\begin{aligned} \limsup _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}}\inf _{d^{\prime }\rightarrow d} V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d^{\prime }\right)&=\lim _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}}\inf _{d^{\prime }\rightarrow d}\delta ^{*}\left( \partial V_{\alpha _{k_{p}} } \left( w_{p}\right) \right) \left( d^{\prime }\right) =V^{\prime }_{\alpha } \left( u,d\right) \text {.} \end{aligned}$$

Uniform–local Lipschitzness of \(d^{\prime }\mapsto \delta ^{*} ( \partial V_{\alpha _{k_{p}}} ( w_{p}) ) ( d^{\prime }) \) follows from the locally uniform boundedness of \(\partial V_{\alpha _{k_{p}}} \left( w_{p}\right) \). Hence

$$\begin{aligned} \lim _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}} V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d\right) \!=\! \limsup _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}}\delta ^{*}\left( \partial V_{\alpha _{k_{p}}}\left( w_{p}\right) \right) \left( d\right) \!=\! \limsup _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}}\inf _{d^{\prime }\rightarrow d} V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d^{\prime }\right) \!<\!0\text {.} \end{aligned}$$
(13)

Consequently, for \(p\) large we have \(V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d\right) <0\), for any \(w_{p}\rightarrow u\).

By the mean value theorem for convex functions there exists \(\gamma _{p}\in ]0,1[\) such that \(V^{\prime }_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}\gamma _{p}d,d\right) \ge v_{p}d_{p}=\frac{V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d\right) -V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\), for some element \(v_{p}\in \partial V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}\gamma _{p}d_{p}\right) \). Hence, for \(p\) large, \(V^{\prime }_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}\gamma _{p}d,d\right) <0\). Using (12), and observing both quantities in the quotient for \(\rho _{k_{p}+1}\) are negative, gives, using the Lipschitzness of \(V_{\alpha _{k_{p}}} \left( \cdot \right) \), a lower bound on \(\limsup _{p\rightarrow \infty }\rho _{k_{p}+1} \) of

$$\begin{aligned}&\left( \limsup _{\begin{array}{c} t_{p}\downarrow 0\\ u_{k_{p}}\rightarrow u \end{array}}\frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\right) \left( s_{k_{p}}d_{p}\right) ^{-1} \\&\ge \limsup _{_{p\rightarrow \infty }} V^{\prime }_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}\gamma _{p}d,d\right) \left( sd\right) ^{-1}\ge \limsup _{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}} V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d\right) \left( sd\right) ^{-1}. \end{aligned}$$

Thus \(\limsup _{k\rightarrow \infty }\rho _{k+1}\ge V^{\prime }_{\alpha } \left( u,d\right) \times \left( sd\right) ^{-1}\ge 1\). \(\square \)

Proof

(of Lemma 5.2) The argument given above, in the proof of Lemma 5.1, may be applied with \(\alpha _{k}\) fixed. Take a subsequence \( \{d_{l_p} \}_{p=0}^\infty \) of \(\{ d_{l} := \frac{u_{k,l} - u_k}{\Vert u_{k,l} - u_k \Vert }\}_{p=0}^\infty \) converging to \(d\). With the subsequence \(u_{k,l_{p}}\rightarrow u_{k}\), we may associate a subsequence of diameters \(\Delta _{k,l_{p}}\downarrow 0\). Place \(u_{k,l_{p}}=u_{k}+t_{p}d_{l_{p}}\), and note that we continue to reject an update of \(u_{k}\). As \(u\mapsto V_{\alpha _{k}} \left( u\right) \) is convex, it is also semi-smooth, and so we may assume \(s_{l_{p}}\rightarrow s \in \partial _{d} V_{\alpha _{k}} \left( u_{k}\right) :=\left\{ s^{\prime }\in \partial V_{\alpha _{k}} \left( u_{k}\right) : s^{\prime }\cdot d=V^{\prime }_{\alpha _{k}} \left( u_{k} ,d \right) \right\} \). As \(z_{k,l_{p}}\) is added to \(X_{k,l_{p}}\left( \alpha _{k}\right) \) and retained, we have \(z_{k,l_{p-1}}\in X_{k,l_{p-1}}\left( \alpha _{k}\right) \) when solving (3) at iteration \((k,l_{p})\) (and \(\alpha _{k}\) is never updated).

Consequently, \(\beta _{k,l_{p}}\ge u_{k,l_{p}}\cdot s_{l_{p}-1}=s_{l_{p}-1}\cdot \left( u_{k}+t_{p}d_{p}\right) \). Using (11) replacing \(\alpha _{k_{p}}=\alpha _{k}\), and applying (7) to each problem in this sequence yields

$$\begin{aligned} 0&>\frac{V_{\alpha _{k}} \left( u_{k}+t_{p}d_{l_{p}}\right) -V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}} \ge \frac{\beta _{k,l_{p}} - V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}}\nonumber \\&\ge \frac{s_{l_{p}-1}\cdot \left( u_{k}+t_{p}d_{p}\right) - V_{ \alpha _{k}} \left( u_{k}\right) }{t_{p}} = \left[ \left( s_{l_{p}-1}\cdot d_{l_{p}}\right) -\frac{t_{p-1}}{t_{p}}\left( s_{l_{p}-1}\cdot d_{p-1}\right) \right] \nonumber \\&\quad +\left( \frac{t_{p-1}}{t_{p}}\right) \left( \frac{ V_{\alpha _{k}} \left( u_{k}+t_{p-1}d_{l_{p-1}}\right) - V_{\alpha _{k}} \left( u_{k}\right) }{t_{p-1}}\right) , \end{aligned}$$
(14)

as \(s_{l_{p}-1}\in \partial V_{\alpha _{k}} \left( u_{k}+t_{p-1}d_{p-1}\right) \). First suppose that there exists a subsequence so that \(\left\{ \frac{t_{p_{m}-1}}{t_{p_{m}}}\right\} \rightarrow \lambda \ge 0\). By semi-smoothness, we have the limiting values \(s_{l_{p}-1}\cdot d_{p}\rightarrow s\cdot d=V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) \) and \(s_{p-1}\cdot d_{p-1}\rightarrow V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) \), so (14) converges to the weighted sum \( \left[ 1-\lambda \right] V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) +\lambda V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) =V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) \). Hence, using (12) and noting both quantities in the quotient are negative, we obtain

$$\begin{aligned} \limsup _{p\rightarrow \infty }\rho _{k+l_{p}}&\ge \limsup _{p\rightarrow \infty }\left( \frac{V_{\alpha _{k}} \left( u_{k}+t_{p}d_{l_{p}}\right) - V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}}\right) \left( \frac{\beta _{k,l_{p}}- V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}}\right) ^{-1}\\&\ge \left( \lim _{t_{p}\downarrow 0,d_{l_{p}}\rightarrow d}\frac{V_{\alpha _{k}} \left( u_{k}+t_{p}d_{l_{p}}\right) -V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}}\right) \left( V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) \right) ^{-1}=1\text {.} \end{aligned}$$

In the alternative case, \(\left\{ \frac{t_{p-1}}{t_{p}}\right\} \) is unbounded (and \(\Delta _{k,l_{p}+1}=\gamma \Delta _{k},_{l_{p}}\) using a \(\gamma \in ]0,1[\)). Then \(u_{k}+t_{p-1}d_{_{p-1}}\in \mathrm{int }B_{k,l_{p}}\), placing us in Step 3(a), as described in Lemma 4.2, resulting in an update of \(u_{k}\), contrary to assumption. \(\square \)

Proof

(of Corollary 5.1) We have \(\liminf _{k\rightarrow \infty }\rho _{k+1}\) given by

$$\begin{aligned} \min \left\{ \lim _{m\rightarrow \infty }\rho _{k_{m}+1} : \left\{ \rho _{k_{m}+1}\right\} _{m=0}^{\infty }\text { is a convergent subsequence of }\left\{ \rho _{k+1}\right\} _{k=0}^{\infty }\right\} \text {.} \end{aligned}$$

For part 1, apply Lemma 5.2 to \(\left\{ \rho _{k_{m}+1}\right\} _{m=0}^{\infty }\) to obtain \(\left\{ \rho _{k_{p}+1}\right\} _{p=0}^{\infty }\) with

$$\begin{aligned} \lim _{m\rightarrow \infty }\rho _{k_{m}+1}=\limsup _{p\rightarrow \infty }\rho _{k_{p}+1}\ge 1. \end{aligned}$$
(15)

For part 2, we use \(\rho _{k+1}\ge \xi \), and so

$$\begin{aligned}&\frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\le \xi \left[ \frac{\beta _{k_{p}}- V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\right] \nonumber \\&=\xi \left[ \frac{\underline{V}_{k_{p}+1}\left( u_{k_{p}}+t_{p}d_{p}\right) - \underline{V}_{k_{p}}\left( u_{k_{p}}\right) }{t_{p}}\right] .\ \end{aligned}$$
(16)

As \(u_{k}\rightarrow u\) \(\notin \arg \min \left\{ V_{\alpha } \left( w \right) : w\in S^{n}\right\} \), for \(k\) large, there exists a descent direction for \(V_{\alpha } \left( \cdot \right) \) at \(u_{k}\). As \(\underline{V}_{k}\left( u_{k}\right) = V_{\alpha } \left( u_{k} \right) \) and \( \underline{V}_{k}\left( \cdot \right) \) minorizes \(V_{\alpha } \left( \cdot \right) \), there must exist a greater descent in the same direction for \(\underline{V}_{k}\left( \cdot \right) \) at \(u_{k}\). As \(u_{k_{p}+1}\) solves (3), we have \(d_{k_{p}}:=\frac{u_{k_{p} +1}-u_{k_{p}}}{ \Vert u_{k_{p}+1}-u_{k_{p}} \Vert }\) in the direction of maximal descent of \(\underline{V}_{k_{p}}\left( \cdot \right) \) at \(u_{k_{p}}\), and so there exists \(\delta >0\) such that we have \( \underline{V}_{k_{p}+1}\left( u_{k_{p}}+t_{p}d_{p}\right) - \underline{V}_{k_{p}}\left( u_{k_{p}}\right) \le -\delta t_{p} \), for \(p\) sufficiently large. Using (16) and the subgradient inequality for \(V_{\alpha _{k_{p}}} \left( \cdot \right) \) at \(u_{k_{p}}\) gives (for \(p\) large)

$$\begin{aligned} V^{\prime }_{\alpha _{k_{p}}} \left( u_{k_{p}},d_{p}\right) \le \frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\le -\xi \delta <0. \end{aligned}$$

Using (13), we have \(V^{\prime }_{\alpha } \left( u,d\right) =\limsup _{p} V^{\prime }_{\alpha _{k_{p}}}\left( u_{k_{p}},d_{p}\right) -\xi \delta <0.\) We may now apply Lemma 5.1 to get (15) again. \(\square \)

Proof

(of Proposition 5.1) In all cases, we have \(V_{\alpha _{k}} \left( u_{k}\right) \ge 0\). Suppose that \(\rho _{k,l}<\xi \) for all \(l\). Then, applying Lemma 5.4 recursively, we have indices \(l_{p}\) with

$$\begin{aligned} V_{\alpha _{k}} \left( u_{k}\right) - \underline{V}_{k,l_{k,p}}\left( u_{k,l_{p}}\right)&\le \eta \left[ V_{\alpha _{k}} \left( u_{k}\right) - \underline{V}_{k,l_{p-1}}\left( u_{k,l_{p-1}}\right) \right] \cdots \nonumber \\&\le \eta ^{p}\left[ V_{\alpha _{k}} \left( u_{k}\right) - \underline{V}_{k,l_{0}}\left( u_{k,l_{0}}\right) \right] \rightarrow _{p\rightarrow \infty }0\text {.} \end{aligned}$$
(17)

When \(u_{k}\notin S_{\alpha _{k}}\), we have some \(\varepsilon >0\) such that \(\left\| u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \right\| _{\infty }\ge \varepsilon \) and \( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\ge \) \(\varepsilon \). By (8), \(\min \left( \frac{\Delta _{k,l_{p}}}{ \Vert u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \Vert _{\infty }},1 \right) \left( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\right) \rightarrow _{p\rightarrow \infty }0\) implying \(\Delta _{k,l_{p}}\rightarrow 0\). Apply Corollary 5.1, part 1 to get a subsequence of \(\left\{ \rho _{k,l_{p}}\right\} _{p=0}^{\infty }\) that tends to \(1\). Thus, there exists a \(p\) with \(\rho _{k,l_{p}}\ge \xi >0\), a contradiction.

In the case that \(u_{k}\in S_{\alpha _{k}}\), there does not exist any descent for \( V_{\alpha _{k}} \left( \cdot \right) \) at \(u=u_{k}\). In particular, we have \(v_{k}^{*}= V_{\alpha _{k}} \left( u_{k}\right) \ge 0\). Consequently, (3) cannot generate a descent that passes the test \(\rho _{k}\ge \xi \). Using (17), we observe that

$$\begin{aligned} v_{k}^{*} - \underline{V}_{k,l_{k,p}}\left( u_{k,l_{p}}\right) \rightarrow _{p}0\text {.} \end{aligned}$$
(18)

As we add a new subgradient each time when solving \(V_{\alpha _{k}} (u_{k,l}) \), to calculate \(\rho _{k,l}\), the model function \(u\mapsto \underline{V}_{k,l_{k}}(u) \) is monotonically increasing, and hence convergent to a finite, convex function. As the trust region size is monotonically non-increasing, we have \(\left\{ \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \right\} _{k}\) monotonically non-decreasing. Consequently, using (18), we deduce that we have a subsequence converging to zero. So for the whole sequence, \(\underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \uparrow v_{k}^{*}\). When \(V_{\alpha _{k}} \left( u_{k}\right) =v_{k}^{*}>0\), (or equivalently \(v\left( SD\right) <\alpha _{k}\), the case \(v\left( SD\right) \ne \alpha _{k}\)), we find after a finite number of iterations that \(\beta _{k,l_{k}}= \underline{V}_{k,l_{k,p}}\left( u_{k,l_{k}}\right) >0\). Thus, we are in case 3(a), of Algorithm SDTR, after a finite number of iterations. Alternatively, \(V_{\alpha _{k}} \left( u_{k}\right) =v_{k}^{*}=0\), so \(\alpha _{k}=v\left( SD\right) \) and \(\underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \le 0\). This establishes part 2b of the proposition.

Suppose we have a pure IP and we add a new subgradient each time when solving \(V_{\alpha _{k}} \left( u_{k,l}\right) \) when calculating \(\rho _{k,l}\). As \(u\mapsto V_{\alpha _{k}} \left( u\right) \) is polyhedral, (\(X\left( \alpha _{k}\right) \) contains a finite set of points), we add all extremal subgradients after a finite number of iterations \(l\). As \(u_{k}\in S_{\alpha _{k}}\) we have \(0\in \partial V_{\alpha _{k}} \left( u_{k}\right) \), or equivalently, \(0 \in \mathrm{co }\left\{ u_{k}\left( Ax_{j}-b\right) :x_{j}\in X_{k,l}\right\} \). But \(\underline{V}_{k,l}\left( u\right) =\max \left\{ u\left( Ax_{j}-b\right) : x_{j}\in X_{k,l}\right\} \), so we also have \(0\in \partial \underline{V}_{k,l} \left( u_{k}\right) =\mathrm{co }\left\{ u_{k}\left( Ax_{j}-b\right) : x_{j}\in X_{k,l}\right\} ,\) showing that \(u_{k}\) is a local (and hence global) minimum of \(u\mapsto \underline{V}_{k,l}\left( u\right) \). Then we have the inequalities \(0\le \underline{V}_{k,l}\left( u_{k}\right) =u_{k}\left( Ax_{k}-b\right) \le \min _{u\in S^{n}\cap B} \underline{V}_{k,l}\left( u\right) =\beta _{k,l},\) using the inequality \(u_{k}\left( Ax_{k}-b\right) = V_{\alpha _{k}} \left( u_{k}\right) \ge 0\). Hence \( \underline{V}_{k,l}\left( u_{k}\right) =0\) finitely. \(\square \)

Proof

(of Theorem 5.1) Whenever \([\overline{\alpha },\underline{\alpha }]\) is updated, we decrease the length of this interval of uncertainty by at least a constant factor. Thus, finite termination is assured unless we have an infinite cycle in the trust region loop, with fixed interval \([\overline{\alpha },\underline{\alpha }]\). Proposition 5.1 indicates that this can only occur in two ways. Via a sequence of decreasing \(\left\{ \alpha _{k}\right\} \), and a sequence of acceptable descents (i.e. \(\rho _{k,l}\ge \xi )\) which does not terminate, or that for some \(k\) we have \(\alpha _{k}=v\left( SD\right) \) and \( \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \rightarrow 0= V_{\alpha _{k}} \left( u_{k}\right) \), monotonically. This latter case cannot occur when we have a pure IP.

Consider the first case of acceptable descents. Then there exists a sequence \(\left\{ l_{k}\right\} \) such that \(u_{k+1}=u_{k,l_{k}}\) and between iteration \(\left( k,1\right) \) and \(\left( k,l_{k}\right) \), we have \(\alpha _{k+l}=\alpha _{k}\) fixed. Thus, by Lemma 5.3,

$$\begin{aligned} \underline{V}_{k,1}\left( u_{k}\right)&- \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) =\sum _{l=1}^{l_{k}} \underline{V}_{k,l}\left( u_{k}\right) -\underline{V}_{k,l}\left( u_{k,l}\right) \\&\ge \sum _{l=1}^{l_{k}}\min \left( \frac{\Delta _{k,l}}{\left\| u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \right\| _{\infty }},1\right) \left( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\right) , \end{aligned}$$

where \(u_{k+1}=u_{k,l_{k}}\) and \( \underline{V}_{k+1,0}\left( u_{k+1}\right) = \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \ge 0\) (as we are in Step 3 of STDR). The model function \( \underline{V}_{k,l_{k}}\left( \cdot \right) \) is not carried to the next stage, but \(\alpha _{k}\) is decreased to \(\alpha _{k+1}\), and we prune subgradients which decrease the new model function to the new initial model \(\underline{V}_{k+1,1}\). Hence, we have the inequality \(\underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) - \underline{V}_{k+1,1}\left( u_{k+1}\right) \ge 0\). Thus \( \underline{V}_{k,1}\left( u_{k}\right) - \underline{V}_{k+1,1}\left( u_{k+1}\right) \) equals

$$\begin{aligned}&\underline{V}_{k,1}\left( u_{k}\right) - \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) + \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) - \underline{V}_{k+1,1}\left( u_{k+1}\right) \\&\quad \ge \sum _{l=1}^{l_{k}}\min \left( \frac{\Delta _{k+l}}{\left\| u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \right\| _{\infty }},1\right) \left( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\right) \text {.} \end{aligned}$$

For \(M=\sum _{k=0}^{K}l_{k}\), (as we remain in Step 3 of SDTR with \( \underline{V}_{K,1}\left( u_{K}\right) \ge 0\)),

$$\begin{aligned} +\infty&> \underline{V}_{0,1}\left( u_{0}\right) \ge \underline{V}_{0,1}\left( u_{0}\right) - \underline{V}_{K,1}\left( u_{K}\right) =\sum _{k=0}^{K-1}\left[ \underline{V}_{k,1}\left( u_{k}\right) - \underline{V}_{k+1,1}\left( u_{k+1}\right) \right] \nonumber \\&\ge \sum _{k=0}^{K-1}\sum _{l=0}^{l_{k}}\min \left( \frac{\Delta _{k+l}}{\left\| u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \right\| _{\infty }},1\right) \left( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\right) \nonumber \\&=\sum _{j=0}^{M}\min \left( \frac{\Delta _{j}}{\left\| u_{j}-p_{\alpha _{j}}\left( u_{j}\right) \right\| _{\infty }},1\right) \left( V_{\alpha _{j}} \left( u_{j}\right) -v_{j}^{*}\right) , \end{aligned}$$
(19)

where \(p_{\alpha _{j}}\left( u\right) \) denotes the projection of \(u\) onto the solution set \(S_{\alpha _{j}}\subseteq S^{n}\) of minimizers of \(u\mapsto V_{\alpha _{j}} \left( u\right) \) and \(v_{j}^{*} := \min _{u\in S^{n}} V_{\alpha _{j}} \left( u\right) \). The convergence of the series implies the terms in (19) converge to zero. Next, note that each time we accept a subgradient, (as we have obtained sufficient descent), we decrease the interval \([\underline{\alpha },\alpha _{k}].\) Thus, we have \(\alpha _{j}\downarrow \underline{\alpha }\) as \(j\rightarrow \infty \). As \(\min _{u\in S^{n}} V_{\underline{\alpha }} \left( u\right) <0\), and we assume we remain in Step 3 of SDTR, we have \( V_{\alpha _{j}} \left( u_{j}\right) \ge 0\) and hence: \(\left\| u_{j}-p_{\alpha _{j}}\left( u_{j}\right) \right\| _{\infty }\ge \delta >0\) for some \(\delta \). Thus, there is some \(\epsilon >0\) such that \( V_{\alpha _{j}} \left( u_{j}\right) -v_{j}^{*} \ge \epsilon >0\). Consequently, we must have \(\Delta _{j}\rightarrow 0.\)

Consider the first case when we have a MIP. Note that each time we accept a subgradient, (as we have obtained sufficient descent), we reduce the interval \([\underline{\alpha },\alpha _{k}].\) Thus, we have \(\alpha _{j}\downarrow \underline{\alpha }\) as \(j\rightarrow \infty \) and we may apply Corollary 5.1 to deduce that \(\liminf _{j}\rho _{j}\ge 1\). But this implies that there exists a \(J\) for which \(\rho _{j}\ge \frac{3}{4}\) for \(j\ge J\), forcing \(\Delta _{j+l}=\min \left\{ 2\Delta _{j+l-1},\overline{\Delta }\right\} \). Eventually, we must have \(\Delta _{j}=\overline{\Delta }>0\), a contradiction.

In the case of a pure IP, (no continuous variable), we note that as \(\alpha _{j}\downarrow 0\), and \(X\left( \alpha _{j}\right) :=\left\{ x\in X : cx\le \alpha _{j}\right\} \), for \(j\) sufficiently large, the discrete components of \(X\left( \alpha _{j}\right) \) do not change. As we have a pure IP, the function \( V_{\alpha _{j}} \left( \cdot \right) \) must be constant and equal to \(V_{\underline{\alpha }} \left( \cdot \right) \), for \(j\) sufficiently large. Hence for \(j\) sufficiently large \( V_{\alpha _{j}} \left( \cdot \right) = V_{\underline{\alpha }} \left( \cdot \right) \). We may now apply Corollary 5.1, with \(\alpha _{k}=\underline{\alpha }\) constant, to deduce that \(\liminf _{j}\rho _{j}\ge 1\).

Thus, we cannot have an infinite loop of acceptable descents without an update of a lower or upper bound. After a finite number of iteration, we must have \(\left| \overline{\alpha }-\underline{\alpha }\right| \le \varepsilon \). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boland, N., Eberhard, A.C. & Tsoukalas, A. A Trust Region Method for the Solution of the Surrogate Dual in Integer Programming. J Optim Theory Appl 167, 558–584 (2015). https://doi.org/10.1007/s10957-014-0681-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-014-0681-9

Keywords

Mathematics Subject Classification

Navigation