A Trust Region Method for the Solution of the Surrogate Dual in Integer Programming

Boland, N.; Eberhard, A. C.; Tsoukalas, A.

doi:10.1007/s10957-014-0681-9

A Trust Region Method for the Solution of the Surrogate Dual in Integer Programming

Published: 18 November 2014

Volume 167, pages 558–584, (2015)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

N. Boland¹,
A. C. Eberhard² &
A. Tsoukalas²

280 Accesses
3 Citations
Explore all metrics

Abstract

We propose an algorithm for solving the surrogate dual of a mixed integer program. The algorithm uses a trust region method based on a piecewise affine model of the dual surrogate value function. A new and much more flexible way of updating bounds on the surrogate dual’s value is proposed, in which numerical experiments prove to be advantageous. A proof of convergence is given and numerical tests show that the method performance is better than a state of the art subgradient solver. Incorporation of the surrogate dual value as a cut added to the integer program is shown to greatly reduce solution times of a standard commercial solver on a specific class of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Article Open access 06 March 2024

Yaohua Hu, Xinlin Hu & Xiaoqi Yang

Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems

Article 11 April 2024

L. F. Prudente & D. R. Souza

A class of accelerated GADMM-based method for multi-block nonconvex optimization problems

Article 13 April 2024

Kunyu Zhang, Hu Shao, … Xiaoquan Wang

References

Greenberg, H.J., Pierskalla, W.P.: Surrogate mathematical programming. Oper. Res. 18, 924–939 (1970)
Article MATH MathSciNet Google Scholar
Karwan, M., Rardin, R.: Surrogate dual multiplier search procedures in integer programming. Oper. Res. 32, 52–69 (1984)
Article MATH MathSciNet Google Scholar
Karwan, M., Rardin, R.: Some relationships between Lagrangian and surrogate duality in integer programming. Math. Progr. 17, 320–334 (1979)
Article MATH MathSciNet Google Scholar
Kim, S.-L., Kim, S.: Exact algorithm for the surrogate dual of an integer programming problem: subgradient method approach. J. Optim. Theory Appl. 96, 363–375 (1998)
Article MATH MathSciNet Google Scholar
Sarin, S., Karwan, M., Rardin, R.: Surrogate duality in a branch-and-bound procedure for integer programming. Euro. J. Oper. Res. 33, 326–333 (1988)
Article MATH MathSciNet Google Scholar
Li, D., Sun, X.: Nonlinear Integer Programming. International Series in Operations Research & Management, Springer (2006)
Noll, D.: Bundle method for non-convex minimization with inexact subgradients and function values. Comput. Anal. Math. 50, 555–592 (2013)
Article MathSciNet Google Scholar
Linderoth, J., Wright, S.: Decomposition algorithms for stochastic programming on a computational grid. Comput. Opt. Appl. 24, 207–250 (2003)
Article MATH MathSciNet Google Scholar
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms II: Advanced Theory and Bundle Methods. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 306. Springer-Verlag, Berlin (1993)
Google Scholar
Boland N, Eberhard A, Tsoukalas A (2014) A trust region method for the solution of the surrogate dual in integer programming. Optim. Online. http://www.optimization-online.org/DB_HTML/2014/02/4249.html
Frangioni, A.: Generalized bundle methods. SIAM J. Optim. 13, 117–156 (2002)
Article MATH MathSciNet Google Scholar
https://www-user.tu-chemnitz.de/~helmberg/ConicBundle/
Han, B., Leblet, J., Simon, G.: Hard multidimensional multiple choice knapsack problems: an empirical study. Comput. Oper. Res. 37, 172–181 (2010)
Article MATH MathSciNet Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. A series of comprehensive studies in mathematics. Springer, Berlin (1998)
Book MATH Google Scholar

Download references

Acknowledgments

We thank two anonymous referees, whose constructive comments improved the paper. This research was supported by the ARC Discovery Grant No. DP0987445.

Author information

Authors and Affiliations

SMAPS, University of Newcastle, Callaghan, NSW, 2308, Australia
N. Boland
School of Mathematical and Geospatial Sciences, RMIT, Melbourne, VIC, 3001, Australia
A. C. Eberhard & A. Tsoukalas

Authors

N. Boland
View author publications
You can also search for this author in PubMed Google Scholar
A. C. Eberhard
View author publications
You can also search for this author in PubMed Google Scholar
A. Tsoukalas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. C. Eberhard.

Appendix

In the following, we will denote the support functions to a closed convex set $A$ by $\delta ^{*}\left( A\right) \left( u\right) :=\sup \left\{ au : a\in A\right\} .$ We use epi-limits and Attouch’s theorem for which the reader may consult [14] for details.

Proof

(of Lemma 4.1) Suppose $V_{\alpha }(u+\gamma d) < V_{\alpha }(u) <+\infty $, for some sufficiently small $\gamma >0$ and $x\in \arg \max \left\{ u\left( Ax^{\prime }-b\right) : x^{\prime }\in X\left( \alpha \right) \right\} $. Then, $x\in X\left( \alpha \right) $ and so

$$\begin{aligned} \gamma d\left( Ax-b\right)&\le \max \left\{ \left( u+\gamma d\right) \left( Ax^{\prime }-b\right) : x^{\prime }\in X\left( \alpha \right) \right\} -u\left( Ax-b\right) \\&= V_{\alpha }(u+\gamma d ) - V_{\alpha }(u) <0, \end{aligned}$$

implying $d\left( Ax-b\right) <0$ holds for all $x\in \arg \max \left\{ u\left( Ax^{\prime }-b\right) : x^{\prime }\in X\left( \alpha \right) \right\} $. Consequently, this is also true for all $s$ defined as in (5).

Conversely, for all $x_{\gamma }\in M\left( \gamma \right) :=\arg \max \left\{ \left( u+\gamma d\right) \left( Ax^{\prime }-b\right) : x^{\prime }\in X\left( \alpha \right) \right\} ,$

$$\begin{aligned} V_{\alpha } (u+\gamma d)&=\left( u+\gamma d\right) \left( Ax_{\gamma }-b\right) \\&=u\left( Ax_{\gamma }-b\right) +\gamma d\left( Ax_{\gamma }-b\right) \le V_{\alpha }( u) +\gamma d\left( Ax_{\gamma }-b\right) \text {.} \end{aligned}$$

Note that since $M(0) $ is bounded and $g_{\gamma }( x) := (u+\gamma d) (Ax-b) +\delta _{X(\alpha ) }(x) $ is a sequence of level set bounded, proper convex functions, epi-converging to $g(x) := u( Ax-b) +\delta _{X( \alpha ) }(x) $, we may invoke [14], Exercise 7.32 (c) and Theorem 7.33 to deduce $\limsup _{\gamma \downarrow 0}M(\gamma ) \subseteq M( 0)$. When we assume $d(Ax-b) <0$ for all $x\in M(0) $, we have $d( Ax_{\gamma }-b) <0$ for $\gamma $ sufficiently small. If we assume otherwise, (i.e., assuming there exists $x_{\gamma _{m}}\in M(\gamma _{m}) $ for $\gamma _{m}\downarrow 0$ with $d(Ax_{\gamma _{m}}-b) \ge 0$), the following contradiction follows. As $V_{\alpha }(\cdot ) $ is a finite, convex function, it is locally Lipschitz and so $\partial V_{\alpha }(\cdot ) $ is locally, uniformly bounded, implying local boundedness of $s_{\gamma }:= \,( Ax_{\gamma }-b)$. This in turn implies local boundedness of $\{ x_{\gamma _{m}}\} $, and on taking any convergent subsequence $x_{\gamma _{m_{k}}}\rightarrow x\in M(0) $, we find that the assumption $d(Ax_{\gamma _{m_{k}} }-b) \ge 0$ implies the contradiction $d(Ax-b) \ge 0$ for some $x\in M(0) $. Thus, $d( Ax_{\gamma }-b) <0$ and $ V_{ \alpha }(u+\gamma d ) \le V_{\alpha } (u) +\gamma d(Ax_{\gamma }-b) < V_{\alpha }(u) $, for $\gamma $ small. If there exists a sequence $x_{\gamma _{m}}\in M(\gamma _{m}) $ such that $x_{\gamma _{m} }\rightarrow x\in M( 0) $ with $d(Ax-b) <0$, then the same argument implies $d$ is a descent direction. Indeed, the presumption that there exists a further subsequence such that $d( Ax_{\gamma _{m_{k}}}-b) \ge 0$ for all $k$, implies the contradiction $d(Ax-b) \ge 0$. Thus, $\gamma _{m}d(Ax_{\gamma _{m} }-b) <0$, for $m$ large, implying $V_{ \alpha }( u+\gamma _{m}d ) < V_{\alpha }(u) $.

Finally, when $x_{\gamma }\in M( \gamma ) $ we have $s_{\gamma }:= ( Ax_{\gamma }-b) \in \partial V_{\alpha } ( u+\gamma d)$, and as $V_{\alpha }(\cdot ) $ is a finite, convex function, it is also semi-smooth. Consequently, any convergent subsequence $s_{\gamma _{m}}:= ( Ax_{\gamma _{m}}-b) \rightarrow s= ( Ax-b) \in \partial _{d} V_{\alpha } (u)$, for some $x\in M(0) $. Assuming that $d( Ax-b) <0$ for $s= (Ax-b) \in \partial _{d} V_{\alpha } (u) $, we may argue as above that for $\gamma $ sufficiently small that $d(Ax_{\gamma }-b) <0$. Hence $V_{\alpha }(u+\gamma d) < V_{\alpha }(u) $. $\square $

Proof

(of Lemma 5.1) We assume we have a sequence of trust regions of diameter $\Delta _{k}\downarrow 0$, and an associated sequence of $u_{k+1}$, generated in the evaluation of $V_{\alpha _{k}} (u_{k+1} ) $ when calculating $\rho _{k+1}$. In Step 3 of Algorithm SDTR, we add some $s_{k+1}\in \partial V_{\alpha _{k}} \left( u_{k+1}\right) $ by choosing $s_{k+1}=Az_{k+1}-b$.

By assumption, $\left\| u_{k+1}-u_{k}\right\| _{\infty }\downarrow 0$. As $u_{k}\rightarrow u$, we have $\left\| u_{k+1}-u\right\| _{\infty }\rightarrow 0$. Also by assumption, there exists a subsequence of $\left\{ u_{k}\right\} _{k=0}^{\infty }$, denoted by $\left\{ u_{p}\right\} _{p=0}^{\infty }$, with $u_{p} \rightarrow _dd$ such that $v^{\prime }(P(\alpha ,\cdot )) ( u,d) <0$. Denote $t_{p}:=\left\| u_{k_{p}+1}-u_{k_{p}}\right\| _{\infty }$. Then $u_{k_{p}+1}-u_{k_{p}}=t_{p}d_{p}$. Note that in Step 3(b), just before returning to do Step 3 again, we add $z_{k_{p}+1}\in \arg \max \left\{ u_{k_{p}+1}\left( Ax-b\right) : x\in X\left( \alpha _{k_{p}}\right) \right\} $ found while calculating $\rho _{{k_p}+1}$. This implies $\left( Az_{k_{p}+1}-b\right) $ must satisfy the following:

$$\begin{aligned} V_{\alpha _{k_{p}}} ( u_{k_{p}}+t_{p}d_{p} )&=\left( Az_{k_{p}+1}-b\right) \cdot \left( u_{k_{p}}+t_{p}d_{p}\right) \nonumber \\&\ge s\cdot \left( u_{k_{p}}+t_{p}d_{p}\right) ,\quad \forall s=\left( Ax-b\right) \text {, }x\in X_{k_{p}}\left( \alpha _{k_{p}}\right) . \end{aligned}$$

(10)

Take $y_{p}\in X\left( \alpha _{k_{p}}\right) $ satisfying $\beta _{k_{p} +1}=u_{k_{p}+1}\left( Ay_{p}-b\right) $ in the problem (3) from which we obtain $\left( u_{k_{p}+1},\beta _{k_{p}+1}\right) $. Due to (10) and $y_{p}\in X_{k_{p}}( \alpha _{k_{p}})$, we have

$$\begin{aligned} V_{\alpha _{k_{p}}} ( u_{k_{p}}+t_{p}d_{p} )\ge \left( Ay_{p}-b\right) \cdot \left( u_{k_{p}}+t_{p}d_{p}\right) =\beta _{k_{p}+1}. \end{aligned}$$

(11)

When (7) holds, we have $0> V_{\alpha _{k_{p}}} ( u_{k_{p}+1} ) - V_{\alpha _{k_{p}}} ( u_{k_{p}} )\ge \beta _{k_{p}+1}- V_{\alpha _{k_{p}}} ( u_{k_{p} } )$ and so $\rho _{k_{p}+1}\le 1$ . Note that (7) holds in either of the cases: $\beta _{k+1}<0$ or $\beta _{k+1}< \underline{V}_k ( u_{k}) $.

Now we wish to invoke Attouch’s theorem, ([14], Theorem 12.35), which states that an epi-convergent family of convex functions has their subdifferentials graphically converging. First we note that as $\alpha _{k_{p}}\downarrow \alpha $, and $X( \alpha _{k_{p}}) $ are eventually bounded, the finite sequence of convex functions $V_{\alpha _{k_{p}}} (\cdot )$ is monotonically non-increasing, pointwise convergent and consequently also epi-converges to $V_{\alpha } ( \cdot ) $. Thus, $\partial V_{\alpha _{k_{p}}} ( \cdot )$ graphically converges to $\partial V_{\alpha } (\cdot ) $.

We need to estimate the magnitude of

$$\begin{aligned} \rho _{k_{p}+1}&\ge \left( \frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{m_{p}}}} ( u_{k_{m_{p}}}) }{t_{p}}\right) \left( \frac{\beta _{k_{p}}-V_{ \alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\right) ^{-1} \end{aligned}$$

(12)

As $z_{k_{p}}$ was added to the model in Step 3(b) at iteration $k_{p}-1$, and by assumption is not dropped, we have $z_{k_{p}}\in X_{k_{p}}\left( \alpha _{k_{p}}\right) $ when solving (3) at iteration $k_{p}$. Thus, at the iteration $k_p$, a subgradient $s_{k_{p}}:=Az_{k_{p}}-b$ satisfies $V_{\alpha _{k_{p}-1}} \left( u_{k_{p}}\right) =u_{k_{p}}\left( Az_{k_{p} }-b\right) $. As $z_{k_{p}}\in X_{k_{p}}\left( \alpha _{k_{p}}\right) $, it follows that $\beta _{k_{p}+1}\ge u_{k_{p}+1}s_{k_{p}}=s_{k_{p}}\left( u_{k_{p} }+t_{p}d_{l_{p}}\right) $. Since $s_{k_{p}}\in \partial V_{\alpha _{k_{p}-1}} \left( u_{k_{p}}\right) $, taking a further subsequence, on re-numbering, we may assume we have such a sequence of subgradients with $s_{k_{p}}\rightarrow s = (Ax-b) \in \partial V_{\alpha } (\bar{u}) $, for some $x\in X(\alpha ) $. (Note that local Lipshitzness of $u\mapsto V_{\alpha } (u) $ ensures $\partial V_{\alpha }( \bar{u}) $ is bounded. Then by the graphical convergence of subdifferentials, the sequence $\{ s_{p}\} $ is locally bounded, see [14] Exercise 5.34(b)). As (7) holds for all $p$, we have

$$\begin{aligned} 0&>\frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\ge \frac{\beta _{k_{p}+1}- V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}} \\&\ge \frac{s_{k_{p}}\left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\ge \frac{s_{k_{p}}\left( u_{k_{p}}+t_{p}d_{p}\right) -V_{\alpha _{k_{p}-1}} \left( u_{k_{p}}\right) \ }{t_{p}}=s_{k_{p}}d_{p}, \end{aligned}$$

using (11) and monotonicity. The last equality holds as $s_{k_{p}}\in \partial V_{\alpha _{k_{p}-1}} \left( u_{k_{p}}\right) $ implies $V_{\alpha _{k_{p}-1}} ( u_{k_{p}} ) =s_{k_{p}}u_{k_{p}}$. Due to Attouch’s theorem, $s_{k_{p}}\rightarrow s\in \partial V_{\alpha } \left( u\right) $, and so $s_{k_{p}}d_{p}\rightarrow sd\le V^{\prime }_{\alpha } \left( u,d\right) $.

As graphical convergence of sets implies the epi-convergence of the associated support functions, (see [14], Corollary 11.36), we have for all $w_p \rightarrow u$,

$$\begin{aligned} \limsup _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}}\inf _{d^{\prime }\rightarrow d} V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d^{\prime }\right)&=\lim _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}}\inf _{d^{\prime }\rightarrow d}\delta ^{*}\left( \partial V_{\alpha _{k_{p}} } \left( w_{p}\right) \right) \left( d^{\prime }\right) =V^{\prime }_{\alpha } \left( u,d\right) \text {.} \end{aligned}$$

Uniform–local Lipschitzness of $d^{\prime }\mapsto \delta ^{*} ( \partial V_{\alpha _{k_{p}}} ( w_{p}) ) ( d^{\prime }) $ follows from the locally uniform boundedness of $\partial V_{\alpha _{k_{p}}} \left( w_{p}\right) $. Hence

$$\begin{aligned} \lim _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}} V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d\right) \!=\! \limsup _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}}\delta ^{*}\left( \partial V_{\alpha _{k_{p}}}\left( w_{p}\right) \right) \left( d\right) \!=\! \limsup _{\,_{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}}}\inf _{d^{\prime }\rightarrow d} V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d^{\prime }\right) \!<\!0\text {.} \end{aligned}$$

(13)

Consequently, for $p$ large we have $V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d\right) <0$, for any $w_{p}\rightarrow u$.

By the mean value theorem for convex functions there exists $\gamma _{p}\in ]0,1[$ such that $V^{\prime }_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}\gamma _{p}d,d\right) \ge v_{p}d_{p}=\frac{V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d\right) -V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}$, for some element $v_{p}\in \partial V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}\gamma _{p}d_{p}\right) $. Hence, for $p$ large, $V^{\prime }_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}\gamma _{p}d,d\right) <0$. Using (12), and observing both quantities in the quotient for $\rho _{k_{p}+1}$ are negative, gives, using the Lipschitzness of $V_{\alpha _{k_{p}}} \left( \cdot \right) $, a lower bound on $\limsup _{p\rightarrow \infty }\rho _{k_{p}+1} $ of

$$\begin{aligned}&\left( \limsup _{\begin{array}{c} t_{p}\downarrow 0\\ u_{k_{p}}\rightarrow u \end{array}}\frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\right) \left( s_{k_{p}}d_{p}\right) ^{-1} \\&\ge \limsup _{_{p\rightarrow \infty }} V^{\prime }_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}\gamma _{p}d,d\right) \left( sd\right) ^{-1}\ge \limsup _{\begin{array}{c} w_{p}\rightarrow u\\ p\rightarrow \infty \end{array}} V^{\prime }_{\alpha _{k_{p}}} \left( w_{p},d\right) \left( sd\right) ^{-1}. \end{aligned}$$

Thus $\limsup _{k\rightarrow \infty }\rho _{k+1}\ge V^{\prime }_{\alpha } \left( u,d\right) \times \left( sd\right) ^{-1}\ge 1$. $\square $

Proof

(of Lemma 5.2) The argument given above, in the proof of Lemma 5.1, may be applied with $\alpha _{k}$ fixed. Take a subsequence $ \{d_{l_p} \}_{p=0}^\infty $ of $\{ d_{l} := \frac{u_{k,l} - u_k}{\Vert u_{k,l} - u_k \Vert }\}_{p=0}^\infty $ converging to $d$. With the subsequence $u_{k,l_{p}}\rightarrow u_{k}$, we may associate a subsequence of diameters $\Delta _{k,l_{p}}\downarrow 0$. Place $u_{k,l_{p}}=u_{k}+t_{p}d_{l_{p}}$, and note that we continue to reject an update of $u_{k}$. As $u\mapsto V_{\alpha _{k}} \left( u\right) $ is convex, it is also semi-smooth, and so we may assume $s_{l_{p}}\rightarrow s \in \partial _{d} V_{\alpha _{k}} \left( u_{k}\right) :=\left\{ s^{\prime }\in \partial V_{\alpha _{k}} \left( u_{k}\right) : s^{\prime }\cdot d=V^{\prime }_{\alpha _{k}} \left( u_{k} ,d \right) \right\} $. As $z_{k,l_{p}}$ is added to $X_{k,l_{p}}\left( \alpha _{k}\right) $ and retained, we have $z_{k,l_{p-1}}\in X_{k,l_{p-1}}\left( \alpha _{k}\right) $ when solving (3) at iteration $(k,l_{p})$ (and $\alpha _{k}$ is never updated).

Consequently, $\beta _{k,l_{p}}\ge u_{k,l_{p}}\cdot s_{l_{p}-1}=s_{l_{p}-1}\cdot \left( u_{k}+t_{p}d_{p}\right) $. Using (11) replacing $\alpha _{k_{p}}=\alpha _{k}$, and applying (7) to each problem in this sequence yields

$$\begin{aligned} 0&>\frac{V_{\alpha _{k}} \left( u_{k}+t_{p}d_{l_{p}}\right) -V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}} \ge \frac{\beta _{k,l_{p}} - V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}}\nonumber \\&\ge \frac{s_{l_{p}-1}\cdot \left( u_{k}+t_{p}d_{p}\right) - V_{ \alpha _{k}} \left( u_{k}\right) }{t_{p}} = \left[ \left( s_{l_{p}-1}\cdot d_{l_{p}}\right) -\frac{t_{p-1}}{t_{p}}\left( s_{l_{p}-1}\cdot d_{p-1}\right) \right] \nonumber \\&\quad +\left( \frac{t_{p-1}}{t_{p}}\right) \left( \frac{ V_{\alpha _{k}} \left( u_{k}+t_{p-1}d_{l_{p-1}}\right) - V_{\alpha _{k}} \left( u_{k}\right) }{t_{p-1}}\right) , \end{aligned}$$

(14)

as $s_{l_{p}-1}\in \partial V_{\alpha _{k}} \left( u_{k}+t_{p-1}d_{p-1}\right) $. First suppose that there exists a subsequence so that $\left\{ \frac{t_{p_{m}-1}}{t_{p_{m}}}\right\} \rightarrow \lambda \ge 0$. By semi-smoothness, we have the limiting values $s_{l_{p}-1}\cdot d_{p}\rightarrow s\cdot d=V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) $ and $s_{p-1}\cdot d_{p-1}\rightarrow V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) $, so (14) converges to the weighted sum $ \left[ 1-\lambda \right] V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) +\lambda V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) =V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) $. Hence, using (12) and noting both quantities in the quotient are negative, we obtain

$$\begin{aligned} \limsup _{p\rightarrow \infty }\rho _{k+l_{p}}&\ge \limsup _{p\rightarrow \infty }\left( \frac{V_{\alpha _{k}} \left( u_{k}+t_{p}d_{l_{p}}\right) - V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}}\right) \left( \frac{\beta _{k,l_{p}}- V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}}\right) ^{-1}\\&\ge \left( \lim _{t_{p}\downarrow 0,d_{l_{p}}\rightarrow d}\frac{V_{\alpha _{k}} \left( u_{k}+t_{p}d_{l_{p}}\right) -V_{\alpha _{k}} \left( u_{k}\right) }{t_{p}}\right) \left( V^{\prime }_{\alpha _{k}} \left( u_{k},d\right) \right) ^{-1}=1\text {.} \end{aligned}$$

In the alternative case, $\left\{ \frac{t_{p-1}}{t_{p}}\right\} $ is unbounded (and $\Delta _{k,l_{p}+1}=\gamma \Delta _{k},_{l_{p}}$ using a $\gamma \in ]0,1[$). Then $u_{k}+t_{p-1}d_{_{p-1}}\in \mathrm{int }B_{k,l_{p}}$, placing us in Step 3(a), as described in Lemma 4.2, resulting in an update of $u_{k}$, contrary to assumption. $\square $

Proof

(of Corollary 5.1) We have $\liminf _{k\rightarrow \infty }\rho _{k+1}$ given by

$$\begin{aligned} \min \left\{ \lim _{m\rightarrow \infty }\rho _{k_{m}+1} : \left\{ \rho _{k_{m}+1}\right\} _{m=0}^{\infty }\text { is a convergent subsequence of }\left\{ \rho _{k+1}\right\} _{k=0}^{\infty }\right\} \text {.} \end{aligned}$$

For part 1, apply Lemma 5.2 to $\left\{ \rho _{k_{m}+1}\right\} _{m=0}^{\infty }$ to obtain $\left\{ \rho _{k_{p}+1}\right\} _{p=0}^{\infty }$ with

$$\begin{aligned} \lim _{m\rightarrow \infty }\rho _{k_{m}+1}=\limsup _{p\rightarrow \infty }\rho _{k_{p}+1}\ge 1. \end{aligned}$$

(15)

For part 2, we use $\rho _{k+1}\ge \xi $, and so

$$\begin{aligned}&\frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\le \xi \left[ \frac{\beta _{k_{p}}- V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\right] \nonumber \\&=\xi \left[ \frac{\underline{V}_{k_{p}+1}\left( u_{k_{p}}+t_{p}d_{p}\right) - \underline{V}_{k_{p}}\left( u_{k_{p}}\right) }{t_{p}}\right] .\ \end{aligned}$$

(16)

As $u_{k}\rightarrow u$ $\notin \arg \min \left\{ V_{\alpha } \left( w \right) : w\in S^{n}\right\} $, for $k$ large, there exists a descent direction for $V_{\alpha } \left( \cdot \right) $ at $u_{k}$. As $\underline{V}_{k}\left( u_{k}\right) = V_{\alpha } \left( u_{k} \right) $ and $ \underline{V}_{k}\left( \cdot \right) $ minorizes $V_{\alpha } \left( \cdot \right) $, there must exist a greater descent in the same direction for $\underline{V}_{k}\left( \cdot \right) $ at $u_{k}$. As $u_{k_{p}+1}$ solves (3), we have $d_{k_{p}}:=\frac{u_{k_{p} +1}-u_{k_{p}}}{ \Vert u_{k_{p}+1}-u_{k_{p}} \Vert }$ in the direction of maximal descent of $\underline{V}_{k_{p}}\left( \cdot \right) $ at $u_{k_{p}}$, and so there exists $\delta >0$ such that we have $ \underline{V}_{k_{p}+1}\left( u_{k_{p}}+t_{p}d_{p}\right) - \underline{V}_{k_{p}}\left( u_{k_{p}}\right) \le -\delta t_{p} $, for $p$ sufficiently large. Using (16) and the subgradient inequality for $V_{\alpha _{k_{p}}} \left( \cdot \right) $ at $u_{k_{p}}$ gives (for $p$ large)

$$\begin{aligned} V^{\prime }_{\alpha _{k_{p}}} \left( u_{k_{p}},d_{p}\right) \le \frac{ V_{\alpha _{k_{p}}} \left( u_{k_{p}}+t_{p}d_{p}\right) - V_{\alpha _{k_{p}}} \left( u_{k_{p}}\right) }{t_{p}}\le -\xi \delta <0. \end{aligned}$$

Using (13), we have $V^{\prime }_{\alpha } \left( u,d\right) =\limsup _{p} V^{\prime }_{\alpha _{k_{p}}}\left( u_{k_{p}},d_{p}\right) -\xi \delta <0.$ We may now apply Lemma 5.1 to get (15) again. $\square $

Proof

(of Proposition 5.1) In all cases, we have $V_{\alpha _{k}} \left( u_{k}\right) \ge 0$. Suppose that $\rho _{k,l}<\xi $ for all $l$. Then, applying Lemma 5.4 recursively, we have indices $l_{p}$ with

$$\begin{aligned} V_{\alpha _{k}} \left( u_{k}\right) - \underline{V}_{k,l_{k,p}}\left( u_{k,l_{p}}\right)&\le \eta \left[ V_{\alpha _{k}} \left( u_{k}\right) - \underline{V}_{k,l_{p-1}}\left( u_{k,l_{p-1}}\right) \right] \cdots \nonumber \\&\le \eta ^{p}\left[ V_{\alpha _{k}} \left( u_{k}\right) - \underline{V}_{k,l_{0}}\left( u_{k,l_{0}}\right) \right] \rightarrow _{p\rightarrow \infty }0\text {.} \end{aligned}$$

(17)

When $u_{k}\notin S_{\alpha _{k}}$, we have some $\varepsilon >0$ such that $\left\| u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \right\| _{\infty }\ge \varepsilon $ and $ V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\ge $ $\varepsilon $. By (8), $\min \left( \frac{\Delta _{k,l_{p}}}{ \Vert u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \Vert _{\infty }},1 \right) \left( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\right) \rightarrow _{p\rightarrow \infty }0$ implying $\Delta _{k,l_{p}}\rightarrow 0$. Apply Corollary 5.1, part 1 to get a subsequence of $\left\{ \rho _{k,l_{p}}\right\} _{p=0}^{\infty }$ that tends to $1$. Thus, there exists a $p$ with $\rho _{k,l_{p}}\ge \xi >0$, a contradiction.

In the case that $u_{k}\in S_{\alpha _{k}}$, there does not exist any descent for $ V_{\alpha _{k}} \left( \cdot \right) $ at $u=u_{k}$. In particular, we have $v_{k}^{*}= V_{\alpha _{k}} \left( u_{k}\right) \ge 0$. Consequently, (3) cannot generate a descent that passes the test $\rho _{k}\ge \xi $. Using (17), we observe that

$$\begin{aligned} v_{k}^{*} - \underline{V}_{k,l_{k,p}}\left( u_{k,l_{p}}\right) \rightarrow _{p}0\text {.} \end{aligned}$$

(18)

As we add a new subgradient each time when solving $V_{\alpha _{k}} (u_{k,l}) $, to calculate $\rho _{k,l}$, the model function $u\mapsto \underline{V}_{k,l_{k}}(u) $ is monotonically increasing, and hence convergent to a finite, convex function. As the trust region size is monotonically non-increasing, we have $\left\{ \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \right\} _{k}$ monotonically non-decreasing. Consequently, using (18), we deduce that we have a subsequence converging to zero. So for the whole sequence, $\underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \uparrow v_{k}^{*}$. When $V_{\alpha _{k}} \left( u_{k}\right) =v_{k}^{*}>0$, (or equivalently $v\left( SD\right) <\alpha _{k}$, the case $v\left( SD\right) \ne \alpha _{k}$), we find after a finite number of iterations that $\beta _{k,l_{k}}= \underline{V}_{k,l_{k,p}}\left( u_{k,l_{k}}\right) >0$. Thus, we are in case 3(a), of Algorithm SDTR, after a finite number of iterations. Alternatively, $V_{\alpha _{k}} \left( u_{k}\right) =v_{k}^{*}=0$, so $\alpha _{k}=v\left( SD\right) $ and $\underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \le 0$. This establishes part 2b of the proposition.

Suppose we have a pure IP and we add a new subgradient each time when solving $V_{\alpha _{k}} \left( u_{k,l}\right) $ when calculating $\rho _{k,l}$. As $u\mapsto V_{\alpha _{k}} \left( u\right) $ is polyhedral, ($X\left( \alpha _{k}\right) $ contains a finite set of points), we add all extremal subgradients after a finite number of iterations $l$. As $u_{k}\in S_{\alpha _{k}}$ we have $0\in \partial V_{\alpha _{k}} \left( u_{k}\right) $, or equivalently, $0 \in \mathrm{co }\left\{ u_{k}\left( Ax_{j}-b\right) :x_{j}\in X_{k,l}\right\} $. But $\underline{V}_{k,l}\left( u\right) =\max \left\{ u\left( Ax_{j}-b\right) : x_{j}\in X_{k,l}\right\} $, so we also have $0\in \partial \underline{V}_{k,l} \left( u_{k}\right) =\mathrm{co }\left\{ u_{k}\left( Ax_{j}-b\right) : x_{j}\in X_{k,l}\right\} ,$ showing that $u_{k}$ is a local (and hence global) minimum of $u\mapsto \underline{V}_{k,l}\left( u\right) $. Then we have the inequalities $0\le \underline{V}_{k,l}\left( u_{k}\right) =u_{k}\left( Ax_{k}-b\right) \le \min _{u\in S^{n}\cap B} \underline{V}_{k,l}\left( u\right) =\beta _{k,l},$ using the inequality $u_{k}\left( Ax_{k}-b\right) = V_{\alpha _{k}} \left( u_{k}\right) \ge 0$. Hence $ \underline{V}_{k,l}\left( u_{k}\right) =0$ finitely. $\square $

Proof

(of Theorem 5.1) Whenever $[\overline{\alpha },\underline{\alpha }]$ is updated, we decrease the length of this interval of uncertainty by at least a constant factor. Thus, finite termination is assured unless we have an infinite cycle in the trust region loop, with fixed interval $[\overline{\alpha },\underline{\alpha }]$. Proposition 5.1 indicates that this can only occur in two ways. Via a sequence of decreasing $\left\{ \alpha _{k}\right\} $, and a sequence of acceptable descents (i.e. $\rho _{k,l}\ge \xi )$ which does not terminate, or that for some $k$ we have $\alpha _{k}=v\left( SD\right) $ and $ \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \rightarrow 0= V_{\alpha _{k}} \left( u_{k}\right) $, monotonically. This latter case cannot occur when we have a pure IP.

Consider the first case of acceptable descents. Then there exists a sequence $\left\{ l_{k}\right\} $ such that $u_{k+1}=u_{k,l_{k}}$ and between iteration $\left( k,1\right) $ and $\left( k,l_{k}\right) $, we have $\alpha _{k+l}=\alpha _{k}$ fixed. Thus, by Lemma 5.3,

$$\begin{aligned} \underline{V}_{k,1}\left( u_{k}\right)&- \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) =\sum _{l=1}^{l_{k}} \underline{V}_{k,l}\left( u_{k}\right) -\underline{V}_{k,l}\left( u_{k,l}\right) \\&\ge \sum _{l=1}^{l_{k}}\min \left( \frac{\Delta _{k,l}}{\left\| u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \right\| _{\infty }},1\right) \left( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\right) , \end{aligned}$$

where $u_{k+1}=u_{k,l_{k}}$ and $ \underline{V}_{k+1,0}\left( u_{k+1}\right) = \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) \ge 0$ (as we are in Step 3 of STDR). The model function $ \underline{V}_{k,l_{k}}\left( \cdot \right) $ is not carried to the next stage, but $\alpha _{k}$ is decreased to $\alpha _{k+1}$, and we prune subgradients which decrease the new model function to the new initial model $\underline{V}_{k+1,1}$. Hence, we have the inequality $\underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) - \underline{V}_{k+1,1}\left( u_{k+1}\right) \ge 0$. Thus $ \underline{V}_{k,1}\left( u_{k}\right) - \underline{V}_{k+1,1}\left( u_{k+1}\right) $ equals

$$\begin{aligned}&\underline{V}_{k,1}\left( u_{k}\right) - \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) + \underline{V}_{k,l_{k}}\left( u_{k,l_{k}}\right) - \underline{V}_{k+1,1}\left( u_{k+1}\right) \\&\quad \ge \sum _{l=1}^{l_{k}}\min \left( \frac{\Delta _{k+l}}{\left\| u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \right\| _{\infty }},1\right) \left( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\right) \text {.} \end{aligned}$$

For $M=\sum _{k=0}^{K}l_{k}$, (as we remain in Step 3 of SDTR with $ \underline{V}_{K,1}\left( u_{K}\right) \ge 0$),

$$\begin{aligned} +\infty&> \underline{V}_{0,1}\left( u_{0}\right) \ge \underline{V}_{0,1}\left( u_{0}\right) - \underline{V}_{K,1}\left( u_{K}\right) =\sum _{k=0}^{K-1}\left[ \underline{V}_{k,1}\left( u_{k}\right) - \underline{V}_{k+1,1}\left( u_{k+1}\right) \right] \nonumber \\&\ge \sum _{k=0}^{K-1}\sum _{l=0}^{l_{k}}\min \left( \frac{\Delta _{k+l}}{\left\| u_{k}-p_{\alpha _{k}}\left( u_{k}\right) \right\| _{\infty }},1\right) \left( V_{\alpha _{k}} \left( u_{k}\right) -v_{k}^{*}\right) \nonumber \\&=\sum _{j=0}^{M}\min \left( \frac{\Delta _{j}}{\left\| u_{j}-p_{\alpha _{j}}\left( u_{j}\right) \right\| _{\infty }},1\right) \left( V_{\alpha _{j}} \left( u_{j}\right) -v_{j}^{*}\right) , \end{aligned}$$

(19)

where $p_{\alpha _{j}}\left( u\right) $ denotes the projection of $u$ onto the solution set $S_{\alpha _{j}}\subseteq S^{n}$ of minimizers of $u\mapsto V_{\alpha _{j}} \left( u\right) $ and $v_{j}^{*} := \min _{u\in S^{n}} V_{\alpha _{j}} \left( u\right) $. The convergence of the series implies the terms in (19) converge to zero. Next, note that each time we accept a subgradient, (as we have obtained sufficient descent), we decrease the interval $[\underline{\alpha },\alpha _{k}].$ Thus, we have $\alpha _{j}\downarrow \underline{\alpha }$ as $j\rightarrow \infty $. As $\min _{u\in S^{n}} V_{\underline{\alpha }} \left( u\right) <0$, and we assume we remain in Step 3 of SDTR, we have $ V_{\alpha _{j}} \left( u_{j}\right) \ge 0$ and hence: $\left\| u_{j}-p_{\alpha _{j}}\left( u_{j}\right) \right\| _{\infty }\ge \delta >0$ for some $\delta $. Thus, there is some $\epsilon >0$ such that $ V_{\alpha _{j}} \left( u_{j}\right) -v_{j}^{*} \ge \epsilon >0$. Consequently, we must have $\Delta _{j}\rightarrow 0.$

Consider the first case when we have a MIP. Note that each time we accept a subgradient, (as we have obtained sufficient descent), we reduce the interval $[\underline{\alpha },\alpha _{k}].$ Thus, we have $\alpha _{j}\downarrow \underline{\alpha }$ as $j\rightarrow \infty $ and we may apply Corollary 5.1 to deduce that $\liminf _{j}\rho _{j}\ge 1$. But this implies that there exists a $J$ for which $\rho _{j}\ge \frac{3}{4}$ for $j\ge J$, forcing $\Delta _{j+l}=\min \left\{ 2\Delta _{j+l-1},\overline{\Delta }\right\} $. Eventually, we must have $\Delta _{j}=\overline{\Delta }>0$, a contradiction.

In the case of a pure IP, (no continuous variable), we note that as $\alpha _{j}\downarrow 0$, and $X\left( \alpha _{j}\right) :=\left\{ x\in X : cx\le \alpha _{j}\right\} $, for $j$ sufficiently large, the discrete components of $X\left( \alpha _{j}\right) $ do not change. As we have a pure IP, the function $ V_{\alpha _{j}} \left( \cdot \right) $ must be constant and equal to $V_{\underline{\alpha }} \left( \cdot \right) $, for $j$ sufficiently large. Hence for $j$ sufficiently large $ V_{\alpha _{j}} \left( \cdot \right) = V_{\underline{\alpha }} \left( \cdot \right) $. We may now apply Corollary 5.1, with $\alpha _{k}=\underline{\alpha }$ constant, to deduce that $\liminf _{j}\rho _{j}\ge 1$.

Thus, we cannot have an infinite loop of acceptable descents without an update of a lower or upper bound. After a finite number of iteration, we must have $\left| \overline{\alpha }-\underline{\alpha }\right| \le \varepsilon $. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boland, N., Eberhard, A.C. & Tsoukalas, A. A Trust Region Method for the Solution of the Surrogate Dual in Integer Programming. J Optim Theory Appl 167, 558–584 (2015). https://doi.org/10.1007/s10957-014-0681-9

Download citation

Received: 20 October 2013
Accepted: 05 November 2014
Published: 18 November 2014
Issue Date: November 2015
DOI: https://doi.org/10.1007/s10957-014-0681-9

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A Trust Region Method for the Solution of the Surrogate Dual in Integer Programming

Abstract

Access this article

Similar content being viewed by others

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems

A class of accelerated GADMM-based method for multi-block nonconvex optimization problems

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Proof

Proof

Proof

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A Trust Region Method for the Solution of the Surrogate Dual in Integer Programming

Abstract

Access this article

Similar content being viewed by others

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems

A class of accelerated GADMM-based method for multi-block nonconvex optimization problems

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof

Proof

Proof

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation