Skip to main content
Log in

Convergence-order analysis for differential-inequalities-based bounds and relaxations of the solutions of ODEs

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

For the performance of global optimization algorithms, the rate of convergence of convex relaxations to the objective and constraint functions is critical. We extend results from Bompadre and Mitsos (J Glob Optim 52(1):1–28, 2012) to characterize the convergence rate of parametric bounds and relaxations of the solutions of ordinary differential equations (ODEs). Such bounds and relaxations are used for global dynamic optimization and are computed using auxiliary ODE systems that use interval arithmetic and McCormick relaxations. Two ODE relaxation methods (Scott et al. in Optim Control Appl Methods 34(2):145–163, 2013; Scott and Barton in J Glob Optim 57:143–176, 2013) are shown to give second-order convergence, yet they can behave very differently from each other in practice. As time progresses, the prefactor in the convergence-order bound tends to grow much more slowly for one of these methods, and can even decrease over time, yielding global optimization procedures that require significantly less computation time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Alefeld, G., Mayer, G.: Interval analysis: theory and applications. J. Comput. Appl. Math. 121(1–2), 421–464 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  2. Banga, J.R., Alonso, A.A., Singh, R.P.: Stochastic dynamic optimization of batch and semicontinuous bioprocesses. Biotechnol. Prog. 13, 326–335 (1997)

    Article  Google Scholar 

  3. Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)

    MATH  Google Scholar 

  4. Bompadre, A., Mitsos, A.: Convergence rate of McCormick relaxations. J. Glob. Optim. 52(1), 1–28 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bompadre, A., Mitsos, A., Chachuat, B.: Convergence analysis of Taylor models and McCormick–Taylor models. J. Glob. Optim. 57, 75–114 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  6. Cervantes, A.M., Wächter, A., Tütüncü, R.H., Biegler, L.T.: A reduced space interior point strategy for optimization of differential algebraic systems. Comput. Chem. Eng. 24, 39–51 (2000)

    Article  Google Scholar 

  7. Chachuat, B., Barton, P.I., Singer, A.B.: Global methods for dynamic optimization and mixed-integer dynamic optimization. Ind. Eng. Chem. Res. 45(25), 8373–8392 (2006)

    Article  Google Scholar 

  8. Chachuat, B., Villanueva, M.: Bounding the solutions of parametric ODEs: when Taylor models meet differential inequalities. In: Bogle, I.D.L., Fairweather, M. (eds.) 22 European Symposium on Computer Aided Process Engineering, volume 30 of Comput. Aided Chem. Eng., pp. 1307–1311. Elsevier Science BV (2012)

  9. Dahlquist, G.: Stability and error bounds in the numerical integration of ordinary differential equations. PhD thesis, University of Stockholm (1958)

  10. Du, K., Kearfott, R.B.: The cluster problem in multivariate global optimization. J. Glob. Optim. 5(3), 253–265 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  11. Esposito, W.R., Floudas, C.A.: Global optimization for the parameter estimation of differential-algebraic systems. Ind. Eng. Chem. Res. 39, 1291–1310 (2000)

    Article  Google Scholar 

  12. Filippov, A.F.: Differential Equations with Discontinuous Righthand Sides. Kluwer Academic Publishers, Dordrecht (1988)

    Book  Google Scholar 

  13. Hairer, E., Norsett, S.P., Wanner, G.: Solving Ordinary Differential Equations I. Springer, Berlin (1993)

    MATH  Google Scholar 

  14. Harrison, G.: Dynamic models with uncertain parameters. In: Avula, X. (ed.) Proc. First Int. Conf. Math. Model., vol. 1, pp. 295–304. University of Missouri, Rolla (1977)

  15. Harwood, S.M., Barton, P.I.: Efficient polyhedral enclosures for the reachable set of nonlinear control systems. Math. Control Signal 28(1), 8 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  16. Harwood, S.M., Scott, J.K., Barton, P.I.: Bounds on reachable sets using ordinary differential equations with linear programs embedded. IMA J. Math. Control I 33(2), 519–541 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  17. Houska, B., Logist, F., Van Impe, J., Diehl, M.: Robust optimization of nonlinear dynamic systems with application to a jacketed tubular reactor. J. Process Control 22(6), 1152–1160 (2012)

    Article  Google Scholar 

  18. Houska, B., Villanueva, M.E., Chachuat, B.: A validated integration algorithm for nonlinear ODEs using Taylor models and ellipsoidal calculus. In: 52nd IEEE Conf. Decis. Control, pp. 484–489 (2013)

  19. Houska, B., Villanueva, M.E., Chachuat, B.: Stable set-valued integration of nonlinear dynamic systems using affine set-parameterizations. SIAM J. Numer. Anal. 53(5), 2307–2328 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  20. Khalil, H.K.: Nonlinear Systems, 3rd edn. Prentice-Hall, Upper Saddle River (2002)

    MATH  Google Scholar 

  21. Khan, K.A., Watson, H.A.J., Barton, P.I.: Differentiable mccormick relaxations. J. Glob. Optim. 67(4), 687–729 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  22. Krogh, B.H., Thorpe, C.E.: Integrated path planning and dynamic steering control for autonomous vehicles. In: Proceedings of 1986 IEEE Int. Conf. on Robot. Autom., vol. 3, pp. 1664–1669. IEEE (1986)

  23. Leineweber, D.B., Bauer, I., Bock, H.G., Schlöder, J.P.: An efficient multiple shooting based reduced SQP strategy for large-scale dynamic process optimization. Part 1: theoretical aspects. Comput. Chem. Eng. 27, 157–166 (2003)

    Article  Google Scholar 

  24. Lin, Y., Stadtherr, M.A.: Deterministic global optimization for parameter estimation of dynamic systems. Ind. Eng. Chem. Res. 45(25), 8438–8448 (2006)

    Article  Google Scholar 

  25. Lin, Y., Stadtherr, M.A.: Deterministic global optimization of nonlinear dynamic systems. AIChE J. 53(4), 866–875 (2007)

    Article  Google Scholar 

  26. Lin, Y., Stadtherr, M.A.: Validated solutions of initial value problems for parametric ODEs. Appl. Numer. Math. 57(10), 1145–1162 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  27. Løvik, I., Hillestad, M., Hertzberg, T.: Long term dynamic optimization of a catalytic reactor system. Comput. Chem. Eng. 22, S707–S710 (1998)

    Article  Google Scholar 

  28. Luus, R., Dittrich, J., Keil, F.J.: Multiplicity of solutions in the optimization of a bifunctional catalyst blend in a tubular reactor. Can. J. Chem. Eng. 70(4), 780–785 (1992)

    Article  Google Scholar 

  29. Maravall, D., de Lope, J.: Multi-objective dynamic optimization with genetic algorithms for automatic parking. Soft Comput. 11(3), 249–257 (2007)

    Article  Google Scholar 

  30. McCormick, G.P.: Computability of global solutions to factorable nonconvex programs: part I—convex underestimating problems. Math. Program. 10(1), 147–175 (1976)

    Article  MATH  Google Scholar 

  31. Moles, C.G., Mendes, P., Banga, J.R.: Parameter estimation in biochemical pathways: a comparison of global optimization methods. Genome Res. 13(11), 2467–2474 (2003)

    Article  Google Scholar 

  32. Moore, R.E.: Interval arithmetic and automatic error analysis in digital computing. PhD thesis, Stanford University (1962)

  33. Moore, R.E.: Methods and Applications of Interval Analysis. SIAM, Philadelphia (1979)

    Book  MATH  Google Scholar 

  34. Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. Society for Industrial and Applied Mathematics, Philadelphia (2009)

    Book  MATH  Google Scholar 

  35. Müller, M.: Über die Eindeutigkeit der Integrale eines Systems gewöhnlicher Differentialgleichungen und die Konvergenz einer Gattung von Verfahren zur Approximation dieser Integrale. Sitz.-Ber. Heidelberger Akad. Wiss. Math.-Naturwiss. Kl. 9, 3–38 (1927)

    MATH  Google Scholar 

  36. Najman, J., Mitsos, A.: Convergence analysis of multivariate mccormick relaxations. J. Glob. Optim. 66(4), 597–628 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  37. Neumaier, A.: Taylor forms-use and limits. Reliab. Comput. 9(1), 43–79 (2003). https://doi.org/10.1023/A:1023061927787

    Article  MathSciNet  MATH  Google Scholar 

  38. Pongpunwattana, A., Rysdyk, R.: Real-time planning for multiple autonomous vehicles in dynamic uncertain environments. J. Aerosp. Comput. Inf. Commun. 1(12), 580–604 (2004)

    Article  Google Scholar 

  39. Prata, A., Oldenburg, J., Kroll, A., Marquardt, W.: Integrated scheduling and dynamic optimization of grade transitions for a continuous polymerization reactor. Comput. Chem. Eng. 32(3), 463–476 (2008)

    Article  Google Scholar 

  40. Rodriguez-Fernandez, M., Egea, J.A., Banga, J.R.: Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems. BMC Bioinformatics 7, 483 (2006)

    Article  Google Scholar 

  41. Sahlodin, A.M.: Global optimization of dynamic process systems using complete search methods. PhD thesis, McMaster University (2013)

  42. Sahlodin, A.M., Chachuat, B.: Convex/concave relaxations of parametric ODEs using Taylor models. Comput. Chem. Eng. 35(5), 844–857 (2011)

    Article  MATH  Google Scholar 

  43. Sahlodin, A.M., Chachuat, B.: Discretize-then-relax approach for convex/concave relaxations of the solutions of parametric ODEs. Appl. Numer. Math. 61(7), 803–820 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  44. Schaber, S.D.: Tools for dynamic model development. PhD thesis, Massachusetts Institute of Technology (2014)

  45. Schöbel, A., Scholz, D.: The theoretical and empirical rate of convergence for geometric branch-and-bound methods. J. Glob. Optim. 48, 473–495 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  46. Scholz, D.: Theoretical rate of convergence for interval inclusion functions. J. Glob. Optim. 53(4), 749–767 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  47. Scott, J.K.: Reachability analysis and deterministic global optimization of differential-algebraic systems. PhD thesis, Massachusetts Institute of Technology (2012)

  48. Scott, J.K., Barton, P.I.: Tight, efficient bounds on the solutions of chemical kinetics models. Comput. Chem. Eng. 34(5), 717–731 (2010)

    Article  Google Scholar 

  49. Scott, J.K., Barton, P.I.: Bounds on the reachable sets of nonlinear control systems. Automatica 49(1), 93–100 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  50. Scott, J.K., Barton, P.I.: Convex and concave relaxations for the parametric solutions of semi-explicit index-one differential-algebraic equations. J. Optim. Theory Appl. 156, 617–649 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  51. Scott, J.K., Barton, P.I.: Improved relaxations for the parametric solutions of ODEs using differential inequalities. J. Glob. Optim. 57, 143–176 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  52. Scott, J.K., Barton, P.I.: Interval bounds on the solutions of semi-explicit index-one DAEs. Part 1: analysis. Numer. Math. 125(1), 1–25 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  53. Scott, J.K., Barton, P.I.: Interval bounds on the solutions of semi-explicit index-one DAEs. Part 2: computation. Numer. Math. 125(1), 27–60 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  54. Scott, J.K., Chachuat, B., Barton, P.I.: Nonlinear convex and concave relaxations for the solutions of parametric ODEs. Optim. Control Appl. Methods 34(2), 145–163 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  55. Scott, J.K., Stuber, M.D., Barton, P.I.: Generalized McCormick relaxations. J. Glob. Optim. 51(4), 569–606 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  56. Shen, K., Scott, J.K.: Rapid and accurate reachability analysis for nonlinear dynamic systems by exploiting model redundancy. Comput. Chem. Eng. 106, 596–608 (2017)

    Article  Google Scholar 

  57. Singer, A.B., Barton, P.I.: Bounding the solutions of parameter dependent nonlinear ordinary differential equations. SIAM J. Sci. Comput. 27(6), 2167 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  58. Singer, A.B., Barton, P.I.: Global optimization with nonlinear ordinary differential equations. J. Glob. Optim. 34(2), 159–190 (2006). https://doi.org/10.1007/s10898-005-7074-4

    Article  MathSciNet  MATH  Google Scholar 

  59. Singer, A.B., Taylor, J.W., Barton, P.I., Green, W.H.: Global dynamic optimization for parameter estimation in chemical kinetics. J. Phys. Chem. A 110(3), 971–976 (2006)

    Article  Google Scholar 

  60. Söderlind, G.: The logarithmic norm. History and modern theory. BIT Numer. Math. 46(3), 631–652 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  61. Tjoa, I.-B., Biegler, L.T.: Simultaneous solution and optimization strategies for parameter estimation of differential-algebraic equation systems. Ind. Eng. Chem. Res. 30, 376–385 (1991)

    Article  Google Scholar 

  62. Tulsyan, A., Barton, P.I.: Interval enclosures for reachable sets of chemical kinetic flow systems. Part 1: sparse transformation. Chem. Eng. Sci. 166, 334–344 (2017)

    Article  Google Scholar 

  63. Tulsyan, A., Barton, P.I.: Interval enclosures for reachable sets of chemical kinetic flow systems. Part 2: direct-bounding method. Chem. Eng. Sci. 166, 345–357 (2017)

    Article  Google Scholar 

  64. Tulsyan, A., Barton, P.I.: Interval enclosures for reachable sets of chemical kinetic flow systems. Part 3: indirect-bounding method. Chem. Eng. Sci. 166, 358–372 (2017)

    Article  Google Scholar 

  65. Vassiliadis, V., Sargent, R., Pantelides, C.: Solution of a class of multistage dynamic optimization problems. 1. Problems without path constraints. Ind. Eng. Chem. Res. 33, 2111–2122 (1994)

    Article  Google Scholar 

  66. Villanueva, M.E., Houska, B., Chachuat, B.: Unified framework for the propagation of continuous-time enclosures for parametric nonlinear ODEs. J. Glob. Optim. 62(3), 575–613 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  67. Wechsung, A., Schaber, S.D., Barton, P.I.: The cluster problem revisited. J. Glob. Optim. 58(3), 429–438 (2014)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge funding from Novartis Pharmaceuticals and helpful comments from the anonymous reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul I. Barton.

A Supporting lemmas and proofs

A Supporting lemmas and proofs

1.1 A.1 Proof of Lemma 2.2

Proof

If \(\mu = 0\), the result is trivial. Now consider \(\mu \ne 0\). Define

$$\begin{aligned} v(s) \equiv \exp (\mu (t_0 - s)) \int _{t_0}^s \mu x(r) {\mathrm {d}}r,\quad \forall s \in I. \end{aligned}$$
(A.1)

Differentiating gives

$$\begin{aligned} v'(s) = \mu \exp (\mu (t_0-s))\left( x(s) - \int _{t_0}^s \mu x(r) {\mathrm {d}}r \right) , \quad \forall s \in I. \end{aligned}$$

Since \(\mu \ne 0\),

$$\begin{aligned} \frac{v'(s)}{\mu } = \underbrace{\exp (\mu (t_0-s))}_{\ge 0} \underbrace{\left( x(s) - \int _{t_0}^s \mu x(r) {\mathrm {d}}r\right) }_{\le \lambda _0 + \lambda _1(s-t_0)}, \quad \forall s \in I, \end{aligned}$$

where the bound on the second term comes from (2.1). Therefore,

$$\begin{aligned} \frac{v'(s)}{\mu } \le \exp (\mu (t_0-s))(\lambda _0 + \lambda _1(s-t_0)),\quad \forall s \in I. \end{aligned}$$

Note that \(v(t_0) = 0\) and integrate:

$$\begin{aligned} \frac{v(t)}{\mu }&= \int _{t_0}^t \frac{v'(s)}{\mu } {\mathrm {d}}s \le \int _{t_0}^t \exp (\mu (t_0-s))(\lambda _0 + \lambda _1(s-t_0)) {\mathrm {d}}s,\\&= \frac{\lambda _0 \mu + \lambda _1 - \exp (\mu (t_0 - t))(\lambda _0 \mu + \lambda _1 + \lambda _1 \mu (t-t_0))}{\mu ^2},\quad \forall t \in I. \end{aligned}$$

Substitute in the definition for v from (A.1):

$$\begin{aligned} \exp (\mu (t_0 - t)) \int _{t_0}^t x(s) {\mathrm {d}}s \le \frac{\lambda _0 \mu + \lambda _1 - \exp (\mu (t_0 - t))(\lambda _0 \mu + \lambda _1 + \lambda _1 \mu (t-t_0))}{\mu ^2},\quad \forall t \in I. \end{aligned}$$

Multiply by \(\exp (\mu (t-t_0))\):

$$\begin{aligned} \int _{t_0}^t x(s) {\mathrm {d}}s \le \frac{(\lambda _0 \mu + \lambda _1)\exp (\mu (t-t_0)) - (\lambda _0 \mu + \lambda _1 + \lambda _1 \mu (t-t_0))}{\mu ^2},\quad \forall t \in I. \end{aligned}$$

Differentiate the above inequality to obtain

$$\begin{aligned} x(t)&\le \left( \lambda _0 + \frac{\lambda _1}{\mu }\right) \exp (\mu (t-t_0))-\frac{\lambda _1}{\mu },\quad \forall t \in I. \end{aligned}$$

\(\square \)

1.2 A.2 Proof of Theorem 4.14

The following argument follows a similar line of reasoning to [47, Corollary 3.3.6] and [47, Proof of Theorem 3.3.2], but is sufficiently different that we prove it in full.

Proof

Fix any \({\widehat{P}}\in {\mathbb {I}P}\). Since solutions to (4.13) and (4.14) are assumed to exist,

$$\begin{aligned}&{{\mathbf {v}}}(t,{\widehat{P}}), {{\mathbf {w}}}(t,{\widehat{P}}), \widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}), \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}}) \in D,\quad \forall t\in I. \end{aligned}$$

Since, in addition, the intervals in Hypotheses 1 and 2 of Theorem 4.14 are proper,

$$\begin{aligned}&{{\mathbf {v}}}(t,{\widehat{P}}) \le {{\mathbf {w}}}(t,{\widehat{P}})\quad \text {and} \quad \widetilde{{{\mathbf {v}}}}(t,{\widehat{P}})\le \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}}), \quad \forall t \in I. \end{aligned}$$

We need to prove that \(\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}) \le {{\mathbf {v}}}(t,{\widehat{P}})\) and \({{\mathbf {w}}}(t,{\widehat{P}}) \le \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}})\), \(\forall t \in I\), which holds at \(t_0\) by Hypothesis 1. Suppose (to arrive at a contradiction) \(\exists t \in I\) such that either \(v_i(t,{\widehat{P}}) < \widetilde{v}_i(t,{\widehat{P}})\) or \(w_i(t,{\widehat{P}}) > \widetilde{w}_i(t,{\widehat{P}})\) for at least one \(i \in \{1,\ldots ,n_x\}\) and define

$$\begin{aligned} t_1 \equiv \inf \{t \in I: v_i(t,{\widehat{P}}) < \widetilde{v}_i(t,{\widehat{P}}) \text { or } w_i(t,{\widehat{P}}) > \widetilde{w}_i(t,{\widehat{P}}) \text {, for at least one } i\}. \end{aligned}$$

Applying [47, Lemma 3.3.5] with the definition \(\varvec{\delta }(t) \equiv (\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}) - {{\mathbf {v}}}(t,{\widehat{P}}), {{\mathbf {w}}}(t,{\widehat{P}}) - \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}}))\), we obtain the following fact:

  • For any \(t_4 \in (t_1, t_f]\), \(\varepsilon > 0\), and \(\beta \in \mathbb {R}_+\), there exists \(j \in \{1, \ldots , n_x\}\), an absolutely continuous and non-decreasing function \(\rho :[t_1,t_4]\rightarrow \mathbb {R}\), and numbers \(t_2, t_3 \in [t_1, t_4]\) with \(t_2 < t_3\) such that

    $$\begin{aligned} 0 < \rho (t)&\le \varepsilon ,\quad \forall t \in [t_1,t_4]\quad \text {and} \quad \dot{\rho }(t) > \beta \rho (t),\quad \text {a.e.}\ t\in [t_1,t_4], \end{aligned}$$
    (A.2)
    $$\begin{aligned} \widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}) - {\mathbf {1}}\rho (t)&< {{\mathbf {v}}}(t,{\widehat{P}}) \quad \text {and} \quad {{\mathbf {w}}}(t,{\widehat{P}}) < \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}}) + {\mathbf {1}}\rho (t),\quad \forall t \in [t_2, t_3), \end{aligned}$$
    (A.3)

    where \({\mathbf {1}}\) is a vector with all components equal to 1, and

    $$\begin{aligned}&v_{j}(t_2,{\widehat{P}}) = \widetilde{v}_{j}(t_2,{\widehat{P}}),\quad v_{j}(t_3,{\widehat{P}}) = \widetilde{v}_{j}(t_3,{\widehat{P}}) - \rho (t_3), \nonumber \\&\quad \text {and}\quad v_{j}(t,{\widehat{P}}) < \widetilde{v}_{j}(t,{\widehat{P}}), \quad \forall t \in (t_2,t_3) \end{aligned}$$
    (A.4)
    $$\begin{aligned}&\Big (\text {or}\quad w_{j}(t_2,{\widehat{P}}) = \widetilde{w}_{j}(t_2,{\widehat{P}}),\quad w_{j}(t_3,{\widehat{P}}) = \widetilde{w}_{j}(t_3,{\widehat{P}}) + \rho (t_3), \nonumber \\&\quad \text {and}\quad w_{j}(t,{\widehat{P}}) > \widetilde{w}_{j}(t,{\widehat{P}}), \quad \forall t \in (t_2,t_3) \Big ). \end{aligned}$$
    (A.5)

To apply this fact, we choose \(\varepsilon >0\) small enough that

$$\begin{aligned} K \equiv [\widetilde{{{\mathbf {v}}}}(t_1,{\widehat{P}}),\widetilde{{{\mathbf {w}}}}(t_1,{\widehat{P}})]+2\varepsilon [-{\mathbf {1}},{\mathbf {1}}] \subset D, \end{aligned}$$
(A.6)

which is possible because D is open and \([\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}), \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}})] \subset D\), \(\forall t \in I\), by the existence of a solution to (4.14). Next, we choose \(\beta =L\), where L is the larger of the two Lipschitz constants for \(\widetilde{{{\mathbf {u}}}}\) and \(\widetilde{{{\mathbf {o}}}}\) on \(I \times \mathbb {I}K\times \mathbb {I}{\widehat{P}}\). Such an L exists by Hypothesis 4 (see also Remark 4.2). Finally, we choose \(t_4\) sufficiently small that

$$\begin{aligned} {[}\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}), \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}})] \subset [\widetilde{{{\mathbf {v}}}}(t_1,{\widehat{P}}),\widetilde{{{\mathbf {w}}}}(t_1,{\widehat{P}})]+\varepsilon [- {\mathbf {1}},{\mathbf {1}}],\quad \forall t \in (t_1,t_4]. \end{aligned}$$
(A.7)

Now, suppose (A.4) holds (the proof is analogous if instead (A.5) holds). We know from (A.3) that

$$\begin{aligned} {[}{{\mathbf {v}}}(t,{\widehat{P}}), {{\mathbf {w}}}(t,{\widehat{P}})] \subset [\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}})-\rho (t){\mathbf {1}}, \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}})+\rho (t){\mathbf {1}}] ,\quad \forall t \in [t_2, t_3).\end{aligned}$$

By (4.17), (4.18), and the inclusion above, we have

$$\begin{aligned} u_j(t, [{{\mathbf {v}}}(t,{\widehat{P}}), {{\mathbf {w}}}(t,{\widehat{P}})], {\widehat{P}})&\ge \widetilde{u}_j(t, [{{\mathbf {v}}}(t,{\widehat{P}}), {{\mathbf {w}}}(t,{\widehat{P}})], {\widehat{P}}), \nonumber \\&\ge \widetilde{u}_j(t, [\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}) - \rho (t){\mathbf {1}}, \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}}) + \rho (t){\mathbf {1}}], {\widehat{P}}),\quad \mathrm {a.e.}\ t \in [t_2,t_3). \end{aligned}$$
(A.8)

Above, \([\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}) - \rho (t){\mathbf {1}}, \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}}) + \rho (t){\mathbf {1}}]\) is guaranteed to be a subset of K, and hence of D, by (A.2), (A.7), and (A.6). Thus, Lipschitz continuity of \(\widetilde{u}_j\) on \(I\times {\mathbb {I}}K\times {\mathbb {I}}{\widehat{P}}\) gives

$$\begin{aligned} \widetilde{u}_j(t, [\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}) - \rho (t){\mathbf {1}}, \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}}) + \rho (t){\mathbf {1}}], {\widehat{P}}) \ge \widetilde{u}_j(t, [\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}), \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}})], {\widehat{P}}) - L \rho (t). \end{aligned}$$
(A.9)

Combining (A.8) and (A.9),

$$\begin{aligned} u_j(t, [{{\mathbf {v}}}(t,{\widehat{P}}), {{\mathbf {w}}}(t,{\widehat{P}})], {\widehat{P}}) \ge \widetilde{u}_j(t, [\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}), \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}})], {\widehat{P}}) - L \rho (t),\quad \text {a.e.}\ t \in [t_2, t_3]. \end{aligned}$$

Adding \(\dot{\rho }(t)\) to both sides,

$$\begin{aligned} u_j(t, [{{\mathbf {v}}}(t,{\widehat{P}}), {{\mathbf {w}}}(t,{\widehat{P}})], {\widehat{P}}) + \dot{\rho }(t)&\ge \widetilde{u}_j(t, [\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}), \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}})], {\widehat{P}}) - L \rho (t) + \dot{\rho }(t),\\&> \widetilde{u}_j(t, [\widetilde{{{\mathbf {v}}}}(t,{\widehat{P}}), \widetilde{{{\mathbf {w}}}}(t,{\widehat{P}})], {\widehat{P}}),\quad \text {a.e.}\ t \in [t_2, t_3], \end{aligned}$$

where the second inequality follows from (A.2) with \(\beta = L\). By [47, Theorem 3.3.3], this implies that \((\widetilde{v}_j(\cdot ,{\widehat{P}}) - v_j(\cdot ,{\widehat{P}}) - \rho )\) is non-increasing on \([t_2, t_3]\), so that

$$\begin{aligned} \widetilde{v}_j(t_3,{\widehat{P}}) - v_j(t_3,{\widehat{P}}) - \rho (t_3) \le \widetilde{v}_j(t_2,{\widehat{P}}) - v_j(t_2,{\widehat{P}}) - \rho (t_2). \end{aligned}$$

But, by (A.4), this implies that \(0 \le - \rho (t_2)\), which contradicts (A.2). \(\square \)

1.3 A.3 (1,2)-Convergence of natural McCormick extensions

In this section, we establish the (1, 2)-convergence of natural McCormick extensions [47, Definition 2.4.31] as required by Assumption 5.17. To begin, we show that (1, 2)-convergence is composable.

Theorem A.1

Let \(D_x\subset \mathbb {R}^{n}\) and let \(\mathcal {F}:\mathbb {M}D_x \rightarrow \mathbb {MR}^m\) have the form \(\mathcal {F}(\mathcal {X})=(F^B(X^B),F^C(\mathcal {X}))\), \(\forall \mathcal {X}=(X^B,X^C)\in \mathbb {M}D_x\). Moreover, let \(F^B(X^B)\subset D_y\subset \mathbb {R}^m\) for all \(X^B\in \mathbb {I}D_x\), and let \(\mathcal {G}:\mathbb {M}D_y \rightarrow \mathbb {MR}^q\) have the form \(\mathcal {G}(\mathcal {Y})=(G^B(Y^B),G^C(\mathcal {Y}))\). If \(\mathcal {F}\) and \(\mathcal {G}\) have (1, 2)-convergence on \(\mathbb {M}D_x\) and \(\mathbb {M}D_y\), respectively, then \(\mathcal {G}\circ \mathcal {F}\) has (1, 2)-convergence on \(\mathbb {M}D_x\).

Proof

Applying (1, 2)-convergence for \(\mathcal {G}\) and \(\mathcal {F}\) sequentially, we obtain for any \(\mathcal {X}\in \mathbb {M}D_x\),

$$\begin{aligned} w(G^B\circ F^B(X^B))&\le \tau ^G_{BB}w(F^B(X^B)) \end{aligned}$$
(A.10)
$$\begin{aligned}&\le \tau ^G_{BB}\tau ^F_{BB}w(X^B), \end{aligned}$$
(A.11)
$$\begin{aligned} w(G^C\circ \mathcal {F}(\mathcal {X}))&\le \tau ^G_{CC}w(F^C(\mathcal {X}))+\tau ^G_{CB}w(F^B(X^B))^{2}, \end{aligned}$$
(A.12)
$$\begin{aligned}&\le \tau ^G_{CC}(\tau ^F_{CC}w(X^C)+\tau ^F_{CB}w(X^B)^2)+\tau ^G_{CB}(\tau ^F_{BB}w(X^B))^{2}, \end{aligned}$$
(A.13)
$$\begin{aligned}&\le \tau ^G_{CC}\tau ^F_{CC}w(X^C)+(\tau ^G_{CC}\tau ^F_{CB}+\tau ^G_{CB}(\tau ^F_{BB})^2)w(X^B)^{2}. \end{aligned}$$
(A.14)

\(\square \)

Recall that a function is factorable if it can be written as a finite recursive sequence of basic operations including addition, multiplication, and composition with univariate functions from a standard library (e.g., \(\frac{1}{x}\), \(e^x\), \(x^n\), etc.) Natural McCormick extensions are constructed for such functions by recursively applying simple relaxation rules for each basic operation in this sequence (see [47] for a precise definition). Specifically, these rules are relaxation functions for the basic operations. In light of Theorem A.1, it follows that, in order to establish (1, 2)-convergence of natural McCormick extensions, it suffices to establish (1, 2)-convergence of the relaxation functions for addition, multiplication, and univariate composition.

1.3.1 A.3.1 Addition

Definition A.2

McCormick addition \(+:\mathbb {MR}\times \mathbb {MR}\rightarrow \mathbb {MR}\) is defined by

$$\begin{aligned} +(\mathcal {X},\mathcal {Y})&= \mathcal {X}+\mathcal {Y} = (X^B+Y^B,(X^B\cap X^C)+(Y^B\cap Y^C)). \end{aligned}$$
(A.15)

Lemma A.3

McCormick addition has (1,2)-convergence on \(\mathbb {MR}\times \mathbb {MR}\).

Proof

Choose any \((\mathcal {X},\mathcal {Y})\in \mathbb {MR}\times \mathbb {MR}\) and let \(\mathcal {Z}=+(\mathcal {X},\mathcal {Y})\). Then,

$$\begin{aligned} w(Z^B)&= w(X^B)+w(Y^B) \le 2\max (w(X^B),w(Y^B)) =2w(X^B\times Y^B) \end{aligned}$$
(A.16)

and

$$\begin{aligned} w(Z^C) = w(X^B\cap X^C)+w(Y^B\cap Y^C)&\le w(X^C)+w(Y^C), \nonumber \\&\le 2 \max \left( w(X^C),w(Y^C)\right) , \nonumber \\&= 2w(X^C\times Y^C). \end{aligned}$$
(A.17)

\(\square \)

1.3.2 A.3.2 Multiplication

The following definition of McCormick’s multiplication rule is nonstandard but facilitates our convergence arguments. It is shown in [47, §2.4.2] that this definition is equivalent except that the \(\mathrm {Cut}\) function is not applied to \(\mathcal {X}\) and \(\mathcal {Y}\) in McCormick’s original work [30].

Definition A.4

McCormick multiplication \(\times :\mathbb {MR}\times \mathbb {MR}\rightarrow \mathbb {MR}\) is defined by

$$\begin{aligned} \times (\mathcal {X},\mathcal {Y})&= \mathcal {XY} = (X^BY^B,[z^{cv},z^{cc}]), \end{aligned}$$

where \(X^BY^B\) denotes standard interval multiplication [33] and

$$\begin{aligned} z^{cv}&= \max \left( \left[ y^L\bar{X}^C+x^L\bar{Y}^C-x^Ly^L\right] ^L,\left[ y^U\bar{X}^C+x^U\bar{Y}^C-x^Uy^U\right] ^L\right) , \\ z^{cc}&= \min \left( \left[ y^L\bar{X}^C+x^U\bar{Y}^C-y^Lx^U\right] ^U,\left[ y^U\bar{X}^C+x^L\bar{Y}^C-y^Ux^L\right] ^U\right) . \end{aligned}$$

Above, \(\bar{\mathcal {X}}=\mathrm {Cut}(\mathcal {X})\), \(\bar{\mathcal {Y}}=\mathrm {Cut}(\mathcal {Y})\), and the notations \([\cdot ]^L\) and \([\cdot ]^U\) refer to the lower and upper bounds of the interval-valued quantity in brackets, respectively.

Lemma A.5

McCormick multiplication is (1,2)-convergent on \(\mathbb {M}K\) for any compact \(K\subset \mathbb {R}\times \mathbb {R}\).

Proof

Choose any compact \(K\subset \mathbb {R}\times \mathbb {R}\). The existence of \(\tau _{BB}\ge 0\) such that \(w(X^BY^B)\le \tau _{BB}w(X^B\times Y^B)\) for all \((X^B,Y^B)\in {\mathbb {I}}K\) is well known [33]. Choose any \((\mathcal {X},\mathcal {Y}) \in \mathbb {M}K\) and let \(\mathcal {Z}=\times (\mathcal {X},\mathcal {Y})\). We have \(w(Z^C) = z^{cc}-z^{cv}\). There are four cases to consider. For the first case,

$$\begin{aligned} w(Z^C)&= \left[ y^L\bar{X}^C+x^U\bar{Y}^C-y^Lx^U\right] ^U-\left[ y^L\bar{X}^C+x^L\bar{Y}^C-x^Ly^L\right] ^L. \end{aligned}$$

Writing \(r^U=w(R)+r^L\) for \(R=\left[ y^L\bar{X}^C+x^U\bar{Y}^C-y^Lx^U\right] \) on the right,

$$\begin{aligned} w(Z^C)&= w\left( \left[ y^L\bar{X}^C+x^U\bar{Y}^C-y^Lx^U\right] \right) \\&\quad +\left[ y^L\bar{X}^C+x^U\bar{Y}^C-y^Lx^U\right] ^L-\left[ y^L\bar{X}^C+x^L\bar{Y}^C-x^Ly^L\right] ^L, \\&= w\left( \left[ y^L\bar{X}^C+x^U\bar{Y}^C\right] \right) \\&\quad +\left[ y^L\bar{X}^C\right] ^L+\left[ x^U\bar{Y}^C\right] ^L-y^Lx^U-\left[ y^L\bar{X}^C\right] ^L-\left[ x^L\bar{Y}^C\right] ^L+x^Ly^L, \\&\le |y^L|w(\bar{X}^C)+|x^U|w(\bar{Y}^C)+\left[ x^U\bar{Y}^C\right] ^L-y^Lx^U-\left[ x^L\bar{Y}^C\right] ^L+x^Ly^L, \\&= |y^L|w(\bar{X}^C)+|x^U|w(\bar{Y}^C)+\left[ x^U\bar{Y}^C-y^Lx^U\right] ^L-\left[ x^L\bar{Y}^C-x^Ly^L\right] ^L, \\&\le |y^L|w(X^C)+|x^U|w(Y^C)+\left[ x^U(\bar{Y}^C-y^L)\right] ^L-\left[ x^L(\bar{Y}^C-y^L)\right] ^L. \end{aligned}$$

Noting that \(\bar{Y}^C\subset Y^B\), it follows that every element of \(\bar{Y}^C-y^L\) is nonnegative and bounded above by \(w(Y^B)\). Thus, \(\exists q_1,q_2\in (\bar{Y}^C-y^L)\), both bounded between 0 and \(w(Y^B)\), satisfying

$$\begin{aligned} w(Z^C)&\le |y^L|w(X^C)+|x^U|w(Y^C)+x^Uq_1-x^Lq_2, \\&= |y^L|w(X^C)+|x^U|w(Y^C)+(x^L+w(X^B))q_1-x^Lq_2, \\&= |y^L|w(X^C)+|x^U|w(Y^C)+x^L(q_1-q_2)+w(X^B)q_1, \\&\le |y^L|w(X^C)+|x^U|w(Y^C)+|x^L|w(\bar{Y}^C-y^L)+w(X^B)w(Y^B), \\&\le |y^L|w(X^C)+(|x^U|+|x^L|)w(Y^C)+w(X^B)w(Y^B), \\&\le (|y^L|+|x^U|+|x^L|)\max (w(X^C),w(Y^C))+\max (w(X^B)^2,w(Y^B)^2), \\&\le 3\max \{|x^L|,|x^U|,|y^L|,|y^U|\}w(X^C\times Y^C)+w(X^B\times Y^B)^2, \\&\le 3(\max _{x\in K}|x|)w(X^C\times Y^C)+w(X^B\times Y^B)^2. \end{aligned}$$

Similar arguments hold in the remaining three cases. \(\square \)

1.3.3 A.3.3 Univariate functions

Let \(\mathcal {L}\) be a library of univariate functions \(u:B\subset \mathbb {R}\rightarrow \mathbb {R}\) that are permissible in the definition of a factorable function. The construction of the natural McCormick extension requires the following assumption.

Assumption A.6

For every \(u:B\subset \mathbb {R}\rightarrow \mathbb {R}\) in \(\mathcal {L}\), the following objects are available:

  1. 1.

    An inclusion function \(U^B:\mathbb {I}B\rightarrow \mathbb {IR}\).

  2. 2.

    A scheme of estimators \((u^{cv},u^{cc}):\mathbb {I}B\times B\rightarrow \mathbb {R}\times \mathbb {R}\).

  3. 3.

    Functions \(x^{\min },x^{\max }:\mathbb {I}B\rightarrow \mathbb {R}\) such that \(x^{\min }(X)\) and \(x^{\max }(X)\) are a minimum of \(u^{cv}(X,\cdot )\) and a maximum of \(u^{cc}(X,\cdot )\) on X, respectively.

Definition A.7

For every \(u:B\subset \mathbb {R}\rightarrow \mathbb {R}\) in \(\mathcal {L}\), the McCormick univariate composition rule \(\mathcal {U}:\mathbb {M}B\rightarrow \mathbb {MR}\) is defined by

$$\begin{aligned} \mathcal {U}(\mathcal {X}) = \big (U^B(X^B),\big [u^{cv}(X^B,&\,\mathrm {mid}(x^{cv},x^{cc},x^{\min }(X^B))), u^{cc}(X^B,\mathrm {mid}(x^{cv},x^{cc},x^{\max }(X^B)))\big ]\big ), \end{aligned}$$

where the \(\mathrm {mid}\) function selects the middle value of its arguments.

Note that \(\mathcal {X}\in \mathbb {M}B\) implies that either \(x^{cv}\in X^B\) or \(x^{cc}\in X^B\), or both. By definition, \(x^{\min }(X^B)\) and \(x^{\max }(X^B)\) are both also in \(X^B\), so that, in both uses of the \(\mathrm {mid}\) function above, at least two of the three arguments lie in \(X^B\). Thus, \(\mathrm {mid}\) chooses an element of \(X^B\), and hence of B, so that \(\mathcal {U}(\mathcal {X})\) is defined.

The following assumptions are required to establish (1, 2)-convergence of \(\mathcal {U}\) and are standard in the convergence literature (see Theorem 8 in [4]).

Assumption A.8

For every \(u:B\subset \mathbb {R}\rightarrow \mathbb {R}\) in \(\mathcal {L}\) and any compact \(K\subset B\):

  1. 1.

    u is Lipschitz continuous on K.

  2. 2.

    The inclusion function \(U^B\) converges in diameter in K with order at least one.

  3. 3.

    The scheme of estimators \((u^{cv},u^{cc})\) converges pointwise in K with order at least 2.

Lemma A.9

\(\mathcal {U}:\mathbb {M}B\rightarrow \mathbb {MR}\) is (1,2)-convergent on \(\mathbb {M}K\) for any compact \(K\subset B\).

Proof

Choose any compact \(K\subset B\). By Assumption A.8, there exists \(\tau _{BB}\ge 0\) such that \(w(U^B(X^B))\le \tau _{BB}w(X^B)\), \(\forall X^B\in \mathbb {I}K\). Let \(L\in \mathbb {R}_+\) be a Lipschitz constant for u on K and let \(\tau \in \mathbb {R}_+\) be the pointwise convergence-order prefactor for \(\mathcal {U}\) in K as per Assumption A.8. Choose any \(\mathcal {X}\in \mathbb {M}K\). Since both of the points \(\mathrm {mid}(x^{cv},x^{cc},x^{\min }(X^B)))\) and \(\mathrm {mid}(x^{cv},x^{cc},x^{\max }(X^B)))\) are in \(X^C\), it follows that

$$\begin{aligned} |\mathrm {mid}(x^{cv},x^{cc},x^{\max }(X^B))-\mathrm {mid}(x^{cv},x^{cc},x^{\min }(X^B))|\le w(X^C). \end{aligned}$$
(A.18)

Then,

$$\begin{aligned} w(U^C(\mathcal {X}))&= |u^{cc}(X^B,\mathrm {mid}(x^{cv},x^{cc},x^{\max }(X^B))) - u^{cv}(X^B,\mathrm {mid}(x^{cv},x^{cc},x^{\min }(X^B)))|, \nonumber \\&\le |u^{cc}(X^B,\mathrm {mid}(x^{cv},x^{cc},x^{\max }(X^B))) - u(\mathrm {mid}(x^{cv},x^{cc},x^{\max }(X^B)))| \nonumber \\&\quad +|u(\mathrm {mid}(x^{cv},x^{cc},x^{\max }(X^B)))-u(\mathrm {mid}(x^{cv},x^{cc},x^{\min }(X^B)))| \nonumber \\&\quad +|u(X^B,\mathrm {mid}(x^{cv},x^{cc},x^{\min }(X^B)))-u^{cv}(X^B,\mathrm {mid}(x^{cv},x^{cc},x^{\min }(X^B)))|, \nonumber \\&\le \tau w(X^B)^2 +Lw(X^C) +\tau w(X^B)^2. \end{aligned}$$
(A.19)

\(\square \)

Combining the results of Appendix A.3.1–A.3.3 with the Composition Theorem A.1, it is straightforward to show that the natural McCormick extension \(\mathcal {F}:\mathbb {MD}\rightarrow \mathbb {MR}^m\) of a factorable function \({{\mathbf {f}}}:D\rightarrow \mathbb {R}^m\) has (1, 2)-convergence on any compact \(K\subset D\), as required by Assumption 5.17.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schaber, S.D., Scott, J.K. & Barton, P.I. Convergence-order analysis for differential-inequalities-based bounds and relaxations of the solutions of ODEs. J Glob Optim 73, 113–151 (2019). https://doi.org/10.1007/s10898-018-0691-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-018-0691-5

Keywords

Mathematics Subject Classification

Navigation