Abstract
The convergence rate of a rectangular partition based algorithm is considered. A hyper-rectangle for the subdivision is selected at each step according to a criterion rooted in the statistical models based theory of global optimization; only the objective function values are used to compute the criterion of selection. The convergence rate is analyzed assuming that the objective functions are twice- continuously differentiable and defined on the unit cube in d-dimensional Euclidean space. An asymptotic bound on the convergence rate is established. The results of numerical experiments are included.
Similar content being viewed by others
References
Floudas, Ch.: Deterministic Global Optimization: Theory, Algorithms and Applications. Kluwer, Dordrecht (2000)
Horst, R., Pardalos, A.P., Thoai, N.: Introduction to Global Optimization. Kluwer, Dordrecht (1995)
Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches. Springer, Berlin (1996)
Pinter, J.: Global Optimization in Action. Kluwer, Dordrecht (1996)
Sergeyev, Y.D.: Multidimensional global optimization using the first derivatives. Comput. Maths. Math. Phys. 39(5), 743–752 (1999)
Sergeyev, Y.D., Kvasov, D.E.: A deterministic global optimization using smooth diagonal auxiliary functions. Commun. Nonlinear Sci. Numer. Simul. 21, 99–111 (2015)
Sergeyev, Y.D., Kvasov, D.E.: Deterministic Global Optimization, An Introduction to the Diagonal Approach. Springer, New York (2017)
Mockus, J.: Bayesian Approach to Global Optimization. Kluwer, Dordrecht (1988)
Strongin, R.G., Sergeyev, Y.D.: Global Optimization with Non-convex Constraints: Sequential and Parallel Algorithms. Kluwer, Dordrecht (2000)
Zhigljavsky, A.: Theory of Global Random Search. Kluwer, Dordrecht (1991)
Žilinskas, A.: A statistical model-based algorithm for black-box multi-objective optimisation. Int. J. Syst. Sci. 45(1), 82–92 (2014)
Žilinskas, A., Zhigljavsy, A.: Stochastic global optimization: a review on the occasion of 25 years of Informatica. Informatica 27, 229–256 (2016)
Hooker, J.: Testing heuristics: we have it all wrong. J. Heuristics 11, 33–42 (1995)
Pardalos, P., Romeijn, H.: Handbook of Global Optimization, vol. 2. Springer, Berlin (2002)
Rastrigin, L.: Statistical Models of Search. Nauka (2013) (in Russian)
Paulavičius, R., Sergeyev, Y., Kvasov, D., Z̆ilinskas, J.: Globally-biased DISIMPL algorithm for expensive global optimization. J. Glob. Optim. 59, 545567 (2014)
Žilinskas, A., Z̆ilinskas, J.: A hybrid global optimization algorithm for non-linear leist squares regression. J. Glob. Optim. 56, 265–277 (2013)
Calvin, J., Žilinskas, A.: A one-dimensional P-algorithm with convergence rate \(o(n^{-3+\delta })\) for smooth functions. J. Optim. Theory Appl. 106, 297–307 (2000)
Calvin, J.M.: An adaptive univariate global optimization algorithm and its convergence rate under the Wiener measure. Informatica 22, 471–488 (2010)
Calvin, J.M., Chen, Y., Z̆ilinskas, A.: An adaptive univariate global optimization algorithm and its convergence rate for twice continuously differentiable functions. J. Optim. Theory Appl. 155, 628–636 (2011)
Calvin, J.M., Z̆ilinskas, A.: On a global optimization of bivariate smooth functions. J. Optim. Theory Appl. 163, 528–547 (2014)
Novak, E., Woźniakowski, H.: Tractability of Multivariate Problems, II. Tracts in Mathematics, vol. 12. European Mathematical Society, Zürich (2010)
Huyer, W., Neumaier, A.: Global optimization by multi-level coordinate search. J. Glob. Optim. 14, 331–355 (1999)
Jones, D.R., Perttunen, C.D., Stuckman, C.D.: Lipschitzian optimization without the lipschitz constant. J. Optim. Theory Appl. 79, 157–181 (1993)
Sergeyev, Y., Kvasov, D.: Global search based on efficicient diagonal partitions and set of Lipschitz constants. SIAM J. Optim. 16(3), 910–937 (2006)
Scholz, D.: Deterministic Global Optimization: Geometric Branch-and-Bound Methods and their Applications. Springer, Berlin (2012)
Paulavičius, R., Z̆ilinskas, J.: Simplicial Global Optimization. Springer Briefs in Optimization. Springer, Berlin (2014)
Novak, E.: Deterministic and Stochastic Error Bounds in Numerical Analysis, Lecture Notes in Mathematics, vol. 1349. Springer, Berlin (1988)
Zhigljavsy, A., Žilinskas, A.: Stochastic Global Optimization. Springer, Berlin (2008)
Waldron, S.: Sharp error estimates for multivariate positive linear operators which reproduce the linear polynomials. In: Chui, C.K., Schumaker, L.L. (eds.) Approximation Theory IX, vol. 1, pp. 339–346. Vanderbilt University Press, Nashville (1998)
McGeoch, C.: Experimental analysis of algorithms. In: Pardalos, P., Romeijn, E. (eds.) Handbook of Global Optimization, vol. 2, pp. 489–514. Kluwer, Dodrecht (2002)
More, J., Wild, S.: Benchmarking derivative-free optimization algorithms. SIAM J. Optim. 20, 172–191 (2009)
GSL, GNU Scientific Library. https://www.gnu.org/software/gsl/. Accessed on 1 Aug 2016
Cgal, Computational Geometry Algorithms Library. http://www.cgal.org. Accessed on 15 Oct 2014
Sobol, I.: On the systematic search in a hypercube. SIAM J. Numer. Anal. 16, 790793 (1979)
Dixon, L.C.W., Szegö, G.P. (eds.): Towards Global Optimization II. North Holland, New York (1978)
Branin, F.: Widely convergent method for finding multiple solutions of simultaneous nonlinear equations. IBM J. Res. Dev. 16(5), 504–522 (1972)
Hansen, P., Jaumard, B.: Lipshitz optimization. In: Horst, R., Pardalos, P. (eds.) Handbook of Global Optimization, pp. 407–493. Kluwer, Dodrecht (1995)
Liu, H., Xu, S., Ma, Y., Wang, X.: Global optimization of expensive black box functions using potential lipschitz constants and response surfaces. J. Glob. Optim. 63, 229251 (2015)
Acknowledgements
We thank the reviewers for valuable hints and remarks which facilitated the improvement of presentation of our results. The work of J. Calvin was supported by the National Science Foundation under Grant No. CMMI-0926949 and the work of G. Gimbutienė and A. Žilinskas was supported by the Research Council of Lithuania under Grant No. P-MIP-17-61.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
This appendix contains the proof of Lemma 2.
Denote the eigenvalues of \(D^2f(x^*)\) by
Then by Taylor’s theorem (x a column vector),
Since \(D^2f(x^*)\) is symmetric positive definite, we can express it as
for an orthogonal matrix V and diagonal matrix \(\varTheta =\mathrm{diag}(\theta _1, \theta _2,\ldots ,\theta _d)\). Then
and
where \(Tx \equiv \varTheta ^{1/2}V^Tx\).
Let \(B_c^d(0)\) denote the ball of radius c, centered at 0, in \(\mathbb {R}^d\). For \(c>0\), let
which is positive since the minimizer is assumed unique and f is continuous. Then
and
by (24). For any \(\eta \in ]0,1/2]\), we can choose \(c>0\) small enough so that
Therefore,
as \(\epsilon \downarrow 0\), and similarly
Using the orthogonality of V, for any \(b>0\),
where \(E(c,b,\epsilon )\) is the image of the ball of radius c under the map \(x_i \mapsto x_i(\theta _ib/\epsilon )^{1/2}\), and we used the substitution \(y_i \leftarrow x_i\left( \theta _i b/\epsilon \right) ^{1/2}\) in the last equation. Therefore,
and (27) gives the bounds
Let \(\mathcal {V}(x)=\pi ^{d/2}\varGamma (d/2 +1)^{-1}x^d\) denote the volume of the d-dimensional ball of radius x. For \(z>0\),
For \(z>1\),
and for \(A<z\),
Combining this inequality with (28) we conclude that
Set
and
By (29), \(G_d(c)/\log (c)\rightarrow 1\) as \(c\uparrow \infty \). For any \(b>0\), we have the bounds
and
Since \(\eta >0\) is arbitrary, the fact that \(G_d(c)/\log (c)\rightarrow 1\) as \(c\uparrow \infty \) completes the proof.
Appendix B
This appendix contains the proofs of the lemmas upon which the proof of Theorem 1 are based.
Proof of Lemma 3
Let \(m\le n\) be the last time before n that the smallest hyper-rectangle was about to be split, so \(v_m = 2v_n\) and by (5),
Note that \(m\ge n_0\) by definition of \(n_2\).
We will show by induction that
for \(k\in \{1,2,\ldots ,n-m\}\), if the hyper-rectangle that the algorithm is subdividing at iteration \(m+k\) has not been previously subdivided since time m. At other times the bound is twice as large.
Let us first consider \(\{\rho ^{m+1}_i: i\le m+1\}\). Suppose that i is the split hyper-rectangle at time m, and let j denote a non-split hyper-rectangle, so \(\rho ^m_j \le \rho _i^m\). Then there exists a point \(c_j\in R_j\) such that
Next consider a child of the split subrectangle i. Since we are splitting the smallest subrectangle, \(v_m = 2v_n\), and there exists a point \(c_i\in R_i\) such that (supposing that one child has index i at time \(m+1\))
We have established the base case for the induction. Now consider iteration \(m+k+1\), \(1\le k<n-m\). For the induction hypothesis, assume that
if hyper-rectangle i has not been split since time m and
if hyper-rectangle i has been split since time m.
Suppose that the most promising hyper-rectangle at iteration \(m+k+1\) is subrectangle \(R_j=\prod _{i=1}^d[a_i,b_i]\),
Assume that this hyper-rectangle has not been split since time m. Let us suppose that during this interval (between m and the next time that the smallest hyper-rectangle is split) R is split r times, and denote by \(L^\prime \) the piecewise multilinear function defined over R after the splitting evaluations. Then
where
Set
Then, denoting by \(\rho _1\) the sum of the \(\rho \) values for the resulting subrectangles of R at time \(m+k+q\), we have
Let
where \(a=\min _{s\in R} L_{m+k+1}(s)-M_n+g(v_n) >0\). Then
The latter ratio is maximized by \(Q\equiv 0\), which corresponds to \(\rho _0 = \left| R\right| /a^{d/2}\), and so
We used the inequalities
which follows from the induction hypothesis.
We have shown that whenever the algorithm is about to subdivide a hyper-rectangle for the first time since m, \(\rho ^{m+k}\le 2/\lambda \log (m+k)\). After a hyper-rectangle is subdivided, subsequent subdivisions can never result in \(\rho \) more than twice as large.
The proof by induction of (30) is complete. \(\square \)
Proof of Lemma 4
Suppose (to get a contradiction) that \(\left| R_i\right| =v_n^*\), \(\left| R_j\right| =v_n\), and \(v_n^*=4 v_n\), but yet we are about to split hyper-rectangle \(R_j\), resulting in \(v_n^*>4v_n\); that is, \(\rho ^n_j \ge \rho ^n_i\):
for some \(c_i\in R_i, c_j\in R_j\). This implies that
But \(L_n(c_j) \ge M_n\) and
Thus
This means that
since \(d\ge 2\). But this contradicts (31), and establishes (9).
The proof of (10) follows from
\(\square \)
Proof of Lemma 5
Recall that \(R^*_n\) denotes the hyper-rectangle containing \(x^*\) at time n, and that \(\left| R^*_n\right| = v^*_n \le 4 v_n\) by Lemma 4. If h is the minimal edge length of \(R^*_n\), then \(v^*_n = 2^j h^d\) for some \(0 \le j <d\), and the diameter of \(R^*_n\) is less than \(2h\sqrt{d}\). Also \(h = 2^{-j/d}(v^*_n)^{1/d}\). Therefore, for \(s\in R^*_n\),
Therefore, by the previous inequality
\(\square \)
Proof of Lemma 6
Equation (11) follows from Lemma 3.
For \(n\ge n_2(f)\), the \(\rho \) values for the children of a split hyper-rectangle will not be much smaller than the parent. The largest possible decrease occurs if the values on the boundary of \(R_i\) are constant, say with value \(M_{m+k}+A\), and the new function values are the largest possible, namely \(M_{n}+A+a\), where
Considering a child of the split hyper-rectangle (say with index i),
using the inequality \((1+x/k)^k \le \exp (x)\) and the fact that \(d\ge 2\) implies that \(4^{2/d}/(4d)\le 1/2\).
Consider the average of the \(\{\rho ^n_i\}\) at time n that were the product of splits after time n / 2. Over this time interval we have
by Lemma 5, since \(n\ge 2n_2(f)\). Since the children \(\rho \) values will not be much smaller than the parent’s,
since \(v_i\ge v_n\). \(\square \)
Proof of Lemma 7
Fix a particular hyper-rectangle \(R_i\) with
We first show that the integrals
are close. As in the proof of Lemma 3, let
where \(a=\min _{s\in {R_i}} L_n(s)-f^*+g(v_n) >0\). Then, from (6),
and
The smallest value of the ratio occurs when \(Q(s)\equiv 0\).
The case \(Q\equiv 0\) corresponds to \(\rho = \left| {R_i}\right| /a^{d/2}\), and so
Now
Therefore,
since the last expression is increasing in \(d\ge 2\). Turning to an upper bound for
observe that
We have shown that
Recall that \(v_n^*\) denotes the volume of the hyper-rectangle containing the minimizer \(x^*\). Since \(n\ge n_2\), by Lemma 4 \(v_n^* \le 4 v_n\) and
Therefore,
Combining these inequalities with (32) gives
Since
this completes the proof. \(\square \)
Rights and permissions
About this article
Cite this article
Calvin, J., Gimbutienė, G., Phillips, W.O. et al. On convergence rate of a rectangular partition based global optimization algorithm. J Glob Optim 71, 165–191 (2018). https://doi.org/10.1007/s10898-018-0636-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-018-0636-z