Skip to main content
Log in

Population model-based optimization

  • Original Paper
  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

Model-based optimization methods are a class of stochastic search methods that iteratively find candidate solutions by generating samples from a parameterized probabilistic model on the solution space. In order to better capture the multi-modality of the objective functions than the traditional model-based methods which use only a single model, we propose a framework of using a population of models at every iteration with an adaptive mechanism to propagate the population over iterations. The adaptive mechanism is derived from estimating the optimal parameter of the probabilistic model in a Bayesian manner, and thus provides a proper way to determine the diversity in the population of the models. We provide theoretical justification on the convergence of this framework by showing that the posterior distribution of the parameter asymptotically converges to a degenerate distribution concentrating on the optimal parameter. Under this framework, we develop two practical algorithms by incorporating sequential Monte Carlo methods, and carry out numerical experiments to illustrate their performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

References

  1. Ali, M.M., Khompatraporn, C., Zabinsky, Z.B.: A numerical evaluation of several stochastic algorithms on selected continuous global optimization test problems. J. Global Optim. 31, 635–672 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  2. Azimi-Sadjadi, B., Krishnaprasad, P.S.: Approximate nonlinear filtering and its application in navigation. Automatica 41(6), 945–956 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  3. Crisan, D., Doucet, A.: A survey of convergence results on particle filtering methods for practitioners. IEEE Trans. Signal Process. 50(3), 736–746 (2002)

    Article  MathSciNet  Google Scholar 

  4. DeBoer, P.T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134, 19–67 (2005)

    Article  MathSciNet  Google Scholar 

  5. Denny, M.: Introduction to importance sampling in rare-event simulations. Eur. J. Phys. 22(4), 403–411 (2001)

    Article  Google Scholar 

  6. Dorigo, M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1, 53–66 (1997)

    Article  Google Scholar 

  7. Doucet, A., deFreitas, J.F.G., Gordon, N.J. (eds.): Sequential Monte Carlo Methods In Practice. Springer, New York (2001)

    MATH  Google Scholar 

  8. Glover, F.W.: Tabu search: a tutorial. Interfaces 20, 74–94 (1990)

    Article  Google Scholar 

  9. Goldberg, D.E.: Genetic Algorithms in Search, Optimizaion and Machine Learning. Addison-Wesley Longman Publishing Co., Inc., Boston, MA (1989)

    Google Scholar 

  10. Hu, J., Chang, H.S., Fu, M.C., Marcus, S.I.: Dynamic sample budget allocation in model-based optimization. J. Global Optim. 50, 575–596 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  11. Hu, J., Fu, M.C., Marcus, S.I.: A model reference adaptive search method for global optimization. Oper. Res. 55, 549–568 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  12. Kirkpatrick, S., Gelatt, C.D., Vecchi Jr, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  13. Laguna, M., Marti, R.: Experimental testing of advanced scatter search designs for global optimization of multimodal functions. J. Global Optim. 33, 235–255 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  14. Larranaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers, Boston, MA (2002)

    Book  Google Scholar 

  15. Liu, J., West, M.: Combined parameter and state estimation in simulation-based filtering. In: Doucet, A., de Freitas, J.F.G., Gordon, N.J. (eds.) Sequential Monte Carlo Methods in Practice. Springer, New York (2001)

    Google Scholar 

  16. Molvalioglu, O., Zabinsky, Z.B., Kohn, W.: The interacting-particle algorithm with dynamic heating and cooling. J. Global Optim. 43, 329–356 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  17. Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning. Springer, New York (2004)

    Book  Google Scholar 

  18. Shi, L., Ólafsson, S.: Nested partitions method for global optimization. Oper. Res. 48(3), 390–407 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  19. Zhigljavsky, A.: Theory of Global Random Search. Kluwer, Netherlands (1991)

    Book  Google Scholar 

  20. Zhigljavsky, A., Zilinskas, A.: Stochastic Global Optimization. Springer, Berlin (2008)

    MATH  Google Scholar 

  21. Zhou, E., Fu, M.C., Marcus, S.I.: Solving continuous-state POMDPs via density projection. IEEE Trans. Autom. Control 55(5), 1101–1116 (2010)

    Article  MathSciNet  Google Scholar 

  22. Zhou, E., Fu, M.C., Marcus, S.I.: Particle filtering framework for a class of randomized optimization algorithms. IEEE Trans. Autom. Control 59(4), 1025–1030 (2014)

    Article  MathSciNet  Google Scholar 

  23. Zhou, E., Hu, J.: Gradient-based adaptive stochastic search for non-differentiable optimization. IEEE Trans. Autom. Control 59(7), 1818–1832 (2014)

    Article  MathSciNet  Google Scholar 

  24. Zlochin, M., Birattari, M., Meuleau, N., Dorigo, M.: Model-based search for combinatorial optimization: a critical survey. Ann. Oper. Res. 131, 373–395 (2004)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Science Foundation under Grant CMMI-1130273, and Air Force Office of Scientific Research under YIP Grant FA-9550-12-1-0250. The preliminary conference version of this paper, which presents the framework of PMO and part of the numerical results, is in the proceedings of the 2013 Winter Simulation Conference.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Enlu Zhou.

Appendix

Appendix

1.1 Proof of Lemma 1

Proof

Define \(c_{k}\triangleq \mathbb {E}_{b_{k}}[H(X)]-\mathbb {E}_{\tilde{b}_{k}}[H(X)]\). First, we show that \(\mathbb {E}_{b_k}[H(X)]\ge \mathbb {E}_{b_{k-1}}[H(X)]-c_{k-1}\). By (11), the posterior distribution \(b_k(x,\theta )\) can be expressed by

$$\begin{aligned} b_k(x,\theta )\triangleq p(x,\theta |y_{1:k}) =\frac{\tilde{b}_{k-1}(x,\theta )\varphi (H(x)-y_k)}{\mathbb {E}_{\tilde{b}_{k-1}}\left[ \varphi (H(X)-y_k)\right] }. \end{aligned}$$
(17)

Then, the expectation of \(H(X)\) with respect to \(b_k(x,\theta )\) is represented by

$$\begin{aligned}&\mathbb {E}_{b_k}[H(X)]=\frac{\mathbb {E}_{\tilde{b}_{k-1}}\left[ H(X)\varphi (H(X)-y_k)\right] }{\mathbb {E}_{\tilde{b}_{k-1}}\left[ \varphi (H(X)-y_k)\right] }\\&\quad =\frac{\mathbb {E}_{\tilde{b}_{k-1}}\left[ (H(X)-E_{\tilde{b}_{k-1}}[H(X)])\varphi (H(X)-y_k)\right] +\mathbb {E}_{\tilde{b}_{k-1}}\left[ H(X)\right] \mathbb {E}_{\tilde{b}_{k-1}}\left[ \varphi (H(X)-y_k)\right] }{\mathbb {E}_{\tilde{b}_{k-1}}\left[ \varphi (H(X)-y_k)\right] }\\&\quad = \frac{\mathbb {E}_{\tilde{b}_{k-1}}\left[ (H(X)-E_{\tilde{b}_{k-1}}[H(X)])\varphi (H(X)-y_k)\right] }{\mathbb {E}_{\tilde{b}_{k-1}}\left[ \varphi (H(X)-y_k)\right] }+\mathbb {E}_{\tilde{b}_{k-1}}\left[ H(X)\right] \\&\quad \ge \mathbb {E}_{\tilde{b}_{k-1}}\left[ H(X)\right] , \end{aligned}$$

where the inequality follows from \(\mathbb {E}_{\tilde{b}_{k-1}}\left[ (H(X)-E_{\tilde{b}_{k-1}}[H(X)])\varphi (H(X)-y_k)\right] \ge 0\), which can be proved as follows.

By Assumption 1, \(\varphi (\cdot )\) is strictly increasing on its support. We have

$$\begin{aligned} \varphi (H(x)-y_k)-\varphi (\mathbb {E}_{\tilde{b}_{k-1}}[H(X)]-y_k)\le 0, \ \text {if}\ H(x)\le \mathbb {E}_{\tilde{b}_{k-1}}[H(X)], \end{aligned}$$

and

$$\begin{aligned} \varphi (H(x)-y_k)-\varphi (\mathbb {E}_{\tilde{b}_{k-1}}[H(X)]-y_k)> 0, \ \text {if}\ H(x)> \mathbb {E}_{\tilde{b}_{k-1}}[H(X)]. \end{aligned}$$

Then,

$$\begin{aligned}&\mathbb {E}_{\tilde{b}_{k-1}}\left[ (H(X)-E_{\tilde{b}_{k-1}}[H(X)])\varphi (H(X)-y_k)\right] \\&\quad =\mathbb {E}_{\tilde{b}_{k-1}}\left[ (H(X)-E_{\tilde{b}_{k-1}}[H(X)])(\varphi (H(X)-y_k)-\varphi (\mathbb {E}_{\tilde{b}_{k-1}}[H(X)]-y_k))\right] \\&\quad = \int _{H(x)\le \mathbb {E}_{\tilde{b}_{k-1}}[H(X)]}\int _{\Theta } (H(x)-E_{\tilde{b}_{k-1}}[H(X)])(\varphi (H(x)-y_k)\\&\qquad -\,\varphi (\mathbb {E}_{\tilde{b}_{k-1}}[H(X)]-y_k))\tilde{b}_{k-1}(x,\theta ) d\theta dx \\&\qquad +\,\int _{H(x)> \mathbb {E}_{\tilde{b}_{k-1}}[H(X)]}\int _{\Theta } (H(x)-E_{\tilde{b}_{k-1}}[H(X)])(\varphi (H(x)-y_k)\\&\qquad -\,\varphi (\mathbb {E}_{\tilde{b}_{k-1}}[H(X)]-y_k))\tilde{b}_{k-1}(x,\theta ) d\theta dx\\&\quad \ge 0. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathbb {E}_{b_k}[H(X)]\ge \mathbb {E}_{b_{k-1}}[H(X)]-c_{k-1}. \end{aligned}$$

Then, we have

$$\begin{aligned} a_k \triangleq \mathbb {E}_{b_k}[H(X)]+\sum _{i=1}^{k-1}c_i \ge \mathbb {E}_{b_{k-1}}[H(X)]+\sum _{i=1}^{k-2}c_i=a_{k-1}. \end{aligned}$$

Thus, \(\{a_k,\ k=1,2,\ldots \}\) is monotonically increasing. Moreover, \(\{a_k\}\) is upper bounded, since for all \(k\ge 1\), \(a_k \le H^{u} + \sum _{i=1}^{k-1}c_i\) and

$$\begin{aligned} \sum _{i=1}^{k-1}c_{i}\le & {} \sum _{i=1}^{k-1}|c_{i}| \\\le & {} \int _{\mathcal {X}}{H(x) \sum _{i=1}^{k-1}|b_{i}(x) - \tilde{b}_{i}(x)|dx} \\\le & {} \int _{\mathcal {X}}{H(x) \sum _{i=1}^{\infty }|b_{i}(x) - \tilde{b}_{i}(x)|dx} < \infty , \end{aligned}$$

where the last inequality follows from Assumption 4 and the fact that \(\mathcal {X}\) is compact. Since \(\{a_k\}\) is monotonically increasing and upper bounded, \(\lim _{k\rightarrow \infty }a_k\) exists. Using the dominated convergence theorem, we conclude that \(\sum _{i=1}^{\infty }c_i\) exists and

$$\begin{aligned} \sum _{i=1}^{\infty }c_i= & {} \sum _{i=1}^{\infty } \int _{\mathcal {X}}H(x)(b_{i}(x) - \tilde{b}_{i}(x))dx \\= & {} \int _{\mathcal {X}}H(x)\sum _{i=1}^{\infty }(b_{i}(x) - \tilde{b}_{i}(x))dx < \infty . \end{aligned}$$

Therefore, the limit of the righthand side of \(\mathbb {E}_{b_k}[H(X)] = a_k - \sum _{i=1}^{k-1}c_i\) exists, which implies that \(\lim _{k\rightarrow \infty }\mathbb {E}_{b_k}[H(x)]\) exists. \(\square \)

1.2 Proof of Theorem 1

Proof

Since \(y_k\) is monotonically increasing and upper bounded by \(H^*\) and is updated only when \(\gamma _k\ge y_{k-1}+\epsilon \), there exists \(K<\infty \) such that \(y_k=y_K\), \(\forall k\ge K\). There are two cases need to consider: (i) \(y_K=H^*\), and (ii) \(y_K<H^*\).

(i) Case 1: \(y_K=H^*\)

By (17), we have

$$\begin{aligned} b_k(x)=\tilde{b}_{k-1}(x)\frac{\varphi (H(x)-y_k)}{\mathbb {E}_{\tilde{b}_{k-1}}[\varphi (H(X)-y_k)]}. \end{aligned}$$
(18)

Since \(\varphi (\cdot )\) has support on \([0,H^u-H^l]\), we have \(\varphi (H(X)-y_K)=0\) if \(H(x)<H^*\), which trivially gives us

$$\begin{aligned} b_k(x)=0, \ \ \forall x\ne x^*, \ \forall k\ge K. \end{aligned}$$

Thus,

$$\begin{aligned} \mathbb {E}_{b_k}[H(X)]=H^*, \ \forall k\ge K, \end{aligned}$$

which completes the proof of case (i).

(ii) Case 2: \(y_K<H^*\)

By lemma 1, the sequence \(\{\mathbb {E}_{b_k}[H(X)],k=1,2,\dots \}\) converges. Suppose \(\lim _{k\rightarrow \infty }\mathbb {E}_{b_k}[H(X)]=H_*\), and we will prove \(H_*=H^*\) by contradiction.

We define the set \(\mathcal {A}\) as

$$\begin{aligned} \mathcal {A}=\left\{ x\in \mathcal {X}:H(x)\ge H_*\right\} . \end{aligned}$$

For any fixed \(x\in \mathcal {A}\) and any finite \(i\), since \(\varphi (H(X)-y_i)>0\) and by Assumption 3 \(\tilde{b}_i(x)>0\), we have \(b_i(x)>0\) and \(\mathbb {E}_{\tilde{b}_{i-1}}[\varphi (H(X)-y_i)]>0\). Therefore, by induction we may represent (18) by

$$\begin{aligned} b_k(x)=b_{1}(x)\prod ^k_{i=2}\frac{\tilde{b}_{i-1}(x)}{b_{i-1}(x)}\frac{\varphi (H(x)-y_i)}{\mathbb {E}_{\tilde{b}_{i-1}}[\varphi (H(X)-y_i)]}, \quad \forall x\in \mathcal {A}, \end{aligned}$$
(19)

and from Assumption 4, we have \(\lim _{i\rightarrow \infty }\frac{\tilde{b}_i(x)}{b_i(x)}=1\), almost everywhere in \(\mathcal {A}\).

Hence, almost everywhere in \(\mathcal {A}\),

$$\begin{aligned}&\lim _{i\rightarrow \infty }\frac{\tilde{b}_{i-1}(x)}{b_{i-1}(x)}\frac{\varphi (H(x)-y_i)}{\mathbb {E}_{\tilde{b}_{i-1}}[\varphi (H(X)-y_i)]}\\&\quad =\lim _{i\rightarrow \infty }\frac{\tilde{b}_{i-1}(x)}{b_{i-1}(x)}\lim _{i\rightarrow \infty }\frac{\varphi (H(x)-y_i)}{\mathbb {E}_{\tilde{b}_{i-1}}[\varphi (H(X)-y_i)]}\\&\quad =\frac{\lim _{i\rightarrow \infty }\varphi (H(x)-y_i)}{\lim _{i\rightarrow \infty }\mathbb {E}_{\tilde{b}_{i-1}}[\varphi (H(X)-y_i)]}\\&\quad =\frac{\varphi (H(x)-y_K)}{\lim _{i\rightarrow \infty }\mathbb {E}_{b_{i-1}}[\varphi (H(X)-y_i)]} \end{aligned}$$

where the last equality is because of the continuity of \(\varphi (\cdot )\) under Assumption 1, and \(\lim _{i\rightarrow \infty }\mathbb {E}_{\tilde{b}_{i-1}}[\varphi (H(X)-y_i)]=\lim _{i\rightarrow \infty }\mathbb {E}_{b_{i-1}}[\varphi (H(X)-y_i)]\) is yielded by bounded convergence theorem and Assumption 4.

Suppose \(\lim _{k\rightarrow \infty }\mathbb {E}_{b_k}[H(X)]=H_*<H^*\), a trivial contradiction leads to

$$\begin{aligned} C\triangleq \lim _{k\rightarrow \infty }\int _{\{x:H(x)\le H_*\}}b_k(x)dx>0. \end{aligned}$$

We can write

$$\begin{aligned} \mathbb {E}_{b_{i-1}}[\varphi (H(X)-y_i)]= & {} \int _{\{x:H(x)\le H_*\}}\varphi (H(X)-y_i)b_k(x)dx\\&+\,\int _{\{x:H(x)> H_*\}}\varphi (H(X)-y_i)b_k(x)dx\\\le & {} \varphi (H_*-y_i)\int _{\{x:H(x)\le H_*\}}b_k(x)dx\\&+\,\varphi (H^*-y_i)\int _{\{x:H(x)> H_*\}}b_k(x)dx. \end{aligned}$$

Taking limit on both sides of the inequality, by the continuity of \(\varphi (\cdot )\) we have

$$\begin{aligned} \lim _{k\rightarrow \infty }\mathbb {E}_{b_{i-1}}[\varphi (H(X)-y_i)]\le & {} \varphi (H_*-y_K)\lim _{k\rightarrow \infty }\int _{\{x:H(x)\le H_*\}}b_k(x)dx\\&+\,\varphi (H^*-y_K)\lim _{k\rightarrow \infty }\int _{\{x:H(x)> H_*\}}b_k(x)dx\\= & {} \varphi (H_*-y_K)C+\varphi (H^*-y_K)(1-C). \end{aligned}$$

We define the set \(\mathcal {B}\) as

$$\begin{aligned} \mathcal {B}=\left\{ x\in \mathcal {A}:\varphi (H_*-y_K)C+\varphi (H^*-y_K)(1-C)<\varphi (H(x)-y_K)\right\} . \end{aligned}$$

Since \(C>0\) and \(\varphi (\cdot )\) is strictly increasing, \(\varphi (H_*-y_K)C+\varphi (H^*-y_K)(1-C)<\varphi (H^*-y_K)\). Thus, \(\mathcal {B}\) has a strict positive Lebesgue measure by Assumption 2.

Hence, almost everywhere in \(\mathcal {B}\),

$$\begin{aligned} \lim _{i\rightarrow \infty }\frac{\tilde{b}_{i-1}(x)}{b_{i-1}(x)}\frac{\varphi (H(x)-y_i)}{\mathbb {E}_{\tilde{b}_{i-1}}[\varphi (H(X)-y_i)]} \ge \frac{\varphi (H(x)-y_K)}{\varphi (H_*-y_K)C+\varphi (H^*-y_K)(1-C)}>1, \end{aligned}$$

by the definition of \(\mathcal {B}\). From the inequality above and (19), we have

$$\begin{aligned} \lim _{k\rightarrow \infty }b_k(x)=\infty , \ \text {almost everywhere in} \ \mathcal {B}. \end{aligned}$$

By Fatou’s lemma and the positive Lebesgue measure of \(\mathcal {B}\), we have

$$\begin{aligned} \lim _{k\rightarrow \infty }\inf \int _{\mathcal {B}}b_k(x)dx\ge \int _{\mathcal {B}}\lim _{k\rightarrow \infty }\inf b_k(x)dx=\infty , \end{aligned}$$

which contradicts to the fact that

$$\begin{aligned} \lim _{k\rightarrow \infty }\inf \int _{\mathcal {B}}b_k(x)dx\le \lim _{k\rightarrow \infty }\inf \int _{\mathcal {X}}b_k(x)dx=1. \end{aligned}$$

Therefore, we conclude that \(H_*=H^*\), and \(\lim _{k\rightarrow \infty }\mathbb {E}_{b_k}[H(X)]=H^*\). \(\square \)

1.3 Proof of Theorem 2

Proof

We prove this theorem by showing that Assumption 4 is satisfied under Assumptions 57.

By (8) and (10), we have

$$\begin{aligned} \tilde{b}_k(x,\theta )=p(x|\theta ,y_{1:k})\int _{\Theta }p(\theta |\theta _k,y_{1:k})b_k(\theta _k)d\theta _k. \end{aligned}$$
(20)

For any fixed \(\theta \in \Theta \), let \(S_k^m(\theta )=\left\{ \theta _k \in \Theta : |\theta _k^i-\theta ^i|<\delta _k, i=1,\ldots ,m \right\} \), where \(\theta ^i\) denotes the i-th element of \(\theta \), \(m\) is the dimension of \(\theta \). Let \(V(S_k^m(\theta ))\) denote the volume of \(S_k^m(\theta )\). By Assumption 5, the artificial noise \(\Gamma _k\) is uniformly distributed on \([-\delta _k,\delta _k]^m\), then the p.d.f. \(p(\theta |\theta _k,y_{1:k})\) is

$$\begin{aligned} p(\theta |\theta _k,y_{1:k})=\frac{\mathbb {I}_{\{\theta _k\in S_k^m(\theta )\}}}{V(S_k^m(\theta ))}. \end{aligned}$$
(21)

Plugging (21) into (20), we have

$$\begin{aligned} \tilde{b}_k(x,\theta )=p(x|\theta ,y_{1:k})\int _{S_k^m(\theta )}\frac{1}{V(S_k^m(\theta ))}b_k(\theta _k)d\theta _k. \end{aligned}$$

The joint posterior p.d.f. \(b_k(x,\theta )\) can be represented by

$$\begin{aligned} b_k(x,\theta )=p(x|\theta ,y_{1:k})b_k(\theta ). \end{aligned}$$

Then,

$$\begin{aligned} |b_k(x,\theta )-\tilde{b}_k(x,\theta )|= & {} p(x|\theta ,y_{1:k})\left| b_k(\theta )-\int _{S_k^m(\theta )}\frac{1}{V(S_k^m(\theta ))}b_k(\theta _k)d\theta _k\right| \\\le & {} \frac{p(x|\theta ,y_{1:k})}{V(S_k^m(\theta ))}\int _{S_k^m(\theta )}\left| b_k(\theta )-b_k(\theta _k)\right| d\theta _k. \end{aligned}$$

By Assumption 6, \(b_k(\theta )\) is continuous on \(S_k^m(\theta )\) and differentiable on the open set \(\{\theta _k \in \Theta :|\theta _k^i-\theta ^i|<\delta _k,\ i=1,\ldots ,m\}\). By the mean value theorem, \(\exists \ \xi \in S_k^m(\theta )\), such that

$$\begin{aligned} \left| b_k(\theta )-b_k(\theta _k)\right| \le \left\| \nabla _\theta b_k(\xi )\right\| _2\left\| \theta -\theta _k\right\| _2\le \left\| \nabla _\theta b_k(\xi )\right\| m\delta _k, \end{aligned}$$

where \(\Vert \cdot \Vert _2\) denotes the Euclidean norm and \(\Vert \cdot \Vert \) denotes the maximum norm.

Define

$$\begin{aligned} d_k=\sup _{\xi \in \Theta }\left( \left\| \nabla _\theta b_k(\xi )\right\| \right) m\delta _k, \end{aligned}$$

and \(V(\Theta )\) is the volume of \(\Theta \), which is bounded. Now, since

$$\begin{aligned} |b_k(x)-\tilde{b}_k(x)|\le \int \,_{\Theta }|b_k(x;\theta )-\tilde{b}_k(x;\theta )|d\theta , \end{aligned}$$

to prove \(\sum _{k=1}^{\infty }|b_k(x)-\tilde{b}_k(x)|<\infty \) almost everywhere in \(\mathcal {X}\), it is sufficient to show \(\sum _{k=1}^{\infty } d_k <\infty \).

By (6) and (17), we have

$$\begin{aligned} b_k(\theta )=\frac{\int _{\mathcal {X}}\tilde{b}_{k-1}(x,\theta )\varphi (H(x)-y_k)dx}{\mathbb {E}_{\tilde{b}_{k-1}}[\varphi (H(X)-y_k)]}. \end{aligned}$$

The gradient of \(b_k(\theta )\) is

$$\begin{aligned} \nabla _\theta b_k(\theta )= & {} \frac{\nabla _\theta \int _{\mathcal {X}}\tilde{b}_{k-1}(x,\theta )\varphi (H(x)-y_k)dx}{\mathbb {E}_{\tilde{b}_{k-1}}[\varphi (H(X)-y_k)]}\\= & {} \frac{\int _{\mathcal {X}}\nabla _\theta \tilde{b}_{k-1}(x,\theta )\varphi (H(x)-y_k)dx}{\mathbb {E}_{\tilde{b}_{k-1}}[\varphi (H(X)-y_k)]}. \end{aligned}$$

Since there exists \(K<\infty \) such that \(y_k=y_K\), \(\forall k\ge K\), the gradient of \(b_k(\theta )\) is upper bounded by

\(\forall k\ge K,\)

$$\begin{aligned} \nonumber \nabla _\theta b_k(\theta )\le & {} \frac{\varphi (H^u-y_K)}{\varphi (0)}\int _{\mathcal {X}}\nabla _\theta \tilde{b}_{k-1}(x,\theta )dx\\= & {} \frac{\varphi (H^u-y_K)}{\varphi (0)}\nabla _\theta \tilde{b}_{k-1}(\theta ), \end{aligned}$$
(22)

\(\forall k< K,\)

$$\begin{aligned} \nonumber \nabla _\theta b_k(\theta )\le & {} \frac{\varphi (H^u-H^l)}{\varphi (0)}\int _{\mathcal {X}}\nabla _\theta \tilde{b}_{k-1}(x,\theta )dx\\= & {} \frac{\varphi (H^u-H^l)}{\varphi (0)}\nabla _\theta \tilde{b}_{k-1}(\theta ), \end{aligned}$$
(23)

where the inequalities are because of \(\varphi (0)\le \varphi (H(x)-y_k)\le \varphi (H^u-y_K)\), \(\forall k\ge K\), and \(\varphi (0)\le \varphi (H(x)-y_k)\le \varphi (H^u-H^l)\), \(\forall k< K\). Taking the maximum norm on both sides of (22) and (23), we have the following inequalities

\(\forall k\ge K,\)

$$\begin{aligned} \Vert \nabla _\theta b_{k}(\theta )\Vert \le \frac{\varphi (H^u-y_K)}{\varphi (0)}\Vert \nabla _\theta \tilde{b}_{k-1}(\theta )\Vert , \end{aligned}$$
(24)

\(\forall k< K,\)

$$\begin{aligned} \Vert \nabla _\theta b_{k}(\theta )\Vert \le \frac{\varphi (H^u-H^l)}{\varphi (0)}\Vert \nabla _\theta \tilde{b}_{k-1}(\theta )\Vert . \end{aligned}$$
(25)

Next, we prove \(\exists \ \eta _k\in \Theta \), such that \(\Vert \nabla _\theta \tilde{b}_{k}(\theta )\Vert \le \Vert \nabla _\theta b_{k}(\eta _k)\Vert \), where \(\eta _k\) is dependent on \(\theta \).

Let \(\vec {\varepsilon }^i=(0,0,\ldots ,0,\varepsilon ,0,\ldots ,0)\), where the i-th element of \(\vec {\varepsilon }^i\) is \(\varepsilon \) and other elements are 0. We denote \(\theta =(\theta ^1,\theta ^2,\ldots ,\theta ^m)\), and \(\bar{\theta }^i=(\theta ^1,\theta ^2,\ldots ,\theta ^{i-1},\theta ^{i+1},\ldots ,\theta ^m)\). With the definition of \(\bar{\theta }^i\), \(\tilde{b}_k(\theta )\) can be alternatively represented by

$$\begin{aligned} \tilde{b}_k(\theta )=\int _{S_k^m(\theta )}\frac{b_k(\theta _k)}{V(S_k^m(\theta ))}d\theta _k=\int _{\theta ^i-\delta _k}^{\theta ^i+\delta _k}\int _{S_k^{m-1}(\bar{\theta }^i)}\frac{b_k(\theta _k^i,\bar{\theta }_k^i)}{V(S_k^m(\theta ))}d\bar{\theta }_k^id\theta _k^i, \end{aligned}$$

where \(S_k^{m-1}(\bar{\theta }^i)=\{\bar{\theta }^i_k\in \Theta : |\theta ^i_k-\theta ^i|<\delta _k, i=1,\ldots ,i-1,i+1,\ldots ,m\}\). Then,

$$\begin{aligned} |\tilde{b}_k(\theta +\vec {\varepsilon }^i)-\tilde{b}_k(\theta )|= & {} \left| \int _{\theta ^i-\delta _k+\varepsilon }^{\theta ^i+\delta _k+\varepsilon }\int _{S_k^{m-1}(\bar{\theta }^i)}\frac{b_k(\theta _k^i,\bar{\theta }_k^i)}{V(S_k^m(\theta ))}d\theta _k-\int _{\theta ^i-\delta _k}^{\theta ^i+\delta _k}\int _{S_k^{m-1}(\bar{\theta }^i)}\frac{b_k(\theta _k^i,\bar{\theta }_k^i)}{V(S_k^m(\theta ))}d\theta _k \right| \\= & {} \left| \int _{\theta ^i+\delta _k}^{\theta ^i+\delta _k+\varepsilon }\int _{S_k^{m-1}(\bar{\theta }^i)}\frac{b_k(\theta _k^i,\bar{\theta }_k^i)}{V(S_k^m(\theta ))}d\theta _k-\int _{\theta ^i-\delta _k}^{\theta ^i-\delta _k+\varepsilon }\int _{S_k^{m-1}(\bar{\theta }^i)}\frac{b_k(\theta _k^i,\bar{\theta }_k^i)}{V(S_k^m(\theta ))}d\theta _k \right| \\= & {} \left| \int _{\theta ^i}^{\theta ^i+\varepsilon }\int _{S_k^{m-1}(\bar{\theta }^i)}\frac{b_k(\theta _k^i+\delta _k,\bar{\theta }_k^i)-b_k(\theta _k^i-\delta _k,\bar{\theta }_k^i)}{V(S_k^m(\theta ))}d\theta _k \right| \\\le & {} \int _{\theta ^i}^{\theta ^i+\varepsilon }\int _{S_k^{m-1}(\bar{\theta }^i)}\frac{\left| b_k(\theta _k^i+\delta _k,\bar{\theta }_k^i)-b_k(\theta _k^i-\delta _k,\bar{\theta }_k^i)\right| }{V(S_k^m(\theta ))}d\theta _k. \end{aligned}$$

Because \(S_k(\theta )\) is compact, \(\exists \ t\in S_k(\theta )\), such that \(\forall \theta _k\in S_k(\theta )\), we have

$$\begin{aligned} |b_k(\theta _k^i+\delta _k,\bar{\theta }_k^i)-b_k(\theta _k^i-\delta _k,\bar{\theta }_k^i)|\le |b_k(t^i+\delta _k,\bar{t}^i)-b_k(t^i-\delta _k,\bar{t}^i)|. \end{aligned}$$

Thus,

$$\begin{aligned} |\tilde{b}_k(\theta +\vec {\varepsilon }^i)-\tilde{b}_k(\theta )|\le \int _{\theta ^i}^{\theta ^i+\varepsilon }\int _{S_k^{m-1}(\bar{\theta }^i)}\frac{|b_k(t^i+\delta _k,\bar{t}^i)-b_k(t^i-\delta _k,\bar{t}^i)|}{V(S_k^m(\theta ))}d\theta _k. \end{aligned}$$

By mean value theorem, \(\exists \ \tau \in \Theta \), such that

$$\begin{aligned} |b_k(t^i+\delta _k,\bar{t}^i)-b_k(t^i-\delta _k,\bar{t}^i)|=\left| \frac{\partial b_k}{\partial \theta ^i}(\tau )\right| 2\delta _k. \end{aligned}$$

Thus,

$$\begin{aligned} |\tilde{b}_k(\theta +\vec {\varepsilon }^i)-\tilde{b}_k(\theta )|\le \frac{\varepsilon }{2\delta _k}\left| \frac{\partial b_k}{\partial \theta ^i}(\tau )\right| 2\delta _k=\varepsilon \left| \frac{\partial b_k}{\partial \theta ^i}(\tau )\right| . \end{aligned}$$

By the definition of derivative, we have

$$\begin{aligned} \left| \frac{\partial \tilde{b}_k(\theta )}{\partial \theta ^i}\right| =\lim _{\varepsilon \rightarrow 0}\frac{|\tilde{b}_k(\theta +\vec {\varepsilon }^i)-\tilde{b}_k(\theta )|}{\varepsilon }\le \left| \frac{\partial b_k}{\partial \theta ^i}(\tau )\right| . \end{aligned}$$

It is easy to observe from the above inequality that \(\exists \eta _k \in \Theta \), such that

$$\begin{aligned} \Vert \nabla _\theta \tilde{b}_k(\theta )\Vert \le \Vert \nabla _\theta b_k(\eta _k)\Vert . \end{aligned}$$
(26)

By (24)–(26), we may bound \(\Vert \nabla _\theta b_k(\theta )\Vert \) in terms of \(\Vert \nabla _\theta b_{k-1}(\cdot )\Vert \). Therefore, \(\exists \ \eta _{k-1}\in \Theta \), such that

$$\begin{aligned} \Vert \nabla _\theta b_k(\theta )\Vert \le \frac{\varphi (H^u-y_K)}{\varphi (0)}\Vert \nabla _\theta \tilde{b}_{k-1}(\theta )\Vert \le \frac{\varphi (H^u-y_K)}{\varphi (0)}\Vert \nabla _\theta b_{k-1}(\eta _{k-1})\Vert ,\quad \forall k\ge K, \end{aligned}$$

and

$$\begin{aligned} \Vert \nabla _\theta b_k(\theta )\Vert \le \frac{\varphi (H^u-H^l)}{\varphi (0)}\Vert \nabla _\theta \tilde{b}_{k-1}(\theta )\Vert \le \frac{\varphi (H^u-H^l)}{\varphi (0)}\Vert \nabla _\theta b_{k-1}(\eta _{k-1})\Vert ,\quad \forall k< K. \end{aligned}$$

By induction, we have

$$\begin{aligned} \Vert \nabla _\theta b_k(\theta )\Vert \le \left( \frac{\varphi (H^u-y_K)}{\varphi (0)}\right) ^{k-K}\left( \frac{\varphi (H^u-H^l)}{\varphi (0)}\right) ^{K}\Vert \nabla _\theta b_0(\eta _0)\Vert ,\quad \forall k\ge K. \end{aligned}$$

By Assumption 7, we have \(\Vert \nabla _\theta b_0(\theta )\Vert \le A\); hence the upper bound of \(d_k\) is

$$\begin{aligned} d_k\le A\left( \frac{\varphi (H^u-y_K)}{\varphi (0)}\right) ^{k-K}\left( \frac{\varphi (H^u-H^l)}{\varphi (0)}\right) ^{K}m\delta _k, \quad \forall k\ge K. \end{aligned}$$

If \(\delta _k=\delta \alpha ^k\) and \(\alpha <\frac{\varphi (0)}{\varphi (H^u-y_K)}\), we have \(\sum _{k=1}^\infty d_k<\infty \), which implies that

$$\begin{aligned} \sum _{k=1}^{\infty }|b_k(x)-\tilde{b}_k(x)|\le \sum _{k=1}^{\infty }\int _{\Theta }|b_k(x;\theta )-\tilde{b}_k(x;\theta )|d\theta <\infty , \end{aligned}$$

as \(k\) goes to infinity. Therefore, \(\sum _{k=1}^{\infty }|b_k(x)-\tilde{b}_k(x)|<\infty \) almost everywhere in \(\mathcal {X}\), which is Assumption 4. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, X., Zhou, E. Population model-based optimization. J Glob Optim 63, 125–148 (2015). https://doi.org/10.1007/s10898-015-0288-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-015-0288-1

Keywords

Navigation