On the Convergence of Adaptive Stochastic Search Methods for Constrained and Multi-objective Black-Box Optimization

Regis, Rommel G.

doi:10.1007/s10957-016-0977-z

On the Convergence of Adaptive Stochastic Search Methods for Constrained and Multi-objective Black-Box Optimization

Published: 18 July 2016

Volume 170, pages 932–959, (2016)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Rommel G. Regis¹

366 Accesses
3 Citations
Explore all metrics

Abstract

Stochastic search methods for global optimization and multi-objective optimization are widely used in practice, especially on problems with black-box objective and constraint functions. Although there are many theoretical results on the convergence of stochastic search methods, relatively few deal with black-box constraints and multiple black-box objectives and previous convergence analyses require feasible iterates. Moreover, some of the convergence conditions are difficult to verify for practical stochastic algorithms, and some of the theoretical results only apply to specific algorithms. First, this article presents some technical conditions that guarantee the convergence of a general class of adaptive stochastic algorithms for constrained black-box global optimization that do not require iterates to be always feasible and applies them to practical algorithms, including an evolutionary algorithm. The conditions are only required for a subsequence of the iterations and provide a recipe for making any algorithm converge to the global minimum in a probabilistic sense. Second, it uses the results for constrained optimization to derive convergence results for stochastic search methods for constrained multi-objective optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Local Convergence of Stochastic Global Optimization Algorithms

MSO: a framework for bound-constrained black-box global optimization algorithms

Article 21 May 2016

Stochastic Search in Metaheuristics

References

Baba, N.: Convergence of a random optimization method for constrained optimization problems. J. Optim. Theory Appl. 33(4), 451–461 (1981)
Article MathSciNet MATH Google Scholar
Price, W.L.: Global optimization by controlled random search. J. Optim. Theory Appl. 40(3), 333–348 (1983)
Article MathSciNet MATH Google Scholar
Fonseca, C.M., Fleming, P.J.: Genetic algorithms for multiobjective optimization: formulation, discussion and generalization. In: Forrest, S. (ed.) Proceedings of the Fifth International Conference on Genetic Algorithms, pp. 416–423. Morgan Kaufmann, San Mateo, CA (1993)
Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Article Google Scholar
Solis, F.J., Wets, R.J.B.: Minimization by random search techniques. Math. Oper. Res. 6(1), 19–30 (1981)
Article MathSciNet MATH Google Scholar
Pinter, J.D.: Global Optimization in Action. Kluwer Academic Publishers, Dordrecht (1996)
Book MATH Google Scholar
Stephens, C.P., Baritompa, W.: Global optimization requires global information. J. Optim. Theory Appl. 96(3), 575–588 (1998)
Article MathSciNet MATH Google Scholar
Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, New Jersey (2003)
Book MATH Google Scholar
Zabinsky, Z.B.: Stochastic Adaptive Search in Global Optimization. Springer US, New York (2003). http://www.springer.com/us/book/9781402075261?cm_mmc=sgw-_-ps-_-book-_-1-4020-7526-X
Birbil, S., Fang, S.C., Sheu, R.L.: On the convergence of a population-based global optimization algorithm. J. Global Optim. 30(2–3), 301–318 (2004)
Article MathSciNet MATH Google Scholar
Pinter, J.D.: Convergence properties of stochastic optimization procedures. Optim. J. Math. Program. Oper. Res. 15(3), 405–427 (1984)
MathSciNet MATH Google Scholar
Baba, N., Takeda, H., Miyake, T.: Interactive multi-objective programming technique using random optimization method. Int. J. Syst. Sci. 19(1), 151–159 (1988)
Article MathSciNet MATH Google Scholar
Hanne, T.: On the convergence of multiobjective evolutionary algorithms. Eur. J. Oper. Res. 117, 553–564 (1999)
Article MATH Google Scholar
Rudolph, G., Agapie, A.: Convergence properties of some multi-objective evolutionary algorithms. In: Proceedings of the 2000 Congress on Evolutionary Computation (CEC 2000), vol. 2, pp. 1010–1016. IEEE, La Jolla, CA (2000)
Laumanns, M., Thiele, L., Deb, K., Zitzler, E.: Combining convergence and diversity in evolutionary multiobjective optimization. Evol. Comput. 10(3), 263–282 (2002)
Article Google Scholar
Schüetze, O., Laumanns, M., Coello, C.A.C., Dellnitz, M., Talbi, E.: Convergence of stochastic search algorithms to finite size pareto set approximations. J. Global Optim. 41(4), 559–577 (2008)
Article MathSciNet MATH Google Scholar
Brockhoff, D.: Theoretical aspects of evolutionary multiobjective optimization. In: Auger, A., Doerr, B. (eds.) Theory of Randomized Search Heuristics: Foundations and Recent Developments, pp. 101–139. World Scientific Publishing Co., Inc., River Edge, NJ (2011). http://dl.acm.org/citation.cfm?id=1996312
Regis, R.G.: Convergence guarantees for generalized adaptive stochastic search methods for continuous global optimization. Eur. J. Oper. Res. 207(3), 1187–1202 (2010)
Article MathSciNet MATH Google Scholar
Resnick, S.I.: A Probability Path. Birkhäuser, Boston (1999)
MATH Google Scholar
Hansen, N.: The cma evolution strategy: a comparing review. In: Lozano, J.A., Nga, P.L., Inza, I., Bengoetxea, E. (eds.) Towards a New Evolutionary Computation: Advances in Estimation of Distribution Algorithms, pp. 75–102. Springer-Verlag Berlin Heidelberg (2006). http://www.springer.com/us/book/9783540290063
Fang, K.T., Zhang, Y.T.: Generalized Multivariate Analysis. Science Press, Springer, Beijing (1990)
MATH Google Scholar
Regis, R.G.: Evolutionary programming for high-dimensional constrained expensive black-box optimization using radial basis functions. IEEE Trans. Evol. Comput. 18(3), 326–347 (2014)
Article Google Scholar
Bäck, T., Rudolph, G., Schwefel, H.-P.: Evolutionary programming and evolution strategies: similarities and differences. In: Fogel, D.B., Atmar, J.W. (eds.) Proceedings of the Second Annual Conference on Evolutionary Programming, pp. 11–22. Evolutionary Programming Society, La Jolla, CA (1993)
Miettinen, K.: Nonlinear Multiobjective Optimization. Kluwer Academic Publisher, Boston, MA (1999)
MATH Google Scholar
Geoffrion, A.M.: Proper efficiency and the theory of vector maximization. J. Math. Anal. Appl. 22(3), 618–630 (1968)
Article MathSciNet MATH Google Scholar
Soland, R.M.: Multicriteria optimization: a general characterization of efficient solutions. Decis. Sci. 10(1), 26–38 (1979)
Article Google Scholar
Yu, P.L.: A class of solutions for group decision problems. Manag. Sci. 19(8), 936–946 (1973)
Article MathSciNet MATH Google Scholar
Zeleny, M.: Compromise programming. In: Cochrane, J.L., Zeleny, M. (eds.) Multiple Criteria Decision Making, pp. 262–301. University of South Carolina Press, Columbia, SC (1973)
Google Scholar
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms, 3rd edn. Wiley, New York (2006)
Book MATH Google Scholar

Download references

Acknowledgments

The author thanks the anonymous reviewers for their comments. He also thanks Saint Joseph’s University for awarding him with a Michael J. Morris Grant for this research.

Author information

Authors and Affiliations

Department of Mathematics, Saint Joseph’s University, Philadelphia, PA, 19131, USA
Rommel G. Regis

Authors

Rommel G. Regis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rommel G. Regis.

Appendix

This section provides complete proofs of some of the results mentioned earlier.

Proof of Proposition 2.1

Fix $\epsilon >0$ and define ${\mathcal {S}}_{\epsilon } := \{x \in {\mathcal {D}} : f(x)<f^*+\epsilon \}$. By assumption,

$$\begin{aligned} P[X_{n_k} \in {\mathcal {S}}_{\epsilon }\ |\ \sigma (\mathcal {E}_{(n_k)-1})] \ge P[Y_{n_k} \in {\mathcal {S}}_{\epsilon }\ |\ \sigma (\mathcal {E}_{(n_k)-1})] \ge L(\epsilon ),\ \text{ for } \text{ any } k \ge 1. \end{aligned}$$

Now for each $k \ge 1$, we have

$$\begin{aligned}&P[X_{n_1} \not \in {\mathcal {S}}_{\epsilon }, X_{n_2} \not \in {\mathcal {S}}_{\epsilon }, \ldots , X_{n_k} \not \in {\mathcal {S}}_{\epsilon }] \\&\quad = \prod _{i=1}^k P[X_{n_i} \not \in {\mathcal {S}}_{\epsilon }| X_{n_1} \not \in {\mathcal {S}}_{\epsilon }, X_{n_2} \not \in {\mathcal {S}}_{\epsilon }, \ldots , X_{n_{(i-1)}} \not \in {\mathcal {S}}_{\epsilon }]. \end{aligned}$$

By conditioning on the random elements in $\mathcal {E}_{(n_i)-1}$, it is easy to check that for each $\epsilon >0$, we have

$$\begin{aligned} P[X_{n_i} \in {\mathcal {S}}_{\epsilon }| X_{n_1} \not \in {\mathcal {S}}_{\epsilon }, X_{n_2} \not \in {\mathcal {S}}_{\epsilon }, \ldots , X_{n_{(i-1)}} \not \in {\mathcal {S}}_{\epsilon }] \ge L(\epsilon ). \end{aligned}$$

Thus,

$$\begin{aligned}&P[X_{n_1} \not \in {\mathcal {S}}_{\epsilon }, X_{n_2} \not \in {\mathcal {S}}_{\epsilon }, \ldots , X_{n_k} \not \in {\mathcal {S}}_{\epsilon }] \\&\quad = \prod _{i=1}^k P[X_{n_i} \not \in {\mathcal {S}}_{\epsilon }| X_{n_1} \not \in {\mathcal {S}}_{\epsilon }, X_{n_2} \not \in {\mathcal {S}}_{\epsilon }, \ldots , X_{n_{(i-1)}} \not \in {\mathcal {S}}_{\epsilon }]\\&\quad = \prod _{i=1}^k \left( 1-P[X_{n_i} \in {\mathcal {S}}_{\epsilon }| X_{n_1} \not \in {\mathcal {S}}_{\epsilon }, X_{n_2} \not \in {\mathcal {S}}_{\epsilon }, \ldots , X_{n_{(i-1)}} \not \in {\mathcal {S}}_{\epsilon }] \right) \le \left( 1-L(\epsilon )\right) ^k. \end{aligned}$$

Observe that if i is the smallest index such that $X_i \in {\mathcal {S}}_{\epsilon }$, it follows that $X_i^*=X_i$ and $X_n^* \in {\mathcal {S}}_{\epsilon }$ for all $n \ge i$. Consequently, if $X_{n_k}^* \not \in {\mathcal {S}}_{\epsilon }$, then $X_{n_1} \not \in {\mathcal {S}}_{\epsilon }, X_{n_2} \not \in {\mathcal {S}}_{\epsilon }, \ldots , X_{n_k} \not \in {\mathcal {S}}_{\epsilon }$. Hence, for each $k \ge 1$,

$$\begin{aligned} P[f(X_{n_k}^*)-f^* \ge \epsilon ]= & {} P[f(X_{n_k}^*) \ge f^*+\epsilon ] \le P[X_{n_k}^* \not \in {\mathcal {S}}_{\epsilon }] \\\le & {} P[X_{n_1} \not \in {\mathcal {S}}_{\epsilon }, X_{n_2} \not \in {\mathcal {S}}_{\epsilon }, \ldots , X_{n_k} \not \in {\mathcal {S}}_{\epsilon }] \le \left( 1-L(\epsilon )\right) ^k, \end{aligned}$$

and so, $\displaystyle {\lim _{k \rightarrow \infty }P[f(X_{n_k}^*)-f^* \ge \epsilon ]=0}$, i.e., $f(X_{n_k}^*) \longrightarrow f^*$ in probability. By a standard result in probability theory (e.g., see [19], Theorem 6.3.1(b)), $f(X_{n_{k(i)}}^*) \longrightarrow f^*$ a.s. as $i \rightarrow \infty $ for some subsequence $\{n_{k(i)}\}_{i \ge 1}$. Hence, $\exists \mathcal {H}\subseteq \Omega $ such that $P(\mathcal {H})=0$ and $\displaystyle {\lim _{i \rightarrow \infty } f(X_{n_{k(i)}}^*(\omega ))=f^*}$ for all $\omega \in \Omega {\setminus } \mathcal {H}$.

Next, define

$$\begin{aligned} \mathcal {I}\!:=\! \{ \omega \in \Omega \ :\ X_{n_k}^*(\omega )\! \not \in \! {\mathcal {D}}\ \text{ for } \text{ all } \text{ k }\} \!=\! \bigcap _{k=1}^{\infty } \{ \omega \in \Omega \ :\ X_{n_k}^*(\omega ) \!\not \in \! {\mathcal {D}}\} = \bigcap _{k=1}^{\infty } [X_{n_k}^* \not \in {\mathcal {D}}]. \end{aligned}$$

We wish to show that $P(\mathcal {I})=0$. Since $\{[X_{n_k}^* \not \in {\mathcal {D}}]\}_{k \ge 1}$ is a decreasing sequence of events in the $\sigma $-field $\mathcal {B}$, it follows that

$$\begin{aligned} P(\mathcal {I}) = P( \bigcap _{k=1}^{\infty } [X_{n_k}^* \not \in {\mathcal {D}}] ) = \lim _{k \rightarrow \infty } P(X_{n_k}^* \not \in {\mathcal {D}}). \end{aligned}$$

Now, for all $k \ge 1$,

$$\begin{aligned} \begin{array}{lcl} P(X_{n_k}^* \not \in {\mathcal {D}}) &{} \le &{} P(X_{n_1} \not \in {\mathcal {D}}, X_{n_2} \not \in {\mathcal {D}}, \ldots , X_{n_k} \not \in {\mathcal {D}}) \\ &{}=&{} \prod _{i=1}^k P[X_{n_i} \not \in {\mathcal {D}} | X_{n_1} \not \in {\mathcal {D}}, X_{n_2} \not \in {\mathcal {D}}, \ldots , X_{n_{(i-1)}} \not \in {\mathcal {D}}] \\ &{} = &{} \prod _{i=1}^k \left( 1-P[X_{n_i} \in {\mathcal {D}} | X_{n_1} \not \in {\mathcal {D}}, X_{n_2} \not \in {\mathcal {D}}, \ldots , X_{n_{(i-1)}} \not \in {\mathcal {D}}] \right) \\ \end{array} \end{aligned}$$

Moreover, for each $i=1,\ldots ,k$,

$$\begin{aligned}&P[X_{n_i} \in {\mathcal {D}} | X_{n_1} \not \in {\mathcal {D}}, X_{n_2} \not \in {\mathcal {D}}, \ldots , X_{n_{(i-1)}} \not \in {\mathcal {D}}] \\&\quad \ge P[X_{n_i} \in {{\mathcal {S}}_{\epsilon }} | X_{n_1} \not \in {\mathcal {D}}, X_{n_2} \not \in {\mathcal {D}}, \ldots , X_{n_{(i-1)}} \not \in {\mathcal {D}}] \ge L(\epsilon ). \end{aligned}$$

Hence, for all $k \ge 1$, $P(\mathcal {I}) \le P(X_{n_k}^* \not \in {\mathcal {D}}) \le (1-L(\epsilon ))^k$, and so, $P(\mathcal {I}) \le \lim _{k \rightarrow \infty } (1-L(\epsilon ))^k = 0$. This shows that $P(\mathcal {I})=0$.

Clearly, $P(\mathcal {I}\cup \mathcal {H}) \le P(\mathcal {I})+P(\mathcal {H})=0$, and so, $P(\mathcal {I}\cup \mathcal {H})=0$. Next, fix $\omega \in \Omega {\setminus } (\mathcal {I}\cup \mathcal {H})$. Since $\omega \in \Omega {\setminus } \mathcal {I}$, $\exists k(\omega )$ such that $X_{n_{k(\omega )}}^*(\omega ) \in {\mathcal {D}}$, and so, $X_n^*(\omega ) \in {\mathcal {D}}$ for all $n \ge n_{k(\omega )}$. Hence, $\{f(X_n^*(\omega ))\}_{n \ge n_{k(\omega )}}$ is monotonically non-increasing. Moreover, $f(X_n^*(\omega )) \ge f^*$ for all $n \ge n_{k(\omega )}$. Hence, $\lim _{n \rightarrow \infty } f(X_n^*(\omega ))$ exists. Also, since $\omega \in \Omega {\setminus } \mathcal {H}$, it follows that $\lim _{i \rightarrow \infty } f(X_{n_{k(i)}}^*(\omega ))=f^*$. Hence, $\lim _{n \rightarrow \infty } f(X_n^*(\omega ))=f^*$. This shows that $f(X_n^*) \longrightarrow f^*$ a.s. $\square $

Proof of Proposition 2.2

Fix $\epsilon >0$ and let $\widetilde{f} := \inf \{ f(x) : x \in {\mathcal {D}}, \Vert x-x^*\Vert \ge \epsilon \}$. Since $f(X_n^*) \longrightarrow f(x^*)$ a.s., it follows that $\exists \mathcal {N}\subseteq \Omega $ with $P(\mathcal {N})=0$ such that $f(X_n^*(\omega )) \longrightarrow f(x^*)$ for all $\omega \in \Omega {\setminus } {\mathcal {N}}$. As in the proof of Proposition 2.1, define $\mathcal {I}:= \{ \omega \in \Omega \ :\ X_{n_k}^*(\omega ) \not \in {\mathcal {D}}\ \text{ for } \text{ all } k\} = \bigcap _{k=1}^{\infty } [X_{n_k}^* \not \in {\mathcal {D}}]$. It was shown that$P(\mathcal {I})=0$.

Note that $P(\mathcal {I}\cup \mathcal {N}) = 0$. Fix $\omega \in \Omega {\setminus } (\mathcal {I}\cup \mathcal {N})$. Since $\omega \in \Omega {\setminus } \mathcal {N}$, we have $f(X_n^*(\omega )) \longrightarrow f(x^*)$. By assumption, $\widetilde{f} - f(x^*) > 0$. Hence, there is an integer $N(\omega )$ such that for all $n \ge N(\omega )$, we have

$$\begin{aligned} f(X_n^*(\omega )) - f(x^*) = |f(X_n^*(\omega )) - f(x^*)| < \widetilde{f} - f(x^*), \end{aligned}$$

or equivalently, $f(X_n^*(\omega )) < \widetilde{f}$. Moreover, since $\omega \in \Omega {\setminus } \mathcal {I}$, there is an integer $k(\omega )$ such that $X_{n_{k(\omega )}}^*(\omega ) \in \mathcal {D}$. This implies that $X_n^*(\omega ) \in \mathcal {D}$ for all $n \ge n_{k(\omega )}$.

Now, for any $n \ge \max (N(\omega ), k(\omega ))$, we have $X_n^*(\omega ) \in \mathcal {D}$ and $f(X_n^*(\omega ))<f^*$. Note that we must have $\Vert X_n^*(\omega )-x^*\Vert <\epsilon $. (Otherwise, if $\Vert X_n^*(\omega ) - x^*\Vert \ge \epsilon $, then $f(X_n^*(\omega )) \ge \inf \{ f(x) : x \in {\mathcal {D}}, \Vert x-x^*\Vert \ge \epsilon \} = \widetilde{f}$, which is a contradiction.) This shows that $X_n^*(\omega )~\longrightarrow ~x^*$ for each $\omega \in \Omega {\setminus } (\mathcal {I}\cup \mathcal {N})$. Thus, $X_n^* \longrightarrow x^*$ a.s. $\square $

Proof of Proposition 2.4

Since $S \ne \emptyset $, the given assumption implies that $\text{ int }(S) \ne \emptyset $ (otherwise $\text{ cl }(S)=\emptyset $). Next, if $\text{ bd }(S)=\emptyset $, then the above statement is vacuously true so assume that $\text{ bd }(S) \ne \emptyset $ and let $x \in \text{ bd }(S)$. Then, $x \in \text{ cl }(S)$. Since $\text{ cl }(\text{ int }(S))=\text{ cl }(S)$, it follows that $x \in \text{ cl }(\text{ int }(S))$. Since $x \not \in \text{ int }(S)$, it follows that x is a limit point of $\text{ int }(S)$. Thus, every neighborhood of x contains an interior point of S.

The second part of the proposition follows from a result in [29] (Corollary 3 p. 48), which states that if C is a convex set in $\mathbb {R}^d$ with a nonempty interior, then $\text{ cl }(\text{ int }(C))=\text{ cl }(C)$. $\square $

Proof of Corollary 2.1

For each $k \ge 1$, the conditional distribution of $Y_{n_k}$ given $\sigma (\mathcal {E}_{(n_k)-1})$ is uniform over the box $[\ell ,u]$: $h_{n_k}(y \ |\ \sigma (\mathcal {E}_{(n_k)-1})) = 1/\mu ([\ell ,u]),\ y \in [\ell ,u]$. Here, $\mu ([\ell ,u])=\prod _{i=1}^d (u^{(i)}-\ell ^{(i)})$. Hence, for any $y \in [\ell ,u]$,

$$\begin{aligned} h(y) := \inf _{k \ge 1} h_{n_k}(y \ |\ \sigma (\mathcal {E}_{(n_k)-1})) = 1/\mu ([\ell ,u]) > 0, \end{aligned}$$

and so, $\mu (\{y \in [\ell ,u]\ :\ h(y)=0\})=\mu (\emptyset )=0$. By Proposition 2.5, $f(X_n^*) \longrightarrow f^*$ a.s.$\square $

Proof of Proposition 2.6

For each $k \ge 1$, the conditional distribution of $Y_{n_k}$ given $\sigma (\mathcal {E}_{(n_k)-1})$ is an elliptical distribution with conditional density

$$\begin{aligned} h_{n_k}(y \ |\ \sigma (\mathcal {E}_{(n_k)-1})) = \gamma [\det (C_k)]^{-1/2}\ \Psi ((y-u_k)^TC_k^{-1}(y-u_k)), \qquad y \in \mathbb {R}^d, \end{aligned}$$

where $u_k \in \mathbb {R}^d$ is a realization of the random vector $U_k$. By the same argument as in the proof of Theorem 6 in [18], it can be shown that for any $y \in [\ell ,u]$,

$$\begin{aligned} h(y) \!:=\! \inf _{k \ge 1} h_{n_k}(y \ |\ \sigma (\mathcal {E}_{(n_k)-1})) \ge \gamma \left( \sup _{k \ge 1} \lambda _\mathrm{max}(C_k) \right) ^{-d/2} \Psi \left( \frac{\text{ diam }([\ell ,u])^2}{\inf _{k \ge 1} \lambda _\mathrm{min}(C_k)} \right) \!>\! 0, \end{aligned}$$

where $\text{ diam }([\ell ,u]):=\Vert u-\ell \Vert $ is the largest distance between any two points in $[\ell ,u]$. Hence, $\mu (\{y \in [\ell ,u]\ :\ h(y)=0\})=\mu (\emptyset )=0$. By Proposition 2.5, $f(X_n^*) \longrightarrow f^*$ a.s. $\square $

Proof of Proposition 2.7

As before, we first check that the above EP algorithm follows the GARSCO framework. For $n=1,2,\ldots ,\mu $, we have $k_n=1$ and $\Lambda _{n,1} = Y_n$. Moreover, for all $n \ge \mu +1$, we have $k_n=2$ and

$$\begin{aligned} \Lambda _{n,1}=Z_n\qquad \text{ and }\qquad \Lambda _{n,2}=\Xi _{t,i}=[\xi _{t,i}^{(0)},\xi _{t,i}^{(1)},\ldots ,\xi _{t,i}^{(d)}], \end{aligned}$$

where t and i are the unique integers such that $t \ge 1$, $1 \le i \le \mu $ and $n=t\mu +i$. Hence,

$$\begin{aligned} \displaystyle { {\mathcal {E}}_{t\mu +i-1} = \{ Y_1,\ldots ,Y_{\mu }\} \bigcup \left( \displaystyle { \bigcup _{s=1}^{t-1} \bigcup _{j=1}^{\mu } \{Z_{s\mu +j},\ \Xi _{s,j} \} } \right) \bigcup \left( \displaystyle { \bigcup _{j=1}^{i-1} \{Z_{t\mu +j},\ \Xi _{t,j} \} } \right) }. \end{aligned}$$

Since the selection of the new parent population in Step 3.4 is done in a greedy manner, it follows that for each integer $t \ge 1$ and $i=1,2,\ldots ,\mu $, $X(P_i(t-1))$ is a deterministic function of $Y_1,Y_2,\ldots ,Y_{t\mu }$. This also implies that for each integer $t \ge 1$ and $i=1,2,\ldots ,\mu $, $X(P_i(t-1))$ is also a deterministic function of $Y_1,Y_2,\ldots ,Y_{t\mu +i-1}$. Hence, for each integer $t \ge 1$ and $i=1,2,\ldots ,\mu $, we have $Y_{t\mu +i} = \Phi _{(t-1)\mu +i}(\mathcal {E}_{t\mu +i-1}) + Z_{t\mu +i}$, for some deterministic function $\Phi _{(t-1)\mu +i}$. Note that this implies that

$$\begin{aligned} Y_{\mu +k} = \Phi _k(\mathcal {E}_{\mu +k-1}) + Z_{\mu +k},\ \text{ for } \text{ all }\ k \ge 1 \end{aligned}$$

where $Z_{\mu +k}$ is a random vector whose conditional distribution given $\sigma ({\mathcal {E}}_{\mu +k-1})$ is a normal distribution with mean vector 0 and diagonal covariance matrix

$$\begin{aligned} \text{ Cov }(Z_{\mu +k}) = \text{ diag }\left( \left( \sigma _{\mu +k}^{(1)}\right) ^2, \left( \sigma _{\mu +k}^{(2)}\right) ^2, \ldots , \left( \sigma _{\mu +k}^{(d)}\right) ^2 \right) \end{aligned}$$

For each integer $k \ge 1$ and $j=1,2,\ldots ,d$, we have $\left( \sigma _{\mu +k}^{(j)} \right) ^2 \ge \sigma _{\small min}^2 > 0$. Define the subsequence $\{n_k\}_{k \ge 1}$ by $n_k:=\mu +k$ for all $k \ge 1$. Then, we have $Y_{n_k} = \Phi _k(\mathcal {E}_{(n_k)-1}) + W_k$, for all $k \ge 1$, where $W_k=Z_{n_k}$. Let $\lambda _k$ be the smallest eigenvalue of $\text{ Cov }(W_k)$. Since the eigenvalues of $\text{ Cov }(W_k)$ are $\left( \sigma _{\mu +k}^{(1)}\right) ^2, \ldots , \left( \sigma _{\mu +k}^{(d)}\right) ^2$, we have $\displaystyle {\lambda _k = \min _{1 \le j \le d} \left( \sigma _{\mu +k}^{(j)} \right) ^2 \ge \sigma _{\small min}^2}$, and so, $\inf _{k \ge 1} \lambda _k \ge \sigma _{\small min}^2 > 0$. Moreover, the conditional distribution of $W_k$ given $\sigma ({\mathcal {E}}_{(n_k)-1})$ is an elliptical distribution with conditional density given by

$$\begin{aligned} q_k(w\ |\ \sigma (\mathcal {E}_{(n_k)-1})) = \gamma [\det (C_k)]^{-1/2}\ \Psi (w^TC_k^{-1}w),\quad z \in \mathbb {R}^d \end{aligned}$$

where $\Psi (y)=e^{-y/2}$ and $\gamma = (2\pi )^{-d/2}$. Again, $\Psi (y)=e^{-y/2}$ is monotonically nonincreasing, and so, the conclusion follows from Proposition 2.6. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Regis, R.G. On the Convergence of Adaptive Stochastic Search Methods for Constrained and Multi-objective Black-Box Optimization. J Optim Theory Appl 170, 932–959 (2016). https://doi.org/10.1007/s10957-016-0977-z

Download citation

Received: 29 January 2015
Accepted: 04 July 2016
Published: 18 July 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10957-016-0977-z

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Convergence of Adaptive Stochastic Search Methods for Constrained and Multi-objective Black-Box Optimization

Abstract

Access this article

Similar content being viewed by others

On Local Convergence of Stochastic Global Optimization Algorithms

MSO: a framework for bound-constrained black-box global optimization algorithms

Stochastic Search in Metaheuristics

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Proof of Proposition 2.1

Proof of Proposition 2.2

Proof of Proposition 2.4

Proof of Corollary 2.1

Proof of Proposition 2.6

Proof of Proposition 2.7

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

On the Convergence of Adaptive Stochastic Search Methods for Constrained and Multi-objective Black-Box Optimization

Abstract

Access this article

Similar content being viewed by others

On Local Convergence of Stochastic Global Optimization Algorithms

MSO: a framework for bound-constrained black-box global optimization algorithms

Stochastic Search in Metaheuristics

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Proposition 2.1

Proof of Proposition 2.2

Proof of Proposition 2.4

Proof of Corollary 2.1

Proof of Proposition 2.6

Proof of Proposition 2.7

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation