Stochastics and Statistics
Stochastic Nelder–Mead simplex method – A new globally convergent direct search method for simulation optimization

https://doi.org/10.1016/j.ejor.2012.02.028Get rights and content

Abstract

Nelder–Mead simplex method (NM), originally developed in deterministic optimization, is an efficient direct search method that optimizes the response function merely by comparing function values. While successful in deterministic settings, the application of NM to simulation optimization suffers from two problems: (1) It lacks an effective sample size scheme for controlling noise; consequently the algorithm can be misled to the wrong direction because of noise, and (2) it is a heuristic algorithm; the quality of estimated optimal solution cannot be quantified. We propose a new variant, called Stochastic Nelder–Mead simplex method (SNM), that employs an effective sample size scheme and a specially-designed global and local search framework to address these two problems. Without the use of gradient information, SNM can handle problems where the response functions are nonsmooth or gradient does not exist. This is complementary to the existing gradient-based approaches. We prove that SNM can converge to the true global optima with probability one. An extensive numerical study also shows that the performance SNM is promising and is worthy of further investigation.

Highlights

► A new direct search method, SNM, is developed for simulation optimization. ► With the developed sample size scheme, SNM can handle noises effectively. ► Without requiring gradient estimation, SNM can handle nonsmooth response functions. ► We prove that SNM has attractive global convergence property. ► Numerical experiments show that the performance of SNM is promising.

Introduction

Simulation is one of the most popular planning tools in operations research and management science. The advantage of simulation is that it can account for any detail that is important to the system under investigation. Simulation optimization is concerned with identifying optimal design parameters for a stochastic system, where the objective function is expressed as an expectation of a function of response variables associated with a simulation model. Over the decades, simulation optimization has received considerable attention owing to a wide range of real-world applications, for example, designing production plans that minimize the expected inventory cost with stochastic customer demands and selecting financial portfolios that maximize the expected total profits with stochastic asset prices.

In this paper, we are focused on simulation optimization where the decision variables are continuous (henceforth called “continuous simulation optimization”). Recent methodology development for continuous simulation optimization have been discussed in much literature, for example, Banks, 1998, Fu, 2002, Tekin and Sabuncuoglu, 2004, Fu, 2006. Angün and Kleijnen (2010) classified the methodologies of continuous simulation optimization into two categories: the white-box method and black-box method. The white-box method corresponds to the methods where the gradient can be estimated via a single simulation run—best known are perturbation analysis, likelihood ratio/score function method (Fu, 2006), while the black-box method is directed to the method where the simulation model is essentially treated as a black box; examples are simultaneous perturbation stochastic approximation (SPSA) (Spall, 2003), response surface methodology (RSM) (Myers et al., 2009), and the many metaheuristics (genetic and evolutionary algorithm, scatter search, simulated annealing, tabu search). While the white-box methods are typically computationally efficient, they require substantial knowledge of the simulation model, such as the input distributions and/or some of the system dynamics, to derive an effective and efficient gradient estimator. When the simulation model is too complex to be known, the white-box method cannot be used and the black-box method appears to be the only option.

Stochastic approximation (SA) (Kiefer and Wolfowitz, 1952) may be one of the most prevalent and extensively studied black-box method over the past decades, e.g., Blum, 1954, Fabian, 1971, Benveniste et al., 1990, Andradóttir, 1995, Kushner and Yin, 1997, Wang and Spall, 2008, Bhatnagar et al., 2011, Andrieu et al., 2011. The advantages of SA are that it requires minimal knowledge of the simulation model and that it can be proved to converge under certain regularity conditions (Kushner and Yin, 1997). On the other hand, RSM, as stated in Myers et al. (2009), “is a collection of statistical and mathematical techniques useful for developing, improving, and optimizing processes”. Taking a sequential experimentation strategy, RSM approximates the response function based on first-order regression models, and later switches to second-order regression models whenever first-order models are insufficient to represent the underlying response function. Some analytical approaches such as canonical analysis and ridge analysis are used to locate the optimal solution. Full coverage about RSM can be found in many classic texts such as Khuri and Cornell, 1996, Myers et al., 2009.

While SA and RSM are extensively used and enjoy tremendous success in many applications, the requirement of first- or second-order Taylor approximation has serious limitations when solving problems whose objective functions are ill-behaved, for example, nondifferentiable functions (Swann, 1972), which arise in many engineering problems. This motivates the interest of direct search methods, because they “rely only on function values to find the location of the optimum” (Barton and Ivey, 1996). In fact, there has been an increasing interest in direct search methods during the past few years, for example, Hooke and Jeeves, 1961, Nelder and Mead, 1965, Spendley et al., 1962, Anderson and Ferris, 2001, Kolda et al., 2003. The direct search methods are advantageous in that they do not require gradient estimation and involve relatively few function evaluations at each iteration. In practice they have generally proved to be robust and reliable (Swann, 1972).

Nelder–Mead simplex method (NM) (Nelder and Mead, 1965), originally developed for unconstrained optimization of deterministic functions, is one of the most popular direct search methods (Barton and Ivey, 1996). Fletcher (1987) noted that “Nelder–Mead simplex method is the most successful of the methods which merely compare function values”. While successful in deterministic settings, the direct applications of NM to simulation optimization where the response variable is in the presence of noise have some serious limitations. First, NM lacks an effective sample size scheme for controlling noise; consequently when the response variable is grossly noisy, the noise can corrupt the relative ranks of solutions, leading the algorithm to a totally wrong direction. Second, NM is a heuristic algorithm (Spall, 2003); the quality of estimated optimal solution cannot be quantified. While a convergent variant of NM is developed for deterministic optimization, e.g., Price et al. (2002); to our best knowledge, the convergence of NM in stochastic environment is not addressed, except some heuristic modifications designed to handle noisy response functions, e.g., Barton and Ivey, 1996, Humphrey and Wilson, 2000.

We propose a new variant of NM, called Stochastic Nelder–Mead simplex method (SNM), for simulation optimization. Inheriting the advantages of NM, SNM only uses function values to optimize the response function instead of requiring gradient estimation; therefore it can solve problems whose objective function is nonsmooth, or gradient does not exist. This makes SNM complementary to the existing gradient-based approaches. Moreover, SNM is equipped with a specially-designed sample size scheme, thus the noise inherent in the response variable can be effectively controlled and the mistakes made when ranking a set of solutions can be minimized. A newly-developed global and local search framework further enables SNM to prevent the algorithm from premature convergence, a notorious weakness of NM. We prove that SNM is a globally convergent algorithm, i.e., it is guaranteed that the algorithm can achieve the global optima with probability one. Two computational enhancement procedures, LHS and SNS, are developed for SNM to achieve the best cost economy when handling practical problems.

The rest of this paper is organized as follows: In Section 2, we formally define the problem. In Section 3, the main framework of SNM as well as its convergence analysis are presented. In Section 4, we discuss two procedures that can significantly enhance the computational efficiency of SNM. In Section 5, we use an extensive numerical study to demonstrate the performance of SNM and compare it with other competing algorithms. We conclude with future direction in Section 6.

Section snippets

Problem definition

Consider the following continuous simulation optimization problem:MinimizexXE[G(x,ω)],where x is a p-dimensional vector of continuous decision variables, X is the parameter space, and ω is a random variable defined in the probability space (Ω,F,P). The stochastic response G(x, ω) takes two inputs, the design parameters x and a random sample from the distribution of ω. A basic assumption requires that the response function G(x, ω) is measurable, so that the expectation E[G(x, ω)] exists for

Nelder–Mead simplex method

Nelder–Mead simplex method (NM) (Nelder and Mead, 1965) was originally developed for nonlinear and deterministic optimization. The idea of simplex—a geometric object that is the convex hull of p + 1 points in Rp not lying in the same hyperplane—can be traced back to Spendley et al. (1962). NM is renowned because of the following advantages. First, it searches for the new solution by reflecting the extreme point with the worst function value through the centroid of the remaining extreme points.

Enhancement of computational efficiency

While SNM is proved to possess nice global convergence, it does not guarantee a satisfactory performance when applied to practical problems. In this section, we offer two procedures, the generation of a better initial simplex and a signal-to-noise sample size scheme, to enhance the computational performance of SNM.

Numerical study

In this section, we conduct a numerical study to evaluate the performance of SNM. In particular, we compare the performance of SNM with three widely-used algorithms, Simultaneous Perturbation Stochastic Approximation (SPSA) (Spall, 2003), Modified Nelder–Mead (MNM) method (Barton and Ivey, 1996), and Pattern Search (PS) on 96 scenarios, constituted by eight types of test functions, three types of dimensionalities, two types of variance settings, two types of initial solutions. In particular,

Conclusion

In this paper, we proposed a new direct search method, SNM, for continuous simulation optimization. SNM does not require gradient estimation and therefore it can handle problems where the response function is nonsmooth or gradient does not exist. This is complementary to the existing gradient-based approaches. We prove that SNM can converge to the global optima with probability one. An extensive numerical study shows that SNM can outperform the existing algorithms, SPSA, MNM and PS. The

Acknowledgments

The author would like to thank two anonymous referees for their insightful comments and suggestions that have significantly improved this paper. This research is supported in part by the Advanced Manufacturing and Service Management Center at National Tsing Hua University (100NN2074E1/101N2074E1) and National Science Council in Taiwan (NSC99-2221-E-007-038-MY1).

References (36)

  • L. Andrieu et al.

    Gradient-based simulation optimization under probability constraints

    European Journal of Operational Research

    (2011)
  • V. Fabian

    Stochastic approximation

    Optimizing Methods in Statistics

    (1971)
  • E.J. Anderson et al.

    A direct search algorithm for optimization with noisy function evaluations

    SIAM Journal on Optimization

    (2001)
  • S. Andradóttir

    Stochastic approximation algorithm with varying bounds

    Operations Research

    (1995)
  • E. Angün et al.

    An asymptotic test of optimality conditions in multiresponse simulation-based optimization

    INFORMS Journal on Computing

    (2010)
  • R.R. Barton et al.

    Nelder–Mead simplex modifications for simulation optimization

    Management Science

    (1996)
  • A. Benveniste et al.

    Adaptive Algorithms and Stochastic Approximations

    (1990)
  • S. Bhatnagar et al.

    Stochastic approximation algorithms for constrained optimization via simulation

    ACM Transactions on Modeling and Computer Simulation

    (2011)
  • P. Billingsley

    Probability and Measure

    (1995)
  • J.R. Blum

    Approximation methods which converge with probability one

    Annals of Mathematical Statistics

    (1954)
  • Chang, K.-H., Hong, L.J., Wan, H., 2011. Stochastic trust-region response-surface method (STRONG)-a new...
  • L. Dai

    Convergence properties of ordinal comparison in the simulation of discrete event dynamic systems

    Journal of Optimization Theory and Applications

    (1996)
  • G. Deng et al.

    Variable-number sample-path optimization

    Mathematical Programming

    (2009)
  • K.-T. Fang et al.

    Design and Modeling for Computer Experiments

    (2006)
  • R. Fletcher

    Practical Methods of Optimization

    (1987)
  • M.C. Fu

    Optimization for simulation: theory vs. practice

    INFORMS Journal on Computing

    (2002)
  • M.C. Fu

    Gradient estimation

  • Cited by (97)

    • Wave power forecasting using an effective decomposition-based convolutional Bi-directional model with equilibrium Nelder-Mead optimiser

      2022, Energy
      Citation Excerpt :

      In order to handle the issue of hyper-parameter tuning, this paper proposes an effective optimiser to address this problem, namely, the iterative NM simplex direct search algorithm detailed in the previous subsection. This approach represents a rapid polytope search method, particularly useful for low-dimensions optimisation problems and often employed to address nonlinear problems where derivatives cannot be known [45]. Fig. 6 shows the performance of the Nelder-mead algorithm, in terms of convergence rates, in the context of optimising VMD in ten independent experiments.To evaluate the VMD's performance, we applied permutation entropy (PE) [46], a reliable and robust time-series method that presents a quantification parameter of a dynamic system's complexity through ordering relations between states of a time series and deriving a probability distribution of the ordinal patterns.

    View all citing articles on Scopus
    View full text