Stochastics and Statistics
Error estimation properties of Gaussian process models in stochastic simulations

https://doi.org/10.1016/j.ejor.2012.12.033Get rights and content

Abstract

The theoretical relationship between the prediction variance of a Gaussian process model (GPM) and its mean square prediction error is well known. This relationship has been studied for the case when deterministic simulations are used in GPM, with application to design of computer experiments and metamodeling optimization. This article analyzes the error estimation of Gaussian process models when the simulated data observations contain measurement noise. In particular, this work focuses on the correlation between the GPM prediction variance and the distribution of prediction errors over multiple experimental designs, as a function of location in the input space. The results show that the error estimation properties of a Gaussian process model using stochastic simulations are preserved when the signal-to-noise ratio in the data is larger than 10, regardless of the number of training points used in the metamodel. Also, this article concludes that the distribution of prediction errors approaches a normal distribution with a variance equal to the GPM prediction variance, even in the presence of significant bias in the GPM predictions.

Highlights

► Error estimation is studied in Gaussian process models built from stochastic samples. ► The relationship between prediction error and prediction variance is quantified. ► With a signal-to-noise ratio less than 10, GPM loses accuracy in error estimation. ► The GPM prediction errors are normally distributed, despite bias in GPM prediction.

Introduction

A recurring question in metamodeling is how to assess the prediction accuracy of the model. Even though a validation step is typically performed as part of the metamodel construction, the question remains as to how accurate is the prediction of the model at an untried sample point. Assume that the response of an expensive simulation  f can be described as ytr(x) = f(x), where the subscript tr denotes the true mean response of the simulation at x. The approximate model of the expensive simulation, fˆ, predicts the mean response at x as: yapp(x)=fˆ(x). Therefore, the mean square prediction error of the approximate model at x is:δ2(x)=[ytr(x)-yapp(x)]2In the scenario of expensive simulations, a user may have limited resources (i.e. limited number of simulations) for the training and validation steps of the metamodel. This situation is more difficult when the user is approximating stochastic simulations, since each observed response y from the simulation is corrupted by a measurement noise η around the mean response y(x) = ytr(x) + η. Because of this scenario, researchers have looked for alternatives to obtain an estimation of the prediction error that do not require additional evaluations from the expensive simulation.

Gaussian process modeling (GPM) is one of the most popular methods for constructing approximate models, not only because of its flexibility and good prediction results, but also because it has its own error estimation on the GPM prediction. According to the theory of GPM, when the GPM structure is completely known (that is, the true GPM parameter set, the true local correlation structure and the true regression functions in the model are known), the GPM prediction variance is an error estimator of the mean square prediction error. In practice, the user has to make decisions about the GPM structure, incurring an additional model uncertainty factor that alters this theoretical property. Many of the applications of GPM implement a “plug-in” version of the Gaussian process, using an estimated parameter set. Santner et al. (2003) called this version of GPM the empirical best linear unbiased predictor. Several authors already mentioned this situation, suggesting that the GPM prediction variance of the empirical GPM version is an underestimator of the “true” GPM prediction variance (Cressie, 1993, den Hertog et al., 2006). These issues raise questions about how to estimate the error in Gaussian process models in practice.

Error estimation measures are useful for the assessment of approximate models. In many applications, the success of approximate models depends on the accuracy in the error estimation. Error estimators are used to quantify the level of uncertainty or “trust” in the prediction of an approximate model, therefore indicating particular regions in the input space where additional samples are required. Some examples include improvement of design of experiments in the creation of approximate models (Hernandez and Grover, 2010), and optimization of time-consuming computer simulations via black-box models (Kleijnen et al., 2010).

Early findings about error estimation of GPM were made by Meckesheimer et al. (2002). They evaluate leave-k-out cross-validation strategies as a procedure to assess the accuracy of low-order polynomial functions, radial basis functions and kriging models over the design space. Goel et al. (2009) presented a detailed study on error estimation, evaluating response surfaces and kriging models for six classical benchmark examples in the statistics field. As a result of this last study, Goel et al. concluded that local evaluations of GPM prediction variance can be used for global error estimation of Gaussian process models. As relevant as these results are in the error estimation of GPM, the studies were limited to predictions of deterministic simulations.

The presence of noise in the observations incorporates an additional element in the GPM prediction that has been discussed previously in the literature. Kleijnen and coworkers have worked extensively in the use of kriging models for random simulations (Kleijnen et al., 2010, Kleijnen and van Beers, 2005, van Beers and Kleijnen, 2003, van Beers and Kleijnen, 2008), using replicates at each sample point to calculate sample means, and then treating those values as deterministic outcomes in the GPM construction. In the area of error estimation, the same group implemented a parametric bootstrapping approach to calculate the mean square prediction error of the GPM (den Hertog et al., 2006), but it was not used as an error estimator in the approximate model and it was only employed with deterministic simulations. A similar implementation of this bootstrapping approach was also used to evaluate the uncertainty of time-course experimental data in cell signaling pathways and network topology of time-series gene expression data (Kirk and Stumpt, 2009). Ankenman et al. (2010) extended Kleijnen’s GPM for random simulations with their stochastic kriging model, which models the intrinsic uncertainty, or noise, in the simulations with an additional variance parameter for each sample point. Different from these papers, where the major interest was the GPM mean prediction, the work presented here focuses on the GPM prediction variance and its role as an error estimator of the approximate model when stochastic observations are used in GPM.

Section snippets

Theory

Consider a set D of n input/output pairs {xi,y(xi)}, where xiRd,y(xi)R,i=1,,n. This set of input/output pairs will be referred to as sample points or experimental points. Consider also that the observed data y(xi) contains an additive measurement noise ηN0,σu2 along with the true response of the simulation ytr(xi)R. The subscript u represents the uncorrelated nature of the additive noise in each of the experimental points.y(xi)=ytr(xi)+ηDespite the presence of noise in the observed data,

Materials and methods

This section describes the different case studies that are used to evaluate the error estimation properties in Gaussian process models in the presence of stochastic simulations. Error estimates are computed using the GPM prediction variance, Eq. (12). The main goal is to elucidate the role of the uncorrelated noise parameter σu2 in the error estimation of GPM, as well as to provide recommendations for future GPM users.

Location-based error estimation of GPM

Using the first analysis procedure, described in the second paragraph of Section 3.3, it is possible to evaluate the theoretical error relationship in Eq. (14). Fig. 2a and c shows a comparison between the mean square prediction error MSE(x) and an average value of σy2(x) for each of the 300 test sample points in the Camelback test function. The figure summarizes the effects of changing the number of sample points in the GPM, as well as the effect of the measurement noise in the stochastic

Discussion

Although the main subject of this work is error estimation in Gaussian process models using stochastic simulations, the results illustrate how the GPM balances its behavior between the noise and the signal in the observations. In the end, this relationship defines all the error estimation aspects of the Gaussian process model. If the GPM is capable of recognizing the underlying true function to be approximated, via its local correlation features, then the model has a clear correspondence

Conclusions

Gaussian process models have been used in many research areas as surrogate models of complex simulations. Thanks to their statistical background, these surrogate models have been used to evaluate and estimate error predictions. This paper presents a simple interpretation of the qualities of Gaussian process models when stochastic observations are used. The presence of noise in the observations makes more difficult the identification of the underlying mean function via the local correlation

Acknowledgments

The authors gratefully acknowledge Jye-Chyi Lu and Roshan J. Vengazhiyil for their helpful discussions in this work. This research work was supported by the Air Force Office of Scientific Research (Grant FA9550-07-0161) and the National Science Foundation (Grant CBET-0933430) for financial support.

References (22)

  • T. Goel et al.

    Comparing error estimation measures for polynomial and kriging approximation of noise-free functions

    Structural and Multidisciplinary Optimization

    (2009)
  • Cited by (4)

    • Multivariate versus univariate Kriging metamodels for multi-response simulation models

      2014, European Journal of Operational Research
      Citation Excerpt :

      In practice the Kriging parameters are unknown so they must be estimated, which increases the MSE; multivariate Kriging requires the estimation of additional parameters—namely, the cross-correlations—which further increases the MSE. Note that Hernandez and Grover (2013) also use the MSE criterion in their article on Kriging. To empirically compare univariate and multivariate Kriging, we use Monte Carlo experiments that guarantee the validity of the Kriging metamodel.

    • A joint Gaussian process metamodel to improve quantile predictions

      2017, Proceedings - Winter Simulation Conference
    View full text