Stochastics and Statistics
Multivariate versus univariate Kriging metamodels for multi-response simulation models

https://doi.org/10.1016/j.ejor.2014.02.001Get rights and content

Highlights

  • Applying nonseparable dependence models ensures positive-definite covariance matrix.

  • We develop a Monte Carlo laboratory that satisfies all Kriging assumptions.

  • We find that simple univariate Kriging has smaller MSE, so it is more practical.

  • With known parameters, multivariate Kriging is not better than univariate Kriging.

Abstract

To analyze the input/output behavior of simulation models with multiple responses, we may apply either univariate or multivariate Kriging (Gaussian process) metamodels. In multivariate Kriging we face a major problem: the covariance matrix of all responses should remain positive-definite; we therefore use the recently proposed “nonseparable dependence” model. To evaluate the performance of univariate and multivariate Kriging, we perform several Monte Carlo experiments that simulate Gaussian processes. These Monte Carlo results suggest that the simpler univariate Kriging gives smaller mean square error.

Introduction

In operations research (OR) practice, simulation is often applied. Simulation may be either deterministic or random (stochastic). Applications of deterministic simulation abound in engineering such as computer aided engineering (CAE), but there are also applications in OR as demonstrated by the following two examples. Example 1 concerns the management of fisheries at the French Research Institute for Exploitation of the Sea (IFREMER); see Mahevas and Pelletier (2004). Example 2 is the case study on the CO2 greenhouse effect by Kleijnen, Van Ham, and Rotmans (1992). Applications of random simulation are plentiful in OR, especially in queueing and inventory management; see the references in Kleijnen (2008, pp. 3–6).

Kriging model may be used to analyze the input/output (I/O) behavior of a given simulation model; this analysis may serve validation, sensitivity analysis, and optimization, as discussed by Kleijnen (2008). This Kriging gives a metamodel; i.e., it approximates the I/O function defined by the underlying simulation model. There are different types of metamodels; most popular is a polynomial of either first or second order; see Kleijnen (2008). We, however, focus on Kriging, which has already become popular in engineering and is gaining popularity in OR; see the many references in Chen et al., 2012, Kleijnen, 2008. Most of this Kriging literature, however, ignores multivariate Kriging; also see our literature summary below.

In practice, a given simulation model has multiple outputs—also called responses or performance criteria. For example, Kleijnen (1993) discusses a case study on the production planning of steel tubes of different types, using a simulation model with 28 outputs which—after a discussion with management—were reduced to two outputs. Kleijnen and Smits (2003) discusses multiple performance metrics in supply chain management. The literature on metamodels, however, often reduces these multiple outputs to a single output—either ignoring all the other outputs or combining all outputs through a weighting function; in our Monte Carlo experiments (detailed In Section 4) we shall briefly discuss results for the sum and the product of two outputs. Other publications present metamodels per individual output ignoring the correlations between outputs; e.g., Kleijnen, Van Beers, and Van Nieuwenhuyse (2010) fit univariate Kriging models for each of the two outputs—namely, cost and service—of a call-center simulation. In all our Monte Carlo experiments we also apply such univariate Kriging—besides multivariate Kriging.

Intuitively, it may seem that multivariate Kriging gives a lower mean squared error (MSE) than univariate Kriging, because the former accounts for the cross-correlations between different output types, whereas the latter accounts only for the auto-correlations between outputs of the same type for different input combinations—as we shall explain in Sections 2 Basic univariate Kriging, 3 Multivariate Kriging. However we think this intuition may be misleading. In practice the Kriging parameters are unknown so they must be estimated, which increases the MSE; multivariate Kriging requires the estimation of additional parameters—namely, the cross-correlations—which further increases the MSE. Note that Hernandez and Grover (2013) also use the MSE criterion in their article on Kriging.

To empirically compare univariate and multivariate Kriging, we use Monte Carlo experiments that guarantee the validity of the Kriging metamodel. The literature usually experiments with realistic simulation models, but these experiments imply approximation errors (bias) of the Kriging metamodels. Moreover, these simulation models may be computationally expensive. We limit our investigation to Kriging in deterministic simulation, which is also the basis for Kriging in stochastic simulation.

Furthermore, we limit our first Monte Carlo experiments to situations with a single input and two outputs. Many OR problems do have a single input; examples are queueing simulation models with the traffic rate as the single input and inventory models such as the newsvendor problem with the order quantity as the single input. Moreover, Kriging in simulation usually assumes that in case of multiple inputs the correlation function is the product of the correlation functions per individual input; see Eq. (2). In this example we limit the number of multiple outputs to two; in case of more outputs, the cross-correlations are correlations between all pairs of outputs. We do vary the magnitudes of the cross-correlation between the two outputs. In the second example we base our Monte Carlo experiment on a climate simulation with five inputs and three outputs.

To provide some background for our study, we summarize the rather limited number of publications that explicitly discuss multiple outputs. This literature assumes different types of multivariate models; we distinguish the following three types:

  • 1.

    In practice, simulation models may have (say) n types of output; each type is a specific transformation of the same input combination and the same pseudorandom number stream; in deterministic simulation, this stream vanishes. Software (such as Arena) for building and running discrete-event simulation models permits the automatic collection of multiple outputs. Not only simulation may give multiple outputs; real-life systems may too. This type is the focus of our study.

  • 2.

    A given real system may be represented by n different simulation models with different degrees of realism (detail); so-called multi-fidelity simulation. We claim that this situation is extremely rare in OR. The simulation model with few details is run for many input combinations, whereas the detailed type is run for fewer combinations. Obviously, the most detailed simulation is the real system itself. See Santner et al., 2003, Forrester et al., 2008, Goh et al., 2013, Tuo et al., 2013, and also “partially heterotropic” situations in Wackernagel (2003, p. 158).

  • 3.

    Besides the output of interest, the modelers collect information on the gradient of this output. In discrete-event simulation, this type is rare, because the estimation of this gradient is complicated (it typically uses either “perturbation analysis” or the “score function” method). An example is Chen, Ankenman, and Nelson (2013). Obviously, the output and its gradient are estimated for the same input combinations.

For type-1 real-life systems, Cressie (1991, pp. 138–142) speaks of cokriging in his book on spatial data analysis. Wackernagel (2003, pp. 143–209) also discusses geostatistics, so he restricts the input data to one, two, or three dimensions (whereas simulation implies an arbitrary number of dimensions). Gneiting, Kleiber, and Schlather (2010) also discuss cokriging in geostatistics assuming so-called Matérn correlation functions. Santner et al. (2003, pp. 101–116) do discuss simulation or computer experiments, assume type-3 simulations. Higdon, Gattiker, Williams, and Rightley (2008) discuss the combination of real-life “field data” and simulation data, where both types of data concern the same real-life system so it concerns type-2 situations; they allow for very many types of output. Forrester (2010) also discusses type-2 situations; i.e., the combination of (i) scarce and expensive real-life data with abundant and inexpensive simulation data, or (ii) scarce and expensive data from a detailed simulation model with abundant and inexpensive data from a quick-and-dirty simulation model. Williams, Santner, Notz, and Lehman (2010) discuss multivariate Kriging in constrained optimization in simulation with multiple outputs—but they follow Santner et al. (2003). Altogether we recommend Santner et al., 2003, Wackernagel, 2003 for an introduction to multivariate Kriging. Note that Li, Azarm, Farhang-Mehr, and Diaz (2006) also recognize that in practice simulation models have multiple outputs and that Kriging is an important type of metamodel, but those authors use a completely different approach (they do not use cokriging with estimated cross-correlations).

Besides the areas of operations research (our focus), geostatistics, and engineering there is another area with major contributions to Kriging or Gaussian process (GP); namely machine learning; see Rasmussen and Williams (2005). Multivariate GPs are investigated in machine learning in multi-task learning, multi-sensor networks or structured output data. To obtain positive definite (PD) covariance matrixes, this community uses either so-called separable models or nonseparable models. The nonseparable models are based on either convolution method or the linear model of coregionalization (LMC). We define these different models in Section 3. Separable models for multi-task learning are used by Bonilla, Chai, and Williams (2007). Álvarez, Rosasco, and Lawrence (2011) show how several models in machine learning are special cases of LMC. In LMC, they use Cholesky’s decomposition of the cross-covariance matrix to construct a PD covariance matrix, and they show that the convolution method gives lower standardized mean square errors than LMC. Fricker, Oakley, and Urban (2010) present an LMC variant that uses eigendecomposition of the cross-covariance matrix to construct a PD covariance matrix. They show that their LMC variant gives a lower mean squared error than the convolution method. The convolution method is introduced to this community by Boyle and Frean (2005). The main disadvantage of this method are the computational and storage requirements. Álvarez and Lawrence (2011) propose a more efficient approximation for multivariate GPs constructed through the convolution method. This method does not spend much effort on accurate modeling of cross-covariance. To improve the accuracy in convolution method, more parameters are needed; Fricker et al. (2010) propose a new method that introduces such parameters. Note that Constantinescu and Anitescu (2013) specify the covariance matrixes imposing constraints originating from the physics laws that determine relationships among the outputs of their application; in OR, however, such knowledge is usually not available.

We summarize our article as follows. We consider multivariate Kriging model constructed through LMC which is proposed by Fricker et al. (2010). Furthermore, we interpret this novel Kriging model. We also present Monte Carlo results for the performance of this multivariate model and univariate Kriging per output. Using this Monte Carlo laboratory, we confirm previous results showing that multivariate Kriging does not provide improvements compared with univariate Kriging—even under ideal conditions. Svenson and Santner (2010) use Fricker et al. (2010)’s LMC for their multi-objective optimization problem; unlike we, they do not compare univariate and multivariate Kriging. Fricker et al. (2010) find that univariate Kriging always gives smaller RMSEs than multivariate Kriging. Fricker et al. (2010) suggest that if the output is a function of other outputs, then multivariate Kriging outperform univariate Kriging. We use the data in Fricker et al. (2010) only to select the parameters in our Monte Carlo experiment with a multivariate Kriging metamodel that has no specification errors; i.e., their Kriging metamodel is only an approximation of the true I/O function of their underlying simulation model, whereas multivariate Kriging in our Monte Carlo laboratory gives a metamodel without any bias. So, instead of selecting some arbitrary data that might accidentally favor or “bias” our methodology, we base our experiments on Fricker et al. (2010)’s data. Note that in an Appendix we give details, including several statistical tests for verifying the correctness of Monte Carlo experiments with Kriging; such tests are necessary because computer codes may contain unintended programming errors and peers should be enabled to reproduce results.

We organize the remainder of this article as follows. Section 2 summarizes the basics of univariate Kriging, including references to computational issues. Section 3 extends this Kriging to multivariate Kriging with nonseparable dependence structure, including technical details. Section 4 describes our Monte Carlo laboratory with GP models so the Kriging assumptions are guaranteed and we can use this laboratory to empirically compare univariate and multivariate Kriging. Section 5 presents conclusions and topics for future research. The references at the end of this article enable the reader to study more aspects of this challenging topic.

Section snippets

Basic univariate Kriging

The various disciplines that apply Kriging, use different terminologies. We have already observed that geostatisticians speak of “sites”, whereas simulationists speak of “points” or “combinations”. In machine learning, the “old” points are called the “training set”. Simulationists use correlation functions, whereas geostatisticians use the related concept of variograms.

Our notation remains close to the notation in DACE—the free univariate MATLAB Kriging toolbox developed and well-documented by

Multivariate Kriging

In this section we consider n1 outputs for each of the m input combinations (type-1 model in Section 1); i.e., the simulation outputs become yi;g (i=1,,m) (g=1,,n). For the reader’s convenience, we collect symbols in the Appendices A and B including Tables A.1 and B.1. In multivariate Kriging, we can still use the multinormal density defined for the univariate case in (4)—provided we define the stacked vector (say) Y with mn elements such that we first gather the n outputs at the first input

Monte Carlo laboratory: Sampling from a GP

First we explain why we need a laboratory instead of real applications. Kriging is based on specific assumptions; e.g., Kriging assumes a GP. To analyze the performance of the resulting Kriging procedure, we should start with situations that satisfy these assumptions; a “laboratory” can fully satisfy all our assumptions. Real applications enable us to study the “robustness” of the Kriging method; i.e., how well does the method perform if not all its assumptions are completely satisfied?

Conclusions and future research

In this article we compare univariate and multivariate Kriging metamodels for simulation models with multiple outputs. A major problem of multivariate Kriging is ensuring that the (symmetric) covariance matrix of all the observed simulation outputs remains positive-definite; to solve this problem, we apply a nonseparable dependence model that was originally proposed by Fricker et al. (2010). We compare the resulting multivariate Kriging with univariate Kriging per type of simulation output;

Acknowledgments

Joshua Svenson and Tom Santner (Ohio State) shared their Kriging code with us. Nathan Urban shared his data on the climate model that we used in our second Monte Carlo experiments. David Ginsbourger referred us to Álvarez et al. (2011). Two anonymous reviewers gave detailed comments that helped us to improve the original version of our paper.

References (44)

  • X. Chen et al.

    Enhancing stochastic Kriging metamodels with gradient estimators

    Operations Research

    (2013)
  • E.M. Constantinescu et al.

    Physics-based covariance models for Gaussian processes with multiple outputs

    International Journal for Uncertainty Quantification

    (2013)
  • N. Cressie

    Statistics for spatial data

    (1991)
  • D. Den Hertog et al.

    The correct Kriging variance estimated by bootstrapping

    Journal of The Operational Research Society

    (2006)
  • A.I.J. Forrester

    Black-box calibration for complex-system simulation

    Philosophical Transactions of The Royal Society A: Mathematical, Physical and Engineering Sciences

    (2010)
  • A.I.J. Forrester et al.

    Engineering design via surrogate modelling – A practical guide

    (2008)
  • Frazier, P. (2010). Wiley encyclopedia of operations research and management science. Chapter learning with dynamic...
  • Fricker, T., Oakley, J. E., & Urban, N.M. (2010). Multivariate emulators with nonseparable covariance structures....
  • T.E. Fricker et al.

    Multivariate Gaussian process emulators with nonseparable covariance structures

    Technometrics

    (2013)
  • S.E. Gano et al.

    Update strategies for Kriging models used in variable fidelity optimization

    Structural and Multidisciplinary Optimization

    (2006)
  • T. Gneiting et al.

    Matern cross-covariance functions for multivariate random fields

    Journal of the American Statististics Association

    (2010)
  • J. Goh et al.

    Prediction and computer model calibration using outputs from multi-fidelity simulators

    Technometrics

    (2013)
  • Cited by (60)

    • Passive safety systems analysis: A novel approach for inverse uncertainty quantification based on Stacked Sparse Autoencoders and Kriging metamodeling

      2022, Progress in Nuclear Energy
      Citation Excerpt :

      The comparison with the PCA-based approach (applied to filtered data) shows that i) PCA allows reducing epistemic uncertainty more than the SSAE-based approach since the former provides sharper posterior PDFs (i.e., characterized by minor variance) and ii) the MCMC sampling is computationally more expensive for SSAE than for PCA. Future research lies on the possibility of exploring: 1) more powerful tuning approaches to optimize the SSAE architecture (e.g., extensive grid search and evolutionary optimization); 2) new approaches of uncertainty propagation through DNNs (e.g., the Monte Carlo sampling, the entire-DNN unscented transform and the piecewise exponential approximation of the transfer function) taking, also, into account their computational cost; 3) new approaches, such as multivariate Kriging metamodels (Kleijnen and Mehdad, 2014), to take into account dependencies among the output components. In fact, another limitation that should be further investigated is the effect of building a distinct independent metamodel for each feature extracted, i.e., assuming the features to be independent.

    • A global surrogate model for high-dimensional structural systems based on partial least squares and Kriging

      2022, Mechanical Systems and Signal Processing
      Citation Excerpt :

      This section uses several examples to verify the capabilities of the PLS-K model for UP of the structural systems with both high-dimensional input and output. The multivariate Kriging model [25] and the PLS-PCE model [23] proposed in the existing literature are used as comparison methods, where the PLS-PCE model developed for systems with scalar output is extended to systems with high-dimensional output in this section. The modeling process of the PLS-PCE model is similar to that of the PLS-K model, which determines the input–output PCs by PLS, and updates the input weights by the EBWU criterion.

    View all citing articles on Scopus
    View full text