Chapter 9 Subjective Probability and Bayesian Methodology

https://doi.org/10.1016/S0927-0507(06)13009-1Get rights and content

Abstract

Subjective probability and Bayesian methods provide a unified approach to handle not only randomness from stochastic sample-paths, but also uncertainty about input parameters and response metamodels. The chapter surveys some basic concepts, principles and techniques useful for a subjective Bayesian approach to uncertainty analysis, data collection plans to reduce input uncertainty, response surface modeling, and expected value-of-information approaches to experimental designs for selection procedures. Some differences from the classical technique are identified.

Introduction

If simulation is defined to be the analysis of stochastic processes through the generation of sample paths of the process, then Bayesian and subjective probability methods apply in several ways for the modeling, design and analysis of simulation experiments. By Bayesian methods, we refer here to parameter inference through repeated observations of data with Bayes' rule. Examples in simulation are input parameter inference using field data or the inference of metamodel parameters from simulation replications. The Bayesian approach entails postulating a ‘prior probability’ model that describes a modeler's initial uncertainty about parameters, a likelihood function that describes the distribution of data, given that a parameter holds a specific value, and Bayes' rule, which provides a coherent method of updating beliefs about uncertainty when data becomes available. By subjective probability, we refer to probability assessments for all unknown quantities, including parameters that can be inferred with Bayes' rule, as well as unknown quantities for which parameters cannot be inferred from repeated sampling of data (e.g., one-shot deals like the total potential market size for a particular new product from a simulated manufacturing facility). By frequentist, we mean methods based on sampling statistics from repeated observations, such as maximum likelihood (MLE) methods to fit input parameters, or ranking and selection procedures that provide worst-case probability of correct selection guarantees based on repeated applications of the procedure. The chapter describes applications of Bayesian and subjective probability methods in simulation, and identifies some ways that the Bayesian approach differs from the frequentist approach that underlies much of simulation theory.

In the simulation community, Glynn (1986) first suggested Bayesian applications of uncertainty analysis for statistical input parameter uncertainty. In that paper, the traditional role of estimating α=h(E[Y]) is extended to account for statistical input parameter uncertainty, so α(θ)=h(E[Y|θ]) depends upon unknown parameters with distribution p(θ) that can be updated with data from the modeled system. Three questions he poses are: (i) how to estimate the distribution of α(Θ) induced by the random variable Θ, (ii) how to estimate the mean E[α(Θ)], and (iii) estimation of credible sets, e.g., finding a, b so the probability Pr(α(Θ)[a,b]) equals a pre-specified value, like 0.95. Chick (1997) provided a review of the few works to that date that applied Bayesian ideas to simulation, then suggested a broader range of application areas than uncertainty analysis, including ranking and selection, response surface modeling, and experimental design.

The basic goal is to understand how uncertainty and decision variables affect system performance, so that better decisions can be made. The premise in this chapter is that representing all uncertainty with probability can aid decision-makers that face uncertainty. Stochastic uncertainty, the randomness in simulation models that occurs even if all parameters are known, is already widely modeled with probability. The subjective Bayesian approach also models input parameter and response surface uncertainty with probability distributions, a practice that has been less common in stochastic process simulation.

Probabilistic models for uncertainty are increasingly employed for at least three reasons. One, doing so allows the modeler to quantify how parameter uncertainty influences the performance of a simulated system. Parameters of models of real systems are rarely known with certainty. The Bayesian approach for uncertainty analysis overcomes some limitations of the classical approach for parameter and model selection (Chick, 2001, Barton and Schruben, 2001, Draper, 1995). Two, simulation experiments can be designed to run more efficiently (Chick and Inoue, 2001a, Santner et al., 2003). And three, Bayesian and subjective probability methods are not new but are increasingly implemented due to the development of improved computing power and Markov chain Monte Carlo (MCMC) methods (Gilks et al., 1996).

This chapter describes the subjective Bayesian formulation for simulation. Section 1 presents the basics of subjective probability and Bayesian statistics in the context of quantifying uncertainty about one statistical input parameter. Section 2 summarizes the main ideas and techniques for addressing three main challenges in implementing Bayesian inference: maximization, integration, and sampling variates from posterior distributions. Section 3 addresses input distribution selection when multiple candidate distributions exist. Section 4 presents a joint formulation for input and output modeling, and reviews applications for data collection to reduce input uncertainty in a way that reduces output uncertainty, and for response surface modeling and simulation experiments to reduce response surface uncertainty. Section 5 describes applications of Bayesian expected value of information methods for efficiently selecting the best of a finite set of simulated alternatives.

Simulation research with Bayesian methods has grown rapidly since the mid to late 1990s. A partial reference list is Chen and Schmeiser (1995), Chen (1996), Scott (1996), Nelson et al. (1997), Chen et al. (1999), Cheng (1999), Lee and Glynn (1999), Andradóttir and Bier (2000), Chick and Inoue, 2001a, Chick and Inoue, 2001b, Chick (2001), Cheng and Currie (2003), Steckley and Henderson (2003), Chick et al. (2003), Zouaoui and Wilson, 2003, Zouaoui and Wilson, 2004, Ng and Chick (2004), as well as applications to insurance, finance, waterway safety, civil engineering and other areas described in the Winter Simulation Conference Proceedings. Work on deterministic simulation with potentially important implications for stochastic simulation includes O'Hagan et al. (1999), Kennedy and O'Hagan (2001), Craig et al. (2001), Santner et al. (2003). Excellent references for subjective probability and Bayesian statistics in general, not just in simulation, include Lindley (1972), Berger (1985), Bernardo and Smith (1994), with special mention for de Finetti (1990), Savage (1972) and de Groot (1970).

Section snippets

Main concepts

A stochastic simulation is modeled as a deterministic function of several inputs,Yr=g(θp,θe,θc;Ur), where Yr is the output of the rth replication. The vector of statistical input parameters θp=(θ1,θ2,,θnp) describes np sources of randomness whose values can be inferred from field data. For example, θ1 may be a two-dimensional parameter for log-normally distributed service times, and θ2 may be defect probabilities inferable from factory data. Environmental parameters θe are beyond the control

Computational issues

Three basic computational issues for implementing a Bayesian analysis are maximization (e.g., find the MLE θˆ, or MAP θ˜ estimators for a posterior distribution); integration, either to find a marginal distribution (e.g., find p(θ1|xn) from p(θ1,θ2|xn)) or constant of proportionality for a posterior distribution (e.g., find c−1=f(xn|θ)dπ(θ)); and simulation (e.g., sample from p(θ|xn) in order to estimate E[g(θ)|xn]). Techniques to address these issues are described in a variety of sources

Input distribution and model selection

Selecting an input distribution to model a sequence of random quantities X1,X2, is often more complicated than inferring a parameter of a single parametric distribution, as described in Section 1. There is often a finite number q of candidate distributions proposed to model a given source of randomness, with continuous parameters θm=(ϑm1,,ϑmdm), where dm is the dimension of θm, for m=1,,q. For example, service times might be modeled by exponential, log-normal or gamma distributions (q=3).

Joint input–output models

Simulation is interested in both stochastic uncertainty, or randomness that occurs when all model parameters are known, and structural uncertainty, or uncertainty about model inputs when a real system is being simulated. This section describes an input–output model that quantifies the uncertainty in simulation outputs due to input uncertainty, data collection plans for reducing input uncertainty in a way that effectively reduces output uncertainty, mechanisms to select computer inputs to

Ranking and selection

This section compares and contrasts frequentist and Bayesian approaches to ranking and selection. The objective of ranking and selection is to select the best of a finite number k of systems, where best is defined in terms of the expected performance of a system (e.g. Chapter 17). In the notation of Section 4, θc assumes one of a discrete set of values indexed by i{1,2,,k}, and the goal is to identify the system i that maximizes wi=E[g(θci,Θp,Θe;U)]. The means are inferred from observing

Discussion and future directions

Bayesian methods apply to simulation experiments in a variety of ways, including uncertainty analysis, ranking and selection, input distribution modeling, response surface modeling, and experimental design. One main theme is to represent all uncertainty with probability distributions, to update probability using Bayes' rule, and to use the expected value of information as a technique to make sampling decisions (e.g., the opportunity cost and 0–1 loss functions for selection procedures, or the

Acknowledgement

Portions of this chapter were published by Chick (2004) in the Proceedings of the Winter Simulation Conference.

References (92)

  • J.O. Berger

    An overview of robust Bayesian analysis

    TEST

    (1994)
  • J.O. Berger et al.

    Testing precise hypotheses (with discussion)

    Statistical Science

    (1987)
  • J.O. Berger et al.

    The intrinsic Bayes factor for model selection and prediction

    Journal of the American Statistical Association

    (1996)
  • R. Berk

    Limiting behaviour of posterior distributions when the model is incorrect

    Annals of Mathematical Statistics

    (1966)
  • J.M. Bernardo et al.

    Bayesian Theory

    (1994)
  • P. Billingsley

    Probability and Measure

    (1986)
  • Branke, J., Chick, S.E., Schmidt, C. (2005). Selecting a selection procedure. Working paper, INSEAD, Technology...
  • C.-H. Chen

    A lower bound for the correct subset-selection probability and its application to discrete event simulations

    IEEE Transactions on Automatic Control

    (1996)
  • Chen, C.-H., Yücesan, E., Dai, L., Chen, H. (2006). Efficient computation of optimal budget allocation for discrete...
  • H. Chen et al.

    Monte Carlo estimation for guaranteed-coverage non-normal tolerance intervals

    Journal of Statistical Computation and Simulation

    (1995)
  • H.-C. Chen et al.

    An asymptotic allocation for simultaneous simulation experiments

  • R.C.H. Cheng

    Regression metamodelling in simulation using Bayesian methods

  • R.C.H. Cheng et al.

    Prior and candidate models in the Bayesian analysis of finite mixtures

  • R.C.H. Cheng et al.

    Sensitivity of computer simulation experiments to errors in input data

    Journal of Statistical Computing and Simulation

    (1997)
  • S.E. Chick

    Bayesian analysis for simulation input and output

  • S.E. Chick

    Input distribution selection for simulation experiments: Accounting for input uncertainty

    Operations Research

    (2001)
  • S.E. Chick

    Bayesian methods for discrete event simulation

  • S.E. Chick et al.

    New two-stage and sequential procedures for selecting the best simulated system

    Operations Research

    (2001)
  • S.E. Chick et al.

    New procedures for identifying the best simulated system using common random numbers

    Management Science

    (2001)
  • S.E. Chick et al.

    Corrigendum: New selection procedures

    Operations Research

    (2002)
  • S.E. Chick et al.

    New characterizations of the no-aging property and the 1-isotropic models

    Journal of Applied Probability

    (1998)
  • S.E. Chick et al.

    Selection procedures with frequentist expected opportunity cost bounds

    Operations Research

    (2005)
  • S.E. Chick et al.

    Inferring infection transmission parameters that influence water treatment decisions

    Management Science

    (2003)
  • P. Craig et al.

    Bayesian forecasting for complex systems using computer simulators

    Journal of the American Statistical Association

    (2001)
  • N.A. Cressie

    Statistics for Spatial Data

    (1993)
  • B. de Finetti

    Theory of Probability, vol. 2

    (1990)
  • M.H. de Groot

    Optimal Statistical Decisions

    (1970)
  • J. Dmochowski

    Intrinsic priors via Kullback–Leibler geometry

  • D. Draper

    Assessment and propagation of model uncertainty (with discussion)

    Journal of the Royal Statistical Society B

    (1995)
  • E.J. Dudewicz et al.

    Allocation of observations in ranking and selection with unequal variances

    Sankhyā B

    (1975)
  • A.W.F. Edwards

    Likelihood

    (1984)
  • M. Evans et al.

    Methods for approximating integrals in statistics with special emphasis on Bayesian integration problems

    Statistical Science

    (1995)
  • M.C. Fu et al.

    Optimal computing budget allocation under correlated sampling

  • E.I. George et al.

    Stochastic search variable selection

  • W.R. Gilks et al.

    Adaptive rejection Metropolis sampling

    Applied Statistics

    (1995)
  • Cited by (75)

    View all citing articles on Scopus
    View full text