Mixed aleatory-epistemic uncertainty quantification with stochastic expansions and optimization-based interval estimation

https://doi.org/10.1016/j.ress.2010.11.010Get rights and content

Abstract

Uncertainty quantification (UQ) is the process of determining the effect of input uncertainties on response metrics of interest. These input uncertainties may be characterized as either aleatory uncertainties, which are irreducible variabilities inherent in nature, or epistemic uncertainties, which are reducible uncertainties resulting from a lack of knowledge. When both aleatory and epistemic uncertainties are mixed, it is desirable to maintain a segregation between aleatory and epistemic sources such that it is easy to separate and identify their contributions to the total uncertainty. Current production analyses for mixed UQ employ the use of nested sampling, where each sample taken from epistemic distributions at the outer loop results in an inner loop sampling over the aleatory probability distributions. This paper demonstrates new algorithmic capabilities for mixed UQ in which the analysis procedures are more closely tailored to the requirements of aleatory and epistemic propagation. Through the combination of stochastic expansions for computing statistics and interval optimization for computing bounds, interval-valued probability, second-order probability, and Dempster–Shafer evidence theory approaches to mixed UQ are shown to be more accurate and efficient than previously achievable.

Introduction

Uncertainty quantification (UQ) is the process of determining the effect of input uncertainties on response metrics of interest. These input uncertainties may be characterized as either aleatory uncertainties, which are irreducible variabilities inherent in nature, or epistemic uncertainties, which are reducible uncertainties resulting from a lack of knowledge. Since sufficient data is available for characterizing aleatory uncertainties, probabilistic methods are commonly used for computing response distribution statistics based on input probability distribution specifications. Conversely, for epistemic uncertainties, data is generally too sparse to support objective probabilistic input descriptions, leading either to subjective probabilistic descriptions (e.g., assumed priors in Bayesian analysis) or nonprobabilistic methods based on interval specifications.

One technique for the analysis of aleatory uncertainties using probabilistic methods is the polynomial chaos expansion (PCE) approach to UQ. For smooth functions (i.e., analytic, infinitely differentiable) in L2 (i.e., possessing finite variance), exponential convergence rates can be obtained under order refinement for integrated statistical quantities of interest such as mean, variance, and probability. In this work, generalized polynomial chaos using the Wiener-Askey scheme [1] provides a foundation in which Hermite, Legendre, Laguerre, Jacobi, and generalized Laguerre orthogonal polynomials are used for modeling the effect of continuous uncertain variables described by normal, uniform, exponential, beta, and gamma probability distributions, respectively.2 These polynomial selections are optimal for these distribution types since they are orthogonal with respect to an inner product weighting function that corresponds3 to the probability density functions for these continuous distributions. Orthogonal polynomials can be computed for any positive weight function, so these five classical orthogonal polynomials may be augmented with numerically generated polynomials for other probability distributions (e.g., for lognormal, extreme value, and histogram distributions). When independent standard random variables are used (or computed through transformation), the variable expansions are uncoupled, allowing the polynomial orthogonality properties to be applied on a per-dimension basis. This allows one to mix and match the polynomial basis used for each variable without interference with the spectral projection scheme for the response.

In non-intrusive PCE, simulations are used as black boxes and the calculation of chaos expansion coefficients for response metrics of interest is based on a set of simulation response evaluations. To calculate these response PCE coefficients, two primary classes of approaches have been proposed: spectral projection and linear regression. The spectral projection approach projects the response against each basis function using inner products and employs the polynomial orthogonality properties to extract each coefficient. Each inner product involves a multidimensional integral over the support range of the weighting function, which can be evaluated numerically using sampling, tensor-product quadrature, Smolyak sparse grid [2], or cubature [3] approaches. The linear regression approach uses a single linear least squares solution to solve for the set of PCE coefficients which best match a set of response values obtained from either a design of computer experiments (“point collocation” [4]) or from the subset of tensor Gauss points with highest product weight (“probabilistic collocation” [5]).

Stochastic collocation [6] (SC) is a second stochastic expansion approach that is closely related to PCE. As for PCE, exponential convergence rates can be obtained under order refinement for integrated statistical quantities of interest, provided that the response functions are smooth with finite variance. The primary distinction is that, whereas PCE estimates coefficients for known orthogonal polynomial basis functions, SC forms Lagrange interpolation functions for known coefficients. Interpolation is performed on structured grids such as tensor-product or sparse grids. Starting from a tensor-product multidimensional Lagrange interpolant, we have the feature that the ith interpolation polynomial is 1 at collocation point i and 0 for all other collocation points, leading to the use of expansion coefficients that are just the response values at each of the collocation points. Sparse interpolants are weighted sums of these tensor interpolants; however, they are only interpolatory for sparse grids based on fully nested rules and will exhibit some interpolation error at the collocation points for sparse grids based on non-nested rules. A key to maximizing performance with SC is performing collocation using the Gauss points and weights from the same optimal orthogonal polynomials used in PCE. For use of standard Gauss integration rules (not nested variants such as Gauss–Patterson or Genz–Keister) within tensor-product quadrature, tensor PCE expansions and tensor SC interpolants are equivalent in that identical polynomial approximations are generated [7]. Moreover, this equivalence can be extended to sparse grids based on standard Gauss rules, provided that a sparse PCE is formed based on a weighted sum of tensor expansions [8].

Once PCE or SC representations have been obtained for a response metric of interest, analytic expressions can be derived for the moments of the expansion (from integration over the aleatory/probabilistic random variables) as well as for various sensitivity measures. Local sensitivities (i.e., derivatives) and global sensitivities [9] (i.e., ANOVA, variance-based decomposition) of the response metrics may be computed with respect to the expansion variables, and local sensitivities of probabilistic response moments may be computed with respect to other nonprobabilistic variables [10] (i.e., design or epistemic uncertain variables). This latter capability allows for efficient design under uncertainty and mixed aleatory-epistemic UQ formulations involving moment control or bounding. This paper presents two approaches for calculation of sensitivities of moments with respect to nonprobabilistic dimensions (design or epistemic), one involving response function expansions over both probabilistic and nonprobabilistic variables and one involving response derivative expansions over only the probabilistic variables.

A common approach to quantifying the effects of mixed aleatory and epistemic uncertainties is to separate the aleatory and epistemic variables and perform nested iteration. This separation allows the use of strong probabilistic inferences where possible, while employing alternative inferences only where necessary. Traditionally, this has involved a nested sampling approach, in which each sample drawn from the epistemic variables on the outer loop results in a sampling over the aleatory variables on the inner loop. In this fashion, we generate families or ensembles of response distributions, where each distribution represents the uncertainty generated by sampling over the aleatory variables. Plotting an entire ensemble of cumulative distribution functions (CDFs) in a “horsetail” plot allows one to visualize the upper and lower bounds on the family of distributions (see Fig. 1). However, nested iteration can be computationally expensive when it is implemented using two random sampling loops. Consequently, when employing simulation-based models, the nested sampling must often be under-resolved, particularly at the epistemic outer loop, resulting in an under-prediction of credible output ranges. Thus, the central goal in this work is to preserve the advantages of uncertainty separation (visualization, interpretation, and tailoring of inferences), but address issues with accuracy and efficiency within the nested iteration by closely tailoring the algorithmic approaches to the propagation needs at each level.

We propose a new approach for performing mixed UQ in which the inner-loop CDFs will be calculated using a stochastic expansion method (using either aleatory expansions formed for each instance of the epistemic variables or combined expansions over both variable sets), and outer loop bounds can be computed with optimization-based interval estimation (using either local gradient-based or global nongradient-based optimizers). The advantages of this approach can be significant, due to several factors. First, the stochastic expansion methods can be much more efficient than sampling for calculation of moments or CDF values (exponential convergence rates instead of 1/N polynomial rate). Another advantage is the ability to compute analytic statistics and their derivatives in closed form. This enables efficient optimization approaches, including gradient-based local methods such as sequential quadratic programming and nongradient-based global methods such as efficient global optimization, to computing response intervals through direct minimization and maximization of the response over the range of the epistemic inputs. These optimization methods are more directed and will generally be more accurate and efficient than using random sampling to estimate the interval bounds.

Section 2 describes the approaches to mixed UQ, Section 3 describes the outer loop of local and global optimization-based interval estimation, Section 4 describes the inner loop of collocation-based stochastic expansion methods, Section 5 presents computational experiments using these components within mixed UQ studies, and Section 6 provides concluding remarks. Overall, we will explore eight algorithm combinations (PCE or SC, aleatory or combined stochastic expansions, local or global optimization) and deploy them within interval-valued probability, second-order probability, and Dempster–Shafer approaches to mixed UQ for several algebraic test problems. These combinations are summarized in Table 1, including the test problems from Section 5 that are used to explore the performance of the different approaches. Since it was impractical to explore every combination for every problem, selected results are presented that are representative of the trends of interest. For interval-valued probability (IVP), we examined global and local optimization methods (EGO and NPSOL, respectively) to calculate the outer loop bounding intervals in combination with polynomial chaos expansions (PCE) and stochastic collocation (SC) using Smolyak sparse grids (SSG) to calculate inner loop statistics using either aleatory or combined expansions, for a total of eight combinations. A ninth combination was the traditional approach of LHS sampling at both levels [11], [12]. Since the Dempster–Shafer theory of evidence (DSTE) approach is just IVP applied to multiple cell combinations, it employs the same algorithmic combinations; moreover, the number of test problems can be reduced and focused on issues of performance relative to IVP. For second-order probability (SOP), no optimization is involved: we calculate distributions on distributions using stochastic expansion methods at both levels. The outer loop expansions are based on PCE or SC with tensor-product quadrature (TPQ), since they involve only a few epistemic variables, and the inner loop expansions employ the same combinations as IVP and DSTE (PCE or SC with SSG and either aleatory or combined expansions). Including the nested LHS sampling reference approach, nine total combinations are again employed for SOP.

Section snippets

Approaches to mixed UQ

Epistemic uncertainty is sometimes referred to as state of knowledge uncertainty, subjective uncertainty, or reducible uncertainty, meaning that the uncertainty can be reduced through increased understanding (research), or increased and more relevant data [13]. There are a variety of approaches to propagating epistemic uncertainty, many of which differ significantly from traditional probabilistic propagation techniques. Interval methods are one approach, and there are many others, including

The outer loop: optimization-based interval estimation

This section presents a general formulation for determining interval bounds on the output measures of interest in the case of mixed epistemic-aleatory uncertainties. Given the capability to compute analytic statistics of the response along with sensitivities of these statistics with respect to epistemic parameters, we pursue optimization-based interval estimation approaches for epistemic and mixed aleatory-epistemic uncertainty quantification.

Where applicable, we will employ derivatives of the

The inner loop: stochastic expansion methods

For the inner-loop in our nested analysis procedure, we exploy stochastic expansion methods; in particular, polynomial chaos expansions (PCE) and stochastic collocation (SC). We first describe the polynomial basis functions used in these methods, followed by methods for forming the expansion and capabilities for computing statistics from the expansions.

Analytic benchmarks and results

Capabilities for uncertainty analysis based on stochastic expansions and optimization-based interval estimation have been implemented in DAKOTA [57], an open-source software framework for design and performance analysis of computational models on high performance computers. This section examines computational performance of these algorithmic approaches for several algebraic benchmark test problems. These results build upon IVP and DSTE interval estimation results presented in [58], [33], [23].

Conclusions

The goal of this activity has been to develop interval-valued probability (IVP), second-order probability (SOP), and Dempster–Shafer theory of evidence (DSTE) approaches for mixed aleatory-epistemic uncertainty quantification that can be more accurate (via precise bounds from optimizers) and more efficient (via exponential convergence rates from stochastic expansion methods) than existing approaches based on nested sampling. Computational experiments have demonstrated that the coupling of local

Acknowledgments

The authors thank Joe Castro, Chuck Hembree, and Biliana Paskaleva for guiding the deployment of these capabilities to real-world applications, John Burkardt of Virginia Tech for development of sparse grid software used for numerical integrations, and Gianluca Iaccarino of Stanford University for his helpful comments on this work.

References (62)

  • A. Stroud

    Approximate calculation of multiple integrals

    (1971)
  • Walters RW. Towards stochastic fluid mechanics via polynomial chaos. In: Proceedings of the 41st AIAA aerospace...
  • Tatang M. Direct incorporation of uncertainty in chemical and environmental engineering systems. PhD thesis, MIT;...
  • F. Nobile et al.

    An anisotropic sparse grid stochastic collocation method for partial differential equations with random input data

    SIAM Journal on Numerical Analysis

    (2008)
  • P.G. Constantine et al.

    Spectral methods for parameterized matrix equations

    SIAM Journal on Matrix Analysis and Applications

    (2010)
  • Constantine PG, Eldred MS., Phipps ET. Sparse polynomial chaos expansions. Int J Uncertainty Quant, in...
  • Tang G, Iaccarino G, Eldred MS. Global sensitivity analysis for stochastic collocation expansion. In: Proceedings of...
  • Eldred MS, Webster CG, Constantine P. Design under uncertainty employing stochastic expansion methods. In: Proceedings...
  • M.D. McKay et al.

    A comparison of three methods for selecting values of input variables in the analysis of output from a computer code

    Technometrics

    (1979)
  • Helton JC. Conceptual and computational basis for the quantification of margins and uncertainty. Technical Report...
  • D.R. Karanki et al.

    Uncertainty analysis based on probability bounds (p-box) approach in probabilistic safety assessment

    Risk Analysis

    (2009)
  • Aughenbaugh JM, Paredis CJJ. Probability bounds analysis as a general approach to sensitivity analysis in decision...
  • National Research Council of the National Academies, evaluation of quantification of margins and uncertainties...
  • Swiler LP, Paez TL, Mayes RL. Epistemic uncertainty quantification tutorial. In: Proceedings of the IMAC XXVII...
  • G. Shafer

    A mathematical theory of evidence

    (1976)
  • Tang G, Swiler LP, Eldred MS. Using stochastic expansion methods in evidence theory for mixed aleatory-epistemic...
  • Eldred MS. Design under uncertainty employing stochastic expansion methods. International Journal for Uncertainty...
  • Swiler LP, Paez TL, Mayes RL, Eldred MS. Epistemic uncertainty in the calculation of margins. In: Proceedings of the...
  • Eldred MS, Dunlavy DM. Formulations for surrogate-based optimization with data fit, multifidelity, and reduced-order...
  • Gill PE, Murray W, Saunders MA, Wright MH. User's guide for NPSOL 5.0: a Fortran package for nonlinear programming....
  • J. Nocedal et al.

    Numerical optimization

    (1999)
  • Cited by (172)

    • Stochastic modeling and statistical calibration with model error and scarce data

      2023, Computer Methods in Applied Mechanics and Engineering
    • Uncertainty design and optimization of a hybrid rocket motor with mixed random-interval uncertainties

      2022, Aerospace Science and Technology
      Citation Excerpt :

      It has always been a major challenge to build an effective mathematical model containing multi-source uncertainty factors [16]. Eldred et al. proposed that it was best to quantify the contributions to total uncertainty by decoupling mixed uncertainties from multi-source uncertainties [5]. In the design of aerospace vehicles, some uncertainty quantification methods have been applied to the design optimization with aleatory uncertainty [17,18], but rarely applied to that with both aleatory and epistemic uncertainties.

    View all citing articles on Scopus
    1

    Sandia National Laboratories is a multi-program laboratory operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin company, for the U.S. Department of Energy's National Nuclear Security Administration under Contract DE-AC04-94AL85000.

    View full text