Adaptive sequential strategy for risk estimation of engineering systems using Gaussian process regression active learning
Introduction
Quantitative evaluation of risk associated with an engineering system is an important part of the risk assessment of that system. Risk estimation can help engineers to understand the magnitude of the risk to make wise decisions for separating the acceptable risk from unacceptable one. The acceptable risk refers to the level of risk that can be tolerated by the final user due to the constraints such as the extra cost.
Since uncertainty is part of the engineering design, risk of failure cannot be completely mitigated. The major sources of uncertainty in engineering design are noise, error, and bias in the sample data or error in model or approximation techniques used to solve a model. Due to the ubiquitous nature of uncertainty, estimating the safety of the system in abnormal operating condition or failure environment is part of the realistic modeling of a system (Oberkampf et al., 2002). In practice, such realistic modeling of a system requires the use of complex and time-consuming mathematical models. Therefore, efficient risk estimation models need to be used for estimating the probability of failure of a system. The wide application of such methodologies in the variety of asset-intensive industries such as Manufacturing, Oil and Gas, Utilities, Chemical, and Life Sciences, motivates the authors to propose their risk estimation model.
Direct Monte Carlo method (Rubinstein, 2008) is the most robust risk estimation model since it is not dependent on the dimension and complexity of the model. However, it is computationally expensive for systems with low probability of failure. At the expense of robustness, the efficiency of the direct Monte Carlo method can be increased using variance reduction techniques (Au, 2016). In order to increase the efficiency of the Monte Carlo method, many advanced Monte Carlo methods have been proposed such as Subset Simulation (Papaioannou et al., 2015), Directional Simulation [28], [21], Spherical Subset Simulation (Katafygiotis and Cheung, 2007), the Line Sampling method (de Angelis et al., 2015) and Asymptotic Sampling (Bucher, 2009). Recently, an alternative approach based on the Gaussian process model [16], [31] is proposed. The focus of the presented study takes the latter approach. However, the proposed method in this paper has significant differences with other Gaussian process-based models in the literature. For example, Echard et al. used the First Order Reliability Method (FORM) and a variance reduction technique known as importance sampling to find the most probable failure point. Then, they used the Gaussian process to predict the outcome of the expensive to evaluate system. The performance function in FORM is approximated by the first order Taylor expansion. This assumption can be a source of error for the nonlinear systems. On the other hand, Picheny et al. used a Gaussian process regression method known as the universal kriging model. It is known that the underlying variogram for such method cannot be calculated even with known drift function for irregularly gridded data (Wackernagel, 2013, P. 305). The proposed method in this paper uses an active learning strategy coupled with a covariance-based Gaussian process model to find the failure region capable of handling nonlinear performance functions in a multidimensional feature space.
Recently, several studies on coupling Gaussian process models with sampling-based methods have been conducted as well. For instance, Huang, Allen, Notz, and Miller combined Gaussian process regression based surrogate model and multiple fidelity data to increase the efficiency of the optimization problem (Huang et al., 2006). In order to estimate the small failure probability, one commonly used method is to couple the Polynomial Response Surface Method (PRSM) to the FORM [18], [19], or Bayesian framework to FORM [17], [8], [2]. Such methodologies provide biased estimates of the probability of failure since it relies on FORM estimation of the most probable failure point. Another commonly used alternative to FORM for estimating the probability of failure of the complex systems is Kriging meta-modeling technique (Drignei, 2017). For example, using a learning function based on the probability of the metamodel classification satisfy a constraint, Echard, Gayton, and Lemaire proposed a reliability method combining Gaussian process regression and Monte Carlo method (Echard et al., 2011). Bect et al. proposed a Bayesian decision theory framework in order to derive an optimal sequential strategy for the estimation of the probability of failure (Bect et al., 2012), and Dubourg, Sudret, and Deheeger proposed to couple importance sampling and a Gaussian process regression based surrogate model to approximate a quasi-optimal importance sampling density [14], [13]. Coupling Monte Carlo simulation with active learning algorithms such as neural networks (Sener and Savarese, 2017) and support vector machine (Tong and Koller, 2001) can be used to estimate the probability of failure of the engineering systems efficiently and accurately. However, neural networks active learning suffers from lack of interpretability, and support vector machine active learning tends to yield better results for binary classification problems rather than the regression problems. Besides, although these methods are proven to give good estimation of the expected value of the hypothesis, none of them can directly provide the variance of the prediction explicitly. The interpretability and estimating the variance of the model are two key factors for selecting the proper model for risk estimation. Gaussian process regression is a highly interpretable machine learning algorithm that can provide both expected value and the variance of the model. These observations motivate the authors to propose the Gaussian process regression active learning for risk estimation of engineering systems.
Section snippets
Background information
In measure theory, sample space is a finite or infinite set of all possible outcomes of an experiment, and any subset of the sample space is an event. A -algebra on a set is a collection of subsets of if it is closed under complementation, and it is closed under taking countable unions. The pair is called a measurable space.
A measure on is a map such that is countably additive for every disjoint event. If , then is called a probability measure, and the triple
Gaussian process regression
Let denote a random field with in a d dimensional metric space, and be a random variable for each . Moreover, let be a random vector containing the observed responses at the neighboring sample points. The predicted value of the random variable at the unobserved point is of interest. This value should be estimated in the presence of the model assumptions that the random variable is second order intrinsically stationary with known covariance
Example 1: Simple one-dimensional explanatory example
The objective is to find an algorithm to numerically estimate the probability , where and when is a standard normal random variable.
Monte Carlo method estimates the probability with a sample size equal to 6. Thus, using Eq. (5), there is 3.5% error in the estimator .
The proposed methodology separates all the failed and unfailed sample points in just eight observations. The graphical representation of this limit state function is shown
Conclusion
In this paper, an adaptive sequential strategy based on the Monte Carlo method and Gaussian process regression to build an active learning algorithm for performing risk estimation of engineering systems is presented. First, a Gaussian process regression is presented. Then, an active learning strategy is proposed. This sequential strategy consists of selection of initial training points, selection of the new training points, probability estimator function, and a stopping criterion.
Applying the
References (40)
On MCMC algorithm for subset simulation
Probab. Eng. Mech.
(2016)- et al.
A unified approach for the design of steel structures under low and/or high cycle fatigue
J. Constr. Steel Res.
(1995) - et al.
Structural reliability analysis using a standard deterministic finite element code
Struct. Saf.
(1997) Asymptotic sampling for high-dimensional reliability analysis
Probab. Eng. Mech.
(2009)- et al.
A Bayesian Monte Carlo-based algorithm for the estimation of small failure probabilities of systems affected by uncertainties
Reliab. Eng. Syst. Saf.
(2016) - et al.
Advanced line sampling for efficient robust reliability analysis
Struct. Saf.
(2015) An estimation algorithm for fast kriging surrogates of computer models with unstructured multiple outputs
Comput. Methods in Appl. Mech. Eng.
(2017)- et al.
Meta-model-based importance sampling for reliability sensitivity analysis
Struct. Saf.
(2014) - et al.
Metamodel-based importance sampling for structural reliability analysis
Probab. Eng. Mech.
(2013) - et al.
AK-MCS: An active learning reliability method combining Kriging and Monte Carlo Simulation
Struct. Saf.
(2011)