Combining screening and metamodel-based methods: An efficient sequential approach for the sensitivity analysis of model outputs

doi:10.1016/j.ress.2014.08.009

Reliability Engineering & System Safety

Volume 134, February 2015, Pages 334-344

https://doi.org/10.1016/j.ress.2014.08.009 Get rights and content

Highlights

•
Quasi-OTEE and Kriging-based SA are reviewed and combined as the sequential SA.
•
The sequential SA produces similar results as the analytical calculations and variance-based SA.
•
The sequential SA takes over 50 times less computational cost than the variance-based SA.
•
An efficient SA for high-dimensional and computationally expensive models.

Abstract

Sensitivity analysis (SA) is able to identify the most influential parameters of a given model. Application of SA is usually critical for reducing the complexity in the subsequent model calibration and use. Unfortunately it is hardly applied, especially when the model is in the form of a computationally expensive black-box computer program. A possible solution concerns applying SA to the metamodel (i.e., an approximation of the computationally expensive model) instead. Among the other options, the use of Gaussian process metamodels (also known as Kriging metamodels) has been recently proposed for the SA of computationally expensive traffic simulation models. However, the main limitation of this approach is its dependence on the model dimensionality. When the model is high-dimensional, the estimation of the Kriging metamodel may still be problematic due to its high computational cost.

In order to overcome this problem, in the present paper, the Kriging-based approach has been combined with the quasi-optimized trajectory based elementary effects (quasi-OTEE) approach for the SA of high-dimensional models. The quasi-OTEE SA is used first to screen the influential and non-influential parameters of a high-dimensional model; then the Kriging-based SA is used to calculate the variance-based sensitivity indices, and to rank the most influential parameters in a more accurate way. The application of the proposed sequential SA is illustrated with several numerical experiments. Results show that the method can properly identify the most influential parameters and their ranks, while the number of model evaluations is considerably less than the variance-based SA (e.g., in one of the tests the sequential SA requires over 50 times less model evaluations than the variance-based SA).

Introduction

Simulation models are widely used in various scientific disciplines nowadays for e.g., system design, evaluation and optimization purposes, etc. The reliability of the simulation results always depends on the quality of the calibration. Hence, the calibration is essential, yet it is usually complicated when the model itself contains hundreds or even thousands of parameters (i.e., high dimensional), and/or when running the model is computationally expensive.

Due to certain constrains in computation time and other resources, when dealing with a complex model that is high-dimensional and computationally expensive, one feasible solution for reducing the complexity in model calibration is to calibrate only the most influential input parameters, i.e., the parameters whose variations are expected to have significant impacts on the model output. In this way, it is expected that the model outputs can be efficiently adjusted towards the correct values by fine-tuning the influential parameters. The proper approach to identify the influential and non-influential parameters is sensitivity analysis (SA).

SA explores the relationship between model outputs and input parameters [1]. A proper SA could provide qualitative and/or quantitative information regarding the effects of different model parameters (and their variations) on the model outputs. Such information can be used to eliminate the least relevant parameters in the subsequent calibration, and help practitioners to better understand both the model and its parameters, especially when the model is high-dimensional or behaves like a “black-box”.

Due to its importance, SA has been extensively developed in the last decades [1]. Some of the widely known SA methods are briefly described below.

This method uses the one-at-a-time (OAT) design, i.e., varying one parameter at a time while keeping all other parameters fixed to a nominal value. The sensitivity measures of that varying parameter are estimated via computing the corresponding partial derivatives of the model response. This method just requires a few model evaluations for estimating the derivatives. However, it is not able to detect the interaction effects among parameters [2], and the derivatives are only informative at some fixed nominal points in the input space [3]. Some studies such as [2], [3], [4] and [5] have proposed approaches in which multi-dimensional averaging of the derivatives is used to explore the interaction effects. In this way the global sensitivity measures can be obtained.

This method is typically employed to identify non-influential parameters of a model. Examples of this kind of method can be found in [2], [6], [7], and [8]. The screening method usually requires a relatively low computational cost for running the model. This feature makes it quite attractive especially for complex models. It can also be used to prune the number of parameters to be considered, before applying a more complicated method such as the variance-based method. One drawback of the screening method is, according to Kucherenko et al. [3], that it is not able to provide straightforward information regarding the total effects (for details see the review in Section 2.1), and it lacks accuracy in ranking the parameters if compared with the variance-based method.

This method decomposes the total variance of the model outputs into the conditional variance of each individual parameter, and uses this measure to represent the importance of the parameter (for details see the review in Section 2.2). The development of the variance-based method can be found in [9], [10], [11], [12], [13]. The variance-based method is one of the best available methods today to compute the sensitivity indices purely based on model evaluation [1]. However, to achieve a good estimation of those quantitative sensitivity indices, a large number of model evaluations is usually required. Although an improved sampling method was developed in [14] to enhance its efficiency, the high computation demand still makes this method less practical for large computational models [2], [3].

A metamodel is an abstraction of the original model. When the original model behaves like a black-box, and/or when it has a very high cost to run, the metamodel can be used to approximate the original model (more details are given in Section 2.2). Since the metamodel itself is usually computationally cheap, the variance-based sensitivity indices can be efficiently estimated based on the metamodel rather than the original model. Examples of applying metamodels to estimate the total sensitivity indices can be found in [15], [16]. Most efforts are spent on developing the metamodel (e.g., mapping all possible interactions [17]), and calibrating the metamodel. As these efforts are generally dependent on the number of parameters contained in the model [17], the computational cost can still be huge when the original model contains many parameters. In addition, when the model itself is high-dimensional and the interactions among the parameters are not negligible, it is also difficult to achieve a perfect estimation of the metamodel.

The use of any specific SA method is highly related to the model to be analyzed, and the goal of the analysis [18]. Therefore, there is no universal SA method to fit all possible needs. As for complex models, it is important that the specific SA method should consider both accuracy and efficiency. However, it seems that none of the above methods can fully satisfy this requirement if they are used alone: the derivative-based and screening-based methods lack accuracy in estimating the total effects or the interaction effects, while the variance-based and metamodel-based methods are usually computationally expensive when used on complex models. To achieve correct and feasible SA for complex models, specifically high-dimensional and computationally expensive models, in this paper we propose a novel method that combines two recently developed global SA approaches, namely, the quasi-optimized trajectory based elementary effects (quasi-OTEE) approach, and the Kriging-based approach.

The quasi-OTEE approach belongs to the category of screening method. It was introduced in [19] and [20] based on the elementary effects (EE) method [6], but with much higher efficiency. The two validation experiments and the case study provided in [19] and [20] demonstrated that with a small number of model evaluations, this tool can properly identify the non-influential parameters from a computationally expensive model, for which other quantitative SA techniques are not feasible to be applied at the beginning. For example, in [20] it was shown that the quasi-OTEE approach yielded similar results to those obtained with the OTEE method in [7], but only required a small fraction of its computation time.

The Kriging-based approach belongs to the family of metamodel-based method. It adopts Sobol indices [1] calculated on a Kriging approximation of the simulation model. This method has been presented in [21], where a robust Kriging emulator was obtained based on the recursive use of the DACE tool [22]. Effectiveness of the method was also proven in [21]. The authors showed that the variance-based sensitivity indices estimated based on the Kriging emulator were approximately identical to those derived by the complete variance-based approach described in [1]. However, the Kriging-based SA only required 512 model evaluations, while the variance-based SA took almost 40,000 model evaluations.

These two approaches were successfully but individually used in previous studies for complex simulation models [19], [21]. In the comparison study [23], it was found that the quasi-OTEE SA is more advanced in identifying influential and non-influential parameters, while the Kriging-based SA has a higher accuracy in ranking parameters according to their sensitivity indices. To fully exploit their own strengths, it is reasonable and practical to sequentially apply these two approaches: the quasi-OTEE is used at first for screening non-influential parameters, and the Kriging-based approach is applied in the second stage for calculating the variance-based sensitivity indices, and ranking the most influential parameters.

In this paper we perform several numerical experiments on different test functions to evaluate the performance of the proposed approach. The test functions employed in the numerical experiments are commonly accepted benchmark functions for testing SA methods. The number of parameters in the SA ranges between 12 and 20 depending on the test functions, which are generally sufficient to define a model as high-dimensional. On the contrary, the test functions themselves are not strictly computationally expensive. Since the computation time is not necessarily related to the complexity of the model, we argue that the efficiency of the SA method is assessed in terms of the number of required model evaluations rather than the total computation time. In any case, it is obvious that the total computation time is proportional to the number of model evaluations.

In the numerical experiments, the sequential SA method is applied to screen and rank the most influential parameters in the chosen test functions, and the results are compared with the true results obtained from either analytical calculations or from a standard variance-based SA. It is found that the sequential SA method is able to derive an estimation of the variance-based sensitivity indices, which are very close to the theoretical values, for the most influential parameters. In addition, the proposed approach is proven to be much more efficient than a standard variance-based approach.

The paper is organized as follows. A brief review of the quasi-OTEE and Kriging-based approach is presented in Section 2. Following the review, the details about the numerical experiments for the SA are introduced in Section 3, and the results from the experiments are discussed in Section 4. Conclusions are given in Section 5.

Section snippets

Review of the two SA methods

A brief review of the above mentioned quasi-OTEE approach [20] and the Kriging-based approach [21] is provided in this section. For more details about the two approaches the interested readers may refer to [23], where a similar, but more in-depth review is given. Here, we summarize again the main features of each approach for the reader׳s convenience.

Test functions

In this study we include several numerical experiments to demonstrate and test the proposed sequential SA approach. To this end, we have chosen 4 different test functions that are commonly used as benchmarks in evaluating the SA approaches (e.g., [7], [3], [2], [17], [37]). Moreover, all of them are, loosely speaking, high-dimensional functions with more than 10 parameters. Below is a brief introduction of the test functions used in this study.

Results of the quasi-OTEE SA

The results of the quasi-OTEE SA for all tests are shown in Fig. 1a–e. The details are given below.

Conclusions

In this paper we proposed a new method that sequentially apply the quasi-OTEE and the Kriging-based approach, for the SA of high-dimensional and computationally expensive models. The influential parameters are first screened by the quasi-OTEE approach. Then based on the screening results, the Kriging-based approach is applied to further identify the rank of the most influential parameters.

If compared with a standalone screening method that can only provide qualitative information about

Acknowledgment

Research contained within this paper benefited from participation in the EU COST Action TU0903 – Methods and tools for supporting the Use caLibration and validaTIon of Traffic simUlation moDEls (MULTITUDE).

References (40)

F. Campolongo et al.
From screening to quantitative sensitivity analysis. A unified approach
Comput Phys Commun
(2011)
S. Kucherenko et al.
Monte Carlo evaluation of derivative-based global sensitivity measures
Reliab Eng Syst Saf
(2009)
I.M. Sobol׳ et al.
Derivative based global sensitivity measures and their link with global sensitivity indices
Math Comput Simul
(2009)
I.M. Sobol׳ et al.
A new derivative based importance criterion for groups of variables and its link with the global sensitivity indices
Comput Phys Commun
(2010)
F. Campolongo et al.
An effective screening design for sensitivity analysis of large models
Environ Model Softw
(2007)
A. Saltelli et al.
Screening important inputs in models with strong interaction properties
Reliab Eng Syst Saf
(2009)
A. Saltelli et al.
About the use of rank transformation in sensitivity analysis of model output
Reliab Eng Syst Saf
(1995)
T. Homma et al.
Importance measures in global sensitivity analysis of nonlinear models
Reliab Eng Syst Saf
(1996)
A. Saltelli
Making best use of model evaluations to compute sensitivity indices
Comput Phys Commun
(2002)
C.B. Storlie et al.
Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models
Reliab Eng Syst Saf
(2009)

A. Saltelli et al.

Variance based sensitivity analysis of model output: design and estimator for the total sensitivity index

Comput Phys Commun

(2010)

M.V. Ruano et al.

An improved sampling strategy based on trajectory design for application of the Morris method to systems with many input factors

Environ Model Softw

(2012)

I.M. Sobol׳

Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates

Math Comput Simul

(2001)

A. Marrel et al.

An efficient methodology for modeling complex computer codes with Gaussian processes

Comput Stat Data Anal

(2008)

S. Kucherenko et al.

The identification of model effective dimensions using global sensitivity analysis

Reliab Eng Syst Saf

(2011)

I.M. Sobol׳

Uniformly distributed sequences with additional uniformity properties

USSR Comput Math Math Phys

(1976)

A. Saltelli et al.

Global sensitivity analysis – the primer

(2008)

M.D. Morris

Factorial sampling plans for preliminary computational experiments

Technometrics

(1991)

I.M. Sobol׳

Sensitivity analysis for nonlinear mathematical models

Math Models Comput Exp

(1993)

A. Saltelli et al.

A quantitative model-independent method for global sensitivity analysis of model output

Technometrics

(1999)

Cited by (49)

High-dimensional CFD optimization of a low-flow coefficient S–CO<inf>2</inf> centrifugal compressor for enhanced oil recovery systems
2023, Geoenergy Science and Engineering
The design of low-flow coefficient (∼0.01) centrifugal compressors with supercritical CO2 as working fluid is still a challenge for engineers due to its increased friction losses at the impeller. However, the reinjection pressure required for Enhanced Oil Recovery (EOR) systems is achieved by compression trains with stages of high-Pressure Ratios (PR > 3) which can only be obtained by lowering the flow coefficient of the equipment. The carbon dioxide mitigation, due to the reinjection process, also increases oil productivity and extraction lifetime. A four-staged compression system was considered and the preliminary geometry of its last stage was considered herein after a 1D optimization that decreased the total required power of the system. In order to further increase the systems' performance, a CFD model was developed and submitted to Sensitivity Analysis (SA) and parametric optimization procedure, considering polar angles, meridional profile and vaneless diffuser passage (25 variables). The assessment of the sequential SA using, Morris' screening method Design of Experiment (DoE) and SS-ANOVA for variable ranking and response surface training, has exposed the method's limitation in recognizing interaction between variables since low-quality Response Surfaces (RS) were trained. However, the Incremental Space Filler (ISF) sampling has complemented the sample space screening, guarantying adequate RS at a low computational cost. This indirect optimization strategy that increased the equipment's polytropic efficiency by 1.19%, diminishing total entropy generation by 8.5% can deliver important cost reductions to the operation of EOR compression systems. The ‘entropy-guided’ phenomenology analysis strategy combined with SA results, identified that the narrowing of the vaneless diffuser has extinguished the recirculation present in the original geometry's impeller/diffuser interface region, which was the largest difference in the entropy histogram. Moreover, the enlargement of the impeller's meridional profile has smoothed the fluid flow change of direction (from axial to radial) and displaced the swirl structures that restricted the fluid flow in the main passage.
A global sensitivity analysis method for safety influencing factors of RCC dams based on ISSA-ELM-Sobol
2023, Structures
Structural material parameters will directly affect the stability and deformation of roller compacted concrete (RCC) dams. To analyze the influence degree of structural material parameters of RCC dams on the safety evaluation indexes, this study proposes a global sensitivity analysis method based on extreme learning machines optimized by an improved sparrow search algorithm with Sobol method (ISSA-ELM-Sobol). First, the safety evaluation indexes of the RCC dam and their influencing factors are determined; second, a reasonable finite element model is established; third, the sample set is generated using the Latin hypercube sampling technique; fourth, the ISSA-ELM model is established to replace the finite element calculation; finally, based on the established ISSA-ELM model, the Sobol method is used for global sensitivity analysis. A case study for a typical RCC dam in Sichuan province of China showed that the material properties of the foundation and the cohesion of potential sliding surfaces have the most obvious influence on the safety of RCC dams; the ISSA proposed in this study can overcome the problem that sparrow search algorithm is easy to mature prematurely and fall into a local optimum. The ISSA-ELM model established in this study has significant advantages in solving nonlinear regression problems.
Using deep generative adversarial network to explore novel airfoil designs for vertical-axis wind turbines
2023, Energy Conversion and Management
Wind energy has emerged as an attractive alternative to the current fossil fuel-based energy mix. In this context, small-scale H-Darrieus vertical-axis wind turbines (VAWTs) combine interesting characteristics for harvesting wind energy in urban-like conditions. Still, H-Darrieus turbines are reported to experience relatively low aerodynamic efficiency. Even though several devices have been proposed to increase the aerodynamic performance of H-Darrieus turbines, the literature seems to overlook the potential of specifically designed airfoil shapes. In part, this is a by-product of different shortcomings related to the most common airfoil parameterization methods, such as restricted shape variability, high dimensionality, discontinuous spaces, and/or non-orthogonal parameters. Seeking to overcome these drawbacks altogether, we investigate here the benefits of the Bézier-GAN as an airfoil parameterization method for H-Darrieus turbines. For that, we use computational fluid dynamics (CFD) simulations along with sensitivity analysis, metamodeling, and optimization strategies. The results show that the Bézier-GAN integrates nicely with the proposed framework, substantially reducing the total computational cost of the experiment. By expanding the bounds of the latent design space, we can easily explore novel airfoil designs. The sensitivity analysis clearly indicates a lack of two-way interactions between the latent variables, which further simplifies both the metamodeling and the optimization processes. The optimal geometry increased the turbine performance by 20.5% relative to a NACA 0015 and by 9.1% relative to a NACA 0021—two common airfoil shapes used in H-Darrieus turbines. Interestingly, the optimal geometry was found outside the original bounds of the design space, further confirming that the search for novel airfoil designs may open the way for better aerodynamic performance of small-scale H-Darrieus turbines.
Variance-based sensitivity analysis of dynamic systems with both input and model uncertainty
2022, Mechanical Systems and Signal Processing
This paper develops a methodology to compute variance-based sensitivity indices for dynamic systems with time series inputs and outputs, while accounting for both aleatory and epistemic uncertainty sources, and both random process and random variable inputs. We present semi-analytical methods for computing sensitivity indices for linear systems with Gaussian random process inputs, and for the general case of nonlinear systems with non-Gaussian random process inputs. The novel elements in this approach are the treatment of model form error, quantifying the cumulative effects of uncertainty sources over time, and evaluating sensitivity indices for multi-physics models. Bayesian state and parameter estimation methods are incorporated to quantify the model uncertainty arising from unknown model parameters and model form errors, and sensitivity indices are computed before and after model updating. The proposed methods are illustrated for (a) a linear Timoshenko beam erroneously modeled as an Euler-Bernoulli beam, and (b) hypersonic flow behavior of a flexible panel represented by a coupled multi-physics nonlinear model.
Screening analysis and unconstrained optimization of a small-scale vertical axis wind turbine
2022, Energy
Citation Excerpt :
Following the GCI analysis, a numerical validation was performed using experimental data from Castelli et al. [7]. A quasi-optimal sampling [28] design of experiment (DOE) was generated to maximize the distance between trajectories for a better screening analysis via Morris' method [20] and also for RS training [23]. Additional cases were created via Uniform Latin Hypercube Sampling (ULHS) [29] to assess the RS quality.
The demand for alternative and renewable energy sources has been substantially growing in recent years, mainly steered by economic and environmental inconveniences of conventional energy sources, such as oil and its derivatives. In this context, wind energy has emerged as an attractive renewable source, envisioning possibilities of developing more efficient equipment to meet the ever-growing energy demand. In this work, we coupled Computational Fluid Dynamics (CFD) with an optimization based on response surface (RS) methodologies to find an optimal design for a small-scale NACA 0021 Darrieus vertical axis wind turbine (VAWT) operating at a tip speed ratio of 2.63. For that, we investigated four geometric parameters: number of blades (N), rotor diameter (D), chord length (c), and pitch angle (β). For the numerical model, we considered a two-dimensional, incompressible, turbulent, and unsteady flow regime. A sensitivity analysis (SA) via Morris’ method was performed to identify the influence of the four geometric parameters on the turbine aerodynamic performance. Our results reveal that the pitch angle (β) contributes the most (58%) to the turbine performance. The resulting optimized turbine design increased the conversion efficiency by 40%. Additionally, we also present a detailed discussion on the flow phenomenology considering the impact of each one of the four geometric parameters on the power coefficient. Finally, the strategy adopted here, in which a qualitative sensitivity analysis combined to the response surface and unconstrained optimization, was shown to be robust and can be applied to high-dimensional and computational-expensive CFD models to reduce costs with adequate results regarding fluid flow phenomena.
A microsimulation based analysis of the price of anarchy in traffic routing: The enhanced Braess network case
2022, Journal of Intelligent Transportation Systems: Technology, Planning, and Operations
In the scientific literature, the ratio between the total travel cost under a user equilibrium assignment and the total travel cost under a system optimum assignment is typically referred to as the Price of Anarchy (PoA), i.e., the level of inefficiency that can be eliminated by centralized control in routing. Recently, this concept has been attracting renovated attention due to the new opportunities offered by vehicles’ connectivity and automation. The new technologies could allow individual prescriptions centrally managed to achieve benefits in network performance. However, considering the infrastructure that such a system would need and the ethical implications it could have (related to privacy, equity, etc.), it is necessary to carefully quantify its actual benefits. Existing PoA related studies do not fully capture the essential realism of traffic dynamics and show contradictory results. Moreover, there are no studies that explore the relation of the PoA and driving behavior (e.g., reaction time, acceleration, deceleration, aggressiveness, etc.), which is very important for designing the algorithms of automated driving. In this light, the present paper investigates the PoA over a Braess-like network using the most detailed way to model traffic—the microscopic simulation with parameters defining the driving behavior. The magnitude of the PoA has been studied over a wide range of combinations of the model input parameters. Results show that the PoA can be much higher than that obtained in theoretical studies. In addition, results are used to reveal some PoA features of real traffic networks and propose further research directions.

View all citing articles on Scopus

View full text