Combining screening and metamodel-based methods: An efficient sequential approach for the sensitivity analysis of model outputs
Introduction
Simulation models are widely used in various scientific disciplines nowadays for e.g., system design, evaluation and optimization purposes, etc. The reliability of the simulation results always depends on the quality of the calibration. Hence, the calibration is essential, yet it is usually complicated when the model itself contains hundreds or even thousands of parameters (i.e., high dimensional), and/or when running the model is computationally expensive.
Due to certain constrains in computation time and other resources, when dealing with a complex model that is high-dimensional and computationally expensive, one feasible solution for reducing the complexity in model calibration is to calibrate only the most influential input parameters, i.e., the parameters whose variations are expected to have significant impacts on the model output. In this way, it is expected that the model outputs can be efficiently adjusted towards the correct values by fine-tuning the influential parameters. The proper approach to identify the influential and non-influential parameters is sensitivity analysis (SA).
SA explores the relationship between model outputs and input parameters [1]. A proper SA could provide qualitative and/or quantitative information regarding the effects of different model parameters (and their variations) on the model outputs. Such information can be used to eliminate the least relevant parameters in the subsequent calibration, and help practitioners to better understand both the model and its parameters, especially when the model is high-dimensional or behaves like a “black-box”.
Due to its importance, SA has been extensively developed in the last decades [1]. Some of the widely known SA methods are briefly described below.
This method uses the one-at-a-time (OAT) design, i.e., varying one parameter at a time while keeping all other parameters fixed to a nominal value. The sensitivity measures of that varying parameter are estimated via computing the corresponding partial derivatives of the model response. This method just requires a few model evaluations for estimating the derivatives. However, it is not able to detect the interaction effects among parameters [2], and the derivatives are only informative at some fixed nominal points in the input space [3]. Some studies such as [2], [3], [4] and [5] have proposed approaches in which multi-dimensional averaging of the derivatives is used to explore the interaction effects. In this way the global sensitivity measures can be obtained.
This method is typically employed to identify non-influential parameters of a model. Examples of this kind of method can be found in [2], [6], [7], and [8]. The screening method usually requires a relatively low computational cost for running the model. This feature makes it quite attractive especially for complex models. It can also be used to prune the number of parameters to be considered, before applying a more complicated method such as the variance-based method. One drawback of the screening method is, according to Kucherenko et al. [3], that it is not able to provide straightforward information regarding the total effects (for details see the review in Section 2.1), and it lacks accuracy in ranking the parameters if compared with the variance-based method.
This method decomposes the total variance of the model outputs into the conditional variance of each individual parameter, and uses this measure to represent the importance of the parameter (for details see the review in Section 2.2). The development of the variance-based method can be found in [9], [10], [11], [12], [13]. The variance-based method is one of the best available methods today to compute the sensitivity indices purely based on model evaluation [1]. However, to achieve a good estimation of those quantitative sensitivity indices, a large number of model evaluations is usually required. Although an improved sampling method was developed in [14] to enhance its efficiency, the high computation demand still makes this method less practical for large computational models [2], [3].
A metamodel is an abstraction of the original model. When the original model behaves like a black-box, and/or when it has a very high cost to run, the metamodel can be used to approximate the original model (more details are given in Section 2.2). Since the metamodel itself is usually computationally cheap, the variance-based sensitivity indices can be efficiently estimated based on the metamodel rather than the original model. Examples of applying metamodels to estimate the total sensitivity indices can be found in [15], [16]. Most efforts are spent on developing the metamodel (e.g., mapping all possible interactions [17]), and calibrating the metamodel. As these efforts are generally dependent on the number of parameters contained in the model [17], the computational cost can still be huge when the original model contains many parameters. In addition, when the model itself is high-dimensional and the interactions among the parameters are not negligible, it is also difficult to achieve a perfect estimation of the metamodel.
The use of any specific SA method is highly related to the model to be analyzed, and the goal of the analysis [18]. Therefore, there is no universal SA method to fit all possible needs. As for complex models, it is important that the specific SA method should consider both accuracy and efficiency. However, it seems that none of the above methods can fully satisfy this requirement if they are used alone: the derivative-based and screening-based methods lack accuracy in estimating the total effects or the interaction effects, while the variance-based and metamodel-based methods are usually computationally expensive when used on complex models. To achieve correct and feasible SA for complex models, specifically high-dimensional and computationally expensive models, in this paper we propose a novel method that combines two recently developed global SA approaches, namely, the quasi-optimized trajectory based elementary effects (quasi-OTEE) approach, and the Kriging-based approach.
The quasi-OTEE approach belongs to the category of screening method. It was introduced in [19] and [20] based on the elementary effects (EE) method [6], but with much higher efficiency. The two validation experiments and the case study provided in [19] and [20] demonstrated that with a small number of model evaluations, this tool can properly identify the non-influential parameters from a computationally expensive model, for which other quantitative SA techniques are not feasible to be applied at the beginning. For example, in [20] it was shown that the quasi-OTEE approach yielded similar results to those obtained with the OTEE method in [7], but only required a small fraction of its computation time.
The Kriging-based approach belongs to the family of metamodel-based method. It adopts Sobol indices [1] calculated on a Kriging approximation of the simulation model. This method has been presented in [21], where a robust Kriging emulator was obtained based on the recursive use of the DACE tool [22]. Effectiveness of the method was also proven in [21]. The authors showed that the variance-based sensitivity indices estimated based on the Kriging emulator were approximately identical to those derived by the complete variance-based approach described in [1]. However, the Kriging-based SA only required 512 model evaluations, while the variance-based SA took almost 40,000 model evaluations.
These two approaches were successfully but individually used in previous studies for complex simulation models [19], [21]. In the comparison study [23], it was found that the quasi-OTEE SA is more advanced in identifying influential and non-influential parameters, while the Kriging-based SA has a higher accuracy in ranking parameters according to their sensitivity indices. To fully exploit their own strengths, it is reasonable and practical to sequentially apply these two approaches: the quasi-OTEE is used at first for screening non-influential parameters, and the Kriging-based approach is applied in the second stage for calculating the variance-based sensitivity indices, and ranking the most influential parameters.
In this paper we perform several numerical experiments on different test functions to evaluate the performance of the proposed approach. The test functions employed in the numerical experiments are commonly accepted benchmark functions for testing SA methods. The number of parameters in the SA ranges between 12 and 20 depending on the test functions, which are generally sufficient to define a model as high-dimensional. On the contrary, the test functions themselves are not strictly computationally expensive. Since the computation time is not necessarily related to the complexity of the model, we argue that the efficiency of the SA method is assessed in terms of the number of required model evaluations rather than the total computation time. In any case, it is obvious that the total computation time is proportional to the number of model evaluations.
In the numerical experiments, the sequential SA method is applied to screen and rank the most influential parameters in the chosen test functions, and the results are compared with the true results obtained from either analytical calculations or from a standard variance-based SA. It is found that the sequential SA method is able to derive an estimation of the variance-based sensitivity indices, which are very close to the theoretical values, for the most influential parameters. In addition, the proposed approach is proven to be much more efficient than a standard variance-based approach.
The paper is organized as follows. A brief review of the quasi-OTEE and Kriging-based approach is presented in Section 2. Following the review, the details about the numerical experiments for the SA are introduced in Section 3, and the results from the experiments are discussed in Section 4. Conclusions are given in Section 5.
Section snippets
Review of the two SA methods
A brief review of the above mentioned quasi-OTEE approach [20] and the Kriging-based approach [21] is provided in this section. For more details about the two approaches the interested readers may refer to [23], where a similar, but more in-depth review is given. Here, we summarize again the main features of each approach for the reader׳s convenience.
Test functions
In this study we include several numerical experiments to demonstrate and test the proposed sequential SA approach. To this end, we have chosen 4 different test functions that are commonly used as benchmarks in evaluating the SA approaches (e.g., [7], [3], [2], [17], [37]). Moreover, all of them are, loosely speaking, high-dimensional functions with more than 10 parameters. Below is a brief introduction of the test functions used in this study.
Results of the quasi-OTEE SA
The results of the quasi-OTEE SA for all tests are shown in Fig. 1a–e. The details are given below.
Conclusions
In this paper we proposed a new method that sequentially apply the quasi-OTEE and the Kriging-based approach, for the SA of high-dimensional and computationally expensive models. The influential parameters are first screened by the quasi-OTEE approach. Then based on the screening results, the Kriging-based approach is applied to further identify the rank of the most influential parameters.
If compared with a standalone screening method that can only provide qualitative information about
Acknowledgment
Research contained within this paper benefited from participation in the EU COST Action TU0903 – Methods and tools for supporting the Use caLibration and validaTIon of Traffic simUlation moDEls (MULTITUDE).
References (40)
- et al.
From screening to quantitative sensitivity analysis. A unified approach
Comput Phys Commun
(2011) - et al.
Monte Carlo evaluation of derivative-based global sensitivity measures
Reliab Eng Syst Saf
(2009) - et al.
Derivative based global sensitivity measures and their link with global sensitivity indices
Math Comput Simul
(2009) - et al.
A new derivative based importance criterion for groups of variables and its link with the global sensitivity indices
Comput Phys Commun
(2010) - et al.
An effective screening design for sensitivity analysis of large models
Environ Model Softw
(2007) - et al.
Screening important inputs in models with strong interaction properties
Reliab Eng Syst Saf
(2009) - et al.
About the use of rank transformation in sensitivity analysis of model output
Reliab Eng Syst Saf
(1995) - et al.
Importance measures in global sensitivity analysis of nonlinear models
Reliab Eng Syst Saf
(1996) Making best use of model evaluations to compute sensitivity indices
Comput Phys Commun
(2002)- et al.
Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models
Reliab Eng Syst Saf
(2009)
Variance based sensitivity analysis of model output: design and estimator for the total sensitivity index
Comput Phys Commun
An improved sampling strategy based on trajectory design for application of the Morris method to systems with many input factors
Environ Model Softw
Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates
Math Comput Simul
An efficient methodology for modeling complex computer codes with Gaussian processes
Comput Stat Data Anal
The identification of model effective dimensions using global sensitivity analysis
Reliab Eng Syst Saf
Uniformly distributed sequences with additional uniformity properties
USSR Comput Math Math Phys
Global sensitivity analysis – the primer
Factorial sampling plans for preliminary computational experiments
Technometrics
Sensitivity analysis for nonlinear mathematical models
Math Models Comput Exp
A quantitative model-independent method for global sensitivity analysis of model output
Technometrics
Cited by (49)
High-dimensional CFD optimization of a low-flow coefficient S–CO<inf>2</inf> centrifugal compressor for enhanced oil recovery systems
2023, Geoenergy Science and EngineeringUsing deep generative adversarial network to explore novel airfoil designs for vertical-axis wind turbines
2023, Energy Conversion and ManagementVariance-based sensitivity analysis of dynamic systems with both input and model uncertainty
2022, Mechanical Systems and Signal ProcessingScreening analysis and unconstrained optimization of a small-scale vertical axis wind turbine
2022, EnergyCitation Excerpt :Following the GCI analysis, a numerical validation was performed using experimental data from Castelli et al. [7]. A quasi-optimal sampling [28] design of experiment (DOE) was generated to maximize the distance between trajectories for a better screening analysis via Morris' method [20] and also for RS training [23]. Additional cases were created via Uniform Latin Hypercube Sampling (ULHS) [29] to assess the RS quality.
A microsimulation based analysis of the price of anarchy in traffic routing: The enhanced Braess network case
2022, Journal of Intelligent Transportation Systems: Technology, Planning, and Operations