Incorporating prior knowledge with simulation data to estimate PSF multipliers using Bayesian logistic regression

doi:10.1016/j.ress.2019.04.022

Reliability Engineering & System Safety

Volume 189, September 2019, Pages 210-217

https://doi.org/10.1016/j.ress.2019.04.022 Get rights and content

Highlights

•
Relations between PSFs and HEPs were modeled by Bayesian regression.
•
The results from the frequentist approach and the Bayesian approach were compared.
•
How to overcome the limitations of frequentist statistical analyses are discussed.

Abstract

Recently, several kinds of databases have been constructed and analyzed to support human reliability analyses. Based on these, some researchers have attempted to model the quantitative relations between performance shaping factors and human error probability. However, the limitations of the traditional regression technique and simulation data employed have come to light. To tackle these issues regarding the traditional statistical analysis, this study proposes an analysis based on the Bayesian logistic regression method that incorporates empirical data with prior knowledge. This method was applied to four different prior knowledge sets and empirical data collected via the Human Reliability data Extraction (HuREX) framework. The mean and credible interval from the obtained posterior distributions were compared with previous research. From the application, we found that the suggested approach is useful in consolidating various data sources to estimate the multipliers of performance shaping factors on error probabilities, producing results robust to the data characteristics, and providing the quantitative uncertainties of the estimation. It is also confirmed that selecting an appropriate prior knowledge and collecting abundant and correct empirical data are important for producing meaningful insights for PSF impacts.

Introduction

Human reliability analyses (HRAs) have been widely recognized as an important activity of probabilistic safety assessments (PSAs) in many safety-critical industries. HRAs have been conducted to identify significant mechanisms of human error and/or predicting human error probabilities (HEPs) for significant tasks. However, researchers in the HRA field have recognized that new empirical evidence is required to enhance the quality of HRA results [1], [2]. This is because the empirical data employed are too old to reflect recent trends of human behavior or control room environments, and have not been produced by a solid statistical process [2].

In this light, several databases have been developed, such as the Scenario Authoring, Characterization, and Debriefing Application (SACADA) [1] and Operator Performance and Reliability Analysis (OPERA) [3], and some estimates have been reported from the empirical data. For example, Gesellschaft für Anlagen und Reaktorsicherheit (GRS) generates a set of HEPs using a Bayesian inference technique based on plant experience data [4]. Chang et al. also proposed an HEP estimation method from simulator records [5].

It is usually more difficult to model the PSF effects on human reliabilities or HEPs using data science techniques than to estimate HEPs from a set of data. This is because PSF effects modeling data must contain both (1) a varied range of information on PSFs or contexts for estimating PSF effects and (2) reliability information of human operators in the context. To estimate the PSF effects, the Korea Atomic Energy Research Institute (KAERI) developed the Human Reliability data Extraction (HuREX) framework for collecting human reliability data, including human performance and contextual information [3]. Kim et al. tried to predict the PSF multipliers, which represent impacts of the contexts on HEPs, from the collected data [6]. A regression technique based on maximum likelihood estimation was employed to derive the PSF multipliers. It was shown that the HuREX-based data and the employed statistical analysis were useful for modeling PSF effects in the manner of quantification models of many existing HRA methods.

However, it was also found HuREX-based data and the statistical modeling approaches also have several limitations (Section 2 will discuss these issues). To overcome some issues with the empirical data and the analysis technique, this paper proposes a Bayesian inference technique that incorporates expert knowledge with the empirical data to statistically model PSF effects. The proposed approach is implemented on the same set of data and compared with the results from the maximum likelihood estimation.

Section 2 therefore describes the weakness of the existing approaches for PSF modeling based on the empirical data and Section 3 introduces the Bayesian approach. Section 4 explains the data employed for the analysis to understand the results. In Section 5, estimated PSF effects are shown and compared with the pre-existing results. In Section 6, we also discuss some characteristics of Bayesian approaches, including uncertainty modeling and data sensitivity, based on the results of the PSF modeling.

Section snippets

Previous works for modeling PSF effects

There have been some efforts to identify the relations between HEPs and PSFs. For example, Kirwan et al. generated PSF profiles for HEP information from plant experience data and estimated the effects of each PSF by comparing the HEPs at different PSF levels [7]. Kim et al. also derived PSF multipliers, which are multiplicative effects of PSFs on HEPs, under low-power and shut-down status of NPPs using the PSF profiling method [8]. Liu and Li tried to statistically analyze PSF impacts on HEPs

Bayesian inference for logistic regression

Recently, Bayesian inference has come to be regarded as an alternative approach to traditional statistical analysis to analyze and interpret data in most data science fields, including PSA and HRA [12], [13], [14], [15], [16], [17]. The Bayesian method for data analysis produces a posterior probability based on observed data and prior knowledge using Bayes’ theorem. The Bayesian inference is described by the Eq. (1) $π_{1} (θ | D) = \frac{f (D | θ) π_{0} (θ)}{\int f (D | θ) π_{0} (θ) d θ},$ where D is observed empirical data and θ is a

Empirical data for likelihood function

Human reliability data from the OPERA database were used for the likelihood function in the Bayesian inference [3]. The data were extracted from the full-scope simulator training records, which included human responses and related situational factors to cope with two emergency situations and 12 abnormal situations [6]:

-
Interfacing System Loss of Coolant Accident (ISLOCA)
-
Steam Generator Tube Rupture (SGTR), following Main Steam Line Break (MSLB)
-
Control element assembly deviation
-
Charging system

PSF multipliers

Table 5 compares the PSF multipliers quantified by a maximum likelihood estimation, two previous HRA literatures, and four cases of Bayesian inference. To derive PSF multipliers through Bayesian inference, the regression coefficients from the Bayesian updates were exponentiated. In the case of positive PSFs variables, we took the reciprocal of the exponentiated coefficient. The nominal HEPs indicate the predicted values in the regression models, when all explanatory variables have positive

Discussion and conclusion

In this research, we addressed the limitations of maximum likelihood estimation and simulation data for modeling PSF effects on human reliability and proposed a Bayesian logistic regression technique to tackle these limitations. The suggested approach allows for consolidating various data sources to estimate PSF multipliers, thus producing results robust to the data characteristics and assessing the quantitative uncertainties of the estimation. As many articles [15], [16], [17] have indicated,

Acknowledgment

This work was supported by the Nuclear Research & Development Program of the National Research Foundation of Korea grant, funded by the Korean government, Ministry of Science, ICT & Future Planning (grant number 2017M2A8A4015291).

References (30)

Y. Kim et al.
A statistical approach to estimating effects of performance shaping factors on human error probabilities of soft controls
Reliab Eng Syst Saf
(2015)
W. Preischl et al.
Human error probabilities from operational experience of German nuclear power plants
Reliab Eng Syst Saf
(2013)
Y. Kim et al.
Estimating the quantitative relation between PSFs and HEPs from full-scope simulator data
Reliab Eng Syst Saf
(2018)
A.R. Kim et al.
Quantification of performance shaping factors (PSFs)’weightings for human reliability analysis (HRA) of low power and shutdown (LPSD) operations
Ann Nucl Energy
(2017)
J. Park et al.
An experimental investigation on relationship between PSFs and operator performances in the digital main control room
Ann Nucl Energy
(2017)
K. Groth et al.
A Bayesian method for using simulator data to enhance human error probabilities assigned by existing HRA methods
Reliab Eng Syst Saf
(2014)
L. Mkrtchyan et al.
Bayesian belief networks for human reliability analysis: a review of applications and gaps
Reliab Eng Syst Saf
(2015)
Y.J. Chang et al.
The SACADA database for human reliability and human performance
Reliab Eng Syst Saf
(2014)
W. Jung et al.
HuREX-A framework of HRA data collection from simulators in nuclear power plants
Reliab Eng Syst Saf
(2018)
Y.J. Chang et al.
Example use of the SACADA data to inform HRA

B. Kirwan et al.

Core-data: a computerized human error database for human reliability support

P. Liu et al.

Human error data collection and comparison with predictions by SPAR-H

Risk Anal

(2014)

D.H. Ham et al.

Use of a big data analysis technique for extracting HRA data from event investigation reports based on the Safety-II concept

Reliab Eng Syst Saf

(2018)

J.K. Kruschke et al.

The time has come: Bayesian methods for data analysis in the organizational sciences

Organ Res Methods

(2012)

M. Kay et al.

Researcher-centered design of statistics: why Bayesian statistics better fit the culture and incentives of HCI

Cited by (22)

Seismic fragility analysis of nuclear containment structure using Bayesian logistic regression model
2024, Soil Dynamics and Earthquake Engineering
Nuclear containment structure serves as the critical shielding structure in nuclear power plants, and must maintain structural integrity under various conditions. Following the Fukushima nuclear accident, seismic safety of nuclear power plant structures has become a significant concern in nuclear engineering community. This study presents seismic fragility analysis of the nuclear containment structure subjected to far-fault ground motion based on Bayesian logistic regression model. A refined finite element model of the nuclear containment structure was developed using layered shell element with considering material nonlinear behavior. The maximum tensile strain and roof drift of the nuclear containment structure were analyzed. MATLAB scripts for the seismic fragility analysis method based on Bayesian logistic regression model were developed. Posterior samples for the parameter of fragility function parameters were obtained using developed scripts. The sensitivity of the proposed Bayesian logistic regression based seismic fragility analysis method was verified. Seismic fragility curves of the nuclear containment structure obtained with the minimized sum of squared error, maximum likelihood estimation and Bayesian logistic regression methods were compared in depth.
A framework to determine the holistic multiplier of performance shaping factors in human reliability analysis – An explanatory study
2024, Reliability Engineering and System Safety
The safety of nuclear power plants is the uppermost goal to be emphasized for their sustainability. To this end, it is crucial to evaluate their anticipated risk by visualizing accident scenarios resulting in undesirable consequences. In quantifying the likelihood of an accident scenario occurrence, it is necessary to estimate values of human error probabilities with respect to the performance of safety-critical tasks following relevant human reliability analysis (HRA) methods. One of the technical issues pertaining to the quality of HRA results is to determine the multipliers of performance shaping factors (PSFs) that specify the contexts in which human operators have to accomplish the required safety-critical tasks. Unfortunately, since PSFs are entangled with complicated interrelations, it is hard to properly determine their multipliers. To address this problem, in this study, a framework that can quantify the holistic multiplier of a specific PSF is proposed that can encompass the effect of these complicated interrelations. In order to corroborate the applicability of the proposed framework, an explanatory case study is conducted by using data available from the literature. Based on the result of this case study, it is expected that the proposed framework would be effective for determining PSF multipliers.
Physics-informed machine learning for reliability and systems safety applications: State of the art and challenges
2023, Reliability Engineering and System Safety
The computerized simulations of physical and socio-economic systems have proliferated in the past decade, at the same time, the capability to develop high-fidelity system predictive models is of growing importance for a multitude of reliability and system safety applications. Traditionally, methodologies for predictive modeling generally fall into two different categories, namely physics-based approaches and machine learning-based approaches. There is a growing consensus that the modeling of complex engineering systems requires novel hybrid methodologies that effectively integrate physics-based modeling with machine learning approaches, referred to as physics-informed machine learning (PIML). Developing advanced PIML techniques is recognized as an important emerging area of research, which could be particularly beneficial in addressing reliability and system safety challenges. With this motivation, this paper provides a review of the state-of-the-art of physics-informed machine learning methods in reliability and system safety applications. The paper highlights different efforts towards aggregating physical information and data-driven models as grouped according to their similarity and application area within each group. The goal is to provide a collection of research articles presenting recent developments of this emergent topic, and shed light on the challenges and future directions which we, as a research community, should focus on for harnessing the full potential of advanced PIML techniques for reliability and safety applications.
A probabilistic reasoning approach to analyze the severity of single-vehicle crashes at mid-ramp locations
2023, International Journal of Transportation Science and Technology
Freeway ramps are one of the roadway elements that are considered as crash-prone sites with relatively more crashes per mile than other freeway segments. Among other crash types that occurred on freeway ramps, single-vehicle crashes have been found to be more severe. Thus, understanding the factors associated with the severity of single-vehicle crashes on freeway ramps is essential in improving the safety of our limited-access facilities. This study adopted a discrete Bayesian network (BN) approach to explore the probabilistic relationship among the potential factors associated with the severity of single-vehicle crashes at mid-ramp locations. The analysis was based on 6,041 single-vehicle crashes that occurred at the mid-ramp locations in California from 2009 to 2017. The findings indicated that ramp type, ramp traffic volume, road surface condition, and time of day were directly associated with the severity of single-vehicle crashes at the mid-ramp locations. The interdependency of off-ramps, ramp AADT of less than 13,000 vehicles per day, dry road surface condition, and off-peak hours were associated with the highest risk of fatal/severe injury crashes involving a single vehicle. The study findings could potentially be used by transportation agencies in planning and implementing several strategies to improve the safety of freeway ramps.
A quantitative input for evaluating human error of visual Neglection: Prediction of Operator's detection time spent on perceiving critical visual signal
2022, Reliability Engineering and System Safety
Citation Excerpt :
Moreover, with the improvement of digitization and automation, human-machine interfaces in various modern industries have been remarkably changed [5–8], and these made HEP data obtained from old operational experiences no longer suitable for analyzing recent trends of human error. As a result, analysts in risk assessment started to seek HEP data from simulators specially designed for scenarios of interest [1,9,10]. Still, this way of collecting HEP data is restricted in several specific conditions, mainly because such simulations are always expensive and lack portability and generality [2].
Human error of neglecting critical visual signal could be a trigger of serious accident, especially for those scenarios in which urgent operator intervention is required. Conventional human reliability analyses (HRAs) suffer from data collecting challenges and imprecision problems when evaluating such human error. This paper proposes a new method specialized for predicting operator's detection time spent on perceiving critical visual signal, which is not limited by data collecting challenges and effectively integrates human-visual-scanning processes and detailed characteristics of human-machine interface. Specifically, a novel method of creating visual scan paths is developed to model the “Scanning” part of detection time, and the emerging uncertainty theory is introduced to evaluate the “Noticing” part of it. Further, a new Monte Carlo method is established for combining the “Scanning” and “Noticing” parts, whose outcome is a chance distribution of detection time of interest. From the perspective of practical applications, a conversion of the detection time to uncertainty of human error of visual neglection is clearly explained. A validation from real test demonstrates the effectiveness of proposed method, which also indicates that compared to several common HRAs, the method is more accurate in analyzing the scenarios in which available time for operator's action is short.
A framework to collect human reliability analysis data for nuclear power plants using a simplified simulator and student operators
2022, Reliability Engineering and System Safety
Data scarcity in human reliability analysis (HRA) has been a major challenge in the quantification process. Many institutes have collected HRA data through experiments using full-scope simulators with actual operators. Nevertheless, there are still some limitations to relying solely on full-scope studies. This paper aims to propose how full-scope data collection studies can be supported through the Simplified Human Error Experimental Program (SHEEP). The SHEEP framework was developed by Idaho National Laboratory (INL) to collect HRA data through a simplified simulator and student operators. This paper introduces the major tasks in the SHEEP framework, with a particular focus on differences that arise due to participant type (i.e., student vs. actual operator), based on experiments using a simplified simulator (i.e., the Rancor Microworld). This paper also describes whether the data collected via this approach could support a representative full-scope data collection study (i.e., the HuREX study) based on the experimental data.

View all citing articles on Scopus

View full text

Incorporating prior knowledge with simulation data to estimate PSF multipliers using Bayesian logistic regression

Highlights

Abstract

Introduction

Section snippets

Previous works for modeling PSF effects

Bayesian inference for logistic regression

Empirical data for likelihood function

PSF multipliers

Discussion and conclusion

Acknowledgment

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Ann Nucl Energy

Ann Nucl Energy

Reliab Eng Syst Saf

Reliab Eng Syst Saf

The SACADA database for human reliability and human performance

Reliab Eng Syst Saf

HuREX-A framework of HRA data collection from simulators in nuclear power plants

Reliab Eng Syst Saf

Example use of the SACADA data to inform HRA

Core-data: a computerized human error database for human reliability support

Human error data collection and comparison with predictions by SPAR-H

Risk Anal

Use of a big data analysis technique for extracting HRA data from event investigation reports based on the Safety-II concept

Reliab Eng Syst Saf

The time has come: Bayesian methods for data analysis in the organizational sciences

Organ Res Methods

Researcher-centered design of statistics: why Bayesian statistics better fit the culture and incentives of HCI