Incorporating prior knowledge with simulation data to estimate PSF multipliers using Bayesian logistic regression

https://doi.org/10.1016/j.ress.2019.04.022Get rights and content

Highlights

  • Relations between PSFs and HEPs were modeled by Bayesian regression.

  • The results from the frequentist approach and the Bayesian approach were compared.

  • How to overcome the limitations of frequentist statistical analyses are discussed.

Abstract

Recently, several kinds of databases have been constructed and analyzed to support human reliability analyses. Based on these, some researchers have attempted to model the quantitative relations between performance shaping factors and human error probability. However, the limitations of the traditional regression technique and simulation data employed have come to light. To tackle these issues regarding the traditional statistical analysis, this study proposes an analysis based on the Bayesian logistic regression method that incorporates empirical data with prior knowledge. This method was applied to four different prior knowledge sets and empirical data collected via the Human Reliability data Extraction (HuREX) framework. The mean and credible interval from the obtained posterior distributions were compared with previous research. From the application, we found that the suggested approach is useful in consolidating various data sources to estimate the multipliers of performance shaping factors on error probabilities, producing results robust to the data characteristics, and providing the quantitative uncertainties of the estimation. It is also confirmed that selecting an appropriate prior knowledge and collecting abundant and correct empirical data are important for producing meaningful insights for PSF impacts.

Introduction

Human reliability analyses (HRAs) have been widely recognized as an important activity of probabilistic safety assessments (PSAs) in many safety-critical industries. HRAs have been conducted to identify significant mechanisms of human error and/or predicting human error probabilities (HEPs) for significant tasks. However, researchers in the HRA field have recognized that new empirical evidence is required to enhance the quality of HRA results [1], [2]. This is because the empirical data employed are too old to reflect recent trends of human behavior or control room environments, and have not been produced by a solid statistical process [2].

In this light, several databases have been developed, such as the Scenario Authoring, Characterization, and Debriefing Application (SACADA) [1] and Operator Performance and Reliability Analysis (OPERA) [3], and some estimates have been reported from the empirical data. For example, Gesellschaft für Anlagen und Reaktorsicherheit (GRS) generates a set of HEPs using a Bayesian inference technique based on plant experience data [4]. Chang et al. also proposed an HEP estimation method from simulator records [5].

It is usually more difficult to model the PSF effects on human reliabilities or HEPs using data science techniques than to estimate HEPs from a set of data. This is because PSF effects modeling data must contain both (1) a varied range of information on PSFs or contexts for estimating PSF effects and (2) reliability information of human operators in the context. To estimate the PSF effects, the Korea Atomic Energy Research Institute (KAERI) developed the Human Reliability data Extraction (HuREX) framework for collecting human reliability data, including human performance and contextual information [3]. Kim et al. tried to predict the PSF multipliers, which represent impacts of the contexts on HEPs, from the collected data [6]. A regression technique based on maximum likelihood estimation was employed to derive the PSF multipliers. It was shown that the HuREX-based data and the employed statistical analysis were useful for modeling PSF effects in the manner of quantification models of many existing HRA methods.

However, it was also found HuREX-based data and the statistical modeling approaches also have several limitations (Section 2 will discuss these issues). To overcome some issues with the empirical data and the analysis technique, this paper proposes a Bayesian inference technique that incorporates expert knowledge with the empirical data to statistically model PSF effects. The proposed approach is implemented on the same set of data and compared with the results from the maximum likelihood estimation.

Section 2 therefore describes the weakness of the existing approaches for PSF modeling based on the empirical data and Section 3 introduces the Bayesian approach. Section 4 explains the data employed for the analysis to understand the results. In Section 5, estimated PSF effects are shown and compared with the pre-existing results. In Section 6, we also discuss some characteristics of Bayesian approaches, including uncertainty modeling and data sensitivity, based on the results of the PSF modeling.

Section snippets

Previous works for modeling PSF effects

There have been some efforts to identify the relations between HEPs and PSFs. For example, Kirwan et al. generated PSF profiles for HEP information from plant experience data and estimated the effects of each PSF by comparing the HEPs at different PSF levels [7]. Kim et al. also derived PSF multipliers, which are multiplicative effects of PSFs on HEPs, under low-power and shut-down status of NPPs using the PSF profiling method [8]. Liu and Li tried to statistically analyze PSF impacts on HEPs

Bayesian inference for logistic regression

Recently, Bayesian inference has come to be regarded as an alternative approach to traditional statistical analysis to analyze and interpret data in most data science fields, including PSA and HRA [12], [13], [14], [15], [16], [17]. The Bayesian method for data analysis produces a posterior probability based on observed data and prior knowledge using Bayes’ theorem. The Bayesian inference is described by the Eq. (1)π1(θ|D)=f(D|θ)π0(θ)f(D|θ)π0(θ)dθ,where D is observed empirical data and θ is a

Empirical data for likelihood function

Human reliability data from the OPERA database were used for the likelihood function in the Bayesian inference [3]. The data were extracted from the full-scope simulator training records, which included human responses and related situational factors to cope with two emergency situations and 12 abnormal situations [6]:

  • -

    Interfacing System Loss of Coolant Accident (ISLOCA)

  • -

    Steam Generator Tube Rupture (SGTR), following Main Steam Line Break (MSLB)

  • -

    Control element assembly deviation

  • -

    Charging system

PSF multipliers

Table 5 compares the PSF multipliers quantified by a maximum likelihood estimation, two previous HRA literatures, and four cases of Bayesian inference. To derive PSF multipliers through Bayesian inference, the regression coefficients from the Bayesian updates were exponentiated. In the case of positive PSFs variables, we took the reciprocal of the exponentiated coefficient. The nominal HEPs indicate the predicted values in the regression models, when all explanatory variables have positive

Discussion and conclusion

In this research, we addressed the limitations of maximum likelihood estimation and simulation data for modeling PSF effects on human reliability and proposed a Bayesian logistic regression technique to tackle these limitations. The suggested approach allows for consolidating various data sources to estimate PSF multipliers, thus producing results robust to the data characteristics and assessing the quantitative uncertainties of the estimation. As many articles [15], [16], [17] have indicated,

Acknowledgment

This work was supported by the Nuclear Research & Development Program of the National Research Foundation of Korea grant, funded by the Korean government, Ministry of Science, ICT & Future Planning (grant number 2017M2A8A4015291).

References (30)

  • B. Kirwan et al.

    Core-data: a computerized human error database for human reliability support

  • P. Liu et al.

    Human error data collection and comparison with predictions by SPAR-H

    Risk Anal

    (2014)
  • D.H. Ham et al.

    Use of a big data analysis technique for extracting HRA data from event investigation reports based on the Safety-II concept

    Reliab Eng Syst Saf

    (2018)
  • J.K. Kruschke et al.

    The time has come: Bayesian methods for data analysis in the organizational sciences

    Organ Res Methods

    (2012)
  • M. Kay et al.

    Researcher-centered design of statistics: why Bayesian statistics better fit the culture and incentives of HCI

  • Cited by (22)

    • A quantitative input for evaluating human error of visual Neglection: Prediction of Operator's detection time spent on perceiving critical visual signal

      2022, Reliability Engineering and System Safety
      Citation Excerpt :

      Moreover, with the improvement of digitization and automation, human-machine interfaces in various modern industries have been remarkably changed [5–8], and these made HEP data obtained from old operational experiences no longer suitable for analyzing recent trends of human error. As a result, analysts in risk assessment started to seek HEP data from simulators specially designed for scenarios of interest [1,9,10]. Still, this way of collecting HEP data is restricted in several specific conditions, mainly because such simulations are always expensive and lack portability and generality [2].

    View all citing articles on Scopus
    View full text