Bayesian network model for predicting probability of third-party damage to underground pipelines and learning model parameters from incomplete datasets

https://doi.org/10.1016/j.ress.2020.107262Get rights and content

Highlights

  • Develope a BN model to evaluate probability of third-party damage to pipelines.

  • Apply expectation-maximization algorithm to learn parameters of the BN model.

  • Demonstrate the effectiveness of parameter learning using real-world TPD datasets.

  • Provide a data-driven means to improve the pipeline integrity management for TPD.

Abstract

Damage caused by third-party excavation is one of the leading threats to the structural integrity of underground energy pipelines. Based on fault tree models reported in the literature, the present study develops a Bayesian network (BN) model to estimate the probability of a given pipeline being hit by third-party excavations by taking into account common protective and preventative measures. The Expectation-Maximization (EM) algorithm in the context of the parameters learning is employed to learn the parameters of the BN model from datasets that consist of individual cases of third-party activities but with missing information. The effectiveness of the parameter learning for the developed Bayesian network is demonstrated by a numerical example involving simulated datasets of third-party activities and a case study using real-world datasets obtained from a major pipeline operator in Canada. The BN model and EM-based parameter learning proposed in this study allow pipeline operators to estimate the probability of hit by efficiently taking into account historical third-party excavation records in an objective, efficient manner.

Introduction

The historical pipeline incident data indicate that the mechanical damage from excavations by third parties is one of the leading threats to the structural integrity of buried pipelines [9,19]. A third party is neither a pipeline operator nor a contractor hired by the operator to service the pipeline; in other words, a third party is an individual or organization unrelated to pipeline assets. About 26% of the pipe-related incidents on onshore gas transmission pipelines in the United States resulted from third-party excavations between 2002 and 2013, almost equal to the number of incidents caused by external and internal corrosions combined [19]; the third-party damage is the leading threat to gas transmission pipelines in Europe and accounted for 28.4% of all the gas pipeline incidents between 1970 and 2016 (EGIG, [9]). Therefore, the pipeline industry and regulatory agencies are devoting significant efforts to preventing pipelines from being damaged by third-party excavations. Commonly used preventative measures for the third-party damage (TPD) include, for example, the one-call system (third parties notify the pipeline operators through one-call centers before excavations), warning signs along the pipeline right-of-way (ROW), regular patrol of ROW, and supervision of excavations by personnel of pipeline operators. Protective measures for TPD include the burial depth of pipelines and physical protection such as concrete slabs buried above the pipeline alignment. The Pipeline and Hazardous Materials Safety Administration (PHMSA) of the US Department of Transportation and Common Ground Alliance (CGA) have been using the damage information reporting tool (DIRT) to collect data regarding the damage of underground utilities including pipelines to facilitate the analysis of the effectiveness of preventative and protective measures against TPD (CGA, [6]).

The reliability-based pipeline integrity management program with respect to TPD is being increasingly adopted by pipeline operators to deal with uncertainties associated with the occurrence of TPD events [17]. A key task in such a program is to estimate the hit rate due to third-party excavations, which is the product of the rate of excavation activities (typically expressed in terms of per year per kilometer of pipeline) and probability of hit given a third-party activity [4,5,16,22]. The activity rate is estimated from the observed third-party activities that occurred in the vicinity of the pipeline alignment. A fault tree model developed by Chen and Nessim [5] has been widely employed by the pipeline industry to estimate the probability of hit. The fault tree models a pipeline being hit by a third-party excavation as the result of failures of all the preventative and protective measures such as the third party failing to notify the pipeline operator before the excavation, excavation undetected by the ROW patrol and excavation depth exceeding the burial depth of the pipeline. Various improvements of the original fault tree developed by Chen and Nessim [5] have been proposed since its development. Chen et al. [4] enhanced the fault tree model by taking into account a broader range of preventative and protective measures typically used in the pipeline industry. Lu and Stephens [22] classified third-party activities into authorized activities (AAs) and unauthorised activities (UAs) based on whether or not the pipeline operator's permission has been obtained prior to the start of the excavation. They then developed a hierarchical fault tree model to evaluate the probability of hit as the weighted sum of the probabilities of hit due to authorized and unauthorised activities.

The failures of individual preventative and protective measures are the basic events of the fault tree models reported in the literature [5, 22]. Chen and Nessim [5] carried out an industry-wide survey to estimate probabilities of basic events, generally as functions of relevant pipeline attributes (e.g. patrol frequency, pipeline burial depth, dig notification response time). In the practice of TPD management over the past few decades, pipeline operators have collected a substantial amount of TPD related data such as the individual TPD activities including the information of pipeline attributes, prevention measures and consequences of the TPD activities, and it is highly desirable to use the collected data to estimate the probabilities of basic events. However, the nature of the fault tree analysis, i.e. top-down deduction, and the fact that the collected TPD data generally contain missing information, i.e. the so-called incomplete data, present significant challenges to the probability updating within the fault tree framework.

Fault tree models can be straightforwardly mapped to corresponding Bayesian Networks (BNs) [2, 15], which are well suited for Bayesian inference and probability updating based on observed data. BNs have become increasingly popular in the reliability-based pipeline integrity management program during recent years. Dynamic Bayesian networks (DBNs) and object-oriented Bayesian networks (OOBNs) have been employed to model the structural integrity of offshore and onshore energy pipelines subject to different degradation processes such as corrosion, fatigue and erosion [1,3,26,28,29]. Several BN-based approaches were proposed to carry out the risk analysis of offshore and onshore energy pipelines with respect to TPD threats [7, 10, 11, 16, 20]. Guo et al. [11] developed a BN to characterize the risk of TPD-caused failure for onshore pipelines and adopted the developed BN model to analyze the causes of the TPD incident. Li et al. [20] proposed a dynamic risk analysis model using BNs for subsea pipelines. The fuzzy set theory was employed to convert qualitative expert opinions into probabilities, and the evidence theory was employed to address the conflicts and inconsistency between different expert opinions. The obtained probabilities were then used to develop the conditional probability tables (CPTs) of the BN model for the risk analysis. Cui et al. [7] developed a BN model to quantify the risk of accidental third-party damage and assigned prior probabilities to the BN model with the consultation of experts. Guan et al. [10] developed BNs to assess the probability of TPD for onshore pipelines; parameters of the BNs were manually estimated from databases of third-party activities or specified based on subject matter experts. Compared with expert opinions, manually estimating model parameters from databases of third-party activities reduced the subjectivity of the estimation; however, its application was limited to simple dependence structures due to its deficiency in handling missing data. Koduru and Lu [16] developed a BN model to evaluate the probability of hit based on the fault tree model reported in Chen et al. [4]. They used the information in the DIRT report to evaluate the probabilities of basic events in the BN model. Since participants of the DIRT program report the TPD data only if third-party incidents are detected, the TPD data in the DIRT report are conditional on the occurrence of pipelines being hit. To estimate the unconditional probability of a basic event, Koduru and Lu [16] manually adjusted its probability iteratively until the probability of the event conditional on a hit equals the probability estimated from the DIRT report. Such an approach for evaluating the probability of the basic event is highly inefficient. Furthermore, it is very difficult, if possible at all, to estimate the probabilities of multiple basic events simultaneously using this approach.

The above-described studies have demonstrated the advantage of BNs in characterizing the complex dependent structures of TPD assessment models. However, establishing an approach to elicit parameters of BN models from historical datasets of third-party activities in an objective and efficient manner remains an unsolved issue. To address this problem, the present study proposes to employ the parameter learning technique to learn the parameters of BNs from the historical TPD data. Extensive studies in the area of artificial intelligence have demonstrated that the parameter learning technique associated with BNs provides an automated and objective means to estimate a large number of parameters of BNs from observed data, particularly incomplete data[12], [21], [23], [30]. The TPD-related data (i.e. individual cases of third-party activities) collected by the pipeline industry generally contain incomplete information for estimating the failure probabilities of preventative and protective measures against third-party excavations. The present study considers two typical incomplete datasets that consist of individual third-party activities and proposes to employ the parameter learning technique of BN to learn the probabilities mentioned above. To this end, a BN model for evaluating the probability of hit given a third-party activity is first developed based on the fault tree model commonly used by the pipeline industry [5, 22], whereby the probabilities to be learned are converted to the parameters of the BN model. The Expectation-Maximization (EM) algorithm in the context of parameter learning is then employed to learn the parameters of the BN from two TPD datasets.

The remainder of the paper is organized as follows. Section 2 describes a fault-tree model for evaluating the probability of hit given a third-party activity and a BN model converted from the fault tree. Section 3 presents the incomplete datasets provided by the pipeline industry and EM algorithm for parameter learning. Section 4 demonstrates the effectiveness of the parameter learning through a numerical example involving simulated TPD data and an application using a real-world TPD dataset, followed by conclusions in Section 5.

Section snippets

Fault tree for evaluating the probability of hit

A fault tree is a top-down deductive tool to evaluate the probability of failure of a system that is attributed to failures of multiple components of the system [24]. In a fault tree, the system failure is the top event; events that result from occurrences of other events are called intermediate events, and events that cannot be broken down into other events are called basic events. The relationship between higher-level and lower-level events is characterized by Boolean logic, i.e. the “or” and

TPD datasets for parameter learning

The pipeline company that provided the TPD data to the present study owns and operates an extensive network of transmission pipelines in Canada, and has been applying the fault-tree model to manage the TPD threat in the past decade. The company groups its pipeline assets into seven TPD regions based on the geographic location of the pipeline. The pipeline attributes denoted by nodes A1 through A9 in the BN model are the same for all the pipelines within the same TPD region. The company has been

Numerical example and case study

This section first uses a numerical example involving simulated TPD data to demonstrate the effectiveness of the parameter learning for this specific BN model. Then, a case study involving real-world TPD datasets, i.e. Datasets 1 and 2 described in Section 3.1, is presented.

Conclusions

The present study develops a BN model to evaluate the probability of buried oil and gas pipelines being hit by third-party excavation activities based on a fault tree model well recognized in the pipeline industry. The EM algorithm-based parameter learning technique is employed to learn CPTs of the BN model from historical TPD datasets. The effectiveness of the parameter learning is demonstrated by a numerical example involving simulated TPD datasets, where the KL-divergence between the learned

CRediT authorship contribution statement

W. Xiang: Methodology, Software, Investigation, Writing - original draft, Visualization. W. Zhou: Conceptualization, Methodology, Writing - review & editing, Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper

Acknowledgments

The authors gratefully acknowledge the financial support provided by the Natural Sciences and Engineering Research Council of Canada (NSERC), Ontario Trillium Scholarship and Faculty of Engineering at the University of Western Ontario. The constructive comments provided by two anonymous reviewers are appreciated.

References (30)

  • A.R. Masegosa et al.

    Learning from incomplete data in Bayesian networks with qualitative influences

    Int J Approx Reason

    (2016)
  • Y. Yang et al.

    Corrosion induced failure analysis of subsea pipelines

    Reliab Eng Syst Saf

    (2017)
  • Y. Zhou et al.

    An empirical study of Bayesian network parameter learning with monotonic influence constraints

    Decis Support Syst

    (2016)
  • B. Cai et al.

    Remaining useful life estimation of structure systems under the influence of multiple causes: subsea pipelines as a case study

    IEEE Trans Ind Electron

    (2019)
  • Q. Chen et al.

    Modeling damage prevention effectiveness based on industry practices and regulatory framework

  • Cited by (29)

    View all citing articles on Scopus
    View full text