An automated method for estimating reliability of grid systems using Bayesian networks

https://doi.org/10.1016/j.ress.2012.03.016Get rights and content

Abstract

Grid computing has become relevant due to its applications to large-scale resource sharing, wide-area information transfer, and multi-institutional collaborating. In general, in grid computing a service requests the use of a set of resources, available in a grid, to complete certain tasks. Although analysis tools and techniques for these types of systems have been studied, grid reliability analysis is generally computation-intensive to obtain due to the complexity of the system. Moreover, conventional reliability models have some common assumptions that cannot be applied to the grid systems. Therefore, new analytical methods are needed for effective and accurate assessment of grid reliability. This study presents a new method for estimating grid service reliability, which does not require prior knowledge about the grid system structure unlike the previous studies. Moreover, the proposed method does not rely on any assumptions about the link and node failure rates. This approach is based on a data-mining algorithm, the K2, to discover the grid system structure from raw historical system data, that allows to find minimum resource spanning trees (MRST) within the grid then, uses Bayesian networks (BN) to model the MRST and estimate grid service reliability.

Introduction

Grid computing has become relevant due to its applications to large-scale resource sharing, wide-area information transfer, and multi-institutional collaborating. In general, in grid computing services request a set of resources, available in a grid, to complete certain tasks. Many experts believe that the grid technologies will offer a chance to extend the benefits of the Internet [1]. However, it is difficult to analyze the grid reliability due to its highly heterogeneous and distributed characteristics. Because the grid systems involve cross-organizational sharing, they support existing distributed computing technologies. As an example, enterprise-level distributed computing systems can use the grid technologies to achieve resource sharing across its different institutions. Although, several development tools and techniques for the grid systems have been studied, estimating grid reliability is not straightforward due to the size and complexity of the grid [2]. Therefore, new analytical methods are needed to evaluate the grid reliability.

Over the past several years, research and development efforts have focused on the challenges that arise when large grid organizations [1], [2], [3], [4] are built. As a recent topic, there are a few studies on estimating grid system reliability in the literature [5], [6], [7], [8]. In these studies, the grid system reliability is estimated by focusing on the reliabilities of services provided in the grid system. For this purpose, the grid system components that are involved in a grid service are classified into spanning trees, and each tree is studied separately. However, these studies mainly focus on understanding grid system structures rather than estimating the actual system reliability. Thus for simplification purposes, they make certain assumptions on component failure rates, such as satisfying a probabilistic distribution [7].

For reliability estimation, Bayesian networks (BN) have been proposed as an efficient method [9], [10], [11], [12]. BN provide significant advantages over traditional frameworks for the systems engineers, mainly because they are easy to interpret and they can be used in interaction with domain experts in the reliability field [13]. Using the BN structure and the probabilistic values, the system reliability can be estimated with the help of Bayes rule [12]. There are several recent studies for reliability estimation using BN [9], [11], [14], [15], [16], which require specialized networks that are designed for a specific system. That is, the BN to be used for analyzing system reliability should be known beforehand (i.e. the BN can be built by an expert who has “adequate” knowledge about the system under consideration). However, human intervention is always open to unintentional mistakes that could cause discrepancies in the results [17].

To address these issues, this paper introduces a methodology for estimating grid system reliability by combining techniques such as BN construction from raw component and system data, association rule mining and evaluation of conditional probabilities. Based on the extensive literature review, this is the first study that incorporates these methods for estimating grid system reliability. With the increasing popularity of computer environments in systems engineering, grid systems have been widely used in various system-related applications. Understanding the grid system structure and the component relationships is essential for systems engineers for optimal resource allocation and improving the system reliability. This study provides a methodology for automated discovery of component relationships and estimation of reliability of grid services to help the systems engineers.

The methodology suggested in this paper automates the process of spanning tree discovery and BN construction by using the K2 algorithm (a commonly used association rule mining algorithm) that identifies the associations among the grid system components by using a predefined scoring function and a heuristic. According to the proposed method, once the BN is efficiently and accurately constructed, reliabilities of grid services are estimated with the help of Bayes rule. Unlike previous studies, the methodology proposed in this paper does not rely on any assumptions about the component failure rates in grid systems. Moreover, the proposed method does not require prior knowledge about the grid system structure.

Section snippets

Background information

This section provides background information about the grid systems, BN and the K2 algorithm. Earlier studies on estimating grid system reliability are also discussed in this section.

Estimating grid service reliability using BN

As discussed in Section 2, there are several studies [9], [10], [11], [13], [14], [29], [30], [31] in the literature that define reliability estimation methods for traditional small-scale systems. However these studies mostly rely on certain assumptions on the system topology and operational probabilities of the components (links and nodes). In the case of dynamic grid systems, these assumptions may be invalid since links can be destroyed or established on the fly. Moreover, due to dynamic

Experimental analysis

This section provides experimental analysis of the proposed method for grid service reliability estimation. For experimental analysis, the proposed method is implemented in Matlab 8, using a computer equipped with Intel Core 2 Duo 2.1 Ghz CPU and 2 GB RAM. This computer runs on 32-bit Windows Vista Business operating system.

First, the experiments on the performance of the proposed method are provided. For this purpose, several grid services are created in the example grid system shown in Fig. 9.

Conclusions

Grid systems are newly developed concepts for large-scale distributed systems. In a grid system, there can be various nodes that are logically and physically distributed; and large-scale sharing of resources is essential between these nodes. There are mainly two types of nodes in a grid system: RM share resources and RN request service from them. Identification of the links and nodes between RN and RM is essential for estimating the reliability of the requested service. Due to their special and

References (30)

  • Y.S. Dai et al.

    Optimal resource allocation for maximizing performance and reliability in tree-structured grid services

    IEEE Transactions on Reliability

    (2007)
  • Y.S. Dai et al.

    A hierarchical modeling and analysis for grid service reliability

    IEEE Transactions on Computers

    (2007)
  • Amasaki S, Takagi Y, Mizuno O, Kikuno T. In: a Bayesian belief network for assessing the likelihood of fault content....
  • H. Boudali et al.

    A continuous-time Bayesian network reliability modeling, and analysis framework

    IEEE Transaction on Reliability

    (2006)
  • Gran BA, Helminen A, Bayesian A. Belief network for reliability assessment. Safecomp 2001 2187, 2001, p....
  • Cited by (28)

    • Development of a hybrid method to assess grid-related LOOP scenarios for an NPP

      2021, Reliability Engineering and System Safety
      Citation Excerpt :

      To relieve the conservatism, the risk-significant scenarios are ranked benefiting from the probabilistic approaches [6]. Continuous improvements have been pursued in the development of methods and tools for reliability assessment of electrical transmission grid [2,3,8,9,12–16,18,20,27,29,31,33,34,44]. Volkanovski developed a fault-tree based method that concerns the connectivity of energy delivery paths from the generators to specified load point [42].

    • Applications of Bayesian networks and Petri nets in safety, reliability, and risk assessments: A review

      2019, Safety Science
      Citation Excerpt :

      In this approach, associations between system components were identified using a data mining algorithm named K2. The same authors have performed similar research in Doguc and Ramirez-Marquez (2012) for reliability analysis of grid systems. Jiang et al. (2013) proposed a novel probabilistic model, called the hybrid relation model (HRM) for reliability evaluation of programmable logic controller (PLC) systems.

    • A review of the development of Smart Grid technologies

      2016, Renewable and Sustainable Energy Reviews
      Citation Excerpt :

      As grids continue to grow in size and complexity, it becomes more difficult to analyze grid reliability but new analytical methods from research efforts have continued to build a stronger reliability foundation for modern networks. A data mining algorithm to discover grid system structure from raw historical system data can estimate grid service reliability by using Bayesian networks [29]. Remote monitoring of hybrid generation and automatic Smart Grid management for instable distribution main contribute to efficiency [30].

    View all citing articles on Scopus
    View full text