An automated method for estimating reliability of grid systems using Bayesian networks
Introduction
Grid computing has become relevant due to its applications to large-scale resource sharing, wide-area information transfer, and multi-institutional collaborating. In general, in grid computing services request a set of resources, available in a grid, to complete certain tasks. Many experts believe that the grid technologies will offer a chance to extend the benefits of the Internet [1]. However, it is difficult to analyze the grid reliability due to its highly heterogeneous and distributed characteristics. Because the grid systems involve cross-organizational sharing, they support existing distributed computing technologies. As an example, enterprise-level distributed computing systems can use the grid technologies to achieve resource sharing across its different institutions. Although, several development tools and techniques for the grid systems have been studied, estimating grid reliability is not straightforward due to the size and complexity of the grid [2]. Therefore, new analytical methods are needed to evaluate the grid reliability.
Over the past several years, research and development efforts have focused on the challenges that arise when large grid organizations [1], [2], [3], [4] are built. As a recent topic, there are a few studies on estimating grid system reliability in the literature [5], [6], [7], [8]. In these studies, the grid system reliability is estimated by focusing on the reliabilities of services provided in the grid system. For this purpose, the grid system components that are involved in a grid service are classified into spanning trees, and each tree is studied separately. However, these studies mainly focus on understanding grid system structures rather than estimating the actual system reliability. Thus for simplification purposes, they make certain assumptions on component failure rates, such as satisfying a probabilistic distribution [7].
For reliability estimation, Bayesian networks (BN) have been proposed as an efficient method [9], [10], [11], [12]. BN provide significant advantages over traditional frameworks for the systems engineers, mainly because they are easy to interpret and they can be used in interaction with domain experts in the reliability field [13]. Using the BN structure and the probabilistic values, the system reliability can be estimated with the help of Bayes rule [12]. There are several recent studies for reliability estimation using BN [9], [11], [14], [15], [16], which require specialized networks that are designed for a specific system. That is, the BN to be used for analyzing system reliability should be known beforehand (i.e. the BN can be built by an expert who has “adequate” knowledge about the system under consideration). However, human intervention is always open to unintentional mistakes that could cause discrepancies in the results [17].
To address these issues, this paper introduces a methodology for estimating grid system reliability by combining techniques such as BN construction from raw component and system data, association rule mining and evaluation of conditional probabilities. Based on the extensive literature review, this is the first study that incorporates these methods for estimating grid system reliability. With the increasing popularity of computer environments in systems engineering, grid systems have been widely used in various system-related applications. Understanding the grid system structure and the component relationships is essential for systems engineers for optimal resource allocation and improving the system reliability. This study provides a methodology for automated discovery of component relationships and estimation of reliability of grid services to help the systems engineers.
The methodology suggested in this paper automates the process of spanning tree discovery and BN construction by using the K2 algorithm (a commonly used association rule mining algorithm) that identifies the associations among the grid system components by using a predefined scoring function and a heuristic. According to the proposed method, once the BN is efficiently and accurately constructed, reliabilities of grid services are estimated with the help of Bayes rule. Unlike previous studies, the methodology proposed in this paper does not rely on any assumptions about the component failure rates in grid systems. Moreover, the proposed method does not require prior knowledge about the grid system structure.
Section snippets
Background information
This section provides background information about the grid systems, BN and the K2 algorithm. Earlier studies on estimating grid system reliability are also discussed in this section.
Estimating grid service reliability using BN
As discussed in Section 2, there are several studies [9], [10], [11], [13], [14], [29], [30], [31] in the literature that define reliability estimation methods for traditional small-scale systems. However these studies mostly rely on certain assumptions on the system topology and operational probabilities of the components (links and nodes). In the case of dynamic grid systems, these assumptions may be invalid since links can be destroyed or established on the fly. Moreover, due to dynamic
Experimental analysis
This section provides experimental analysis of the proposed method for grid service reliability estimation. For experimental analysis, the proposed method is implemented in Matlab 8, using a computer equipped with Intel Core 2 Duo 2.1 Ghz CPU and 2 GB RAM. This computer runs on 32-bit Windows Vista Business operating system.
First, the experiments on the performance of the proposed method are provided. For this purpose, several grid services are created in the example grid system shown in Fig. 9.
Conclusions
Grid systems are newly developed concepts for large-scale distributed systems. In a grid system, there can be various nodes that are logically and physically distributed; and large-scale sharing of resources is essential between these nodes. There are mainly two types of nodes in a grid system: RM share resources and RN request service from them. Identification of the links and nodes between RN and RM is essential for estimating the reliability of the requested service. Due to their special and
References (30)
- et al.
Optimal resource allocation on grid systems for maximizing service reliability using a genetic algorithm
Reliability Engineering & System Safety
(2006) - et al.
Reliability of grid service systems
Computers and Industrial Engineering
(2006) - et al.
A generic method for estimating system reliability using Bayesian networks
Reliability Engineering and System Safety
(2009) - et al.
Improving the analysis of dependable systems by mapping fault trees into Bayesian networks
Reliability Engineering and System Safety
(2001) - et al.
A heuristic approach to generating file spanning trees for reliability analysis of distributed computing systems
Computers and Mathematics with Application
(1997) - et al.
A study of service reliability and availability for distributed systems
Reliability Engineering and System Safety
(2003) - et al.
The anatomy of the grid: enabling scalable virtual organizations
International Journal of Supercomputer Applications
(2001) - et al.
A computation management agent for multi-institutional grids
Cluster Computing
(2002) - Foster I, Kesselman C. In computational grids. VECPAR, 1998; Morgan Kaufmann, 1998. p....
- et al.
Economic and on demand brain activity analysis on global grids
Computing Research Repository
(2003)
Optimal resource allocation for maximizing performance and reliability in tree-structured grid services
IEEE Transactions on Reliability
A hierarchical modeling and analysis for grid service reliability
IEEE Transactions on Computers
A continuous-time Bayesian network reliability modeling, and analysis framework
IEEE Transaction on Reliability
Cited by (28)
Design for dependability — State of the art and trends
2024, Journal of Systems and SoftwareProbabilistic probe selection algorithm for fault diagnosis in communication networks
2021, Computer NetworksEvolution of Safety and Security Risk Assessment methodologies towards the use of Bayesian Networks in Process Industries
2021, Process Safety and Environmental ProtectionDevelopment of a hybrid method to assess grid-related LOOP scenarios for an NPP
2021, Reliability Engineering and System SafetyCitation Excerpt :To relieve the conservatism, the risk-significant scenarios are ranked benefiting from the probabilistic approaches [6]. Continuous improvements have been pursued in the development of methods and tools for reliability assessment of electrical transmission grid [2,3,8,9,12–16,18,20,27,29,31,33,34,44]. Volkanovski developed a fault-tree based method that concerns the connectivity of energy delivery paths from the generators to specified load point [42].
Applications of Bayesian networks and Petri nets in safety, reliability, and risk assessments: A review
2019, Safety ScienceCitation Excerpt :In this approach, associations between system components were identified using a data mining algorithm named K2. The same authors have performed similar research in Doguc and Ramirez-Marquez (2012) for reliability analysis of grid systems. Jiang et al. (2013) proposed a novel probabilistic model, called the hybrid relation model (HRM) for reliability evaluation of programmable logic controller (PLC) systems.
A review of the development of Smart Grid technologies
2016, Renewable and Sustainable Energy ReviewsCitation Excerpt :As grids continue to grow in size and complexity, it becomes more difficult to analyze grid reliability but new analytical methods from research efforts have continued to build a stronger reliability foundation for modern networks. A data mining algorithm to discover grid system structure from raw historical system data can estimate grid service reliability by using Bayesian networks [29]. Remote monitoring of hybrid generation and automatic Smart Grid management for instable distribution main contribute to efficiency [30].