Stochastic hybrid automaton model of a multi-state system with aging: Reliability assessment and design consequences

https://doi.org/10.1016/j.ress.2015.12.007Get rights and content

Highlights

  • Analytical model of a multistate system with aging.

  • Limits of analytical techniques for Dynamic Probabilistic Risk Assessment.

  • General definition of Hybrid Basic Events for Dynamic Reliability.

  • Conception of a Stochastic Hybrid Automaton model based on Simulink.

  • Dynamic reliability as a design tool for the dimensioning of industrial systems.

Abstract

Dynamic reliability aims to relax the rigid hypotheses of traditional reliability enabling the possibility to model multi-state systems and consider changes of the nominal design condition of a system. The solution of such type of models is a complex task that cannot be tackled with analytical techniques and must involve other types of formalisms based on simulation. One of the most promising simulation approach is Stochastic Hybrid Automaton (SHA), able to breakdown a system into a physical and a stochastic model that are coupled together with shared variables and synchronising mechanisms.

In order to foster this latter research path, a simulation model, based on SHA, was codified as regard to a case of study; it has allowed to compute the reliability of a multi-state aging system under dynamic environmental and operational conditions. The same model has permitted to understand the system behaviour resulting a useful tool for its design. Such type of highlights could not be inferred using traditional reliability modelling, as shown in the comparison with a dynamic fault tree.

The SHA model was codified in Simulink environment and represents a small step ahead for the conception and the delivering of a user-friendly tool for the DPRA.

Introduction

Traditional techniques of reliability assessment have been developed under hypotheses that simplify many real-life boundary conditions. This is probably the price paid in the early 30׳s when the reliability theory started to gain importance and, under the pressure of new technological advances in the military, maritime, Oil & Gas and aircraft industries, grew fast and rigid. Such industrial applications can never stop, work under well defined conditions and perform within narrow operating margins for the entire time of the mission. For these applications, the actual definition of reliability has offered a well-defined theoretical domain, such that the mathematics built-up around resulted elegant and robust [1]. From a practical point of view, such theory was applicable supported by the famous stratagem in engineering: think always the worst scenario.

Nowadays, the improvement of this conservative approach has brought to the conception of a new research field that encompasses many different subjects and goes under the name of performability [2].

Dependability is one of the attribute of performability; it represents an extension of reliability and deals with reliability, availability, safety and related measures of interest for a system [3], [4]. In turn, dependability assessment considers also non-functional aspects related with the functioning of a system like reconfiguration, fault tolerance, interferences or dependencies and, more generally, its dynamic evolution. These elements allow overcoming the binary nature (i.e., fault or working) when modelling the operating state of a component and relax the hypotheses of traditional reliability theory (i.e., statistical independence), giving room to more insightful analyses including performance evaluation in degraded conditions and considering the system evolution under different environmental and operational conditions. Actually, a complex system works under a multitude of different conditions whose sequence and durations can be stochastic or deterministic, therefore its operating rules and performance can change dramatically.

The need for more realistic reliability assessments started within the field of nuclear engineering, with the definition of the Dynamic Probabilistic Risk Assessment (DPRA) [5]. This class of problems, known also as Hybrid Stochastic systems [6] are characterized by the coupling of a physical deterministic model (i.e., first principles models) with a stochastic one. In this assembly, a stochastic event can trigger a change in the deterministic model and, mutually, a variation of the deterministic model modifies the operational conditions and the probability functions of the stochastic one.

In the literature, DPRA is also referred as Probabilistic Dynamics or Dynamic Reliability [7], [8], [9], [10]; the term dynamic is used to address changes in the environmental and operational conditions. Another main difference with traditional quantitative techniques [2] like Static Fault Trees (SFT) and Reliability Block Diagrams (RBD) is the possibility to model more accurately the aging effects in a system. As matter of fact, in a DPRA a component degrades only during the intervals of time in which it is operating, as opposed to conventional analyses where aging factors are usually given as inputs to the analysis.

At the state of the art, there are analytical and simulation techniques that can be used to handle a DPRA problem. Among the former, Piecewise Deterministic Markov Process (PDMP) [10], [11], [12] and Regime Switching Modelling (RSM) [13], [14] are solid mathematical frameworks able to model both aging effects and system evolution. PDMP models the aging evolution of a system with a set of differential equations while RSM the dynamic change of the system with a sequence of Continuous-Time Markov Chains (CTMC), each one describing a particular type of environmental/operational condition. An alternating renewal process governs the regime switching from a CTMC to another. Also state space modelling has been recently applied to dynamic reliability with aging; [9] offers an elegant review of the most suitable analytical methods, including Generalised Stochastic Markov Processes (GSMP), able to address several dynamic reliability behaviours such as, fault coverage, load sharing, fault coverage. Moreover, it provides a useful guideline flowchart that shows what modelling approach is best to use with respect to the problem to undertake.

With the increasing of computing power, simulations can result a valid alternative and may in fact be the most suitable approach for complex DPRA models. In particular, simulations allow the analysis of systems with non-Markovian structure, as shown in [15], for an application in the Oil & Gas sector. Simulation can be implemented with plenty of different tools, from a spreadsheet [16] to other well-known high-level tools, like Simulink [17] and can benefit of several speed up algorithms [18], [19]. Other research contributes show how hybrid stochastic models like Fluid Stochastic Petri nets (FSPN) [20] and Stochastic Activity Networks (SAN) [21], [22] can be used to implement a continuous process with stochastic features and simulate dynamic reliability problems. In fact, besides the elements of traditional Petri Nets, hybrid stochastic models present additional objects that allow the characterisation of a continuous/discrete marking and time-dependent activities. Such models are then solved using a discrete event simulation engine. Although the penetration of these modelling formalisms within the industrial fields is nowadays a fact, it must be pointed out that the flat representation of a complex Petri Net, made up of places and activities, can become large and difficult to interpret, even more when describing a continuous process typical of a mechanical or a physical system. Hierarchy is a feature that has been often used to alleviate this issue; for instance, SHARPE [23] can combine SFT, GSPN and Markov Chains, RAATSS [24] supports DFT and ATS while MÖBIUS [25], [26] offers high-level constructs (JOIN and REP) to build up composed hierarchical SAN models based on simpler atomic models which can be developed independently, replicated and joined. In particular, the construct REP allows to replicate an atomic model, while the construct JOIN permits the combination of two or more atomics on the base of a set of shared variables.

These hybrid formalisms are very powerful and general but do not offer any high-level construct for modelling systems characterized by physical and mechanical interactions. In these cases, the main drawbacks is the effort linked with the maintenance and the handover of such models. For these reasons, authors recognise the importance to make further investigation on promising hybrid modelling like Stochastic Hybrid Automaton (SHA), for the resolution of DPRA as the one shown in [6,[27], [28], [29] and the utilisation of other tools, more indicated to describe dynamic systems. Among the several attributes of dependability this work deals with the reliability assessment under a dynamic reliability point of view.

Therefore, starting from the definition of system reliability, the DPRA modelling is gradually introduced with the inclusion of the aging effects and of the dynamic changes of the working/operative conditions of a system.

This first section clarifies why analytical techniques, like PDMP, GSMP or RSM, fail the resolution of such models. Afterwards, a non-trivial case study for the reliability assessment, possessing all the characteristics of a DPRA problem will be presented to better show the dynamic multi-state nature of a DPRA model.

The attempt to use a RAMS technique model like DFT, BDMP or DBRD for solving the case study will confirm the limits of this reliability technique and highlights the need for a more powerful class of basic events (that will be defined Hybrid Basic Events, HBE) that better suits the modelling characteristics of a DPRA problem.

In this paper, the solution of a DPRA problem is tackled with the use of a Monte Carlo simulation applied to an architecture of concurrent models, based on the separation of concern [30], [31] and SHA modelling [6,[27], [28], [29]. In a recent paper [48], a formal SHA model (called SHyFTA) extending the MatCarloRE tool for dynamic reliability with DFT was introduced, but it did not consider the importance of the dimensioning activity that cannot be neglected when performing a dynamic reliability assessment. In fact, as it will be shown in this paper, the reliability of a system is strongly affected by the working and operational conditions in which the system operates and the activity of plant dimensioning (i.e., the choice of the correct system tuning) can extend the system life, reducing the aging and the wear-out of the system.

The proposed architecture offers several benefits. First, the modelling effort is reduced because it is possible to break down the original DPRA problem in two different simulation models, the physical and the stochastic, that are individually simpler to implement. Moreover, this architecture permits to describe easily the multi-state nature of a system in terms of mechanical performance, degradation, variation of independent physical variables (like temperatures, pressures and aging), change of failure and stochastic characteristics, resulting able to capture any cumulative damage behaviour.

Thus, the main contributions of this paper can be summarised as below:

  • 1.

    It offers a review about dynamic reliability, limits of analytical techniques and the opportunity for the adoption of a simulation Stochastic Hybrid Automaton model;

  • 2.

    It introduces to the hybrid basic event [48] as a general concept for the dynamic reliability modelling, able to model the multi-state nature of a component in a dynamic working and environmental condition;

  • 3.

    For the case study, it presents the codification of a Stochastic Hybrid Automaton that can be used as a sizing tool for finding the optimal trade-off between system reliability and performance. Moreover, it can be adopted as a reference model, alternative to the SHA solutions discussed in [6,[27], [28], [29]48].

The remainder of this paper is organised as follows: Section 2 introduces to the dynamic reliability. Section 3 presents the case study of a Data Cluster system, showing its characteristics in term of DPRA. Section 4 presents the SHA simulation model of the case study while Section 5 discusses the results of the reliability assessment with respect to several sizing configurations of the system under evaluation. Section 6 contains a discussion summarizing the advantages and drawbacks of the SHA-HPM, including the Simulink implementation offered. Finally, Section 7 provides conclusions and draws the line for future researches.

Section snippets

Literature review on dynamic probabilistic risk assessment

DPRA aims to relax the rigid hypotheses of traditional RAMS techniques, focusing on systems that operate in variable and dynamic conditions.

It can consider numerous characteristics of complex systems, such as inclusion of environmental dependencies, interactions between continuous process variables and system components, stochastic and deterministic behaviours evolving in time. As matter of fact, a component does not operate always around the nominal design operative conditions, resulting in

Case study: dynamic reliability assessment of a Data Cluster

Fig. 2 shows the lay-out of a Data Cluster installation. This system is made up of a service facility that maintains the condition required for the correct functioning of the Data Cluster. The service facility (e.g., the air conditioning system) is constituted by an Internal Unit (IU), the Air Treatment Unit (ATU), and by an External Unit (EU). The ATU permits the evaporation of the coolant refreshing the internal environment, while the EU performs the compressing and condensing of the coolant

Implementation of the SHA-HPM simulation

As discussed in the previous section, the DPRA has to be solved via simulation, using a concurrent simulation approach. It was implemented with Simulink, a block diagram environment of the Matlab suite [42]. The choice of using Simulink relies on the fact that it can be effectively coded for simulating complex concurrent models and solve dynamic systems. Complex logics can be implemented using Boolean logic blocks, Switch blocks, Memory blocks and Assertion Blocks. These latter, in particular,

Simulation campaign

At first, the reliability model of the Data Cluster system was studied with the reliability models of Fig. 4. These representations are equivalent and become valid as a result of a dimensioning process aimed to prevent the occurrence of the overwarming in the technical room where the Data Cluster is placed. Specifically, the air conditioning system is a component that engineers decide to install to improve the reliability of the Data Cluster. In fact, without the air conditioning system the

Discussion

This section contains a brief discussion about the SHA-HPM methodology and the Simulink implementation shown in this paper, providing information about modelling efforts, computational aspects and related benefits and drawbacks.

The first aspect to highlight is that the modelling effort of a SHA-HPM is lower than an analytical model. This benefit is linked with the simulation nature of the SHA-HPM that offers no limitations on the number of components, inter-dependencies, working behaviours and

Conclusions

Traditional RAMS techniques cannot be used to analyse systems featuring aging and dynamic change of boundary conditions. Dynamic reliability arises with the consciousness that the performance of a system is tightly interconnected with the failure behaviour and, consequently, a holistic design of a plant solution cannot disregard this combination of behaviours.

In this paper, a Stochastic Hybrid Automaton model has been created to assess the dynamic reliability of a multi-state system with aging,

References (48)

  • B. Kaiser et al.

    State/event fault trees—a safety analysis model for software-controlled systems

    Reliab Eng Syst Saf

    (2007)
  • D. Wang et al.

    Performability analysis of clustered systems with rejuvenation under varying workload

    Perform Eval

    (2007)
  • F. Chiacchio et al.

    A Weibull-based compositional approach for hierarchical dynamic fault trees

    Reliab Eng Syst Saf

    (2013)
  • G. Merle et al.

    Algebraic determination of the structure function of Dynamic Fault Trees

    Reliab Eng Syst Saf

    (2011)
  • F. Chiacchio et al.

    SHyFTA, a Stochastic Hybrid Fault Tree Automaton for the modelling and simulation of dynamic reliability problems

    Exp Syst Appl

    (2016)
  • M. Rausand et al.

    Life data analysis

    System reliability theory: models, statistical methods, and applications

    (1994)
  • K.B. Misra

    Handbook of performability engineering

    (2008)
  • K. Goševa-Popstojanova et al.

    Stochastic modeling formalisms for dependability, performance and performability

  • A. Avižienis et al.

    Basic concepts and taxonomy of dependable and secure computing

    IEEE Trans Dependable Secur Comput

    (2004)
  • Aubry J.F., Brînzei N. Stochastic hybrid automaton, In: Systems Dependability Assessment 2015 John Wiley & Sons, Inc....
  • H. Zhang et al.

    Piecewise deterministic markov processes and dynamic reliability

    Proc Inst Mech Eng

    (2008)
  • Bouissou M and Jankovic M. Critical comparison of two user friendly tools to study Piecewise Deterministic Markov...
  • Bouissou M, Churabova I and Chraibi H. Critical comparison of two user friendly tools to study Piecewise Deterministic...
  • A.G. Hawkes et al.

    Modeling the evolution of system reliability performance under alternative environments

    IIE Trans

    (2011)
  • Cited by (39)

    • Dynamic Reliability Assessment of PEM Fuel Cell Systems

      2021, Reliability Engineering and System Safety
      Citation Excerpt :

      The method was applied to an air conditioning system. Chiacchio et al. [30] continued the analysis of the same system but proposed using Stochastic Hybrid Automaton (SHA) to implement the stochastic events within the system. The SHA is an approach which breaks down a system into a physical and a stochastic model that are coupled together with shared variables and synchronising mechanisms.

    • A general framework for dependability modelling coupling discrete-event and time-driven simulation

      2020, Reliability Engineering and System Safety
      Citation Excerpt :

      In fact, SPDEs are not trivial to conceive and solve, in particular for complex dependable processes like the one characterizing the industrial systems [6]. Recent works [7-9] demonstrated the effectiveness of Stochastic Hybrid Automaton models (SHA) for the analysis of DPRA problems of complex systems. SHA models are characterized by a combination of discrete and continuous states [10]: the evolution of the system in each state is modelled with the mathematical equations of the system in that specific state.

    • Procedures to model and solve probabilistic dynamic system problems

      2019, Reliability Engineering and System Safety
    View all citing articles on Scopus
    View full text