Establishment of the optimal time interval between periodic inspections for redundant systems

https://doi.org/10.1016/j.ress.2014.06.021Get rights and content

Abstract

For redundant systems with periodic inspections, the establishment of the optimal time interval between inspections that maximize availability and minimize costs is a challenging issue. This paper develops a model to analyze the reliability and determine the optimal interval between inspections of redundant systems subjected to periodic inspections. It uses discrete time Markov Chains to define the transition probabilities between the state of the systems and the costs related with each state. To optimize the time between inspections, the total cost per cycle was minimized using the Markov Chain properties followed by a numerical search technique. Four models of systems are analyzed and numerical examples for systems comprised of two and three components are presented: Model I – Active redundancy without component repair; Model II – Active redundancy with component repair; Model III – Standby redundancy without component repair and Model IV – Standby redundancy with component repair. The main advantage of the model used in this paper is the inclusion of costs for unavailability and production losses through the definition of the downtime costs that penalize the model when the system fails. This model can also be extended and generalized to determine the optimal interval between inspections in systems with active or inactive redundancies and with n components.

Introduction

There is an increasing emphasis and expectation for companies to uphold high standards of corporate social responsibility that protects not only the environment, but the health and safety of people as a whole. Emerging trends, such as the widespread use of lean manufacturing and six-sigma, have forced industries to operate in a more cost efficient manner. The optimal combination between redundancy application and maintenance efforts has become essential to ensuring safety, reducing operational costs by eliminating unnecessary activities, and enabling a steady flow process.

Redundant systems are widely used in industries where risk processes need high levels of reliability. For example, pump systems in oil refinery industries, cooling systems in steel plants, turbines in airplanes and reactors in nuclear companies. There are basically two primary types of redundancies: active or hot redundancy and inactive or cold standby redundancy. The use of hot or cold redundancy depends on the components utilized and characteristics required by the system.

In these systems, if continuous monitoring is not possible, periodic inspections are necessary to ensure that the system is working and that adequate redundancy is in place when it is required. During periodic inspections, hidden component failures are detected and repaired at predetermined time intervals. The time interval between inspections should be optimized in order to maximize availability and safety while also minimizing costs [1]. Frequent inspections increase the availability of the system, but involve higher costs of preventive maintenance. On the other hand, longer periods between inspections decrease inspection total costs, but can increase the costs of corrective maintenance (system repair, safety accidents) and downtime since there are longer periods where the system can be unavailable [2], [3]. The establishment of the optimum interval between inspections is important to ensure satisfactory system availability along with the lowest possible cost.

Reliability analysis of redundant systems has been studied for many years considering different approaches and methods. The most common methods used are Markov and semi-Markov models associated with Laplace transforms and numerical solutions. These methods were used by Laprie et al. [4] to study a system where the operating unit׳s failure rate increased when the other unit was under repair. Ref. [5] also used these methods to analyze a system with sequential preventive maintenance. The disappointment and interference time were modeled by Kapur and Kapoor [6]. Studying a system subject to preventive and corrective maintenance, Ref. [7] presented an algorithm to calculate the time until the first system down. Ref. [8] used Markov process to analyze a system with operation and repair priorities, while a system with a unit that switches from a cold-standby to a warm-standby position is modeled in [9].

By combining Markov and semi-Markov process with genetic algorithms, authors in [10], [11] optimized the maintenance of multi-state systems. Ref. [12] analyzed a system with flexible intervals between maintenance interventions and Ref. [13] optimized the availability of a manufacturing system. The authors in [14] combined Markov process, genetic algorithms, and universal moment generation function to study multi-state degraded systems, while Ref. [15] used non-linear programming instead of UMGF to optimize a replacement policy. Finally, Ref. [16] analyzes imperfect maintenance utilizing Markov process and Bayesian networks.

Most of the papers that use Markov processes do not analyze the costs related with the operation and maintenance of the redundant system. The main objective of these papers is to define the reliability and availability of the system. A few papers analyze the costs involved in the maintenance, but they do not include the costs of unavailability and safety accidents (downtime), as seen in [14] that just considered the costs of procurement, preventive and corrective maintenance and [15] that just considered the costs of ordering and purchase of components.

Since the issue of redundant systems with periodic inspections has been partially covered, this paper aims to develop a model to analyze the reliability and determine the optimal time interval between inspections of redundant systems subjected to periodic inspections using discrete time Markov Chains.

The maintenance process of a redundant system is a stochastic process, since it has n components and each one can be up or down (operating or failed state) at any time. Each combination of component states represents a state-space in the Markov Chain process. When periodic inspections are employed, the system state (up or down) is observed at discrete time points (only in the inspections). This scenario justifies the application of a discrete time Markov Chain.

The main advantage of the model proposed in this paper is the inclusion of the costs of unavailability and safety accidents through the definition of downtime costs that penalize the cost model when the system fails; thus increasing the total costs the longer the system is down. Besides that advantage, this model can be generalized and used for determining the optimal time interval between inspections in systems with active or inactive redundancies and with n components. Models for two and three components are presented in this paper.

This article is organized as follows. Section 2 describes the research approach, presenting a brief review of the main studies related with preventive maintenance of redundant systems. Section 3 lists the assumptions and notation used and explains the methodology applied to modeling the problem. In Section 4, numerical examples are presented and analyzed. Section 5 summarizes the article and includes concluding remarks.

Section snippets

Research approach

Over the course of the last 60 years, there have been many papers which have been published about preventive maintenance applied in redundant systems. Early studies were concerned with determining system reliability and defining the best time between inspections while considering the maximization of availability. Since the first part of the problem was well explored and due to the increasing importance of incurred related costs to companies, recent papers aim to determine the optimal time

Systems description

In this paper, four types of redundant systems are studied. Models for redundant systems with two and three components are presented, but the methodology can be extended for redundant systems comprised of more than three components as well. The four systems are described next.

  • System I – Active redundant system without component repair: All redundant components start to work at the same time at the beginning of the system operation. Once a component has failed, it is not possible to repair it

Reliability and cost analysis of redundant systems subject to periodic inspections

At the beginning of this section, the transition probabilities for the four models presented above are developed using Markov Chains, the costs related with the maintenance systems are established and a cost function is determined and minimized aiming to find the optimal interval between inspections.

Conclusions

This paper presented a model to analyze the reliability and determine the optimal time interval between inspections for redundant systems subject to periodic inspections using discrete time Markov Chains.

An important issue when using redundant systems and periodic inspection is the difficulty in determining the best time interval between inspections to detect and repair hidden failures. The time interval between inspections should be optimized to maximize availability and safety while also

Acknowledgments

We would like to thank the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) (9471-12-0) and the Conselho Nacional de Pesquisa e Desenvolvimento (CNPq) (140446/2011-7) for providing us with research fellowships.

References (28)

Cited by (35)

  • Performance modeling for condition-based activation of the redundant safety system subject to harmful tests

    2022, Reliability Engineering and System Safety
    Citation Excerpt :

    Redundancy, in engineering, is a well-accepted measure to the enhancement of system performance through the duplication of critical components or functions of a system, especially the safety-critical systems, e.g., pumping systems and safety valves in the oil and gas industry [1], turbines in airplanes and reactors in the nuclear industry, etc. [2].

  • Reliability optimization problem with the mixed strategy, degrading components, and a periodic inspection and maintenance policy

    2022, Reliability Engineering and System Safety
    Citation Excerpt :

    Hao et al. [35] proposed a condition-based maintenance policy with a two-stage inspection considering both perfect and imperfect inspections for continuous degradation. Mendes et al. [36] used the Markov chain technique to optimize the intervals between inspections for redundant systems. The present study investigates for the first time a series-parallel system under the mixed redundancy strategy with degrading standby components, for which inspection and maintenance (I&M) operations are employed to protect the standby components.

  • Condition-based optimization of non-identical inspection intervals for a k-out-of-n load sharing system with hybrid mixed redundancy strategy

    2022, Knowledge-Based Systems
    Citation Excerpt :

    They assumed the first component was subject to soft failure, while the second one had hard failures. Using Markov chain properties, Mendes et al. [12] minimized the total cost per inspection interval for four different systems with two or three components. Later, they expanded their work by considering two different cold-standby multi-state redundant systems [17].

  • Inspection interval optimization for a k-out-of-n load sharing system under a hybrid mixed redundancy strategy

    2021, Reliability Engineering and System Safety
    Citation Excerpt :

    So, the novelty of the current paper in the presence of the following features are as follows: Presenting the general formulas for calculating the transition probabilities between the system's states using the Markov process: many researchers investigated the system with a limited number of components (i.e., [13], and [42]) and calculated the system's transition probabilities between the system's states. This paper works on a load-sharing system when a discrete switch detects the components’ failure and replaces them with a standby one.

View all citing articles on Scopus
View full text