Redundancy strategies assessment and optimization of k-out-of-n systems based on Markov chains and genetic algorithms

doi:10.1016/j.ress.2021.108277

Reliability Engineering & System Safety

Volume 221, May 2022, 108277

https://doi.org/10.1016/j.ress.2021.108277 Get rights and content

Highlights

•
We revisit the redundancy allocation problem under different strategies.
•
We consider active, standby, mixed and K-mixed strategies for k-out-of-n configurations.
•
We develop Markov chain models and compare the results to Monte Carlo simulation.
•
A reliability optimization problem is formulated and solved using a genetic algorithm.
•
The proposed approach is efficient for evaluating and optimization system reliability.

Abstract

The existing reliability formulation regarding mixed and K-mixed strategies considers only one component as the minimum number of required components for each subsystem. In this paper, we develop a Continuous-Time Markov Chain model for both mixed and K-mixed strategies while the minimum number of required components can be more than one and take any values. In addition, the drawbacks of the classical model, which are complicated formulation, approximate solution and time-consuming problem-solving process, have been addressed. The proposed model estimates the reliability under different redundancy strategies more efficiently and in a straightforward way. To validate the proposed approach, a sequential Monte Carlo model is also developed for the reliability analysis. Besides, the existing strategies are applied to a series-parallel system and an efficient genetic algorithm is developed to solve the resulting optimization problem. The numerical results confirm the accuracy of the Continuous-Time Markov Chain model in estimating k-out-of-n system reliability under standby, mixed and K-mixed strategies with a major reduction in the computation time.

Introduction

A system consisting of n components, of which only k components need to be functioning for system success, is called a "k-out-of-n" configuration. The redundancy strategy of this type of system can be planned to maximize its reliability. Several studies have been conducted to calculate system reliability of this type of system by considering different redundancy strategies (e.g., [1], [2], [3]). Generally, there are four types of redundancy strategies in the context of Redundancy Allocation Problem (RAP), namely active, standby, mixed, and K-mixed. In an active strategy, all redundant components start working when the system comes online, while in a standby strategy, the system functions with only a minimum number of required components. When one of these components fails, a switching system replaces the failed component with a redundant one. It should be noted that the switching system itself is also prone to breakdowns and component malfunctions and hence can fail. The overall system continues working until the number of active components drops down to k or the switching system cannot replace a failed component with a functional redundant one. In case when only one component is necessary for system operation, the failure of either the working component or the switching system leads to the breakdown of the entire system [4]. There are three cases for standby redundancy in the literature [3]: cold, warm, and hot standby which are named based on the component failure due to operational stress associated with system operation.

Many studies have been conducted on systems in which only one component is required for system operation. In some studies, active and standby strategies are considered as predefined strategies for each subsystem [5,6], while in other studies [7,8] redundancy strategies are considered as decision variables that need to be determined. Abouei et al. [9] proposed the mixed strategy that is a combination of active and standby strategies. According to this strategy, the numbers of active and standby components are considered as decision variables that must be optimally selected. When all active components fail, redundant components are replaced individually, and the system fails when all components break down. The authors applied this strategy on different cases and showed that it outperforms both active and standby strategies in most cases [9,10].

The K-mixed redundancy strategy, which is a generalization of the mixed strategy, was introduced by Peiravi et al. [11]. The authors showed that when the switching system is not highly reliable, the K-mixed strategy yields significant improvements over other strategies [12]. To evaluate the efficiency of their model, a series-parallel system was evaluated, and its results were compared with the existing methods in the literature. The results confirmed that the new approach outperforms previous strategies in several problem instances [12].

It is worth mentioning that the number of active and standby components under both mixed and K-mixed strategies are determined by solving an optimization model that aims to maximize system reliability. Nevertheless, it is computationally difficult to estimate system reliability by the aid of classical reliability models for systems with more than four redundant components. Furthermore, the existing approximation approaches provide a lower bound on the actual system reliability [11,12]. To measure the exact system reliability under a standby strategy, Kim and Kim [13] developed a Markov model and applied it in the context of a reliability-redundancy allocation problem. In another paper, the authors provided the same reliability model considering Phase-type time-to-failure distribution for components [14]. Recently, a similar reliability model was proposed in view of the mixed strategy [15].

However, no prior work has considered mixed and K-mixed strategies for k-out-of-n systems. The development of a Continuous-Time Markov Chain model (CTMC) for reliability calculation of k-out-of-n systems under mixed and K-mixed strategies is a novel contribution in this study. An example of a k-out-of-n system (2 out of 5) under K-mixed and mixed strategies is shown in Fig. 1. In this Figure, green, yellow, red, and non-colored boxes represent the required online components (k), extra online components, failed components, and standby components, respectively. In both strategies, the system starts functioning with more than k active components. Despite their similarities, there is a significant difference between the K-mixed and the mixed strategies with respect to the switching system. In the mixed strategy, the switching system is activated only when the number of active components drops below the minimum number of required components; therefore, switching failure results in the subsystem failure. However, in the K-mixed strategy, the switching system is used to substitute a redundant component when the first extra online component fails. In other words, the K-mixed strategy tries to keep the initial number of active components (K) functional during the whole operation time.

The field of k-out-n systems reliability has also been extensively studied. The redundancy allocation problem in k-out-of-n systems was studied for example in [3] under the assumption that the redundancy strategy (e.g., active or cold standby) for each subsystem was fixed beforehand. An algorithm proposed in [16] is used to determine the reliability of a weighted k-out-of-n system. In [17], a dynamic set of k-out-of-n component partnerships was described, where the number of components, k, changes dynamically in response to failures. A study by Li et al. [18] analyzed a sequential k-out-of-n system. Hamdan et al. [19] developed an optimal preventive maintenance model that takes average cost and availability into account for weighted k-out-of-n systems. An example of a load sharing k-out-of-n containing non-identical multi-state subsystems is given in [20].

The classical approach of reliability calculation has a complicated formula that involves several double integrals. Calculating the reliability of any configuration when the K-mixed or mixed strategy is implemented needs a few seconds on a personal computer. However, when it comes to solving combinatorial reliability optimization problems, there is a need to evaluate the objective function (i.e., the system reliability) for a large number of possibilities in a huge search space. Developing a fast accurate evaluation method is crucial to solve large combinatorial reliability optimization problems. To answer this need, our contribution is three-fold.

First, by focusing on k-out-of-n configurations, a Continuous-Time Markov Chain (CTMC) model is developed to calculate the system reliability under standby strategy, where the switch is not continuously monitoring the system and is only triggered upon a component failure. It is noteworthy that this case is more realistic in industrial systems. For instance, to make sure of switch performance, it is triggered by the system operator twice or more during the mission. Second, the classical reliability models are more time-consuming to calculate the system reliability under mixed and K-mixed strategies, in comparison to active and standby strategies. Therefore, in this research, new CTMC modeling models are developed to calculate system reliability under mixed and K-mixed strategies for k-out-of-n configurations. Indeed, the proposed CTMC models calculate system reliability for any number of essential functioning components, k, including 1 out n systems. The system reliability is also evaluated using a Monte-Carlo simulation model to test the accuracy of the proposed approach. Lastly, we combine the suggested evaluation models with a genetic algorithm to efficiently solve a reliability optimization problem in the context of a series-parallel system and we compare the execution time with the one reported in literature.

The rest of the paper is structured as follows. In Section 2, we formulate the CTMC-based reliability model proposed for calculating k-out-of-n subsystem reliability under all the above-mentioned strategies, where the switching system can be either constantly monitored, or triggered in response to a failure. Section 3 presents a Monte Carlo simulation model to validate the obtained results by the Markov model. In Section 4, two numerical examples are presented to demonstrate the efficiency of the proposed methodology. Finally, conclusions are presented in Section 5.

Section snippets

Reliability model

Two cases for an imperfect switching/detection mechanism have been proposed in [4]: I) continuous monitoring and detection, and II) detection and switching in response to a failure. In Case I, the system performance is being observed for failure detection and, if one is detected, the right measures would be taken based on the imposed redundancy strategy. In Case II, the switching system is triggered only in direct response to a component failure. If the switch is failed upon triggering, it will

Monte-Carlo simulation model

To verify the accuracy of the transient matrix constructed in previous sections, a sequential Monte-Carlo Simulation Model (MCM) is developed in this section. The goal of this model is to estimate the reliability of a given subsystem under different redundancy allocation strategies considered in this study. The MCM mimics the components’ history of failure by using their state probability distributions. Statistics are then obtained and statistical computation is used to estimate different

Numerical results

To illustrate the efficiency of the proposed CTMC models, a single subsystem with four components is first considered; the minimum number of required components can be one or two. It is assumed that the components and switch reliability values are predefined. Afterward, the proposed reliability model is implemented for a series parallel system that has been widely studied in the literature.

Conclusion

Classical reliability models corresponding to the redundancy strategies, such as standby, mixed, and K-mixed suffer from a high degree of computational complexity and in some cases provide a lower bound on the actual system reliability. In addition, the mixed and K-mixed strategies have not been considered in k-out-of-n systems in the literature. In this paper, a CTMC modeling approach for the k-out-of-n system is proposed for calculating the exact system reliability under the aforementioned

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to thank the Natural Sciences and Engineering Research Council of Canada (NSERC) for supporting this project. They would like to also thank the editor, the associate editor and the anonymous referees for their constructive comments and recommendations, which have significantly improved the presentation of this paper.

References (31)

R. Ahmadi
Reliability and maintenance modeling for a load-sharing k-out-of-n system subject to hidden failures
Comput Ind Eng
(2020)
T. Jin et al.
Variance of Reliability Estimate for k-out-of-n Systems with Cold Standby Units
W. Wang et al.
A study of interval analysis for cold-standby system reliability optimization under parameter uncertainty
Comput Ind Eng
(2016)
M.A. Mellal et al.
System reliability-redundancy optimization with cold-standby strategy by an enhanced nest cuckoo optimization algorithm
Reliab Eng Syst Saf
(2020)
W. Wang et al.
Multi-objective optimization of reliability-redundancy allocation problem for multi-type production systems considering redundancy strategies
Reliab Eng Syst Saf
(2020)
H. Kim et al.
Reliability–redundancy allocation problem considering optimal redundancy strategy using parallel genetic algorithm
Reliab Eng Syst Saf
(2017)
H. Kim et al.
Reliability models for a nonrepairable system with heterogeneous components having a phase-type time-to-failure distribution
Reliab Eng Syst Saf
(2017)
A. Peiravi et al.
A new Markov-based model for reliability optimization problems with mixed redundancy strategy
Reliab Eng Syst Saf
(2020)
S. Eryilmaz et al.
The lost capacity by the weighted k-out-of-n system upon system failure
Reliab Eng Syst Saf
(2021)
D.W. Coit et al.
Dynamic k-out-of-n system reliability with component partnership
Reliab Eng Syst Saf
(2015)

M. Li et al.

Reliability modeling for repairable circular consecutive-k-out-of-n: F systems with retrial feature

Reliab Eng Syst Saf

(2021)

K. Hamdan et al.

Optimal preventive maintenance for repairable weighted k-out-of-n systems

Reliab Eng Syst Saf

(2021)

H. Kim

Optimal reliability design of a system with k-out-of-n subsystems considering redundancy strategies

Reliab Eng Syst Saf

(2017)

P.P. Guilani et al.

Sequence optimization in reliability problems with a mixed strategy and heterogeneous backup scheme

Reliab Eng Syst Saf

(2020)

A. Yalaoui et al.

Reliability allocation problem in a series–parallel system

Reliab Eng Syst Saf

(2005)

Cited by (31)

Optimizing corrective maintenance for multistate systems with storage
2024, Reliability Engineering and System Safety
Production-storage systems abound in different industries, where the storage is used to store surplus product generated by the production subsystem (PS) and compensate the deficiency when the PS performance cannot meet the demand. The existing models mostly failed to consider the effects of random external shocks and the further mitigation actions. This paper makes contributions by modeling shocks and the mitigating corrective maintenance policy (CMP) that aims to restore the PS performance to a higher level, alleviating negative impacts from the shocks and enhancing the mission success probability (MSP).The proposed CMP defines the condition triggering the maintenance action, which encompasses the shock occurrence time, the PS state, the number of shocks since the previous maintenance, and the amount of product in the storage. A new numerical algorithm is put forward to assess the MSP of the considered multistate production-storage system under any given CMP. The genetic algorithm is then implemented to determine the optimal CMP that maximizes the MSP. A case study of a six-state power generating system subject to voltage surges caused by external shocks is conducted to illustrate the proposed model. Impacts of several model parameters on the MSP and optimization solutions are also examined.
A novel evolutionary solution approach for many-objective reliability-redundancy allocation problem based on objective prioritization and constraint optimization
2024, Reliability Engineering and System Safety
The reliability redundancy allocation problem (RRAP) has been mostly solved either as a single or as a multi-objective optimization problem. However, this problem also has numerous important constraints which play prominent roles in meeting the objectives. This paper proposes a novel formulation named ‘prioritized many reliability redundancy allocation problems (PrMaORRAP)’ that optimizes all the problem objectives concurrently, and also preserves the priority among them. Then, we propose a hybrid method which utilizes the features of many-objective optimization as well as priority relations between different objectives. Here, we divide the procedure into two modules: one is the main priority or the leader which will stay at the top level; underneath the first lie the second part in which rest of the objectives are optimized. The solution approach embeds the optimization structure within the evolutionary process making a prioritized many-objective evolutionary algorithms. We formulate various structures such as series, series-parallel, complex bridge and overspeed gas turbine system of RRAP as many-objective optimization problems, and provide detailed experimental demonstration on how our proposed model works for all these structures. We compare the results given by the proposed approach with the results of other approaches available in the literature and establish the superiority of our proposed solution approach.
Multi-task optimization in reliability redundancy allocation problem: A multifactorial evolutionary-based approach
2024, Reliability Engineering and System Safety
Evolutionary multi-task optimization attempts to solve multiple optimization problems simultaneously by modeling the solution structures of two or more problems within a single encoding. In this paper, we report a novel way for evolutionary multi-task optimization in the reliability redundancy allocation problem exploiting the concepts of the popular multifactorial evolutionary algorithm (MFEA). We demonstrate the working of the proposed method considering two test sets and show how they can be concurrently solved using the MFEA. In the first test set, we consider two optimization tasks (case studies): the complex (bridge) system and the series-parallel system. In the second test set, there are two optimization tasks: the over-speed protection system for the gas turbine and the life support system in a space capsule. The common attributes between the two systems, within a set, complement each other to enhance the evolution process through implicit knowledge transfer. We present the comparative results considering existing evolutionary methods such as particle swarm optimization, genetic algorithm, simulated annealing, differential evolution, and ant colony optimization. Results are analyzed and compared using the average reliability, best reliability, computation time, performance ranking, and the popular statistical significance test of analysis of variance. The outcome shows that our proposed approach can solve the multiple case studies of RRAP simultaneously without compromising the solution quality. Moreover, our MFEA based solution method tops the rank among all approaches and provides significant improvement in computation time where it gains 28.02% and 14.43% of improvement in computation time for first and second test set, respectively, when compared with genetic algorithm. The percentage improvements in the computational time of the MFEA significantly increases when it is compared with other approaches.
Reliability analysis and redundancy optimization of k-out-of-n systems with random variable k using continuous time Markov chain and Monte Carlo simulation
2024, Reliability Engineering and System Safety
This article discusses the problems associated with the redundancy of structures k-out-of-n as a method of increasing system availability. A parallel k-out-of-n system was considered, with k repairable and homogeneous components. The other (n – k) objects are in hot redundancy mode. Failures and repairs are independent processes, and each repaired object is treated as full-fledged in terms of operation. The system operates under dynamically changing conditions, so the minimum number of operable components essential for its proper operation is a random variable with a specific distribution. Component damages and repairs are independent stochastic processes. A probabilistic and simulation approach based on Continuous Time Markov Chain and Monte Carlo simulation is proposed. This study aims to optimize the number of components in the k-out-of-n structure, which will ensure the system operation with a certain availability and performance. The proposed methods were applied to real transport systems, for which models were developed with parameters estimated on the basis of empirical data. The convergence of the results obtained, using two methods, testifies to the correctness of the methodology used and the reliability of the models developed. The risk to system performance associated with changing input parameters was assessed using model sensitivity analysis. As indicated, the model is not susceptible to changes in the damage and repair intensity. It has also been shown that optimization of the k-out-of-n structure brings a significant reduction in the costs incurred for system development and operation.
Performance of load sharing repairable k-out-of-N: G systems
2024, Journal of Engineering Research (Kuwait)
The aim of this research is to explore the performance of load-sharing k-out-of-n: G systems. In such structures, the system operates if at least k components function. Furthermore, in load sharing systems, the load is automatically transferred to the functional components that are still in place whenever one component fails. In this article, asymptotic availability and mean time to failure (MTTF) are used for measuring the system performance assuming exponentially distributed repair times and times between failures. Results indicate that in selecting any configuration over others, we should not only consider the system's performance but also the total cost. For illustrations, the article exhibits a real-life numerical example.
Allocation and activation of resource constrained shock-exposed components in heterogeneous 1-out-of-n standby system
2024, Reliability Engineering and System Safety
In real-world mission-critical applications like unmanned aerial vehicles (UAVs), the standby sparing design method is typically applied to enhance the mission success probability (MSP), and standby components may reside in different physical positions exposed to diverse random environments/shocks affecting components’ failure behaviors. Thus, the position allocation of standby components may affect the MSP greatly. Moreover, the MSP varies for different activation sequences of standby components. This paper makes contributions by formulating and solving the optimal position allocation and activation sequence (PAAS) problem to maximize the MSP of a 1-out-of-n standby system with n non-identical components, characterized by different performance speeds, initial resource (determining the maximum amount of mission work that can be completed by each component), and shock resistance levels. The rate of shocks affecting each component is dependent on the component's position and mode (standby, activation, or operation). We put forward a new numerical algorithm for evaluating the MSP, the probability that the considered standby system can compete a specified amount of work before all the system components stop operation due to failures or resource depletion. We further apply the genetic algorithm to solve the PAAS co-optimization problem. A multi-UAV surveillance system is studied to showcase the proposed standby system model and impacts of several component and shock parameters on the mission reliability and PAAS optimization solutions.

View all citing articles on Scopus

View full text

Redundancy strategies assessment and optimization of k-out-of-n systems based on Markov chains and genetic algorithms

Highlights

Abstract

Introduction

Section snippets

Reliability model

Monte-Carlo simulation model

Numerical results

Conclusion

Declaration of Competing Interest

Acknowledgments

Comput Ind Eng

Comput Ind Eng

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf

Reliab Eng Syst Saf