The design and analysis of AVTMR (all voting triple modular redundancy) and dual–duplex system
Introduction
As the industry is developed, a fault-tolerant system with high reliability and availability is required. This development of a fault-tolerant system is necessary one for the study of failures, and it is known that fault, error, and failure have some close relation in system problem; a fault can lead an error, and vice-versa [1]. So, faults have studied to enhance the reliability of the system to block the failure of a system. It was found as two approaches: the first is fault avoidance and the second, fault tolerance. The former is the technique of making a complete system by testing it and enhancing the quality of electronic components. Because components may develop faults as time goes on, this technique is very difficult to be applied. So, a complete fault avoidance system may be impossible for some systems. But with the latter, even though the fault happened in the system, a formal operation continues, and so the fault tolerant system is more effective for cost and development than the fault avoidance system. Usually, the fault tolerant system has a redundancy and a fault is allowed without stopping its normal operation. In this method, there are hardware redundancy, software redundancy, time redundancy and information redundancy techniques.
The hardware fault tolerant system is applied to a time critical system better than a software fault tolerant one. The hardware fault tolerant system votes or compares data in address or data bus level, but a software fault tolerant system performs these functions on a system level.
NASA developed the FTMP [2] to apply to commercial airplanes with the hardware fault-tolerant technique and SIFT [3] with the software fault tolerant technique.
The proposed system in this paper is the technique of hardware redundancy. In hardware redundancy, there are passive hardware redundancy, active hardware redundancy and hybrid hardware redundancy. The passive hardware redundancy system has the characteristic of fault masking so that the fault is not detected and the system is operating correctly. The active hardware redundancy system has the characteristic of fault detection, fault location and fault recovery. The hybrid hardware redundancy system has elements of passive and active hardware redundancy.
AVTMR and dual–duplex system in this paper has an active hardware redundancy [6].
The proposed AVTMR system has a characteristic of fault masking and fault detection that can verify what problem system has. So, it is not a complete active system, but it has active system characteristics.
The proposed dual–duplex system has an active hardware redundancy. The active hardware redundancy system is divided into a cold standby system, hot standby system and warm standby system according to the status of system operation [8].
Among these characteristics, the hot standby system has the fastest reconfiguration time. The standby system with a comparator is widely used in high reliability, availability, and safety systems [9].
AVTMR and dual–duplex system are compared with the single system, and to evaluate these system, the failure rate of electrical components is calculated for the electrical components which are used in each system with RELEX6.0 [5]. Markov modeling equation is calculated in Matlab and Mathematica for the evaluation of RAMS (Reliability, Avaliability, Maintainability and Safety) and MTTF.
The calculation of failure rate for electrical components is based on MILSPEC-217F standard [4]. The designed each system is based on MC68000 [7].
Section snippets
AVTMR system design
In fault-tolerant design technique, there are passive hardware redundancy, active hardware redundancy and hybrid hardware redundancy. AVTMR system is the passive hardware redundancy, which has a fault masking and detection. When one fault is injected in the system, AVTMR, which has a majority voter, is operated correctly. The reason is that a majority voter compares three inputs and the majority data with three inputs is outputted. So, if the AVTMR system has one fault, the fault is masking and
Design of dual–duplex system
The dual–duplex system is a hot standby system and two MC68000 CPU are on a board operated with a common clock. The data of each CPU are compared at ‘read’ or ‘write’ point by the comparator which is designed in ALTERA(EPM7128LC84) by exclusive OR. The dual–duplex system is comprised of two dual systems. The picture of the dual CPU board is shown in Fig. 3. The dual CPU board is VMEbus compatible and two CPUs are on the same board. Address bus, data bus and control bus are compared in
The calculation of the failure rate
The failure rate is the most important element to evaluate the reliability, availability, safety and MTTF (Mean Time To Failure). The failure rate is represented in Eq. (1).
The failure rate is calculated on MIL-HDBK-217F and the RELEX tool is used for the calculation of the failure rate (Table 1).The failure rate of commercial and MILSPEC components are calculated and the system evaluation is compared for each failure rates.
The calculated failure rate
System modeling
Makov modeling technique is used to evaluate dual–duplex system. Markov modeling provides the probabilistic system model as the system's state transition. The state transition of system failure is represented in a discrete time model and system reliability and availability are evaluated.
In this paper, we construct our transition using two assumptions for Markov Modeling.
- (1)
Only one failure will occur at a time.
- (2)
The system starts in the perfect operation where all of the system's modules are
Reliability
Reliability of single system (SS), AVTMR (All Voting Triple Modular Redundancy) and dual–duplex (DD) is shown in Fig. 11, Fig. 12. In figure, m means military component system and c commercial. As you can see, the reliability of military component system is higher than commercial in Fig. 11, Fig. 12.
The difference between Fig. 11, Fig. 12 is fault coverage. In dual–duplex system, if a fault is detected, system state is changed into standby system. But, AVTMR system does not give a serious
Conclusion
In this paper, single system, AVTMR system, and dual_duplex system is designed and compared for RAMS (Reliability, Availability, Maintainability and Safety).
Totally, when we can see, if dual_duplex system has a high fault coverage, it is the best characteristic in RAMS. As it was, the good quality for the system can be easily achieved by standby characteristic. But, dual_duplex system needs a lot of times to be developed and a lot of money. For example, dual_duplex system needs more electrical
References (11)
Design and analysis of fault tolerant digital systems
(1989)- et al.
FTMP—a highly reliable fault-tolerant multiprocessor for aircraft
Proc IEEE
(1978) SIFT:design and analysis of a fault tolerant computer for aircraft control
Proc IEEE
(1978)- Military handbook 217F. USA: Department of...
- RELEX 6.0 user guide. USA: RELEX Corporation;...
Cited by (44)
Architecture for safety–critical transportation systems
2023, Microprocessors and MicrosystemsA sequence-based method for dynamic reliability assessment of MPD systems
2021, Process Safety and Environmental ProtectionSafety-based availability assessment at design stage
2014, Computers and Industrial EngineeringCitation Excerpt :However, they determine for a particular system, average operational availability from technical point of view. Kim, Lee, and Lee (2005), to improve availability for high risk of production loss systems, proposed the components redundancy strategy. Juang, Lin, and Kao (2008) presented a solution for availability integration in design process.
Performance evaluation of subsea BOP control systems using dynamic Bayesian networks with imperfect repair and preventive maintenance
2013, Engineering Applications of Artificial IntelligenceCitation Excerpt :The results were compared with those obtained by means of Monte Carlo simulations based on Petri net models. Kim et al. developed all voting triple modular redundancy system, dual-duplex system and double 2-out-of-2 system, and assessed the reliability with respect to fault coverage by using discrete-time Markov modeling technique (Kim et al., 2005; Wang et al., 2007). Parashar and Taneja (2007) presented a PLC hot standby system based on master–slave concept and two types of repair facilities (ordinary repairman and expert repairman), and evaluated the reliability and profit by using semi-Markov processes.
Operational reliability analysis of remote operated vehicle based on dynamic Bayesian network synthesis method
2024, Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and ReliabilityA Threshold Design Method of Aircraft Sensor Fault Detection Monitor
2023, Journal of Physics: Conference Series