Loading [MathJax]/extensions/MathMenu.js
Detection of Silent Data Corruption in fault-tolerant distributed systems on board spacecraft | IEEE Conference Publication | IEEE Xplore

Detection of Silent Data Corruption in fault-tolerant distributed systems on board spacecraft


Abstract:

In this paper a novel distributed architecture for system level Fault Detection, Isolation and Recovery (FDIR) aimed at spacecraft applications is presented. The architec...Show More

Abstract:

In this paper a novel distributed architecture for system level Fault Detection, Isolation and Recovery (FDIR) aimed at spacecraft applications is presented. The architecture reconfigures itself in the case of a failure for seamless adaptability and operation. Two new algorithms for detection of Silent Data Corruption (SDC) errors are proposed. A selective redundancy method is employed for transient SDC errors, while a distributed mechanism based upon a data signature value is employed for permanent SDC errors. Experimental results based on prototyping with Xilinx Zynq FPGAs are reported, which show that the proposed method is capable of detecting SDC faults in distributed nodes and tolerates node failures by migrating tasks to healthy nodes. Evaluation results show that the proposed SDC detection algorithms achieve very good fault coverage, while using much lower additional resources compared with physical redundancy.
Date of Conference: 14-17 July 2014
Date Added to IEEE Xplore: 21 August 2014
Electronic ISBN:978-1-4799-5356-1
Conference Location: Leicester, UK

References

References is not available for this document.