Abstract:
Electronic components in space applications are subject to high levels of ionizing and particle radiation. Their lifetime is reduced by the former (especially at high lev...Show MoreMetadata
Abstract:
Electronic components in space applications are subject to high levels of ionizing and particle radiation. Their lifetime is reduced by the former (especially at high levels of utilization) and transient errors might be caused by the latter. Transient errors can be detected and corrected using memory scrubbing. However, this causes an overhead that reduces both the availability and the lifetime of the system. In this work, we present a mechanism based on dynamic hidden Markov models (D-HMMs) that balances availability and lifetime of a multi-resource system by estimating the occurrence of permanent faults amid transient faults, and by dynamically migrating the computation on excess resources when failure occurs. The dynamic nature of the model makes it adaptable to different mission profiles and fault rates. Results show that our model is able to lead systems to their desired lifetime, while keeping availability within the 2% of its ideal value, and it outperforms static rule-based and traditional hidden Markov models (HMMs) approaches.
Date of Conference: 14-17 July 2014
Date Added to IEEE Xplore: 21 August 2014
Electronic ISBN:978-1-4799-5356-1