ABSTRACT
Methods such as rollback and modular redundancy are efficient to correct transient errors. In hard real-time systems, however, correction has a strong impact on response times, also on tasks that were not directly affected by errors. Due to deadline misses, these tasks eventually fail to provide correct service. In this paper we present a reliability analysis for periodic task sets and static priorities that includes realistic detection and roll-back scenarios and covers a hyperperiod instead of just a critical instant and therefore leads to much higher accuracy than previous approaches. The approach is compared with Monte-Carlo simulation to demonstrate the accuracy and with previous approaches covering critical instants to evaluate the improvements.
- T. Austin, D. Blaauw, T. Mudge, and K. Flautner. Making typical silicon matter with razor. IEEE Computer, 37(3):57--65, 2004. Google ScholarDigital Library
- S. Baruah, H. Li, and L. Stougie. Towards the design of certifiable mixed-criticality systems. In Proc. of Real-Time and Embedded Technology and Applications Symp., pages 13--22. IEEE, 2010. Google ScholarDigital Library
- S. Borkar. Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro, 25(6):10--16, 2005. Google ScholarDigital Library
- I. Broster, A. Burns, and G. Rodríguez-Navas. Probabilistic analysis of CAN with faults. In Proc. of Real-Time Systems Symposium, pages 269--278. IEEE, 2002. Google ScholarDigital Library
- A. Burns, R. Davis, and S. Punnekkat. Feasibility analysis of fault-tolerant real-time task sets. In Proc. of Euromicro Workshop Real-Time Systems, pages 29--33, 1996.Google ScholarCross Ref
- A. Burns, S. Punnekkat, L. Strigini, and D. R. Wright. Probabilistic scheduling guarantees for fault-tolerant real-time systems. In Proc. of Dependable Computing for Critical Applications, pages 361--378, 1999. Google ScholarDigital Library
- D. Chabrol, C. Aussagues, and V. David. A spatial and temporal partitioning approach for dependable automotive systems. In Proc. of Emerging Technologies & Factory Automation, pages 1--8, 2009. Google ScholarDigital Library
- M. Glass, M. Lukasiewycz, F. Reimann, C. Haubelt, and J. Teich. Symbolic reliability analysis and optimization of ECU networks. In Proc. of Design, Automation and Test in Europe, pages 158--163, 2008. Google ScholarDigital Library
- International Electrotechnical Commission (IEC). Functional safety of electrical / electronic / programmable electronic safety-related systems, 1998.Google Scholar
- V. Izosimov, P. Pop, P. Eles, and Z. Peng. Synthesis of fault-tolerant embedded systems with checkpointing and replication. In Proc. of Int. Workshop Electronic Design, Test and Applications, 2006. Google ScholarDigital Library
- H. Kopetz. Real-Time Systems: Design Principles for Distributed Embedded Applications. Kluwer Academic Publishers, Norwell, MA, USA, 1997. Google ScholarDigital Library
- C. LaFrieda, E. Ipek, J. F. Martinez, and R. Manohar. Utilizing dynamically coupled cores to form a resilient chip multiprocessor. In Proc. of Int. Conf. Dependable Systems and Networks, pages 317--326, 2007. Google ScholarDigital Library
- P. Pop, V. Izosimov, P. Eles, and Z. Peng. Design optimization of time- and cost-constrained fault-tolerant embedded systems with checkpointing and replication. IEEE Trans. on VLSI, 17(3):389--402, 2009. Google ScholarDigital Library
- S. Punnekkat and A. Burns. Analysis of checkpointing for schedulability of real-time systems. In Proc. of Int. Workshop Real-Time Computing Systems and Applications, pages 198--205, 1997. Google ScholarDigital Library
- M. Sebastian and R. Ernst. Reliability Analysis of Single Bus Communication with Real-Time Requirements. In Proc. of Pacific Rim Int. Symp. Dependable Computing, pages 3--10, 2009. Google ScholarDigital Library
- J. C. Smolens, B. T. Gold, J. Kim, B. Falsafi, J. C. Hoe, and A. G. Nowatryk. Fingerprinting: bounding soft-error-detection latency and bandwidth. IEEE Micro, 24(6):22--29, 2004. Google ScholarDigital Library
- D. J. Sorin, M. M. K. Martin, M. D. Hill, and D. A. Wood. Safetynet: improving the availability of shared memory multiprocessors with global checkpoint/recovery. In Proc. of Int. Computer Architecture Symp., pages 123--134, 2002. Google ScholarDigital Library
- R. Teodorescu, J. Nakano, and J. Torrellas. Swich: A prototype for efficient cache-level checkpointing and rollback. IEEE Micro, 26(5):28--40, 2006. Google ScholarDigital Library
- K. W. Tindell, A. Burns, and A. J. Wellings. An extendible approach for analyzing fixed priority hard real-time tasks. Real-Time Systems, 6(2):133--151, 1994. Google ScholarDigital Library
Index Terms
- Reliability analysis for MPSoCs with mixed-critical, hard real-time constraints
Recommendations
Reliability-Aware Energy Management for Embedded Real-Time Systems with (m, k)-Hard Timing Constraint
While energy consumption and Quality of Service (QoS) are primary concerns for the design of embedded systems, reliability requirement has become increasingly important in the development of today's pervasive computing systems. In this paper, we present ...
Low Effort Evaluation of Real-Time and Reliability Requirements for Embedded Systems
CIT '10: Proceedings of the 2010 10th IEEE International Conference on Computer and Information TechnologyMeasuring reliability of embedded systems is an important but non-trivial problem. In a system design process, it is desirable to have early indicators for the reliability of an embedded system. Such reliability measurement will typically be carried out ...
On the schedulability of a data-centric real-time distribution middleware
This work presents an analysis of the Data Distribution Service for Real-Time Systems (DDS), a data-centric distribution middleware that supports the development of predictable applications, from the schedulability point of view. The study focuses on ...
Comments