Abstract
Modular redundancy and temporal redundancy are traditional techniques to increase system reliability. In addition to being used as temporal redundancy, with technology advancements, slack time in a system can also be used by energy management schemes to save energy. In this paper, we consider the combination of modular and temporal redundancy to achieve energy efficient reliable real-time service provided by multiple servers. We first propose an efficient adaptive parallel recovery scheme that appropriately processes service requests in parallel to increase the number of faults that can be tolerated and thus system reliability. Then we explore schemes to determine the optimal redundant configurations of the parallel servers to minimize system energy consumption for a given reliability goal or to maximize system reliability for a given energy budget. Our analysis results show that small requests, optimistic approaches, and parallel recovery favor lower levels of modular redundancy, while large requests, pessimistic approaches and restricted serial recovery favor higher levels of modular redundancy.
Similar content being viewed by others
References
Austin T, Blaauw D, Mudge T, Flautner K (2004) Making typical silicon matter with razor. In: IEEE computer
Aydin H, Devadas V, Zhu D (2006) System-level energy management for periodic real-time tasks. In: Proc of the 27th IEEE real-time systems symposium (RTSS). Piscataway, NJ, USA. IEEE Comput Soc, Los Alamitos
Aydin H, Melhem R, Mossé D, Mejia-Alvarez P (2001) Dynamic and aggressive scheduling techniques for power-aware real-time systems. In: Proc of the 22th IEEE real-time systems symposium
Bohrer P, Elnozahy EN, Keller T, Kistler M, Lefurgy C, McDowell C, Rajamony R (2002) The case for power management in web servers. In: Power aware computing. Plenum/Kluwer, New York. Chap 1
Burd TD, Brodersen RW (1995) Energy efficient CMOS microprocessor design. In: Proc of fhe HICSS conference
Castillo X, McConnel S, Siewiorek D (1982) Derivation and calibration of a transient error reliability model. IEEE Trans Comput 31(7):658–671
Chen J-J, Kuo T-W (2007) Procrastination determination for periodic real-time tasks in leakage-aware dynamic voltage scaling systems. In: Proc of the 2007 IEEE/ACM int’l conference on computer-aided design (ICCAD), pp 289–294
Ejlali A, Schmitz MT, Al-Hashimi BM, Miremadi SG, Rosinger P (2005) Energy efficient SEU-tolerance in DVS-enabled real-time systems through information redundancy. In: Proc of the int’l symposium on low power and electronics and design (ISLPED)
Elnozahy EM, Kistler M, Rajamony R (2002a) Energy-efficient server clusters. In: Proc of power aware computing systems
Elnozahy EM, Melhem R, Mossé D (2002b) Energy-efficient duplex and TMR real-time systems. In: Proc of the IEEE real-time systems symposium
Fan X, Ellis C, Lebeck A (2003) The synergy between power-aware memory systems and processor voltage. In: Proc of the workshop on power-aware computing systems
Foster I (1995) Design and building parallel programs. Addison-Wesley, Reading. Chap 1.4.4
Hua S, Qu G (2005) Power minimization techniques on distributed real-time systems by global and local slack management. In: Proc of the 2005 conference on Asia South Pacific design automation. ACM Press, New York, pp 830–835
Intel (2006) Intel XScale Processors. http://developer.intel.com/design/intelxscale/
Irani S, Shukla S, Gupta R (2003) Algorithms for power savings. In: Proc of the 14th symposium on discrete algorithms
Ishihara T, Yauura H (1998) Voltage scheduling problem for dynamically variable voltage processors. In: Proc of the 1998 international symposium on low power electronics and design
Iyer R, Rossetti DJ, Hsueh M (1986) Measurement and modeling of computer reliability as affected by system activity. ACM Trans Comput Syst 4(3):214–237
Jejurikar R, Pereira C, Gupta R (2004) Leakage aware dynamic voltage scaling for real-time embedded systems. In: Proc of the 41st annual design automation conference (DAC)
Lebeck AR, Fan X, Zeng H, Ellis CS (2000) Power aware page allocation. In: Proc of the 9th international conference on architectural support for programming languages and operating systems
Lefurgy C, Rajamani K, Rawson F, Felter W, Kistler M, Keller TW (2003) Energy management for commercial servers. IEEE Comput 36(12):39–48
Luo J, Jha NK (2000) Power-conscious joint scheduling of periodic task graphs and aperiodic tasks in distributed real-time embedded systems. In: Proc of international conference on computer aided design
Luo J, Jha NK (2002) Static and dynamic variable voltage scheduling algorithms for real-time heterogeneous distributed embedded systems. In: Proc of 15th international conference on VLSI design
Mahapatra RN, Zhao W (2005) An energy-efficient slack distribution technique for multimode distributed real-time embedded systems. IEEE Trans Parallel Distrib Syst 16(7):650–662
Melhem R, Mossé D, Elnozahy EM (2004) The interplay of power management and fault recovery in real-time systems. IEEE Trans Comput 53(2):217–231
Mishra R, Rastogi N, Zhu D, Mossé D, Melhem R (2003) Energy aware scheduling for distributed real-time systems. In: Proc of international parallel and distributed processing symposium (IPDPS), Piscataway, NJ, USA. IEEE Comput Soc, Los Alamitos, pp 21–29
Pillai P, Shin KG (2001) Real-time dynamic voltage scaling for low-power embedded operating systems. In: Proc of 18th ACM symposium on operating systems principles (SOSP’01)
Pop P, Poulsen K, Izosimov V, Eles P (2007) Scheduling and voltage scaling for energy/reliability trade-offs in fault-tolerant time-triggered embedded systems. In: Proc of the 5th IEEE/ACM int’l conference on hardware/software codesign and system synthesis (CODES+ISSS), pp 233–238
Pradhan DK (1986) Fault tolerance computing: theory and techniques. Prentice Hall, New York
Rambus (1999) RDRAM. http://www.rambus.com/
Rusu C, Ferreira A, Scordino C, Watson A, Melhem R, Mossé D (2006) Energy-efficient real-time heterogenerous sever clusters. In: Proc of the IEEE real-time and embedded technology and applications symposium (RTAS)
Saewong S, Rajkumar R (2003) Practical voltage scaling for fixed-priority RT-systems. In: Proc of the 9th IEEE real-time and embedded technology and applications symposium
Seth K, Anantaraman A, Mueller F, Rotenberg E (2003) FAST: frequency-aware static timing analysis. In: Proc of the IEEE real-time system symposium
Sharma V, Thomas A, Abdelzaher T, Skadron K, Lu Z (2003) Power-aware QoS management in Web servers. In: Proc of the 24th IEEE real-time system symposium
Shin KG, Kim H (1994) A time redundancy approach to TMR failures using fault-state likelihoods. IEEE Trans Comput 43(10):1151–1162
Sinha A, Chandrakasan AP (2001) JouleTrack—a Web based tool for software energy profiling. In: Proc of design automation conference
Thompson S, Packan P, Bohr M (1998) MOS scaling: transistor challenges for the 21st century. Intel Technol J Q3
Unsal OS, Koren I, Krishna CM (2002) Towards energy-aware software-based fault tolerance in real-time systems. In: Proc of the international symposium on low power electronics design (ISLPED)
Weiser M, Welch B, Demers A, Shenker S (1994) Scheduling for reduced CPU energy. In: Proc of the first USENIX symposium on operating systems design and implementation
Xu R, Zhu D, Rusu C, Melhem R, Mossé D (2005) Energy Efficient Policies for Embedded Clusters. In: Proc of the conference on language, compilers, and tools for embedded systems (LCTES). ACM, New York, pp 1–10
Yao F, Demers A, Shenker S (1995) A scheduling model for reduced CPU energy. In: Proc of the 36th annual symposium on foundations of computer science
Zhang Y, Chakrabarty K (2003) Energy-aware adaptive checkpointing in embedded real-time systems. In: Proc of IEEE/ACM design, automation and test in Europe conference (DATE)
Zhang Y, Chakrabarty K (2004) Task feasibility analysis and dynamic voltage scaling in fault-tolerant real-time embedded systems. In: Proc of IEEE/ACM design, automation and test in Europe conference (DATE)
Zhao B, Zhu D, Aydin H (2008) Reliability-aware dynamic voltage scaling for energy-constrained real-time embedded systems. In: Proc of the IEEE international conference on computer design (ICCD)
Zhu D (2006) Reliability-aware dynamic energy management in dependable embedded real-time systems. In: Proc of the IEEE real-time and embedded technology and applications symposium (RTAS)
Zhu D, Aydin H (2006) Energy management for real-time embedded systems with reliability requirements. In: Proc of the int’l conf on computer aidded design
Zhu D, Aydin H (2007) Reliability-aware energy management for periodic real-time tasks. In: Proc of the IEEE real-time and embedded technology and applications symposium (RTAS)
Zhu D, Melhem R, Mossé D (2004a) The effects of energy management on reliability in real-time embedded systems. In: Proc of the int’l conf on computer aidded design
Zhu D, Melhem R, Mossé D, Elnozahy E (2004b) Analysis of an energy efficient optimistic TMR scheme. In: Proc of the 10th int’l conference on parallel and distributed systems
Zhu D, Qi X, Aydin H (2007) Priority-monotonic energy management for real-time systems with reliability requirements. In: Proc of the IEEE international conference on computer design (ICCD)
Zhu D, Aydin H, Chen J-J (2008a) Optimistic reliability aware energy management for real-time tasks with probabilistic execution times. In: Proc of the 29th IEEE real-time systems symposium (RTSS)
Zhu D, Qi X, Aydin H (2008b) Energy management for periodic real-time tasks with variable assurance requirements. In: Proc of the IEEE int’l conference on embedded and real-time computing systems and applications (RTCSA)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhu, D., Melhem, R. & Mossé, D. Energy efficient redundant configurations for real-time parallel reliable servers. Real-Time Syst 41, 195–221 (2009). https://doi.org/10.1007/s11241-009-9067-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11241-009-9067-8