skip to main content
research-article

Fault-tolerant scheduling in homogeneous real-time systems

Published:01 March 2014Publication History
Skip Abstract Section

Abstract

Real-time systems are one of the most important applications of computers, both in commercial terms and in terms of social impact. Increasingly, real-time computers are used to control life-critical applications and need to meet stringent reliability conditions. Since the reliability of a real-time system is related to the probability of meeting its hard deadlines, these reliability requirements translate to the need to meet critical task deadlines with a very high probability. We survey the problem of how to schedule tasks in such a way that deadlines continue to be met despite processor (permanent or transient) or software failure.

References

  1. K. Ahn, J. Kim, and S. Hong. 1997. Fault-Tolerant Real-Time Scheduling Using Passive Replicas. In Proceedings of the Pacific Rim International Symposium on Fault-Tolerance. 98--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Al-Omari, G. Manimaran, and A. K. Somani. 2001. An Efficient Backup-Overloading for Fault-Tolerant Scheduling of Real-Time Tasks. In Proceedings of the International Parallel Processing Symposium. 1291--1295. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. A. Bertossi, L. V. Mancini, and A. Menapace. 2006. Scheduling Hard-Real-Time Tasks with Backup Phasing Delay. In Proceedings of the IEEE Symposium on Distributed Simulation and Real-Time Applications (DS-RT). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. A. Bertossi, L. V. Mancini, and F. Rossini. 1999. Fault-Tolerant Rate-Monotonic First-Fit Scheduling in Hard-Real-Time Systems. IEEE Transactions on Parallel and Distributed Systems 10, 9(September 1999), 934--945. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Burns, R. Davis, and S. Punnekat. 1996. Feasibility Analysis of Fault-Tolerant Real-Time Task Sets. In Proceedings of the 8th Euromicro Workshop on Real-Time Systems (EUROWRTS). 29--33.Google ScholarGoogle Scholar
  6. M. Caccamo and M. Buttazzo. 1998. Optimal Scheduling for Fault-Tolerant and Firm Real-Time Systems. In Proceedings of the IEEE Conference on Real-Time Computing Systems and Applications (RTCSA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Cheng. 2002. Real-Time Systems: Scheduling, Analysis and Verification. Wiley-Interscience. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. 2004. Introduction to Algorithms. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. S. Garey and D. S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Ghosh, R. Melhem, and D. Mosse. 1972. Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems. IEEE Transactions on Parallel and Distributed Systems 8, 3 (March 1997), 272--283. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. O. Gonzalez, H. Shrikumar, J. A. Stankovic, and K. Ramamritham. 1997. Adaptive Fault Tolerance and Graceful Degradation Under Dynamic Hard Real-time Scheduling. In Proceedings of the IEEE Real-Time Systems Symposium. 79--89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. L. Graham. 1969. Bounds on Multiprocessing Timing Anomalies. SIAM Journal of Applied Mathematics 17, 2 (March 1969), 416--429.Google ScholarGoogle ScholarCross RefCross Ref
  13. C.-C. Han, K. G. Shin, and J. Wu. 2003. A Fault-Tolerant Scheduling Algorithm for Real-Time Periodic Tasks with Possible Software Faults. IEEE Transactions on Computers 52, 3 (March 2003), 363--372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. S. Hillier and G. J. Lieberman. 2001. Introduction to Operations Research. McGraw-Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Johnson. 1989. The Design and Analysis of Fault-Tolerant Digital Systems. Addison-Wesley.Google ScholarGoogle Scholar
  16. M. Joseph and P. Pandya. 1986. Finding Response Times in a Real-Time System. Computer Journal 29, 5 (October 1986), 390--395.Google ScholarGoogle ScholarCross RefCross Ref
  17. H. Kopetz. 1997. Real-Time Systems. Kluwer Academic Publishers.Google ScholarGoogle Scholar
  18. H. Kopetz and G. Bauer. 2003. The Time-Triggered Architecture. Proceedings of the IEEE 91, 1 (January 2003), 112--126.Google ScholarGoogle ScholarCross RefCross Ref
  19. H. Kopetz and D. Millinger. 1999. The Transparent Implementation of Fault Tolerance in the Time-Triggered Architecture. In Dependable Computing for Critical Applications, A. Avizienis, H. Kopetz, and J. C. Laprie (Eds.), 192--205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Kopetz and W. Ochsenreiter. 1987. Clock Synchronization in Distributed Real-Time Systems. IEEE Transactions on Computers C-36, 933--940. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. I. Koren and C. M. Krishna. 2007. Fault-Tolerant Systems. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. M. Krishna and K. G. Shin. 1986. Scheduling Tasks with a Quick Recovery from Failure. IEEE Transactions on Computers C-35, 5 (May 1986), 448--455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. M. Krishna and K. G. Shin. 1987. Performance Measures for Control Computers. IEEE Transactions on Automatic Control AC-32, 6, 467--473.Google ScholarGoogle ScholarCross RefCross Ref
  24. C. M. Krishna and K. G. Shin. 1997. Real-Time Systems. McGraw-Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. P. Lehoczky, L. Sha, and J. K. Strosnider. 1987. Enhanced Aperiodic Responsiveness in Hard Real-Time Environments. In Proceedings of the IEEE Real-Time Systems Symposium. 261--270.Google ScholarGoogle Scholar
  26. W. Liao, L. He, and K. M. Lepak. 2005. Temperature and Supply Voltage Aware Performance and Power Modeling at Microarchitecture Level. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24, 7 (July 2005), 1042--1053. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. F. Liberato, R. Melhem, and D. Mosse. 2000. Tolerance to Multiple Transient Faults in Hard Real-Time Systems. IEEE Transactions on Computers 49, 9 (September 2000), 906--914. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. L. Liestman and R. H. Campbell. 1986. A Fault-Tolerant Scheduling Problem. IEEE Transactions on Software Engineering 12, 11 (November 1986), 1089--1095. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. L. Liu and J. W. Layland. 1973. Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment. Journal of the ACM 20, 1 (January 1973), 40--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. W. S. Liu. 2000. Real-Time Systems. Wiley.Google ScholarGoogle Scholar
  31. C. Siva Ram Murthy and G. Manimaran. 2001. Resource Management in Real-Time Systems and Networks. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. G. Manimaran and C. Siva Ram Murthy. 1998. A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis. IEEE Transactions on Parallel and Distributed Processing Systems 9, 11 (November 1998), 1137--1152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Naedele. 1999. Fault-Tolerant Real-Time Scheduling Under Real-Time Constraints. In Proceedings of the International Workshop on Real-Time Computing Systems and Applications (RTCSA). 392--395. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. N. Nissanke. 1997. Realtime Systems. Prentice Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Y. Oh and S. H. Son. 1992. An Algorithm for Real-Time Fault-Tolerant Scheduling in Multiprocessor Systems. In Proceedings of the Euromicro Workshop on Real-Time Systems. 190--195.Google ScholarGoogle Scholar
  36. M. Pandya and M. Malek. 1998. Minimum Achievable Utilization for Fault-Tolerant Processing of Periodic Tasks. IEEE Transactions on Computers 47, 10 (October 1998), 1102--1112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. E. L. Petersen. 1997. Predictions and Observations of SEU Rates in Space. IEEE Transactions on Nuclear Science 44, 6 (December 1997), 2174--2187.Google ScholarGoogle ScholarCross RefCross Ref
  38. S. Poledna, A. Burns, A. Wellings, and P. Barretta. 2000. Replica Determinism and Flexible Scheduling in Hard Real-Time Dependable Systems. IEEE Transactions on Computers 49, 2 (February 2000), 100--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. P. Pop, K. H. Poulsen, V. Izosimov, and P. Eles. 2007. Scheduling and Voltage Scaling for Energy/Reliability Trade-Offs in Fault-Tolerant Time-Triggered Embedded Systems. CODES+ISSS. 233--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. D. K. Pradhan. 1996. Fault-Tolerant Computer System Design. Prentice Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. K. Ramamritham, J. A. Stankovic, and P.-F. Shiah. 1994. Efficient Scheduling Algorithms for Real-Time Multiprocessor Systems. Proceedings of the IEEE 82, 1 (January 1994), 55--67.Google ScholarGoogle ScholarCross RefCross Ref
  42. R. M. Santos, J. Santos, and J. D. Orozco. 2009. Power Saving and Fault-Tolerance in Real-Time Critical Embedded Systems. Journal of Systems Architecture 55, 90--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. K. G. Shin and X. Cui. 1995. Computing Time Delay and Its Effects on Real-Time Control Systems. IEEE Transactions on Control Systems Technology 3, 2 (June 1995), 218--224.Google ScholarGoogle ScholarCross RefCross Ref
  44. K. G. Shin and C. M. Krishna. 1987. Performance Measures for Control Computers. IEEE Transactions on Automatic Control AC-32, 6 (June 1987), 467--473.Google ScholarGoogle Scholar
  45. D. Siewiorek and R. Swarz. 1999. Reliable Computer Systems: Design and Evaluation. A. K. Peters. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. N. Speirs and P. Barrett. 1989. Using Passive Replicates in Delta-4 to Provide Dependable Distributed Computing. In Proceedings of the Fault-Tolerant Computing Symposium (FTCS-19). 184--190.Google ScholarGoogle Scholar
  47. J. A. Stankovic and K. Ramamritham. 1989. The Spring Kernel: A New Paradigm for Real-Time Operating Systems. ACM Operating Systems Review 23, 3 (July 1989), 54--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. T. Tsuchiya, Y. Kakuda, and T. Kikuno. 1995a. Fault-Tolerant Scheduling Algorithm for Distributed Real-Time Systems. 1995. In Proceedings of the 3rd Workshop on Parallel and Distributed Real-Time Systems. 99--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. T. Tsuchiya, Y. Kakuda, and T. Kikuno. 1995b. A New Fault-Tolerant Scheduling Technique for Real-Time Multiprocessor Systems. In Proceedings of the International Workshop on Real-Time Computing Systems and Applications (RTCSA). 197--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. O. S. Unsal and I. Koren. 2003. System-Level Power-Aware Design Techniques in Real-Time Systems. Proceedings of the IEEE 91, 7 (July 2003), 1055--1069.Google ScholarGoogle ScholarCross RefCross Ref
  51. D. Zhu, R. Melhem, and D. Mosse. 2004. The Effects of Energy Management on Reliability in Real-Time Embedded Systems. In Proceedings of the International Conference on Computer-Aided Design (ICCAD). 35--40. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fault-tolerant scheduling in homogeneous real-time systems

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image ACM Computing Surveys
                ACM Computing Surveys  Volume 46, Issue 4
                April 2014
                463 pages
                ISSN:0360-0300
                EISSN:1557-7341
                DOI:10.1145/2597757
                Issue’s Table of Contents

                Copyright © 2014 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 1 March 2014
                • Accepted: 1 September 2013
                • Revised: 1 September 2012
                • Received: 1 May 2008
                Published in csur Volume 46, Issue 4

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article
                • Research
                • Refereed

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader