Abstract
This paper presents an optimal checkpoint strategy for fault-tolerance in real-time systems where transient faults occur in Poisson distribution. In our environment, multiple real-time tasks with different deadlines and harmonic periods are scheduled in the system by rate-monotonic algorithm, and checkpoints are inserted at a constant interval in each task. When a fault is detected, the system carries out rollback to the latest checkpoint and re-executes tasks. The maximum number of re-executable checkpoints and an equation to check schedulability are derived, and the optimal number of checkpoints is selected to maximize the probability of completing all the tasks within their deadlines.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Shin K G, Kim H. Derivation and application of hard dead-lines for real-time control systems. IEEE Transactions on Systems, Man, and Cybernetics, 1992, 22(6): 1403-1413.
Ghosh S, Melhem R G, Mosse D. Fault-tolerance through scheduling of aperiodic tasks in hard real-time multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems, 1997, 8(3): 272-284.
Young J W. A First order approximation to the optimal check-point intervals. Communications of the ACM, 1974, 17(9): 530-531.
Ziv A, Bruck J. An on-line algorithm for checkpoint placement. IEEE Transactions on Computers, 1997, 46(9): 976-985.
Siewiorek D P, Swarz R S. Reliable Computer Systems: Design and Evaluation, 3 rd Edition. Massachusetts: A K Peters, 1998.
Shin K G, Lin T H, Lee Y H. Optimal checkpointing of real-time tasks. IEEE Transactions on Computers, 1987, 36(11): 1328-1341.
Ziv A, Bruck J. Performance optimization of checkpointing schemes with task duplication. IEEE Transactions on Computers, 1997, 46(12): 1381-1386.
Ziv A, Bruck J. Analysis of checkpointing schemes with task duplication. IEEE Transactions on Computers, 1998, 47(2): 222-227.
Kwak S W, Choi B J, Kim B K. Optimal checkpointing strategy for real-time control systems under faults with exponential duration. IEEE Transactions on Reliability, 2001, 50(3): 293-301.
Quaglia F. A cost model for selecting checkpoint positions in time warp parallel simulation. IEEE Transactions on Parallel and Distributed Systems, 2001, 12(4): 346-362.
Kwak S W, Choi B J, Kim B K. Checkpointing strategy for multiple real-time tasks. In Proc. the 7th International Conference on Real-Time Computing Systems and Applications (RTCSA2000), Dec. 2000, pp.517-521.
Kim J K, Kim B K. Probabilistic schedulability analysis of harmonic multi-task systems with dual modular temporal redundancy. Real-Time Systems, 2004, 26(2): 199-222.
Aydin H, Melhem R, Mosseé D, Mejia-Alvarez P. Optimal reward-based scheduling for periodic real-time tasks. IEEE Transactions on Computers, 2001, 50(2): 111-130.
Kwak S W, Yang J M. Schedulability and optimal check-point placement for real-time multi-tasks. In Proc. IEEE Int. Conf. Industrial Engineering and Engineering Management (IEEM 2010), Dec. 2010, pp.778-782.
Kim H, Shin K G. Design and analysis of an optimal instruction-retry policy for TMR controller computers. IEEE Transactions on Computers, 1996, 45(11): 1217-1225.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Kwak, S.W., Yang, JM. Optimal Checkpoint Placement on Real-Time Tasks with Harmonic Periods. J. Comput. Sci. Technol. 27, 105–112 (2012). https://doi.org/10.1007/s11390-012-1209-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-012-1209-0