Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion

Kwak, Seong Woo; You, Kwan-Ho; Yang, Jung-Min

doi:10.1007/s11390-012-1222-3

Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion

Short Paper
Published: 05 March 2012

Volume 27, pages 273–280, (2012)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Seong Woo Kwak¹,
Kwan-Ho You² &
Jung-Min Yang³

112 Accesses
Explore all metrics

Abstract

This paper proposes a checkpoint rollback strategy for real-time systems with double modular redundancy. Without built-in fault-detection and spare processors, our scheme is able to recover from both transient and permanent faults. Two comparisons are conducted at each checkpoint. First, the states stored in two consecutive checkpoints of one processor are compared for checking integrity of the processor. The states of two processors are also compared for detecting faults and the system rolls back to the previous checkpoint whenever required by logic of the proposed scheme. A Markov model is induced by the fault recovery scheme and analyzed to provide the probability of task completion within its deadline. The optimal number of checkpoints is selected so as to maximize the probability of task completion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal Design of Checkpoint Systems with General Structures, Tasks and Schemes

Optimal Checkpoint Intervals, Schemes and Structures for Computing Modules

Computation algorithms for workload-dependent optimal checkpoint placement

Article 21 January 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Young J W. A first order approximation to the optimal check-point intervals. Commun. the ACM, 1974, 17(9): 530–531.
Article MATH Google Scholar
Naruse K, Umemura S, Nakagawa, S. Optimal checkpointing interval for two-level recovery schemes. Computers and Mathematics with Applications, 2006, 51(2): 371–376.
Article MathSciNet MATH Google Scholar
Ziv A, Bruck J. Performance optimization of checkpointing schemes with task duplication. IEEE Transactions on Computers, 1997, 46(12): 1381–1386.
Article MathSciNet Google Scholar
Nakagawa S, Fukumoto S, Ishii N. Optimal checkpointing intervals for a double modular redundancy with signatures. Comput. and Math. with Applicat., 2003, 46(7): 1089–1094.
Article MATH Google Scholar
Krishina C M, Shin K G. Real-Time Systems. McGraw-Hill, 1997.
Pradhan D K, Vaidya N H. Roll-forward checkpointing scheme: A novel fault-tolerant architecture. IEEE Transactions on Computers, 1994, 43(10): 1163–1174.
Article MATH Google Scholar
Ziv A, Bruck J. Analysis of checkpointing schemes with task duplication. IEEE Trans. Computers, 1998, 47(2): 222–227.
Article Google Scholar
Pradhan D K, Vaidya N H. Roll-forward and rollback recovery: Performance-reliability trade-off. IEEE Transactions on Computers, 1997, 46(3): 372–378.
Article Google Scholar
Tiwari A, Tomko K A. Enhanced reliability of finite-state machines in FPGA through efficient fault detection and correction. IEEE Transactions on Reliability, 2005, 54(3): 459–467.
Article Google Scholar
Yang J M, Kwak S W. A checkpoint scheme with task duplication considering transient and permanent fault. In Proc. IEEE Int. Conf. Industrial Engineering and Engineering Management (IEEM2010), Dec. 2010, pp.606–610.
Karpovsky M, Su S Y H. Detection and location of input and feedback bridging faults among input and output lines. IEEE Transactions on Computers, 1980, C-29(6): 523–527.
Article MathSciNet Google Scholar
Hashizume M, Yotsuyanagi H, Tamesada T. Identification of feedback bridging faults with oscillation. In Proc. the 8th Asian Test Symposium, Nov. 1999, pp.25–30.
Konuk H, Ferguson F J. Oscillation and sequential behavior caused by opens in the routing in digital CMOS circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1998, 17(11): 1200–1210.
Article Google Scholar
Berdjag D, Zolghadri A, Cieslak J, Goupil P. Fault detection and isolation for redundant aircraft sensors. In Proc. SysTol 2010, Oct. 2010, pp.137–142.
Kwak S W, Choi B J, Kim B K. Optimal checkpointing strategy for real-time control systems under faults with exponential duration. IEEE Trans. Reliability, 2001, 50(3): 293–301.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Keimyung University, Daegu, 704-701, Korea
Seong Woo Kwak
School of Information & Communication Engineering, Sungkyunkwan University, Suwon, 440-746, Korea
Kwan-Ho You
Department of Electrical Engineering, Catholic University of Daegu, Daegu, 712-702, Korea
Jung-Min Yang

Authors

Seong Woo Kwak
View author publications
You can also search for this author in PubMed Google Scholar
Kwan-Ho You
View author publications
You can also search for this author in PubMed Google Scholar
Jung-Min Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seong Woo Kwak.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 77.3 kb)

About this article

Cite this article

Kwak, S.W., You, KH. & Yang, JM. Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion. J. Comput. Sci. Technol. 27, 273–280 (2012). https://doi.org/10.1007/s11390-012-1222-3

Download citation

Received: 12 April 2011
Revised: 27 September 2011
Published: 05 March 2012
Issue Date: March 2012
DOI: https://doi.org/10.1007/s11390-012-1222-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Optimal Design of Checkpoint Systems with General Structures, Tasks and Schemes

Optimal Checkpoint Intervals, Schemes and Structures for Computing Modules

Computation algorithms for workload-dependent optimal checkpoint placement

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

(PDF 77.3 kb)

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Checkpoint Management with Double Modular Redundancy Based on the Probability of Task Completion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Optimal Design of Checkpoint Systems with General Structures, Tasks and Schemes

Optimal Checkpoint Intervals, Schemes and Structures for Computing Modules

Computation algorithms for workload-dependent optimal checkpoint placement

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

(PDF 77.3 kb)

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation