Reference Hub8
Analysis and Evaluation of a New Algorithm Based Fault Tolerance for Computing Systems

Analysis and Evaluation of a New Algorithm Based Fault Tolerance for Computing Systems

Hodjat Hamidi, Abbas Vafaei, Seyed Amir Hassan Monadjemi
Copyright: © 2012 |Volume: 4 |Issue: 1 |Pages: 15
ISSN: 1938-0259|EISSN: 1938-0267|EISBN13: 9781466612310|DOI: 10.4018/jghpc.2012010103
Cite Article Cite Article

MLA

Hamidi, Hodjat, et al. "Analysis and Evaluation of a New Algorithm Based Fault Tolerance for Computing Systems." IJGHPC vol.4, no.1 2012: pp.37-51. http://doi.org/10.4018/jghpc.2012010103

APA

Hamidi, H., Vafaei, A., & Monadjemi, S. A. (2012). Analysis and Evaluation of a New Algorithm Based Fault Tolerance for Computing Systems. International Journal of Grid and High Performance Computing (IJGHPC), 4(1), 37-51. http://doi.org/10.4018/jghpc.2012010103

Chicago

Hamidi, Hodjat, Abbas Vafaei, and Seyed Amir Hassan Monadjemi. "Analysis and Evaluation of a New Algorithm Based Fault Tolerance for Computing Systems," International Journal of Grid and High Performance Computing (IJGHPC) 4, no.1: 37-51. http://doi.org/10.4018/jghpc.2012010103

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

In this paper, the authors present a new approach to algorithm based fault tolerance (ABFT) for High Performance computing system. The Algorithm Based Fault Tolerance approach transforms a system that does not tolerate a specific type of fault, called the fault-intolerant system, to a system that provides a specific level of fault tolerance, namely recovery. The ABFT techniques that detect errors rely on the comparison of parity values computed in two ways, the parallel processing of input parity values produce output parity values comparable with parity values regenerated from the original processed outputs, can apply convolution codes for the redundancy. This method is a new approach to concurrent error correction in fault-tolerant computing systems. This paper proposes a novel computing paradigm to provide fault tolerance for numerical algorithms. The authors also present, implement, and evaluate early detection in ABFT.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.