Abstract
We present fault detectors for transient faults, (i.e., corruptions of the memory of the processors, but not of the code of the processors). We distinguish fault detectors for tasks (i.e., the problem to be solved) from failure detectors for implementations (i.e., the algorithm that solves the problem). The aim of our fault detectors is to detect a memory corruption as soon as possible. We study the amount of memory needed by the fault detectors for some specific tasks, and give bounds for each task. The amount of memory is related to the size and the number of views that a processor has to maintain to ensure a quick detection. This work may give the implementation designer hints concerning the techniques and resources that are required for implementing a task.
Similar content being viewed by others
References
Afek Y., Dolev S. (2002). Local stabilizer. J. Parallel Distrib. Comput. 62(5): 745–765
Afek Y., Kutten S., Yung M. (1990). Memory-efficient self stabilizing protocols for general networks. In: van Leeuwen, J., Santoro, N. (eds) WDAG, Lecture Notes in Computer Science, vol. 486, pp 15–28. Springer, Berlin
Awerbuch, B., Patt-Shamir, B., Varghese, G.: Self-stabilization by local checking and correction (extended abstract). In: FOCS, pp. 268–277. IEEE (1991)
Awerbuch B., Patt-Shamir B., Varghese G., Dolev S. (1994). Self-stabilization by local checking and global reset (extended abstract). In: Tel, G., Vitányi, P.M.B. (eds) WDAG, Lecture Notes in Computer Science, vol. 857, pp 326–339. Springer, Berlin
Burns J.E., Gouda M.G., Miller R.E. (1993). Stabilization and pseudo-stabilization. Distrib. Comput. 7(1): 35–42
Chandra T.D., Toueg S. (1996). Unreliable failure detectors for reliable distributed systems. J. ACM 43(2): 225–267
Delaët, S., Ducourthial, B., Tixeuil, S.: Self-stabilization with r-operators revisited. J. Aerosp. Comput. Inf. Commun. (2006)
Delaët S., Tixeuil S. (2002). Tolerating transient and intermittent failures. J. Parallel Distrib. Comput. 62(5): 961–981
Dijkstra E.W. (1974). Self-stabilizing systems in spite of distributed control. Commun. ACM 17(11): 643–644
Dolev S. (1997). Self-stabilizing routing and related protocol. J. Parallel Distrib. Comput. 42(2): 122–127
Dolev S. (2000). Self-stabilization. MIT Press, Cambridge
Dolev S., Gouda M.G., Schneider M. (1999). Memory requirements for silent stabilization. Acta Inf. 36(6): 447–462
Dolev, S., Herman, T.: Superstabilizing protocols for dynamic distributed systems. Chicago J. Theor. Comput. Sci. 1997 (1997)
Dolev S., Israeli A., Moran S. (1993). Self-stabilization of dynamic systems assuming only read/write atomicity. Distrib. Comput. 7(1): 3–16
Dolev S., Israeli A., Moran S. (1995). Analyzing expected time by scheduler-luck games. IEEE Trans. Softw. Eng. 21(5): 429–439
Dolev, S., Kranakis, E., Krizanc, D., Peleg, D.: Bubbles: adaptive routing scheme for high-speed dynamic networks (extended abstract). In: STOC, pp. 528–537. ACM (1995)
Ducourthial B., Tixeuil S. (2001). Self-stabilization with r-operators. Distrib. Comput. 14(3): 147–162
Ducourthial, B., Tixeuil, S.: Self-stabilization with path algebra. Theor. Comput. Sci. 293(1), 219–236 (2003). Extended abstract in Sirrocco 2000
Ghosh, S., Gupta, A., Herman, T., Pemmaraju, S.V.: Fault-containing self-stabilizing algorithms. In: PODC, pp. 45–54 (1996)
Katz S., Perry K.J. (1993). Self-stabilizing extensions for message-passing systems. Distribut. Comput. 7(1): 17–26
Kutten, S., Patt-Shamir, B.: Time-adaptive self stabilization. In: PODC, pp. 149–158 (1997)
Lin, C., Simon, J.: Observing self-stabilization. In: PODC92 Proceedings of the 11th annual ACM symposium on principles of distributed computing, pp. 113–123 (1992)
Peleg, D.: Distributed computing: a locality-sensitive approach. SIAM Monogr. Discr. Math. Appl. (2000)
Author information
Authors and Affiliations
Corresponding author
Additional information
An extended abstract of this paper was presented at the 12th International Symposium on DIStributed Computing (DISC’98). Shlomi Dolev is partly supported by the Israeli Ministry of Science and Arts grant #6756195. Part of this research was done while Shlomi Dolev was visiting the Laboratoire de Recherche en Informatique (LRI), University of Paris Sud.
Rights and permissions
About this article
Cite this article
Beauquier, J., Delaët, S., Dolev, S. et al. Transient fault detectors. Distrib. Comput. 20, 39–51 (2007). https://doi.org/10.1007/s00446-007-0029-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00446-007-0029-x