Skip to main content

Advertisement

Log in

Transient fault detectors

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

We present fault detectors for transient faults, (i.e., corruptions of the memory of the processors, but not of the code of the processors). We distinguish fault detectors for tasks (i.e., the problem to be solved) from failure detectors for implementations (i.e., the algorithm that solves the problem). The aim of our fault detectors is to detect a memory corruption as soon as possible. We study the amount of memory needed by the fault detectors for some specific tasks, and give bounds for each task. The amount of memory is related to the size and the number of views that a processor has to maintain to ensure a quick detection. This work may give the implementation designer hints concerning the techniques and resources that are required for implementing a task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Afek Y., Dolev S. (2002). Local stabilizer. J. Parallel Distrib. Comput. 62(5): 745–765

    Article  MATH  Google Scholar 

  2. Afek Y., Kutten S., Yung M. (1990). Memory-efficient self stabilizing protocols for general networks. In: van Leeuwen, J., Santoro, N. (eds) WDAG, Lecture Notes in Computer Science, vol. 486, pp 15–28. Springer, Berlin

    Google Scholar 

  3. Awerbuch, B., Patt-Shamir, B., Varghese, G.: Self-stabilization by local checking and correction (extended abstract). In: FOCS, pp. 268–277. IEEE (1991)

  4. Awerbuch B., Patt-Shamir B., Varghese G., Dolev S. (1994). Self-stabilization by local checking and global reset (extended abstract). In: Tel, G., Vitányi, P.M.B. (eds) WDAG, Lecture Notes in Computer Science, vol. 857, pp 326–339. Springer, Berlin

    Google Scholar 

  5. Burns J.E., Gouda M.G., Miller R.E. (1993). Stabilization and pseudo-stabilization. Distrib. Comput. 7(1): 35–42

    Article  MATH  Google Scholar 

  6. Chandra T.D., Toueg S. (1996). Unreliable failure detectors for reliable distributed systems. J. ACM 43(2): 225–267

    Article  MATH  Google Scholar 

  7. Delaët, S., Ducourthial, B., Tixeuil, S.: Self-stabilization with r-operators revisited. J. Aerosp. Comput. Inf. Commun. (2006)

  8. Delaët S., Tixeuil S. (2002). Tolerating transient and intermittent failures. J. Parallel Distrib. Comput. 62(5): 961–981

    Article  MATH  Google Scholar 

  9. Dijkstra E.W. (1974). Self-stabilizing systems in spite of distributed control. Commun. ACM 17(11): 643–644

    Article  MATH  Google Scholar 

  10. Dolev S. (1997). Self-stabilizing routing and related protocol. J. Parallel Distrib. Comput. 42(2): 122–127

    Article  Google Scholar 

  11. Dolev S. (2000). Self-stabilization. MIT Press, Cambridge

    MATH  Google Scholar 

  12. Dolev S., Gouda M.G., Schneider M. (1999). Memory requirements for silent stabilization. Acta Inf. 36(6): 447–462

    Article  MATH  Google Scholar 

  13. Dolev, S., Herman, T.: Superstabilizing protocols for dynamic distributed systems. Chicago J. Theor. Comput. Sci. 1997 (1997)

  14. Dolev S., Israeli A., Moran S. (1993). Self-stabilization of dynamic systems assuming only read/write atomicity. Distrib. Comput. 7(1): 3–16

    Article  Google Scholar 

  15. Dolev S., Israeli A., Moran S. (1995). Analyzing expected time by scheduler-luck games. IEEE Trans. Softw. Eng. 21(5): 429–439

    Article  Google Scholar 

  16. Dolev, S., Kranakis, E., Krizanc, D., Peleg, D.: Bubbles: adaptive routing scheme for high-speed dynamic networks (extended abstract). In: STOC, pp. 528–537. ACM (1995)

  17. Ducourthial B., Tixeuil S. (2001). Self-stabilization with r-operators. Distrib. Comput. 14(3): 147–162

    Article  Google Scholar 

  18. Ducourthial, B., Tixeuil, S.: Self-stabilization with path algebra. Theor. Comput. Sci. 293(1), 219–236 (2003). Extended abstract in Sirrocco 2000

    Google Scholar 

  19. Ghosh, S., Gupta, A., Herman, T., Pemmaraju, S.V.: Fault-containing self-stabilizing algorithms. In: PODC, pp. 45–54 (1996)

  20. Katz S., Perry K.J. (1993). Self-stabilizing extensions for message-passing systems. Distribut. Comput. 7(1): 17–26

    Article  Google Scholar 

  21. Kutten, S., Patt-Shamir, B.: Time-adaptive self stabilization. In: PODC, pp. 149–158 (1997)

  22. Lin, C., Simon, J.: Observing self-stabilization. In: PODC92 Proceedings of the 11th annual ACM symposium on principles of distributed computing, pp. 113–123 (1992)

  23. Peleg, D.: Distributed computing: a locality-sensitive approach. SIAM Monogr. Discr. Math. Appl. (2000)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joffroy Beauquier.

Additional information

An extended abstract of this paper was presented at the 12th International Symposium on DIStributed Computing (DISC’98). Shlomi Dolev is partly supported by the Israeli Ministry of Science and Arts grant #6756195. Part of this research was done while Shlomi Dolev was visiting the Laboratoire de Recherche en Informatique (LRI), University of Paris Sud.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beauquier, J., Delaët, S., Dolev, S. et al. Transient fault detectors. Distrib. Comput. 20, 39–51 (2007). https://doi.org/10.1007/s00446-007-0029-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-007-0029-x

Keywords

Navigation