Skip to main content
Log in

Fault-containing self-stabilizing distributed protocols

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

Self-stabilization is an elegant approach for designing a class of fault-tolerant distributed protocols. A self-stabilizing protocol is guaranteed to eventually converge to a legitimate state after a transient fault. However, even a minor transient fault can cause vast disruption in the system before legitimacy is reached. This paper introduces the notion of fault-containment to address this particular weakness of self-stabilizing systems. Informally, a fault-containing self-stabilizing protocol, in addition to providing self- stabilization, contains the effects of faults. This ensures that disruption during recovery from faults, is proportional to the extent of the faults. The paper begins with a formal framework for specifying and evaluating fault-containing self-stabilizing protocols. The main result of the paper is a transformer that converts any non-reactive self-stabilizing protocol into an equivalent fault-containing self-stabilizing protocol that can repair any single fault in the system in O(1) time. For a large class of input protocols, the corresponding output protocols produced by the transformer have O(1) space overhead. The small time and space overhead make the fault-containing self-stabilizing protocol a practical alternative to the original self-stabilizing protocol. The transformer is based on a novel stabilizing timer paradigm that significantly simplifies the task of fault-containment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Afek, Y., Dolev, S.: Local stabilizer. In: Proceedings of the 5th Israeli Conference on Theory of Computing and Systems, pp. 74–84 (1997)

  2. Anagnostou, E., El-Yaniv, R., Hadzilacos, V.: Memory adaptive self-stabilizing protocols. In: WDAG92 Distributed Algorithms 6th International Workshop Proceedings. LNCS vol. 647, pp. 203–220. Springer, Heidelberg (1992)

  3. Arora A. and Gouda M.G. (1994). Distributed reset. IEEE Trans. Comput. 43: 1026–1038

    Article  MATH  Google Scholar 

  4. Arora, A., Kulkarni, S.S.: Designing masking fault-tolerance via nonmasking fault-tolerance. In: SRDS95 Proceedings of the 14th Symposium on Reliable Distributed Systems, pp. 174–185 (1995)

  5. Arora A. and Kulkarni S.S. (1997). Component based design of multitolerance. IEEE Trans. Softw. Eng. 43: 1026–1038

    Google Scholar 

  6. Awerbuch B. (1985). Complexity of network synchronization. J. ACM 32: 804–823

    Article  MATH  Google Scholar 

  7. Awerbuch, B., Kutten, S., Mansour, Y., Patt-Shamir, B., Varghese, G.: Time optimal self-stabilizing synchronization. In: Proceedings of the 25th Annual ACM Symposium on Theory of Computing, pp. 652–661 (1993)

  8. Awerbuch, B., Varghese, G.: Distributed program checking: a paradigm for building self-stabilizing distributed protocols. In: Proceedings of the 31st Annual IEEE Symposium on Foundations of Computer Science, pp. 258–267 (1991)

  9. Beauquier, J., Genolini, C., Kutten, S.: Optimal reactive k-stabilization – the case of mutual exclusion. In: Proceedings of the 18th Annual ACM Symposium on Principles of Distributed Computing, pp. 209–218 (1999)

  10. Bruell S.C., Ghosh S., Karaata M.H. and Pemmaraju S.V. (1999). Self-stabilizing algorithms for finding centers and medians of trees. SIAM J. Comput. 29(2): 600–614

    Article  MATH  Google Scholar 

  11. Buskens, R.W., Bianchini, R.P.: Self-stabilizing mutual exclusion in the presence of faulty nodes. In: FTCS95 Proceedings of the 25th International Symposium on Fault-Tolerant Computing Systems, pp. 144–153 (1995)

  12. Chen N.S., Yu H.P. and Huang S.T. (1991). A self-stabilizing algorithm for constructing spanning trees. Informat. Process. Lett. 39: 147–151

    Article  MATH  Google Scholar 

  13. Couvreur, M., Francez, N., Gouda, M.G.: Asynchronous unison. In: Proceedings of the 12th International Conference on Distributed Computing Systems, pp. 486–493 (1992)

  14. Dijkstra E.W. (1974). Self stabilizing systems in spite of distributed control. Communi. ACM 17: 643–644

    Article  MATH  Google Scholar 

  15. Dolev, S.: Optimal time self-stabilization in dynamic systems. In: WDAG93 Distributed Algorithms 7th International Workshop Proceedings. LNCS, vol. 725, pp. 160–173 Springer, Heidelberg (1993)

  16. Dolev S. (2000). Self-Stabilization. MIT Press, Cambridge

    MATH  Google Scholar 

  17. Dolev, S., Herman, T.: Superstabilizing protocols for dynamic distributed systems. In: Proceedings of the Second Workshop on Self-Stabilizing Systems, pp. 3.1–3.15 (1995)

  18. Dolev, S., Israeli, A., Moran, S.: Uniform dynamic self-stabilizing leader election. In: WDAG91 Distributed Algorithms 5th International Workshop Proceedings. LNCS, vol. 579, pp. 167–180 Springer, Heidelberg (1991)

  19. Ghosh S. and Gupta A. (1996). An exercise in fault-containment: Self-stabilizing leader election. Informat. Process. Lett. 5(59): 281–288

    Article  Google Scholar 

  20. Ghosh, S., Gupta, A., Herman, T., Pemmaraju, S.V.: Fault-containing self-stabilizing distributed protocols. Technical Report TR-00-01, Department of Computer Science, The University of Iowa, Iowa City (2000)

  21. Ghosh S., Gupta A. and Pemmaraju S.V. (1996). A fault-containing self-stabilizing algorithm for spanning trees. J. Comput. Informat. 2: 322–338

    Google Scholar 

  22. Ghosh, S., Gupta, A., Pemmaraju, S.V.: Fault-containing network protocols. In: Proceedings of 12th Annual ACM Symposium on Applied Computing (1997)

  23. Ghosh S., Gupta A. and Pemmaraju S.V. (1997). A self-stabilizing algorithm for the maximum flow problem. Distributed Comput. 10: 167–180

    Article  Google Scholar 

  24. Ghosh, S., He, X.: Scalable self-stabilization. In: Proceedings of the 4th Workshop on Self-Stabilization, pp. 18–24 (1999)

  25. Gopal, A.S., Perry, K.J.: Unifying self-stabilization and fault-tolerance. In: Proceedings of the 12th Annual ACM Symposium on Principles of Distributed Computing, pp. 195–206 (1993)

  26. Gouda, M.G., Schneider, M.: Maximum flow routing. In: Proceedings of the Second Workshop on Self-Stabilizing Systems, pp. 2.1–2.13 (1995)

  27. Herman, T.: Superstabilizing mutual exclusion. In: Proceedings of 1st International Conference on Parallel and Distributed Processing: Techniques and Applications (1995)

  28. Kutten, S., Patt-Shamir, B.: Asynchronous time-adaptive self-stabilization. In: Brief Announcement, Proceedings of the 17th Annual ACM Symposium on Principles of Distributed Computing (1998)

  29. Kutten S. and Patt-Shamir B. (1999). Stabilizing time-adaptive protocols. Theor. Comput. Sci. 220: 93–111

    Article  MATH  Google Scholar 

  30. Kutten, S., Peleg, D.: Fault-local distributed mending. In: Proceedings of the 14th Annual ACM Symposium on Principles of Distributed Computing, pp. 20–27 (1995)

  31. Manna Z. and Pnueli A. (1992). The Temporal Logic of Reactive and Concurrent Systems. Springer, Heidelberg

    Google Scholar 

  32. Nelson, V.: Fault-tolerant computing: fundamental concepts. IEEE Comput., pp. 19–25 (1990)

  33. Parlati, G., Yung, M.: Non-exploratory self-stabilization for constant-space symmetry-breaking. In: Algorithms ESA 94. LNCS, vol. 855, pp. 183–201 Springer, Heidelberg (1994)

  34. Schneider M. (1993). Self-stabilization. ACM Comput. Surveys 25: 45–67

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sukumar Ghosh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghosh, S., Gupta, A., Herman, T. et al. Fault-containing self-stabilizing distributed protocols. Distrib. Comput. 20, 53–73 (2007). https://doi.org/10.1007/s00446-007-0032-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-007-0032-2

Keywords

Navigation