Regular Article
Tolerating Transient and Intermittent Failures

https://doi.org/10.1006/jpdc.2001.1827Get rights and content

Abstract

Fault tolerance is a crucial property for recent distributed systems. We propose an algorithm that solves the census problem (list all processor identifiers and their relative distance) on an arbitrary strongly connected network.

This algorithm tolerates transient faults that corrupt the processors and communication links memory (it is self-stabilizing) as well as intermittent faults (fair loss, reorder, finite duplication of messages) on communication media. A formal proof establishes its correctness for the considered problem. Our algorithm leads to the construction of algorithms for any silent problems that are self-stabilizing while supporting the same communication hazards.

References (29)

  • S. Dolev

    Self-stabilizing routing and related protocols

    J. Parallel Distrib. Comput.

    (1997)
  • N.V. Stenning

    A data transfer protocol

    Computer Networks

    (1976)
  • Y. Afek et al.

    Self-stabilizing unidirectional network algorithms by power supply

    Chicago J. Theoret. Comput. Sci.

    (1998)
  • Y. Afek et al.

    Self-stabilization over unreliable communication media

    Distrib. Comput.

    (1993)
  • Y. Afek et al.

    Memory-efficient self-stabilization on general networks

    (1990)
  • E. Anagnostou et al.

    Tolerating transient and permanent failures

    (1993)
  • B. Awerbuch et al.

    Self-stabilization by local checking and correction

  • K.A. Bartlett et al.

    A note on reliable full-duplex transmission over half-duplex links

    Comm. Assoc. Comput. Mach.

    (1969)
  • A. Basu et al.

    Simulating reliable links in the presence of process crashes

    (1996)
  • J. Beauquier et al.

    Self-stabilizing census with cut-through constraint

    (1999)
  • J. Beauquier et al.

    Fault tolerance and self-stabilization: Impossibility results and solutions using self-stabilizing failure detectors

    Int. J. Systems Sci.

    (1997)
  • A. Bui et al.

    State-optimal snap-stabilizing pif in tree networks

    Proceedings of the 4th Workshop on Self-stabilizing Systems

    (1999)
  • S.K. Das et al.

    Self-stabilizing algorithms in dag structured networks

    Parallel Process. Lett.

    (1999)
  • S. Delaët et al.

    Un algorithme auto-stabilisant en dépit de communications non fiables

    Tech. Sci. Inform.

    (1988)
  • Cited by (20)

    View all citing articles on Scopus

    An extended abstract of a preliminary version of this paper appeared in [3]. This work was supported in part by the French STAR project.

    f1

    [email protected]

    f2

    [email protected]

    2

    To whom correspondence should be addressed.

    View full text