Abstract
Unreliable failure detectors, proposed by Chandra and Toueg [2], are mechanisms that provide information about process fail- ures. In [2], eight classes of failure detectors were de.ned, depending on how accurate this information is, and an algorithm implementing a fail- ure detector of one of these classes in a partially synchronous system was presented. This algorithm is based on all-to-all communication, and peri- odically exchanges a number of messages that is quadratic on the number of processes. To our knowledge, no other algorithm implementing these classes of unreliable failure detectors has been proposed.
In this paper, we present a family of distributed algorithms that imple- ment four classes of unreliable failure detectors in partially synchronous systems. Our algorithms are based on a logical ring arrangement of the processes, which defines the monitoring and failure information propa- gation pattern. The resulting algorithms periodically exchange at most a linear number of messages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
M. Aguilera and S. Toueg. Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication. Proceedings of the 11th International Workshop on Distributed Algorithms (WDAG), LNCS, Springer-Verlag, Germany, Sep. 1997.
T. D. Chandra and S. Toueg. Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM, 43(2), pages 225–267, Mar. 1996.
T. D. Chandra, V. Hadzilacos, and S. Toueg. The Weakest Failure Detector for Solving Consensus. Journal of the ACM, 43(4), pages 685–722, Jul. 1996.
D. Dolev, C. Dwork, and L. Stockmeyer. On the Minimal Synchronism Needed for Distributed Consensus. Journal of the ACM, 34(1), pages 77–97, Jan. 1987.
C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the Presence of Partial Synchrony. Journal of the ACM, 35(2), pages 288–323, Apr. 1988.
C. Fetzer and F. Cristian. Fail-Aware Failure Detectors. Proceedings of the 15th Symposium on Reliable Distributed Systems (SRDS), Canada, Oct. 1996.
M. Fischer, N. Lynch, and M. Paterson. Impossibility of Distributed Consensus with One Faulty Process. Journal of the ACM, 32(2), pages 374–382, Apr. 1985.
R. Guerraoui, M. Larrea, and A. Schiper. Non-Blocking Atomic Commitment with an Unreliable Failure Detector. Proceedings of the 14th Symposium on Reliable Distributed Systems (SRDS), Germany, Sep. 1996.
R. Guerraoui and A. Schiper. Gamma-Accurate Failure Detectors. Proceedings of the 10th International Workshop on Distributed Algorithms (WDAG), LNCS, Springer-Verlag, Italy, Oct. 1996.
M. Pease, R. Shostak, and L. Lamport. Reaching Agreement in the Presence of Faults. Journal of the ACM, 27(2), pages 228–234, Apr. 1980.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Larrea, M., Arevalo, S., Fernndez, A. (1999). Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems. In: Jayanti, P. (eds) Distributed Computing. DISC 1999. Lecture Notes in Computer Science, vol 1693. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48169-9_3
Download citation
DOI: https://doi.org/10.1007/3-540-48169-9_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66531-1
Online ISBN: 978-3-540-48169-0
eBook Packages: Springer Book Archive