Abstract
The problem of fault-tolerant coordination is fundamental in distributed computing. In the past, researchers have considered achieving simultaneous coordination under various failure assumptions. It has been shown that doing so optimally in synchronous systems with send/receive omission failures requires NP-hard local computation. This paper studiesalmost-optimal simultaneous coordination, which requires processors to coordinate within a constant additive or multiplicative number of rounds of the coordination time of an optimal protocol. It shows that achieving such coordination also requires NP-hard computation.
Access this article
Rent this article via DeepDyve
Similar content being viewed by others
References
R. A. Bazzi and G. Neiger. The complexity and impossibility of achieving fault-tolerant coordination. InProceedings of the Eleventh ACM Symposium on Principles of Distributed Computing, pages 203–214. ACM Press, New York, August 1992.
R. A. Bazzi and G. Neiger. Simplifying fault-tolerance: Providing the abstraction of crash failures. Technical Report 93/12, College of computing, Georgia Institute of Technology, February 1993. Earlier versions of parts of this paper appeared as “Optimally Simulating Crash Failures in a Byzantine Environment” in S. Toeug, P. G. Spirakis, and L. Kirousis, editors,Proceedings of the Fifth International Workshop on Distributed Algorithms, of Lecture Notes on Computer Science, volume 579, pages 108–128. Springer-Verlag, Berlin, October 1991, and as “Simulating Crash Failures with Many Faulty Processors” in A. Segall and S. Zaks, editors,Proceedings of the Sixth International Workshop on Distributed Algorithms, Lecture Notes on Computer Science, volume 647, pages 166–184. Springer-Verlag, Berlin, November 1992.
P. Berman and J. A. Garay. Cloture votes:n/4-resilient, polynomial time distributed consensus int+1 rounds.Mathematical Systems Theory, 26 (1): 3–20, 1993.
J. E. Burns and N. A. Lynch. The Byzantine firing squad problem.Advances in Computing Research: Parallel and Distributed Computing, 4: 147–161, 1987. Also appears as Technical Report 275, MIT Laboratory for Computer Science.
B. A. Coan. A communication-efficient canonical form for fault-tolerant distributed protocols. InProceedings of the Fifth ACM Symposium on Principles of Distributed Computing, pages 63–72, August 1986. A revised version appears in Coan's Ph.D. dissertation [6].
B. A. Coan. Achieving Consensus in Fault-Tolerant Distributed Computer Systems: Protocols, Lower Bounds and Simulations. Ph.D. dissertation, Massachusetts Institute of Technology, June 1987.
B. A. Coan, D. Dolev, C. Dwork, and L. Stockmeyer. The distributed firing squad problem.SIAM Journal on Computing, 18 (5): 990–1012, October 1989.
D. Dolev, R. Reischuk, and H. R. Strong. Early stopping in Byzantine agreement.Journal of the ACM, 37 (4): 720–741, October 1990.
C. Dwork and Y. Moses. Knowledge and common knowledge in a Byzantine environment: Crash failures.Information and Computation, 88 (2): 156–186, October 1990.
M. R. Garey and D. S. Johnson.Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York, 1979.
V. Hadzilacos. A knowledge theoretic analysis of atomic commitment protocols. InProceedings of the Sixth ACM Symposium on Principles of Database Systems, pages 129–134. ACM Press, New York, March 1987.
J. Y. Halpern and Y. Moses. Knowledge and common knowledge in a distributed environment.Journal of the ACM, 37 (3): 549–587, July 1990.
J. Y. Halpern, Y. Moses, and O. Waarts. A characterization of eventual Byzantine agreement. InProceedings of the Ninth ACM Symposium on Principles of Distributed Computing, pages 333–346. ACM Press, New York, August 1990.
L. Lamport, R. Shostak, and M. Pease. The Byzantine generals problem.ACM Transactions on Programming Languages and Systems, 4 (3): 382–401, July 1982.
M. S. Mazer. A knowledge theoretic account of recovery in distributed systems: The case of negotiated commitment. In M. Y. Vardi, editor,Proceedings of the Second Conference on Theoretical Aspects of Reasoning about Knowledge, pages 309–324. Morgan-Kaufmann, Los Altos, CA, March 1988.
R. Michel. Knowledge in Distributed Byzantine Environments. Ph.D. dissertation, Yale University, December 1989.
Y. Moses and M. R. Tuttle. Programming simultaneous actions using common knowledge.Algorithmica, 3 (1): 121–169, 1988.
G. Neiger and R. Bazzi. Using knowledge to optimally achieve coordination in distributed systems. In Y. Moses, editor,Proceedings of the Fourth Conference on Theoretical Aspects of Reasoning about Knowledge, pages 43–59. Morgan-Kaufmann, Los Altos, CA, March 1992.
G. Neiger and S. Toueg. Automatically increasing the fault-tolerance of distributed algorithms.Journal of Algorithms, 11 (3): 374–419, September 1990.
G. Neiger and S. Toueg. Simulating synchronized clocks and common knowledge in distributed systems.Journal of the ACM, 40 (2): 334–367, April 1993.
G. Neiger and M. R. Tuttle. Common knowledge and consistent simultaneous coordination.Distributed Computing, 6 (3): 181–192, April 1993.
M. Pease, R. Shostak, and L. Lamport. Reaching agreement in the presence of faults.Journal of the ACM, 27 (2): 228–234, April 1980.
Author information
Authors and Affiliations
Additional information
Communicated by G. N. Frederickson.
This work was supported in part by the National Science Foundation under Grants CCR-9106627 and CCR-9301454. R. A. Bazzi was supported in part by a scholarship from the Hariri Foundation.
Rights and permissions
About this article
Cite this article
Bazzi, R.A., Neiger, G. The complexity of almost-optimal simultaneous coordination. Algorithmica 17, 308–321 (1997). https://doi.org/10.1007/BF02523194
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02523194