Summary
There is a very close relationship between common knowledge and simultaneity in synchronous distributed systems. The analysis of several well-known problems in terms of common knowledge has led to round-optimal protocols for these problems, includingReliable Broadcast, Distributed Consensus, and theDistributed Firing Squad problem. These problems require that the correct processors coordinate their actions in some way but place no restrictions on the behaviour of the faulty processors. In systems with benign processor failures, howrver, it is reasonable to require that the actions of a faulty processor be consistent with those of the correct processors, assuming it performs any action at all. We consider problems requiringconsistent, simultaneous coordination. We then analyze these problems in terms of common knowledge in several failure models. The analysis of these stronger problems requires a stronger definition of common knowledge, and we study the relationship between these two definitions. In many cases, the two definitions are actually equivalent, and simple modifications of previous solutions yield roundoptimal solutions to these problems. When the definitions differ, however, we show that such problems cannot be solved, even in failure-free executions.
Similar content being viewed by others
References
Burns JE, Lynch NA: The Byzantine firing squad problem. Adv Comput Res: Parallel Distrib Comput 4:147–161 (1987) Also appears as Tech Rep 275, MIT Laboratory for Computer Science
Coan BA: A communication-efficient canonical form for faulttolerant distributed protocols. In: Proc 5th ACM Symp on Principles of Distributed Computing, pp 63–72, August 1986. A revised version appears in Coan's Ph.D. dissertation [3]
Coan BA: Achieving consensus in fault-tolerant distributed computer systems: protocols, lower bounds, and simulations. Ph.D. dissertation, Massachusetts Institute of Technology, June 1987
Coan BA, Dolev D, Dwork C, Stockmeyer L: The distributed firing squad problem. SIAM J Comput 18(5):990–1012 (1989)
Dolev D, Reischuk R, Strong HR: Early stopping in Byzantine agreement J ACM 37(4):720–741 (1990)
Dwork C, Moses Y: Knowledge and common knowledge in a Byzantine environment: crash failures. Inf Comput 88(2):156–186 (1990)
Fischer MJ, Lynch NA: A lower bound for the time to assure interactive consistency. Inf Process Lett 14:183–186 (1982)
Gopal A, Toueg S: Reliable broadcast in synchronous and asynchronous environments (preliminary version). In: Bermond J-C, Raynal M (eds) Proc 3rd Int Workshop on Distributed Algorithms. Lect Notes Comput Sci, vol 392. Springer, Berlin Heidelberg New York 1989, pp 110–123
Gray J: Notes on database operating systems. In: Bayer R, Graham RM, Seegmuller G (eds) Operating systems: an advanced course. Lect Notes Comput Sci, vol 60. Springer, Berlin Heidelberg New York 1978, pp 393–481. Also appears as Tech Rep RJ2188, IBM Research Laboratory
Hadzilacos V: Issues of fault tolerance in concurrent computations. Ph.D. dissertation, Harvard University, June 1984. Tech Rep 11-84, Aiken Computation Laboratory
Halpern JY, Moses Y: A guide to the modal logic of knowledge and belief. In: Proc 9th Int Joint Con on Artificial Intelligence, Morgan-Kaufmann 1985 pp 480–490
Halpern JY, Moses Y, Knowledge and common knowledge in a distributed environment. J ACM 37(3):549–587 (1990)
Halpern, JY, Moses Y, Waarts O: A characterization of eventual Byzantine agreement. In: Proc 9th ACM Symp on Principles of Distributed Computing, pp 333–346, August 1990
Lamport L, Shostak R, Pease M: The Byzantine generals problem. ACM Trans Program Lang Syst 4(3):382–401 (1982)
Lampson B, Sturgis H: Crash recovery in a distributed data storage system. Tech Rep, Computer Science Laboratory, Xeron, Palo Alto Research Center, Palo Alto, CA, 1976
Mohan C, Strong R, Finkelstein S: Methods for distributed transaction commit and recovery using Byzantine agreement within clusters of processors In: Proc 2nd ACM Symp on Principles of Distributed Computing, pp 89–103, August 1983
Moses Y, Tuttle MR: Programming simultaneous actions using common knowledge. Algorithmica 3(1):121–169 (1988)
Neiger G, Bazzi R: Using knowledge to optimally achieve, coordination in distributed systems. In: Moses Y (ed) Proc 4th Conf on Theoretical Aspects of Reasoning about Knowledge. Morgan-Kaufmann 1992, pp 43–59
Neiger G, Toueg S: Automatically increasing the fault-tolerance of distributed algorithms. J Algorithms 11(3):374–419 (1990)
Pease M, Shostak R, Lamport L: Reaching agreement in the presence of faults. J ACM 27(2):228–234 (1980)
Perry KJ, Toueg S: Distributed agreement in the presence of processor and communication faults. IEEE Trans Software Eng 12(3):477–482 (1986)
Rabin MO: Efficient solutions to the distributed firing squad problem. Private communication
Schlichting RD, Schneider FB: Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Trans Comput Syst 1(3):222–238 (1983)
Author information
Authors and Affiliations
Additional information
Gil Neiger was born on February 19, 1957 in New York, New York. In June 1979, he received an A.B. in Mathematics and Psycholinguistics from Brown University in Providence, Rhode Island. In February 1985, he spent two weeks picking cotton in Nicaragua in a brigade of international volunteers. In January 1986, he received an M.S. in Computer Science from Cornell University in Ithaca, New York. On August 20, 1988, Gil Neiger married Hilary Lombard in Lansing, New York. IN August 1988, he received a Ph.D. in Computer Science, also from Cornell University. Since August 1988, he has been an Assistant Professor in the College of Computing (formerly School of Information and Computer Science) at the Georgia Institute of Technology in Atlanta, Georgia.
Mark Tuttle was born in Lincoln, Nebraska in 1962. He received his B.S. in math and computer science from the University of Nebraska-Lincoln in 1984, and his M.S. and Ph.D. from the Massachusetts Institute of Technology in 1987 and 1989. He is currently a member of the research staff at Digital Equipment Corporation's Cambridge Research Lab in Cambridge, Massachusetts. His research interests include models for distributed computation, knowledge and distributed computation, computer security, and concurrent data structures.
An earlier version of this paper appeared in J. van Leeuwen and N. Santoro (eds.) Proceedings of the Fourth International Workshop on Distributed Algorithms, volume 486 of Lecture Notes on Computer Science, pages 334–352. Springer, September 1990
This author was supported in part by the National Science Foundation under grants CCR-8909663 and CCR-9106627
Rights and permissions
About this article
Cite this article
Neiger, G., Tuttle, M.R. Common knowledge and consistent simultaneous coordination. Distrib Comput 6, 181–192 (1993). https://doi.org/10.1007/BF02242706
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02242706