ABSTRACT
We consider the problem of gossiping when dynamic node crashes are controlled by adaptive adversaries. We develop gossiping algorithms which are efficient with respect to both the time and communication measured as the number of point-to-point messages. If the adversary is allowed to fail up to $t$ nodes, among the total of $n$, where additionally $n-t=\Omega(n/\textpolylog n)$, then one among our algorithms completes gossiping in time $\cO(\log^2 t)$ and with $\cO(n\text polylog t)$ messages. We prove a lower bound which states that the time has to be at least $\Omega\Big(\frac\log n\log(n\log n)-\log t\Big)$ if the communication is restricted to be $\cO(n\text polylog n)$.We also show that one can solve efficiently a more demanding consensus problem with crash failures by resorting to one of our gossiping algorithms. If the adversary is allowed to fail $t$ nodes, where $n-t=\Omega(n/\textpolylog n)$, we obtain a time-optimal solution that is away from the communication optimality by at most a polylogarithmic factor.
- M. Ajtai, J. Aspnes, C. Dwork, and O. Waarts, A theory of competitive analysis of distributed algorithms, in Proc. 33rd IEEE Symp. on Foundations of Computer Science, 1994, pp. 401--411.]]Google ScholarDigital Library
- S. Amdur, S. Weber, and V. Hadzilacos, On the message complexity of binary agreement under crash failures, Distributed Computing, 5 (1992) 175--186.]]Google Scholar
- R.J. Anderson, and H. Woll, Algorithms for the certified write-all problem, SIAM J. Computing, 26 (1997) 1277--1283.]] Google ScholarDigital Library
- J. Aspnes, Lower bounds for distributed coin-flipping and randomized consensus, J. ACM, 45 (1998) 415--450.]] Google ScholarDigital Library
- J. Aspnes, Time- and space-efficient randomized consensus, J. Algorithms, 14 (1993) 414--431.]] Google ScholarDigital Library
- J. Aspnes, and W. Hurwood, Spreading rumors rapidly despite an adversary, J. Algorithms, 26 (1998) 386--411.]] Google ScholarDigital Library
- N.T.J. Bailey, "The Mathematical Theory of Infectious Diseases and its Applications", Charles Griffin, London, 1975.]]Google Scholar
- Z. Bar-Joseph, and M. Ben-Or, A tight lower bound for randomized synchronous consensus, in Proc., 17th ACM Symp. on Principles of Distributed Computing, 1998, pp. 193--199.]] Google ScholarDigital Library
- M. Ben-Or, Another advantage of free choice: Completely asynchronous agreement protocols, in Proc., 2nd ACM Symp. on Principles of Distributed Computing, 1983, pp. 27--30.]] Google ScholarDigital Library
- G. Bracha, unpublished manuscript, Cornell University, 1984.]]Google Scholar
- T.D. Chandra, and S. Toueg, Time and message efficient reliable broadcast, in Proc., 4th Int. Workshop on Distributed Algorithms, 1990, Springer LNCS 486, pp. 289--303.]] Google ScholarDigital Library
- B.S. Chlebus, L. Gąsieniec, D.R. Kowalski, and A.A. Shvartsman, Bounding work and communication in robust cooperative computation, submitted.]] Google ScholarDigital Library
- A. Demers, D. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. Sturgis, D. Swineheart, and D. Terry, Epidemic algorithms for replicated database maintenance, in Proc., 6th ACM Symp. on Principles of Distributed Computing, 1987, pp. 1--12.]] Google ScholarDigital Library
- K. Diks, and A. Pelc, Optimal adaptive broadcasting with a bounded fraction of faulty nodes, Algorithmica, 28 (2000) 37--50.]]Google Scholar
- D. Dolev, and R. Reischuk, Bounds on information exchange for Byzantine agreement, J. ACM, 32 (1985) 191--204.]] Google ScholarDigital Library
- C. Dwork, J. Halpern, and O. Waarts, Performing work efficiently in the presence of faults, SIAM J. on Computing, 27 (1998) 1457--1491.]] Google ScholarDigital Library
- C. Dwork, D. Peleg, N. Pippenger, and E. Upfal, Fault tolerance in networks of bounded degree, SIAM J. on Computing, 17 (1988) 975--988.]] Google ScholarDigital Library
- M. Fisher, and N. Lynch, A lower time for the time to assure interactive consistency, Information Processing Letters, 14 (1982) 183--186.]]Google Scholar
- M. Fisher, N. Lynch, and M. Paterson, Impossibility of distributed consensus with one faulty process, J. ACM, 32 (1985) 374--382.]] Google ScholarDigital Library
- Z. Galil, A. Mayer, and M. Yung, Resolving message complexity of Byzantine agreement and beyond, in Proc. 36th IEEE Symp. on Foundations of Computer Science, 1995, pp. 724--733.]] Google ScholarDigital Library
- J.A. Garay, and Y. Moses, Fully polynomial Byzantine agreement for $n>3t$ processors in $t+1$ rounds, SIAM J. Computing, 27 (1998) 247--290.]] Google ScholarDigital Library
- O. Goldreich, and E. Petrank, The best of both worlds: guaranteeing termination in fast randomized byzantine agreement protocols, Information Processing Letters, 36 (1990) 45--49.]] Google ScholarDigital Library
- V. Hadzilacos, and J.Y. Halpern, Message-optimal protocols for Byzantine agreement, Mathematical Systems Theory, 26 (1993) 41--102.]]Google Scholar
- V. Hadzilacos, and S. Toueg, Fault-tolerant broadcast and related problems, in "Distributed Systems", 2nd ed., S. Mullender (ed.), Eddison-Wesley, 1993, pp. 97--145.]] Google ScholarDigital Library
- M. Harchol-Balter, T. Leighton, and D. Lewin, Resource discovery in distributed networks, in Proc., 18th ACM Symp. on Principles of Distributed Computing, 1999, pp. 229--238.]] Google ScholarDigital Library
- R. Karp, C. Schindelhauer, S. Shenker, and B. Vöcking, Randomized rumor spreading, in Proc., 41st IEEE Symp. on Foundations of Computer Science, 2000, pp. 565--574.]] Google ScholarDigital Library
- D. Kempe, J. Kleinberg, and A. Demers, Spatial gossip and resource location protocols, in Proc., 33rd ACM Symp. on Theory of Computing, 2001, pp. 163--172.]] Google ScholarDigital Library
- L. Lamport, and M.J. Fischer, Byzantine generals and transaction commit protocols, manuscript, 1982.]]Google Scholar
- L. Lamport, R. Shostak, and M. Pease, The Byzantine generals problem, ACM Transactions on Programming Languages and Systems, 4 (1982) 382--401.]] Google ScholarDigital Library
- A. Lubotzky, R. Phillips, and P. Sarnak, Ramanujan graphs, Combinatorica, 8 (1988) 261--277.]]Google ScholarCross Ref
- G.A. Margulis, Explicit group-theoretical constructions of combinatorial schemes and their applications to the design of expanders and concentrators, Problems of Information Transmission, 24 (1988) 39--46.]]Google Scholar
- M. Pease, R. Shostak, and L. Lamport, Reaching agreement in the presence of faults, J. ACM, 27 (1980) 228--234.]] Google ScholarDigital Library
- A. Pelc, Fault-tolerant broadcasting and gossiping in communication networks, Networks, 28 (1996) 43--156.]]Google ScholarCross Ref
- M.O. Rabin, Randomized byzantine generals, in Proc., 24th IEEE Symp. on Foundations of Computer Science, 1983, pp. 403--409.]]Google ScholarDigital Library
- M. Saks, N. Shavit, and H. Woll, Optimal time randomized consensus - making resilient algorithms fast in practice, in Proc. 2nd SIAM--ACM Symp. on Discrete Algorithms, 1991, pp. 351--362.]] Google ScholarDigital Library
- E. Upfal, Tolerating a linear number of faults in networks of bounded degree, Information and Computation, 115 (1994) 312--320.]] Google ScholarDigital Library
- R. van Renesse, Y. Minsky, and M. Hayden, A gossip-style failure detection service, in Proc., IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing, 1998, pp. 55--70.]]Google ScholarCross Ref
Index Terms
- Gossiping to reach consensus
Recommendations
Locally scalable randomized consensus for synchronous crash failures
SPAA '09: Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architecturesWe consider bit communication complexity of binary consensus in synchronous message passing systems with processes prone to crashes. A distributed algorithm is locally scalable when each process contributes to the complexity measure an amount that is ...
Fast broadcasting and gossiping in radio networks
FOCS '00: Proceedings of the 41st Annual Symposium on Foundations of Computer ScienceWe establish an O(n log/sup 2/n) upper bound on the time for deterministic distributed broadcasting in multi-hop radio networks with unknown topology. This nearly matches the known lower bound of /spl Omega/(n log n). The fastest previously known ...
Randomization helps to perform independent tasks reliably
This paper is about algorithms that schedule tasks to be performed in a distributed failure-prone environment, when processors communicate by message-passing, and when tasks are independent and of unit length. The processors work under synchrony mad may ...
Comments