Abstract
A range of models of distributed computing is presented in which processors may fail either by crashing or by exhibiting arbitrary (Byzantine) behavior. In these models, the total number of faulty processors is bounded from above by a constant t subject to the proviso that no more than b <= t of these processors are Byzantine. At the two extremes of the range (i.e., b=0 or b=t) we get models that are equivalent to the traditional models of either pure crash failures or pure Byzantine failures. For 0<b<t, the models that we introduce accommodate “real-world” experience that shows that the overwhelming majority of failures are crashes but occasionally some number of less-restrictive failures occur. We examine the Reliable Broadcast and Consensus problems within this new family of models and prove lower bounds on the relationship required between the number of processors, t, and b. We also present protocols to solve these problems, which match the lower bounds. In presenting the protocols, we emphasize new algorithmic techniques that are fruitful to use in the new models but which have limited value in either of the pure models.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Bazzi and G. Neiger. Optimally simulating crash failures. In Proceedings of the Fifth International Workshop on Distributed Algorithms. Springer-Verlag, 1991.
P. Berman, J.A. Garay, and K.J. Perry. Towards optimal distributed consensus. In Proceedings of the Thirtieth Annual Symposium on Foundations of Computer Science, pages 410–415. IEEE Computer Society Press, 1989.
D. Dolev. The byzantine generals strike again. Journal of Algorithms, 3(1):14–30, 1982.
D. Dolev, R. Reischuk, and H.R. Strong. Early stopping in byzantine agreement. Journal of the ACM, 37(4):720–741, 1990.
V. Hadzilacos. Issues of fault tolerance in concurrent computations. Ph.D. Dissertation, Harvard University, 1984.
L. Lamport. The weak byzantine generals problem. Journal of the ACM, 30(3):668–676, 1983.
L. Lamport and M. Fischer. Byzantine generals and transaction commit protocols. Technical Report Opus 62, SRI, 1982.
L. Lamport, R.E. Shostak, and M. Pease. The byzantine generals problem. ACM Transactions on Programming Languages and Systems, 4(3):382–401, 1982.
G. Neiger and S. Toueg. Automatically increasing the fault-tolerance of distributed algorithms. Journal of Algorithms, 11(3):374–419, 1990.
M. Pease, R.E. Shostak, and L. Lamport. Reaching agreement in the presence of faults. Journal of the ACM, 27(2):228–234, 1980.
K.J. Perry and S. Toueg. Distributed agreement in the presence of processor and communication faults. IEEE Transactions on Software Engineering, 12(3):477–482, 1986.
K.J. Perry S. Toueg and T.K. Srikanth. Fast distributed agreement. SIAM Journal of Computing, 16(3):445–457, 1987.
T.K. Srikanth and S. Toueg. Simulating authenticated broadcasts to derive simple fault-tolerant algorithms. Distributed Computing, 2(2):80–94, 1987.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Garay, J.A., Perry, K.J. (1992). A continuum of failure models for distributed computing. In: Segall, A., Zaks, S. (eds) Distributed Algorithms. WDAG 1992. Lecture Notes in Computer Science, vol 647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56188-9_11
Download citation
DOI: https://doi.org/10.1007/3-540-56188-9_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56188-0
Online ISBN: 978-3-540-47484-5
eBook Packages: Springer Book Archive