Skip to main content

A continuum of failure models for distributed computing

  • Conference paper
  • First Online:
Distributed Algorithms (WDAG 1992)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 647))

Included in the following conference series:

Abstract

A range of models of distributed computing is presented in which processors may fail either by crashing or by exhibiting arbitrary (Byzantine) behavior. In these models, the total number of faulty processors is bounded from above by a constant t subject to the proviso that no more than b <= t of these processors are Byzantine. At the two extremes of the range (i.e., b=0 or b=t) we get models that are equivalent to the traditional models of either pure crash failures or pure Byzantine failures. For 0<b<t, the models that we introduce accommodate “real-world” experience that shows that the overwhelming majority of failures are crashes but occasionally some number of less-restrictive failures occur. We examine the Reliable Broadcast and Consensus problems within this new family of models and prove lower bounds on the relationship required between the number of processors, t, and b. We also present protocols to solve these problems, which match the lower bounds. In presenting the protocols, we emphasize new algorithmic techniques that are fruitful to use in the new models but which have limited value in either of the pure models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Bazzi and G. Neiger. Optimally simulating crash failures. In Proceedings of the Fifth International Workshop on Distributed Algorithms. Springer-Verlag, 1991.

    Google Scholar 

  2. P. Berman, J.A. Garay, and K.J. Perry. Towards optimal distributed consensus. In Proceedings of the Thirtieth Annual Symposium on Foundations of Computer Science, pages 410–415. IEEE Computer Society Press, 1989.

    Google Scholar 

  3. D. Dolev. The byzantine generals strike again. Journal of Algorithms, 3(1):14–30, 1982.

    Google Scholar 

  4. D. Dolev, R. Reischuk, and H.R. Strong. Early stopping in byzantine agreement. Journal of the ACM, 37(4):720–741, 1990.

    Google Scholar 

  5. V. Hadzilacos. Issues of fault tolerance in concurrent computations. Ph.D. Dissertation, Harvard University, 1984.

    Google Scholar 

  6. L. Lamport. The weak byzantine generals problem. Journal of the ACM, 30(3):668–676, 1983.

    Google Scholar 

  7. L. Lamport and M. Fischer. Byzantine generals and transaction commit protocols. Technical Report Opus 62, SRI, 1982.

    Google Scholar 

  8. L. Lamport, R.E. Shostak, and M. Pease. The byzantine generals problem. ACM Transactions on Programming Languages and Systems, 4(3):382–401, 1982.

    Article  Google Scholar 

  9. G. Neiger and S. Toueg. Automatically increasing the fault-tolerance of distributed algorithms. Journal of Algorithms, 11(3):374–419, 1990.

    Google Scholar 

  10. M. Pease, R.E. Shostak, and L. Lamport. Reaching agreement in the presence of faults. Journal of the ACM, 27(2):228–234, 1980.

    Google Scholar 

  11. K.J. Perry and S. Toueg. Distributed agreement in the presence of processor and communication faults. IEEE Transactions on Software Engineering, 12(3):477–482, 1986.

    Google Scholar 

  12. K.J. Perry S. Toueg and T.K. Srikanth. Fast distributed agreement. SIAM Journal of Computing, 16(3):445–457, 1987.

    Google Scholar 

  13. T.K. Srikanth and S. Toueg. Simulating authenticated broadcasts to derive simple fault-tolerant algorithms. Distributed Computing, 2(2):80–94, 1987.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Adrian Segall Shmuel Zaks

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Garay, J.A., Perry, K.J. (1992). A continuum of failure models for distributed computing. In: Segall, A., Zaks, S. (eds) Distributed Algorithms. WDAG 1992. Lecture Notes in Computer Science, vol 647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56188-9_11

Download citation

  • DOI: https://doi.org/10.1007/3-540-56188-9_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56188-0

  • Online ISBN: 978-3-540-47484-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics