Skip to main content
Log in

Threshold protocols in survivor set systems

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

Many replication protocols employ a threshold model when expressing failures they are able to tolerate. In this model, one assumes that no more than t out of n components can fail, which is a good representation when failures are independent and identically distributed (IID). In many real systems, however, failures are not IID, and a straightforward application of threshold protocols yields suboptimal results. Here, we examine the problem of transforming threshold protocols into survivor-set protocols tolerating dependent failures. Our main goal is to show the equivalence between the threshold model and the core/survivor set model. Toward this goal, we develop techniques to transform threshold protocols into survivor set ones. Our techniques do not require authentication, self-verification or encryption. Our results show in one case that we can transform a threshold protocol to a subset by spreading a number of processes across processors. This technique treats a given threshold algorithm as a black box, and consequently can transform any threshold algorithm. However, it has the disadvantage that the transformation is not possible for all sets of survivor sets. The second technique instead focuses on transforming voters: functions that evaluate to a value out of a set of tallied values in a replication protocol. Voters are an essential part of many fault-tolerant protocols, and we show a universal way of transforming them. With such a transformation we expect that a large number of protocols in the literature can be directly transformed with our technique. It is still an open problem, however, if the two models are equivalent, and our results constitute an important first step in this direction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Attiya H., Welch J.: Distributed computing: fundamentals, simulations, and advanced topics. McGraw-Hill, NY (1998)

    Google Scholar 

  2. Bazzi R.A., Neiger G.: Simplifying fault-tolerance: providing the abstraction of crash failures. J. ACM 48(3), 499–554 (2001)

    Article  MathSciNet  Google Scholar 

  3. Budhiraja, N., Marzullo, K., Schneider, F., Toueg, S.: Optimal primary-backup protocols. In: Proceedings of the 6th International Workshop on Distributed Algorithms (WDAG’97), pp. 362–378 (1992)

  4. Castro M., Liskov B.: Practical byzantine fault-tolerance and proactive recovery. ACM Trans. Comput. Syst. 20, 398–461 (2002)

    Article  Google Scholar 

  5. Castro M., Rodrigues R., Liskov B.: BASE: using abstraction to improve fault tolerance. ACM Trans. Comput. Syst. 21, 236–269 (2003)

    Article  Google Scholar 

  6. Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  7. Garcia-Molina H., Barbara D.: How to assign votes in a distributed system. J. ACM 32(4), 841–860 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  8. Guerraoui, R., Vukolic, M.: Refined quorum systems. In: Proceedings of the 26th ACM Symposium on Principles of Distributed Computing (PODC’07), pp. 119–128. Springer, Berlin (2007)

  9. Herlihy M.: Wait-free synchronization. ACM Trans. Program. Languages Syst. 13(1), 124–149 (1991)

    Article  Google Scholar 

  10. Herlihy M., Shavit N.: The topological structure of asynchronous computability. J. ACM 46(6), 858–923 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  11. Hirt, M., Maurer, U.: Complete characterization of adversaries tolerable in secure multi-party computation. In: Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing (PODC’97), pp. 25–34 (1997)

  12. Junqueira, F.: Coping with dependent failures in distributed systems. Ph.D. Dissertation, UC San Diego, May (2006)

  13. Junqueira F., Marzullo K.: Designing algorithms for dependent process failures. Future Directions Distributed Comput. 2584, 24–28 (2003)

    Article  Google Scholar 

  14. Junqueira, F., Marzullo, K.: Synchronous consensus for dependent process failures. In: Proceedings of the Conference on Distributed Computing Systems (ICDCS’03), pp. 274–283. Springer, Berlin (2003)

  15. Junqueira, F., Marzullo, K.: Replication predicates for dependent-failures algorithms. In: Proceedings of the 11th Euro-Par Conference (Euro-Par’05), pp. 617–632 (2005)

  16. Junqueira F., Marzullo K.: A framework for the design of dependent-failure algorithms. Concurrency Comput.: Pract. Exper. 19(17), 2255–2269 (2007)

    Article  Google Scholar 

  17. Lamport L., Shostak R., Pease M.: The Byzantine generals problem. ACM Trans. Program. Languages Syst. 4(3), 382–401 (1982)

    Article  MATH  Google Scholar 

  18. Malkhi, D., Reiter, M.: Byzantine quorum systems. Distributed Computing 11(4), October, June (1998)

  19. Marzullo K.: Tolerating failures of continuous-valued sensors. ACM Trans. Comput. Syst. 8(4), 284–304 (1990)

    Article  Google Scholar 

  20. Mitra S., McCluskey E.J.: Word voter: A new voter design for triple modular redundant systems. VLSI Test Symposium, IEEE 0, 465 (2000)

    Google Scholar 

  21. Neiger, G., Toueg, S.: Automatically increasing the fault-tolerance of distributed systems. In: PODC ’88: Proceedings of the 7th annual ACM Symposium on Principles of Distributed Computing, pp. 248–262. ACM, New York, NY, USA (1988)

  22. Neumann P.G.: Computer related risks. ACM Press, New York (1995)

    Google Scholar 

  23. Papadimitriou C., Steiglitz K.: Combinatorial optimization: algorithms and complexity. Dover Publications Inc., Mineola (1998)

    MATH  Google Scholar 

  24. Ross, S.: Introduction to probability models, 7th edn. Academic Press (2000)

  25. Schneider F.: Implementing fault-tolerant services using the state-machine approach: a tutorial. ACM Comput. Surveys 22(4), 299–319 (1990)

    Article  Google Scholar 

  26. von Neumann J.: Probabilistic logics and synthesis of reliable organisms from unreliable components. In: Shannon, C., McCarthy, J. (eds) Automata studies., pp. 43–98. Princeton University Press, Princeton (1956)

    Google Scholar 

  27. Warns, T., Freiling, F.C., Hasselbring, W.: Solving consensus using structural failure models. In: Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems (SRDS2006), Springer-Verlag, pp. 212–224 (2006)

  28. Zieliński, P.: Automatic verification and discovery of Byzantine consensus protocols. In: The 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), pp. 25–28. IEEE Computer Society (2007)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Flavio P. Junqueira.

Additional information

Some elements of this paper appear in the paper entitled “Optimizing threshold protocols in adversarial structures” in the Proceedings of the 22nd International Symposium on Distributed Computing (DISC’08).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Junqueira, F.P., Marzullo, K., Herlihy, M. et al. Threshold protocols in survivor set systems. Distrib. Comput. 23, 135–149 (2010). https://doi.org/10.1007/s00446-010-0107-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-010-0107-3

Keywords

Navigation