Abstract
Many replication protocols employ a threshold model when expressing failures they are able to tolerate. In this model, one assumes that no more than t out of n components can fail, which is a good representation when failures are independent and identically distributed (IID). In many real systems, however, failures are not IID, and a straightforward application of threshold protocols yields suboptimal results. Here, we examine the problem of transforming threshold protocols into survivor-set protocols tolerating dependent failures. Our main goal is to show the equivalence between the threshold model and the core/survivor set model. Toward this goal, we develop techniques to transform threshold protocols into survivor set ones. Our techniques do not require authentication, self-verification or encryption. Our results show in one case that we can transform a threshold protocol to a subset by spreading a number of processes across processors. This technique treats a given threshold algorithm as a black box, and consequently can transform any threshold algorithm. However, it has the disadvantage that the transformation is not possible for all sets of survivor sets. The second technique instead focuses on transforming voters: functions that evaluate to a value out of a set of tallied values in a replication protocol. Voters are an essential part of many fault-tolerant protocols, and we show a universal way of transforming them. With such a transformation we expect that a large number of protocols in the literature can be directly transformed with our technique. It is still an open problem, however, if the two models are equivalent, and our results constitute an important first step in this direction.
Similar content being viewed by others
References
Attiya H., Welch J.: Distributed computing: fundamentals, simulations, and advanced topics. McGraw-Hill, NY (1998)
Bazzi R.A., Neiger G.: Simplifying fault-tolerance: providing the abstraction of crash failures. J. ACM 48(3), 499–554 (2001)
Budhiraja, N., Marzullo, K., Schneider, F., Toueg, S.: Optimal primary-backup protocols. In: Proceedings of the 6th International Workshop on Distributed Algorithms (WDAG’97), pp. 362–378 (1992)
Castro M., Liskov B.: Practical byzantine fault-tolerance and proactive recovery. ACM Trans. Comput. Syst. 20, 398–461 (2002)
Castro M., Rodrigues R., Liskov B.: BASE: using abstraction to improve fault tolerance. ACM Trans. Comput. Syst. 21, 236–269 (2003)
Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)
Garcia-Molina H., Barbara D.: How to assign votes in a distributed system. J. ACM 32(4), 841–860 (1985)
Guerraoui, R., Vukolic, M.: Refined quorum systems. In: Proceedings of the 26th ACM Symposium on Principles of Distributed Computing (PODC’07), pp. 119–128. Springer, Berlin (2007)
Herlihy M.: Wait-free synchronization. ACM Trans. Program. Languages Syst. 13(1), 124–149 (1991)
Herlihy M., Shavit N.: The topological structure of asynchronous computability. J. ACM 46(6), 858–923 (1999)
Hirt, M., Maurer, U.: Complete characterization of adversaries tolerable in secure multi-party computation. In: Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing (PODC’97), pp. 25–34 (1997)
Junqueira, F.: Coping with dependent failures in distributed systems. Ph.D. Dissertation, UC San Diego, May (2006)
Junqueira F., Marzullo K.: Designing algorithms for dependent process failures. Future Directions Distributed Comput. 2584, 24–28 (2003)
Junqueira, F., Marzullo, K.: Synchronous consensus for dependent process failures. In: Proceedings of the Conference on Distributed Computing Systems (ICDCS’03), pp. 274–283. Springer, Berlin (2003)
Junqueira, F., Marzullo, K.: Replication predicates for dependent-failures algorithms. In: Proceedings of the 11th Euro-Par Conference (Euro-Par’05), pp. 617–632 (2005)
Junqueira F., Marzullo K.: A framework for the design of dependent-failure algorithms. Concurrency Comput.: Pract. Exper. 19(17), 2255–2269 (2007)
Lamport L., Shostak R., Pease M.: The Byzantine generals problem. ACM Trans. Program. Languages Syst. 4(3), 382–401 (1982)
Malkhi, D., Reiter, M.: Byzantine quorum systems. Distributed Computing 11(4), October, June (1998)
Marzullo K.: Tolerating failures of continuous-valued sensors. ACM Trans. Comput. Syst. 8(4), 284–304 (1990)
Mitra S., McCluskey E.J.: Word voter: A new voter design for triple modular redundant systems. VLSI Test Symposium, IEEE 0, 465 (2000)
Neiger, G., Toueg, S.: Automatically increasing the fault-tolerance of distributed systems. In: PODC ’88: Proceedings of the 7th annual ACM Symposium on Principles of Distributed Computing, pp. 248–262. ACM, New York, NY, USA (1988)
Neumann P.G.: Computer related risks. ACM Press, New York (1995)
Papadimitriou C., Steiglitz K.: Combinatorial optimization: algorithms and complexity. Dover Publications Inc., Mineola (1998)
Ross, S.: Introduction to probability models, 7th edn. Academic Press (2000)
Schneider F.: Implementing fault-tolerant services using the state-machine approach: a tutorial. ACM Comput. Surveys 22(4), 299–319 (1990)
von Neumann J.: Probabilistic logics and synthesis of reliable organisms from unreliable components. In: Shannon, C., McCarthy, J. (eds) Automata studies., pp. 43–98. Princeton University Press, Princeton (1956)
Warns, T., Freiling, F.C., Hasselbring, W.: Solving consensus using structural failure models. In: Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems (SRDS2006), Springer-Verlag, pp. 212–224 (2006)
Zieliński, P.: Automatic verification and discovery of Byzantine consensus protocols. In: The 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), pp. 25–28. IEEE Computer Society (2007)
Author information
Authors and Affiliations
Corresponding author
Additional information
Some elements of this paper appear in the paper entitled “Optimizing threshold protocols in adversarial structures” in the Proceedings of the 22nd International Symposium on Distributed Computing (DISC’08).
Rights and permissions
About this article
Cite this article
Junqueira, F.P., Marzullo, K., Herlihy, M. et al. Threshold protocols in survivor set systems. Distrib. Comput. 23, 135–149 (2010). https://doi.org/10.1007/s00446-010-0107-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00446-010-0107-3