Skip to main content
Log in

Refined quorum systems

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

It is considered good distributed computing practice to devise object implementations that tolerate contention, periods of asynchrony and a large number of failures, but perform fast if few failures occur, the system is synchronous and there is no contention. This paper initiates the first study of quorum systems that help design such implementations by encompassing, at the same time, optimal resilience, as well as optimal best-case complexity. We introduce the notion of a refined quorum system (RQS) of some set S as a set of three classes of subsets (quorums) of S: first class quorums are also second class quorums, themselves being also third class quorums. First class quorums have large intersections with all other quorums, second class quorums typically have smaller intersections with those of the third class, the latter simply correspond to traditional quorums. Intuitively, under uncontended and synchronous conditions, a distributed object implementation would expedite an operation if a quorum of the first class is accessed, then degrade gracefully depending on whether a quorum of the second or the third class is accessed. Our notion of refined quorum system is devised assuming a general adversary structure, and this basically allows algorithms relying on refined quorum systems to relax the assumption of independent process failures, often questioned in practice. We illustrate the power of refined quorums by introducing two new optimal Byzantine-resilient distributed object implementations: an atomic storage and a consensus algorithm. Both match previously established resilience and best-case complexity lower bounds, closing open gaps, as well as new complexity bounds we establish here. Each of our algorithms is representative of a different class of architectures, highlighting the generality of the refined quorum abstraction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abd-El-Malek, M., Ganger, G.R., Goodson, G.R., Reiter, M.K., Wylie, J.J.: Fault-scalable Byzantine fault-tolerant services. In: Proceedings of the 20th ACM Symposium on Operating Systems Principles, pp. 59–74 (2005)

  2. Abraham I., Chockler G.V., Keidar I., Malkhi D.: Byzantine disk paxos: optimal resilience with Byzantine shared memory. Distrib. Comput. 18(5), 387–408 (2006)

    Article  Google Scholar 

  3. Aiyer, A.S., Alvisi, L., Bazzi, R.A.: Bounded wait-free implementation of optimally resilient byzantine storage without (unproven) cryptographic assumptions. In: Proceedings of the 21st International Symposium on Distributed Computing, pp. 7–19 (2007)

  4. Attiya H., Bar-Noy A., Dolev D.: Sharing memory robustly in message-passing systems. J. ACM 42(1), 124–142 (1995)

    Article  MATH  Google Scholar 

  5. Bazzi, R., Ding, Y.: Non-skipping timestamps for Byzantine data storage systems. In: Proceedings of the 18th International Symposium on Distributed Computing, pp. 405–419 (2004)

  6. Black, J., Halevi, S., Krawczyk, H., Krovetz, T., Rogaway, P.: UMAC: fast and secure message authentication. In: Proceedings of the 19th Annual International Cryptology Conference on Advances in Cryptology, pp. 216–233 (1999)

  7. Castro M., Liskov B.: Practical Byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst. 20(4), 398–461 (2002)

    Article  Google Scholar 

  8. Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  9. Chockler, G., Guerraoui, R., Keidar, I.: Amnesic distributed storage. In: Proceedings of the 21st International Symposium on Distributed Computing, pp. 139–151 (2007)

  10. Cowling, J., Myers, D., Liskov, B., Rodrigues, R., Shrira, L.: HQ replication: A hybrid quorum protocol for Byzantine fault tolerance. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementations (2006)

  11. Dutta, P., Guerraoui, R., Levy, R.R., Chakraborty, A.: How fast can a distributed atomic read be? In: Proceedings of the 23rd Annual ACM Symposium on Principles of Distributed Computing, pp. 236–245 (2004)

  12. Dutta, P., Guerraoui, R., Vukolić, M.: Best-case complexity of asynchronous Byzantine consensus. Tech. Rep. 200499, Swiss Federal Institute of Technology (EPFL). School of Computer and Communication Sciences, Lausanne, Switzerland (2005)

  13. Dwork C., Lynch N., Stockmeyer L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988)

    Article  MathSciNet  Google Scholar 

  14. Fischer M.J., Lynch N.A., Paterson M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  15. Gafni, E.: Round-by-round fault detectors (extended abstract): unifying synchrony and asynchrony. In: Proceedings of the 17th Annual ACM Symposium on Principles of Distributed Computing, pp. 143–152 (1998)

  16. Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the 7th ACM Symposium on Operating Systems Principles, pp. 150–162 (1979)

  17. Golovin, D., Gupta, A., Maggs, B.M., Oprea, F., Reiter, M.K.: Quorum placement in networks: Minimizing network congestion. In: Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing, pp. 16–25 (2006)

  18. Goodson, G., Wylie, J., Ganger, G., Reiter, M.: Efficient Byzantine-tolerant erasure-coded storage. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 135–144 (2004)

  19. Guerraoui, R.: Indulgent algorithms (preliminary version). In: Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing, pp. 289–297 (2000)

  20. Guerraoui, R., Vukolić, M.: How fast can a very robust read be? In: Proceedings of the 25th ACM Symposium on Principles of Distributed Computing, pp. 248–257 (2006)

  21. Guerraoui, R., Vukolić, M.: Refined quorum systems. In: Proceedings of the 26th Annual ACM Symposium on Principles of Distributed Computing, pp. 119–128 (2007)

  22. Guerraoui, R., Levy, R.R., Vukolić, M.: Lucky read/write access to robust atomic storage. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 125–136 (2006)

  23. Hendricks, J., Ganger, G.R., Reiter, M.K.: Low-overhead Byzantine fault-tolerant storage. In: SOSP ’07: Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, ACM, New York, pp. 73–86 (2007)

  24. Herlihy M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)

    Article  Google Scholar 

  25. Herlihy M., Wing J.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)

    Article  Google Scholar 

  26. Hirt, M., Maurer, U.: Complete characterization of adversaries tolerable in secure multi-party computation (extended abstract). In: Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing, pp. 25–34 (1997)

  27. Jayanti P., Chandra T.D., Toueg S.: Fault-tolerant wait-free shared objects. J. ACM 45(3), 451–500 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  28. Junqueira F., Marzullo K.: A framework for the design of dependent-failure algorithms: research articles. Concurr. Comput. Pract. Exper. 19(17), 2255–2269 (2007)

    Article  Google Scholar 

  29. Junqueira, F.P., Marzullo, K.: Synchronous consensus for dependent process failures. In: Proceedings of the 23rd IEEE International Conference on Distributed Computing Systems, pp. 274–283 (2003)

  30. Keidar, I., Shraer, A.: Timeliness, failure-detectors, and consensus performance. In: Proceedings of the 25th Annual ACM Symposium on Principles of Distributed Computing, pp. 169–178 (2006)

  31. Kotla R., Alvisi L., Dahlin M., Clement A., Wong E.: Zyzzyva: Speculative byzantine fault tolerance. ACM Trans. Comput. Syst. 27(4), 1–39 (2009)

    Google Scholar 

  32. Lamport L.: Time, clocks and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)

    Article  MATH  Google Scholar 

  33. Lamport L.: On interprocess communication. Distrib. Comput. 1(1), 77–101 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  34. Lamport L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)

    Article  Google Scholar 

  35. Lamport, L.: Lower bounds for asynchronous consensus. In: Future Directions in Distributed Computing, Springer Verlag (LNCS), pp. 22–23 (2003)

  36. Lamport L.: Fast paxos. Distrib. Comput. 19(2), 79–103 (2006)

    Article  MathSciNet  Google Scholar 

  37. Lamport L., Shostak R., Pease M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982)

    Article  MATH  Google Scholar 

  38. Lampson, B.: The ABCD’s of Paxos. In: Proceedings of the 20th Annual ACM Symposium on Principles of Distributed Computing, p. 13 (2001)

  39. Lynch N.A., Tuttle M.R.: An introduction to input/output automata. CWI Q. 2(3), 219–246 (1989)

    MATH  MathSciNet  Google Scholar 

  40. Malkhi D., Reiter M.: Byzantine quorum systems. Distrib. Comput. 11(4), 203–213 (1998)

    Article  Google Scholar 

  41. Martin J.P., Alvisi L.: Fast Byzantine consensus. IEEE Trans. Dependable Secur. Comput. 3(3), 202–215 (2006)

    Article  Google Scholar 

  42. Martin, J.P., Alvisi, L., Dahlin, M.: Minimal Byzantine storage. In: Proceedings of the 16th International Conference on Distributed Computing, pp. 311–325 (2002a)

  43. Martin, J.P., Alvisi, L., Dahlin M.: Small Byzantine quorum systems. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 374–383 (2002b)

  44. Naor M., Wool A.: The load, capacity, and availability of quorum systems. SIAM J. Comput. 27(2), 423–447 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  45. Pease M., Shostak R., Lamport L.: Reaching agreements in the presence of faults. J. ACM 27(2), 228–234 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  46. Ramasamy, H.V., Cachin, C.: Parsimonious asynchronous Byzantine-fault-tolerant atomic broadcast. In: Proceedings of the 9th International Conference on Principles of Distributed Systems, pp. 88–102 (2005)

  47. Rivest R.L., Shamir A., Adleman L.M.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21(2), 120–126 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  48. Saito Y., Frolund S., Veitch A., Merchant A., Spence S.: Fab: building distributed enterprise disk arrays from commodity components. SIGOPS Oper. Syst. Rev. 38(5), 48–58 (2004)

    Article  Google Scholar 

  49. Thambidurai, P., Park, Y.K. : Interactive consistency with multiple failure modes. In: Proceedings of the Seventh Symposium on Reliable Distributed Systems. IEEE Computer Society Press, pp. 93–100 (1988)

  50. Thomas R.H.: A majority consensus approach to concurrency control for multiple copy databases. ACM Trans. Database Syst. 4(2), 180–209 (1979)

    Article  Google Scholar 

  51. Yin, J., Martin, J.P., Venkataramani, A., Alvisi, L., Dahlin, M.: Separating agreement from execution for Byzantine fault tolerant services. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles, pp. 253–267 (2003)

  52. Zielinski, P.: Optimistically terminating consensus: All asynchronous consensus protocols in one framework. In: Proceedings of The Fifth International Symposium on Parallel and Distributed Computing, pp. 24–33 (2006)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marko Vukolić.

Additional information

This paper was originally invited to the special issue of Distributed Computing based on selected papers presented at the 26th ACM Symposium on Principles of Distributed Computing (PODC ’07).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guerraoui, R., Vukolić, M. Refined quorum systems. Distrib. Comput. 23, 1–42 (2010). https://doi.org/10.1007/s00446-010-0103-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-010-0103-7

Keywords

Navigation