Abstract
It is considered good distributed computing practice to devise object implementations that tolerate contention, periods of asynchrony and a large number of failures, but perform fast if few failures occur, the system is synchronous and there is no contention. This paper initiates the first study of quorum systems that help design such implementations by encompassing, at the same time, optimal resilience, as well as optimal best-case complexity. We introduce the notion of a refined quorum system (RQS) of some set S as a set of three classes of subsets (quorums) of S: first class quorums are also second class quorums, themselves being also third class quorums. First class quorums have large intersections with all other quorums, second class quorums typically have smaller intersections with those of the third class, the latter simply correspond to traditional quorums. Intuitively, under uncontended and synchronous conditions, a distributed object implementation would expedite an operation if a quorum of the first class is accessed, then degrade gracefully depending on whether a quorum of the second or the third class is accessed. Our notion of refined quorum system is devised assuming a general adversary structure, and this basically allows algorithms relying on refined quorum systems to relax the assumption of independent process failures, often questioned in practice. We illustrate the power of refined quorums by introducing two new optimal Byzantine-resilient distributed object implementations: an atomic storage and a consensus algorithm. Both match previously established resilience and best-case complexity lower bounds, closing open gaps, as well as new complexity bounds we establish here. Each of our algorithms is representative of a different class of architectures, highlighting the generality of the refined quorum abstraction.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Abd-El-Malek, M., Ganger, G.R., Goodson, G.R., Reiter, M.K., Wylie, J.J.: Fault-scalable Byzantine fault-tolerant services. In: Proceedings of the 20th ACM Symposium on Operating Systems Principles, pp. 59–74 (2005)
Abraham I., Chockler G.V., Keidar I., Malkhi D.: Byzantine disk paxos: optimal resilience with Byzantine shared memory. Distrib. Comput. 18(5), 387–408 (2006)
Aiyer, A.S., Alvisi, L., Bazzi, R.A.: Bounded wait-free implementation of optimally resilient byzantine storage without (unproven) cryptographic assumptions. In: Proceedings of the 21st International Symposium on Distributed Computing, pp. 7–19 (2007)
Attiya H., Bar-Noy A., Dolev D.: Sharing memory robustly in message-passing systems. J. ACM 42(1), 124–142 (1995)
Bazzi, R., Ding, Y.: Non-skipping timestamps for Byzantine data storage systems. In: Proceedings of the 18th International Symposium on Distributed Computing, pp. 405–419 (2004)
Black, J., Halevi, S., Krawczyk, H., Krovetz, T., Rogaway, P.: UMAC: fast and secure message authentication. In: Proceedings of the 19th Annual International Cryptology Conference on Advances in Cryptology, pp. 216–233 (1999)
Castro M., Liskov B.: Practical Byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst. 20(4), 398–461 (2002)
Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)
Chockler, G., Guerraoui, R., Keidar, I.: Amnesic distributed storage. In: Proceedings of the 21st International Symposium on Distributed Computing, pp. 139–151 (2007)
Cowling, J., Myers, D., Liskov, B., Rodrigues, R., Shrira, L.: HQ replication: A hybrid quorum protocol for Byzantine fault tolerance. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementations (2006)
Dutta, P., Guerraoui, R., Levy, R.R., Chakraborty, A.: How fast can a distributed atomic read be? In: Proceedings of the 23rd Annual ACM Symposium on Principles of Distributed Computing, pp. 236–245 (2004)
Dutta, P., Guerraoui, R., Vukolić, M.: Best-case complexity of asynchronous Byzantine consensus. Tech. Rep. 200499, Swiss Federal Institute of Technology (EPFL). School of Computer and Communication Sciences, Lausanne, Switzerland (2005)
Dwork C., Lynch N., Stockmeyer L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988)
Fischer M.J., Lynch N.A., Paterson M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)
Gafni, E.: Round-by-round fault detectors (extended abstract): unifying synchrony and asynchrony. In: Proceedings of the 17th Annual ACM Symposium on Principles of Distributed Computing, pp. 143–152 (1998)
Gifford, D.K.: Weighted voting for replicated data. In: Proceedings of the 7th ACM Symposium on Operating Systems Principles, pp. 150–162 (1979)
Golovin, D., Gupta, A., Maggs, B.M., Oprea, F., Reiter, M.K.: Quorum placement in networks: Minimizing network congestion. In: Proceedings of the 15th Annual ACM Symposium on Principles of Distributed Computing, pp. 16–25 (2006)
Goodson, G., Wylie, J., Ganger, G., Reiter, M.: Efficient Byzantine-tolerant erasure-coded storage. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 135–144 (2004)
Guerraoui, R.: Indulgent algorithms (preliminary version). In: Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing, pp. 289–297 (2000)
Guerraoui, R., Vukolić, M.: How fast can a very robust read be? In: Proceedings of the 25th ACM Symposium on Principles of Distributed Computing, pp. 248–257 (2006)
Guerraoui, R., Vukolić, M.: Refined quorum systems. In: Proceedings of the 26th Annual ACM Symposium on Principles of Distributed Computing, pp. 119–128 (2007)
Guerraoui, R., Levy, R.R., Vukolić, M.: Lucky read/write access to robust atomic storage. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 125–136 (2006)
Hendricks, J., Ganger, G.R., Reiter, M.K.: Low-overhead Byzantine fault-tolerant storage. In: SOSP ’07: Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, ACM, New York, pp. 73–86 (2007)
Herlihy M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)
Herlihy M., Wing J.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)
Hirt, M., Maurer, U.: Complete characterization of adversaries tolerable in secure multi-party computation (extended abstract). In: Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing, pp. 25–34 (1997)
Jayanti P., Chandra T.D., Toueg S.: Fault-tolerant wait-free shared objects. J. ACM 45(3), 451–500 (1998)
Junqueira F., Marzullo K.: A framework for the design of dependent-failure algorithms: research articles. Concurr. Comput. Pract. Exper. 19(17), 2255–2269 (2007)
Junqueira, F.P., Marzullo, K.: Synchronous consensus for dependent process failures. In: Proceedings of the 23rd IEEE International Conference on Distributed Computing Systems, pp. 274–283 (2003)
Keidar, I., Shraer, A.: Timeliness, failure-detectors, and consensus performance. In: Proceedings of the 25th Annual ACM Symposium on Principles of Distributed Computing, pp. 169–178 (2006)
Kotla R., Alvisi L., Dahlin M., Clement A., Wong E.: Zyzzyva: Speculative byzantine fault tolerance. ACM Trans. Comput. Syst. 27(4), 1–39 (2009)
Lamport L.: Time, clocks and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)
Lamport L.: On interprocess communication. Distrib. Comput. 1(1), 77–101 (1986)
Lamport L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)
Lamport, L.: Lower bounds for asynchronous consensus. In: Future Directions in Distributed Computing, Springer Verlag (LNCS), pp. 22–23 (2003)
Lamport L.: Fast paxos. Distrib. Comput. 19(2), 79–103 (2006)
Lamport L., Shostak R., Pease M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982)
Lampson, B.: The ABCD’s of Paxos. In: Proceedings of the 20th Annual ACM Symposium on Principles of Distributed Computing, p. 13 (2001)
Lynch N.A., Tuttle M.R.: An introduction to input/output automata. CWI Q. 2(3), 219–246 (1989)
Malkhi D., Reiter M.: Byzantine quorum systems. Distrib. Comput. 11(4), 203–213 (1998)
Martin J.P., Alvisi L.: Fast Byzantine consensus. IEEE Trans. Dependable Secur. Comput. 3(3), 202–215 (2006)
Martin, J.P., Alvisi, L., Dahlin, M.: Minimal Byzantine storage. In: Proceedings of the 16th International Conference on Distributed Computing, pp. 311–325 (2002a)
Martin, J.P., Alvisi, L., Dahlin M.: Small Byzantine quorum systems. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 374–383 (2002b)
Naor M., Wool A.: The load, capacity, and availability of quorum systems. SIAM J. Comput. 27(2), 423–447 (1998)
Pease M., Shostak R., Lamport L.: Reaching agreements in the presence of faults. J. ACM 27(2), 228–234 (1980)
Ramasamy, H.V., Cachin, C.: Parsimonious asynchronous Byzantine-fault-tolerant atomic broadcast. In: Proceedings of the 9th International Conference on Principles of Distributed Systems, pp. 88–102 (2005)
Rivest R.L., Shamir A., Adleman L.M.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21(2), 120–126 (1978)
Saito Y., Frolund S., Veitch A., Merchant A., Spence S.: Fab: building distributed enterprise disk arrays from commodity components. SIGOPS Oper. Syst. Rev. 38(5), 48–58 (2004)
Thambidurai, P., Park, Y.K. : Interactive consistency with multiple failure modes. In: Proceedings of the Seventh Symposium on Reliable Distributed Systems. IEEE Computer Society Press, pp. 93–100 (1988)
Thomas R.H.: A majority consensus approach to concurrency control for multiple copy databases. ACM Trans. Database Syst. 4(2), 180–209 (1979)
Yin, J., Martin, J.P., Venkataramani, A., Alvisi, L., Dahlin, M.: Separating agreement from execution for Byzantine fault tolerant services. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles, pp. 253–267 (2003)
Zielinski, P.: Optimistically terminating consensus: All asynchronous consensus protocols in one framework. In: Proceedings of The Fifth International Symposium on Parallel and Distributed Computing, pp. 24–33 (2006)
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper was originally invited to the special issue of Distributed Computing based on selected papers presented at the 26th ACM Symposium on Principles of Distributed Computing (PODC ’07).
Rights and permissions
About this article
Cite this article
Guerraoui, R., Vukolić, M. Refined quorum systems. Distrib. Comput. 23, 1–42 (2010). https://doi.org/10.1007/s00446-010-0103-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00446-010-0103-7