Scalable Byzantine fault-tolerant state-machine replication on heterogeneous servers

Eischer, Michael; Distler, Tobias

doi:10.1007/s00607-018-0652-3

Scalable Byzantine fault-tolerant state-machine replication on heterogeneous servers

Published: 21 August 2018

Volume 101, pages 97–118, (2019)
Cite this article

Computing Aims and scope Submit manuscript

395 Accesses
14 Citations
Explore all metrics

Abstract

When provided with more powerful or extra hardware, state-of-the-art Byzantine fault-tolerant (BFT) replication protocols are unable to effectively exploit the additional computing resources: on the one hand, in settings with heterogeneous servers existing protocols cannot fully utilize servers with higher performance capabilities. On the other hand, using more servers than the minimum number of replicas required for Byzantine fault tolerance in general does not lead to improved throughput and latency, but instead actually degrades performance. In this paper, we address these problems with Omada, a BFT system architecture that is able to benefit from additional hardware resources. To achieve this property while still providing strong consistency, Omada first parallelizes agreement into multiple groups and then executes the requests handled by different groups in a deterministic order. By varying the number of requests to be ordered between groups as well as the number of groups that a replica participates in between servers, Omada offers the possibility to individually adjust the resource usage per server. Moreover, the fact that not all replicas need to take part in every group enables the architecture to exploit additional servers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Towards Efficient and Robust BFT Protocols with ER-BFT (Short Paper)

BChain: Byzantine Replication with High Throughput and Embedded Reconfiguration

Supr: Adaptive Byzantine Fault-Tolerant Replication

References

Aguilera MK, Strom RE (2000) Efficient atomic broadcast using deterministic merge. In: Proceedings of the 19th symposium on principles of distributed computing, pp 209–218
Amir Y, Coan B, Kirsch J, Lane J (2007) Customizable fault tolerance for wide-area replication. In: Proceedings of the 26th international symposium on reliable distributed systems, pp 65–82
Amir Y, Coan B, Kirsch J, Lane J (2011) Prime: Byzantine replication under attack. In: IEEE transactions on dependable and secure computing, vol 8, no 4, pp 564–577
Aublin PL, Mokhtar SB, Quéma V (2013) RBFT: Redundant Byzantine fault tolerance. In: Proceedings of the 33rd international conference on distributed computing systems, pp 297–306
Babay A, Amir Y (2016) Fast total ordering for modern data centers. In: Proceedings of the 36th international conference on distributed computing systems, pp 669–679
Behl J, Distler T, Kapitza R (2015) Consensus-oriented parallelization: how to earn your first million. In: Proceedings of the 16th middleware conference, pp 173–184
Bessani A, Sousa J, Alchieri EEP (2014) State machine replication for the masses with BFT-SMaRt. In: Proceedings of the 44th international conference on dependable systems networks, pp 355–362
Castro M, Liskov B (1999) Practical Byzantine fault tolerance. In: Proceedings of the 3rd symposium on operating systems design and implementation, pp 173–186
Castro M, Rodrigues R, Liskov B (2003) BASE: using abstraction to improve fault tolerance. In: ACM transactions on computer systems, vol 21, no 3, pp 236–269
Clement A, Kapritsos M, Lee S, Wang Y, Alvisi L, Dahlin M, Riche T (2009) UpRight cluster services. In: Proceedings of the 22nd symposium on operating systems principles, pp 277–290
Distler T, Kapitza R, Reiser HP (2010) State transfer for hypervisor-based proactive recovery of heterogeneous replicated services. In: Proceedings of the 5th “Sicherheit, Schutz und Zuverlässigkeit” conference, pp 61–72
Distler T, Cachin C, Kapitza R (2016) Resource-efficient Byzantine fault tolerance. In: IEEE transactions on computers, vol 65, no 9, pp 2807–2819
Garcia M, Bessani A, Gashi I, Neves N, Obelheiro R (2014) Analysis of operating system diversity for intrusion tolerance. In: Software—practice & experience, vol 44, no 6, pp 735–770
Hunt P, Konar M, Junqueira F, Reed B (2010) ZooKeeper: wait-free coordination for Internet-scale systems. In: Proceedings of the 2010 USENIX annual technical conference, pp 145–158
Junqueira F, Bhagwan R, Hevia A, Marzullo K, Voelker GM (2005) Surviving Internet catastrophes. In: Proceedings of the 2005 USENIX annual technical conference, pp 45–60
Kapitza R, Behl J, Cachin C, Distler T, Kuhnle S, Mohammadi SV, Schröder-Preikschat W, Stengel K (2012) CheapBFT: resource-efficient Byzantine fault tolerance. In: Proceedings of the 7th European conference on computer systems, pp 295–308
Kapritsos M, Junqueira FP (2010) Scalable agreement: toward ordering as a service. In: Proceedings of the 6th workshop on hot topics in system dependability, pp 7–12
Li B, Xu W, Abid MZ, Distler T, Kapitza R (2016) SAREK: optimistic parallel ordering in Byzantine fault tolerance. In: Proceedings of the 12th European dependable computing conference, pp 77–88
Mao Y, Junqueira FP, Marzullo K (2008) Mencius: building efficient replicated state machines for WANs. In: Proceedings of the 8th conference on operating systems design and implementation, pp 369–384
Ou Z, Zhuang H, Lukyanenko A, Nurminen JK, Hui P, Mazalov V, Ylä-Jääski A (2013) Is the same instance type created equal? Exploiting heterogeneity of public clouds. In: IEEE transactions on cloud computing, vol 1, no 2, pp 201–214
Papadimitriou CH, Steiglitz K (1998) Combinatorial optimization: algorithms and complexity. Dover Publications, New York
MATH Google Scholar
Pease M, Shostak R, Lamport L (1980) Reaching agreement in the presence of faults. J ACM 27(2):228–234
Article MathSciNet MATH Google Scholar
Veronese GS, Correia M, Bessani AN, Lung LC (2009) Spin one’s wheels? Byzantine fault tolerance with a spinning primary. In: Proceedings of the 28th international symposium on reliable distributed systems, pp 135–144
Veronese GS, Correia M, Bessani AN, Lung LC (2010) EBAWA: efficient Byzantine agreement for wide-area networks. In: Proceedings of the 12th symposium on high-assurance systems engineering, pp 10–19
Yin J, Martin JP, Venkataramani A, Alvisi L, Dahlin M (2003) Separating agreement from execution for Byzantine fault tolerant services. In: Proceedings of the 19th symposium on operating systems principles, pp 253–267

Download references

Acknowledgements

This work was partially supported by the German Research Council (DFG) under Grant No. DI 2097/1-2 (“REFIT—Resource-Efficient Fault and Intrusion Tolerance”).

Author information

Authors and Affiliations

Friedrich-Alexander University Erlangen-Nürnberg (FAU), Erlangen, Germany
Michael Eischer & Tobias Distler

Authors

Michael Eischer
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Distler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Eischer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Eischer, M., Distler, T. Scalable Byzantine fault-tolerant state-machine replication on heterogeneous servers. Computing 101, 97–118 (2019). https://doi.org/10.1007/s00607-018-0652-3

Download citation

Received: 21 September 2017
Accepted: 11 August 2018
Published: 21 August 2018
Issue Date: 12 February 2019
DOI: https://doi.org/10.1007/s00607-018-0652-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Scalable Byzantine fault-tolerant state-machine replication on heterogeneous servers

Abstract

Access this article

Similar content being viewed by others

Towards Efficient and Robust BFT Protocols with ER-BFT (Short Paper)

BChain: Byzantine Replication with High Throughput and Embedded Reconfiguration

Supr: Adaptive Byzantine Fault-Tolerant Replication

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Scalable Byzantine fault-tolerant state-machine replication on heterogeneous servers

Abstract

Access this article

Similar content being viewed by others

Towards Efficient and Robust BFT Protocols with ER-BFT (Short Paper)

BChain: Byzantine Replication with High Throughput and Embedded Reconfiguration

Supr: Adaptive Byzantine Fault-Tolerant Replication

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation