research-article

Proving Server Faults: RPCs for Distributed Systems in Byzantine Networks

Authors:
Jonathan Weiss

Hebrew University of Jerusalem, Jerusalem, Israel

Hebrew University of Jerusalem, Jerusalem, Israel
View Profile

,
Albert Kwon

Badge Inc., Fremont, CA, USA

Badge Inc., Fremont, CA, USA
View Profile

,
Yossi Gilad

Hebrew University of Jerusalem, Jerusalem, Israel

Hebrew University of Jerusalem, Jerusalem, Israel
View Profile

HotNets '20: Proceedings of the 19th ACM Workshop on Hot Topics in NetworksNovember 2020Pages 74–80https://doi.org/10.1145/3422604.3425942

Published:04 November 2020Publication History

HotNets '20: Proceedings of the 19th ACM Workshop on Hot Topics in Networks

Pages 74–80

ABSTRACT

Distributed systems are often designed to recover from downed nodes. Unfortunately, it is challenging to create recovery mechanisms that work in Byzantine networks, where the attacker controls some of the nodes and links. Often times an adversarial node can lie about an honest node being offline, and there is no way to verify this claim or detect the liar.

To resolve this challenge, we design rRPC, a robust remote procedure call library for distributed systems running in Byzantine networks. rRPC ensures that either the call succeeds or the online caller/callee can create a third-party verifiable proof that the other party is faulty. A distributed system can use these proofs to identify and remove a faulty node automatically. We implement a prototype of rRPC, and use 20 - 100 EC2 VMs to evaluate its performance as a standalone library and in the context of a distributed mix-net system. Our results quantitatively show that rRPC's overhead is low and induces about 1% increase in latency in the common case.

References

S. Angel and S. Setty. Unobservable communication over fully untrusted infrastructure. In OSDI, pages 551--569, GA, 2016. USENIX Association.Google ScholarDigital Library
D. J. Bernstein, T. Lange, and P. Schwabe. The security impact of a new cryptographic library. In International Conference on Cryptology and Information Security in Latin America, pages 159--176. Springer, 2012.Google ScholarDigital Library
J. Brooks et al. Ricochet: Anonymous instant messaging for real privacy, 2016. https://ricochet.im.Google Scholar
M. Castro, B. Liskov, et al. Practical Byzantine fault tolerance. In OSDI, pages 173--186, 1999.Google ScholarDigital Library
B. Cohen. Incentives build robustness in BitTorrent. In Workshop on Economics of Peer-to-Peer systems, volume 6, pages 68--72, 2003.Google Scholar
H. Corrigan-Gibbs, D. I. Wolinsky, and B. Ford. Proactively accountable anonymous messaging in verdict. In S. T. King, editor, USENIX Security Symposium, pages 147--162. USENIX Association, 2013.Google Scholar
R. Dingledine, N. Mathewson, and P. Syverson. Tor: The second-generation onion router. In USENIX Security Symposium, pages 303--320. USENIX Association, August 2004.Google ScholarDigital Library
Google. gRPC: A high-performance, open source universal RPC framework. https://grpc.io/, 2016.Google Scholar
R. Guerraoui and A. Schiper. Software-based replication for fault tolerance. IEEE Computer, 30(4):68--74, Apr. 1997.Google ScholarDigital Library
T. Gupta, N. Crooks, W. Mulhern, S. T. V. Setty, L. Alvisi, and M. Walfish. Scalable and private media consumption with Popcorn. In K. J. Argyraki and R. Isaacs, editors, NSDI, pages 91--107. USENIX Association, 2016.Google Scholar
A. Haeberlen, P. Kouznetsov, and P. Druschel. Peerreview: practical accountability for distributed systems. In T. C. Bressoud and M. F. Kaashoek, editors, SOSP, pages 175--188. ACM, 2007.Google ScholarDigital Library
B. Kemme and G. Alonso. Don't be lazy, be consistent: Postgres-R, A new way to implement database replication. In A. El Abbadi, M. L. Brodie, S. Chakravarthy, U. Dayal, N. Kamel, G. Schlageter, and K.-Y. Whang, editors, VLDB, pages 134--143, 2000.Google Scholar
R. Kotla, L. Alvisi, M. Dahlin, A. Clement, and E. L. Wong. Zyzzyva: speculative byzantine fault tolerance. In T. C. Bressoud and M. F. Kaashoek, editors, SOSP, pages 45--58. ACM, 2007.Google ScholarDigital Library
A. Kwon, H. Corrigan-Gibbs, S. Devadas, and B. Ford. Atom: Horizontally scaling strong anonymity. In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP '17, pages 406--422, New York, NY, USA, 2017. ACM.Google ScholarDigital Library
A. Kwon, D. Lu, and S. Devadas. XRD: Scalable messaging system with cryptographic privacy. In NSDI. USENIX Association, 2020.Google Scholar
M. Lacuyer, R. Spahn, K. Vodrahalli, R. Geambasu, and D. Hsu. Privacy accounting and quality control in the Sage differentially private ML platform. In T. Brecht and C. Williamson, editors, SOSP, pages 181--195. ACM, 2019.Google Scholar
L. Lamport. Fast paxos. Distributed Computing, 19(2):79--103, 2006.Google ScholarDigital Library
L. Lamport and P. M. Melliar-Smith. Byzantine clock synchronization. In PODC, pages 68--74, 1984.Google ScholarDigital Library
A. Langley. Pond, 2016. https://github.com/agl/pond.Google Scholar
B. Laurie, A. Langley, and E. Kasper. Certificate transparency. RFC 6962, RFC Editor, June 2013.Google Scholar
D. Lazar, Y. Gilad, and N. Zeldovich. Karaoke: Fast and strong metadata privacy with low noise. In OSDI, Carlsbad, CA, 2018. USENIX Association.Google Scholar
D. Lazar, Y. Gilad, and N. Zeldovich. Yodel: strong metadata security for voice calls. In T. Brecht and C. Williamson, editors, SOSP, pages 211--224. ACM, 2019.Google Scholar
H. Leibowitz, A. M. Piotrowska, G. Danezis, and A. Herzberg. No right to remain silent: Isolating malicious mixes. In N. Heninger and P. Traynor, editors, USENIX Security Symposium, pages 1841--1858. USENIX Association, 2019.Google Scholar
M. S. Melara, A. Blankstein, J. Bonneau, E. W. Felten, and M. J. Freedman. CONIKS: Bringing key transparency to end users. In USENIX Security Symposium, pages 383--398, Washington, D.C., 2015. USENIX Association.Google ScholarDigital Library
T. P. Pedersen. Non-interactive and information-theoretic secure verifiable secret sharing. In J. Feigenbaum, editor, CRYPTO, volume 576 of LNCS, pages 129--140. Springer-Verlag, 1992, 11--15 Aug. 1991.Google Scholar
A. M. Piotrowska, J. Hayes, T. Elahi, S. Meiser, and G. Danezis. The Loopix anonymity system. In USENIX Security Symposium, pages 1199--1216. USENIX Association, 2017.Google ScholarDigital Library
E. Roth, D. Noble, B. H. Falk, and A. Haeberlen. Honeycrisp: large-scale differentially private aggregation without a trusted core. In T. Brecht and C. Williamson, editors, SOSP, pages 196--210. ACM, 2019.Google Scholar
J. Terrance and M. J. Freedman. Object storage on CRAQ: High-throughput chain replication for read-mostly workloads. In ATC. USENIX, June 2009.Google Scholar
A. Tomescu and S. Devadas. Catena: Efficient non-equivocation via Bitcoin. In 2017 IEEE Symposium on Security and Privacy (SP), pages 393--409, May 2017.Google ScholarCross Ref
N. Tyagi, Y. Gilad, D. Leung, M. Zaharia, and N. Zeldovich. Stadium: A distributed metadata-private messaging system. In SOSP, SOSP '17, pages 423--440, New York, NY, USA, 2017. ACM.Google ScholarDigital Library
S. B. Wicker and V. K. Bhargava. Reed-Solomon codes and their applications. John Wiley & Sons, 1999.Google ScholarCross Ref
D. I. Wolinsky, H. Corrigan-Gibbs, B. Ford, and A. Johnson. Dissent in numbers: Making strong anonymity scale. In OSDI, pages 179--182, Hollywood, CA, 2012. USENIX Association.Google ScholarDigital Library

Index Terms

Proving Server Faults: RPCs for Distributed Systems in Byzantine Networks

Recommendations

A self-stabilizing link-coloring protocol resilient to unbounded byzantine faults in arbitrary networks
OPODIS'05: Proceedings of the 9th international conference on Principles of Distributed Systems

Self-stabilizing protocols can tolerate any type and any number of transient faults. However, in general, self-stabilizing protocols provide no guarantee about their behavior against permanent faults. This paper proposes a self-stabilizing link-coloring ...
Read More
A Note on Consensus on Dual Failure Modes

Meyer and Pradhan proposed the MS (for "mixed-sum") algorithm to solve the Byzantine Agreement (BA) problem with dual failure modes: arbitrary faults (Byzantine faults) and dormant faults (essentially omission faults and timing faults) [3]. Our study ...
Read More
Emulation of Transient Software Faults for Dependability Assessment: A Case Study
EDCC '10: Proceedings of the 2010 European Dependable Computing Conference

Fault Tolerance Mechanisms (FTMs) are extensively used in software systems to counteract software faults, in particular against faults that manifest transiently, namely Mandelbugs. In this scenario, Software Fault Injection (SFI) plays a key role for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HotNets '20: Proceedings of the 19th ACM Workshop on Hot Topics in Networks
November 2020
228 pages
ISBN:9781450381451
DOI:10.1145/3422604
General Chairs:
Ben Zhao
University of Chicago
,
Heather Zheng
University of Chicago
,
Program Chairs:
Harsha V. Madhyastha
University of Michigan
,
Venkat Padmanabhan
Microsoft Research India
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 November 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
byzantine networks
distributed systems
fault tolerance
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate110of460submissions,24%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 215
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Proving Server Faults: RPCs for Distributed Systems in Byzantine Networks

HotNets '20: Proceedings of the 19th ACM Workshop on Hot Topics in Networks

ABSTRACT

References

Cited By

Index Terms

Recommendations

A self-stabilizing link-coloring protocol resilient to unbounded byzantine faults in arbitrary networks

A Note on Consensus on Dual Failure Modes

Emulation of Transient Software Faults for Dependability Assessment: A Case Study

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Proving Server Faults: RPCs for Distributed Systems in Byzantine Networks

HotNets '20: Proceedings of the 19th ACM Workshop on Hot Topics in Networks

ABSTRACT

References

Cited By

Index Terms

Recommendations

A self-stabilizing link-coloring protocol resilient to unbounded byzantine faults in arbitrary networks

A Note on Consensus on Dual Failure Modes

Emulation of Transient Software Faults for Dependability Assessment: A Case Study

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media