research-article

PrimCast: A Latency-Efficient Atomic Multicast

Authors:
Leandro Pacheco

Università della Svizzera italiana, Lugano, Switzerland

Università della Svizzera italiana, Lugano, Switzerland

0000-0002-5853-4721
View Profile

,
Paulo Coelho

Federal University of Uberlândia, Uberlândia, Brazil

Federal University of Uberlândia, Uberlândia, Brazil

0000-0001-6033-6014
View Profile

,
Fernando Pedone

Università della Svizzera italiana, Lugano, Switzerland

Università della Svizzera italiana, Lugano, Switzerland

0000-0002-2256-0901
View Profile

Middleware '23: Proceedings of the 24th International Middleware ConferenceNovember 2023Pages 124–136https://doi.org/10.1145/3590140.3629110

Published:27 November 2023Publication History

Middleware '23: Proceedings of the 24th International Middleware Conference

Pages 124–136

ABSTRACT

Atomic multicast is a communication abstraction that allows for messages to be addressed to and reliably delivered by multiple process groups, while ensuring a partial order on delivered messages. Strong ordering guarantees can greatly simplify the design and implementation of distributed applications. One critical property for the performance and scalability of an atomic multicast protocol is that of genuineness: a protocol is said to be genuine if only the sender and destinations of a message are involved in ordering the message. This paper presents PrimCast, the first genuine atomic multicast protocol able to deliver messages at every destination in three communication steps. PrimCast uses a primary-based consensus protocol for deciding on message timestamps at each group. Differently from previous work, it does not rely on consensus for advancing and maintaining logical clocks. PrimCast introduces a novel approach, relying on simple quorum intersection, to decide when a multicast message can be delivered. We also show how loosely synchronized clocks can be used to reduce the convoy effect that delays messages under high system load. We present the complete algorithm for PrimCast and evaluate its performance under various scenarios. Our results show that PrimCast achieves lower latency than state-of-the-art approaches while providing higher or comparable throughput.

References

Marcos K Aguilera, Carole Delporte-Gallet, Hugues Fauconnier, and Sam Toueg. 2001. Stable leader election. In Distributed Computing: 15th International Conference, DISC 2001 Lisbon, Portugal, October 3-5, 2001 Proceedings 15. Springer, 108--122.Google ScholarCross Ref
Tarek Ahmed-Nacer, Pierre Sutra, and Denis Conan. 2016. The convoy effect in atomic multicast. In 2016 IEEE 35th Symposium on Reliable Distributed Systems Workshops (SRDSW). IEEE, 67--72.Google ScholarCross Ref
Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobbler, Michael Wei, and John D Davis. 2012. Corfu: A shared log design for flash clusters. In Presented as part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). 1--14.Google Scholar
Samuel Benz, Parisa Jalili Marandi, Fernando Pedone, and Benoît Garbinato. 2014. Building Global and Scalable Systems with Atomic Multicast. In 15th ACM/IFIP/USENIX International Middleware Conference (Middleware).Google Scholar
Samuel Benz and Fernando Pedone. 2017. Elastic Paxos: A Dynamic Atomic Multicast Protocol. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 2157--2164.Google Scholar
Carlos Eduardo Bezerra, Daniel Cason, and Fernando Pedone. 2015. Ridge: high-throughput, low-latency atomic multicast. In 2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS). IEEE, 256--265.Google ScholarDigital Library
Carlos Eduardo Bezerra, Fernando Pedone, and Robbert Van Renesse. 2014. Scalable state-machine replication. In 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 331--342.Google ScholarDigital Library
Kenneth P Birman and Thomas A Joseph. 1987. Reliable communication in the presence of failures. ACM Transactions on Computer Systems (TOCS) 5, 1 (1987), 47--76.Google ScholarDigital Library
Mike Blasgen, Jim Gray, Mike Mitoma, and Tom Price. 1979. The convoy phenomenon. ACM SIGOPS Operating Systems Review 13, 2 (1979), 20--25.Google ScholarDigital Library
Paulo Coelho, Tarcisio Ceolin Junior, Alysson Bessani, Fernando Dotti, and Fernando Pedone. 2018. Byzantine fault-tolerant atomic multicast. In 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 39--50.Google ScholarCross Ref
Paulo R Coelho, Nicolas Schiper, and Fernando Pedone. 2017. Fast atomic multicast. In Dependable Systems and Networks (DSN), 2017 47th Annual IEEE/IFIP International Conference on. IEEE, 37--48.Google ScholarCross Ref
James C Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, Jeffrey John Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, et al. 2013. Spanner: Google's globally distributed database. ACM Transactions on Computer Systems (TOCS) 31, 3 (2013), 8.Google ScholarDigital Library
James Cowling and Barbara Liskov. 2012. Granola: Low-Overhead Distributed Transaction Coordination. In 2012 USENIX Annual Technical Conference (USENIX ATC 12). USENIX Association, Boston, MA, 223--235. https://www.usenix.org/conference/atc12/technical-sessions/presentation/cowlingGoogle Scholar
Xavier Défago, André Schiper, and Péter Urbán. 2004. Total order broadcast and multicast algorithms: Taxonomy and survey. ACM Computing Surveys (CSUR) 36, 4 (2004), 372--421.Google ScholarDigital Library
Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer. 1988. Consensus in the presence of partial synchrony. J. ACM 35, 2 (1988), 288--323.Google ScholarDigital Library
Vitor Enes, Carlos Baquero, Alexey Gotsman, and Pierre Sutra. 2021. Efficient Replication via Timestamp Stability. In Proceedings of the Sixteenth European Conference on Computer Systems (Online Event, United Kingdom) (EuroSys '21). ACM, New York, NY, USA, 178--193.Google ScholarDigital Library
FastCast implementation [n. d.]. https://bitbucket.org/paulo_coelho/libmcast.Google Scholar
Udo Fritzke and Philippe Ingels. 2001. Transactions on Partially Replicated Data based on Reliable and Atomic Multicasts. In Proceedings of the The 21st International Conference on Distributed Computing Systems. 284--291.Google ScholarCross Ref
Udo Fritzke, Philippe Ingels, Achour Mostéfaoui, and Michel Raynal. 1998. Fault-tolerant total order multicast to asynchronous groups. In Reliable Distributed Systems, 1998. Proceedings. Seventeenth IEEE Symposium on. IEEE, 228--234.Google ScholarCross Ref
Alexey Gotsman, Anatole Lefort, and Gregory Chockler. 2019. White-Box Atomic Multicast. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 176--187.Google Scholar
Rachid Guerraoui and André Schiper. 1997. Genuine Atomic Multicast. In Proceedings of the 7th IEEE International Conference on Computer Communications and Networks. IEEE, 840--847.Google ScholarCross Ref
Rachid Guerraoui and Andre Schiper. 1997. Total order multicast to multiple groups. In Proceedings of 17th International Conference on Distributed Computing Systems. IEEE, 578--585.Google ScholarCross Ref
Rachid Guerraoui and André Schiper. 2001. Genuine atomic multicast in asynchronous distributed systems. Theoretical Computer Science 254, 1-2 (2001), 297--316.Google ScholarDigital Library
Vassos Hadzilacos and Sam Toueg. 1994. A Modular Approach to Fault-Tolerant Broadcasts and Related Problems. Technical Report. Cornell University, Ithaca, NY, USA.Google Scholar
Flavio P Junqueira, Benjamin C Reed, and Marco Serafini. 2011. Zab: High-performance broadcast for primary-backup systems. In 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN). IEEE, 245--256.Google ScholarDigital Library
Sandeep S Kulkarni, Murat Demirbas, Deepak Madappa, Bharadwaj Avva, and Marcelo Leone. 2014. Logical physical clocks. In International Conference on Principles of Distributed Systems. Springer, 17--32.Google ScholarCross Ref
Long Hoang Le, Enrique Fynn, Mojtaba Eslahi-Kelorazi, Robert Soulé, and Fernando Pedone. 2019. Dynastar: Optimized dynamic partitioning for scalable state machine replication. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). IEEE, 1453--1465.Google Scholar
Jialin Li, Ellis Michael, Naveen Kr Sharma, Adriana Szekeres, and Dan RK Ports. 2016. Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering. In OSDI. 467--483.Google ScholarDigital Library
Libevent library [n. d.]. https://libevent.org.Google Scholar
Barbara Liskov and James Cowling. 2012. Viewstamped replication revisited. Technical Report. Technical Report MIT-CSAIL-TR-2012-021, MIT.Google Scholar
Parisa Jalili Marandi, Marco Primi, and Fernando Pedone. 2012. Multi-ring paxos. In Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on. IEEE, 1--12.Google ScholarDigital Library
Leandro Pacheco. 2023. Scaling Strongly Consistent Replicated Systems. Ph. D. Dissertation. Università della Svizzera italiana. https://sonar.ch/usi/documents/325574Google Scholar
Leandro Pacheco, Raluca Halalai, Valerio Schiavoni, Fernando Pedone, Etienne Riviere, and Pascal Felber. 2016. GlobalFS: A Strongly Consistent Multi-site File System. In Reliable Distributed Systems (SRDS), 2016 IEEE 35th Symposium on. IEEE, 147--156.Google ScholarCross Ref
Fernando Pedone and André Schiper. 1999. Generic Broadcast. In Proceedings of the 13th International Symposium on Distributed Computing (DISC'99, formerly WDAG).Google Scholar
PrimCast implementation [n. d.]. https://github.com/pacheco/primcast.Google Scholar
Luis Rodrigues, Rachid Guerraoui, and André Schiper. 1998. Scalable atomic multicast. In International Conference on Computer Communications and Networks. 840--847.Google ScholarCross Ref
Nicolas Schiper and Fernando Pedone. 2007. Optimal atomic broadcast and multicast algorithms for wide area networks. In Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing. ACM, 384--385.Google ScholarDigital Library
Nicolas Schiper and Fernando Pedone. 2008. On the inherent cost of atomic broadcast and multicast in wide area networks. In International conference on Distributed computing and networking (ICDCN). 147--157.Google ScholarCross Ref
Nicholas Schiper, Pierre Sutra, and Fernando Pedone. 2010. P-Store: Genuine Partial Replication in Wide Area Networks. In Symposium on Reliable Distributed Systems (SRDS).Google Scholar
Amazon Time Sync Service. [n. d.]. https://aws.amazon.com/about-aws/whats-new/2017/11/introducing-the-amazon-time-sync-service/.Google Scholar
Tokio asynchronous runtime [n. d.]. https://tokio.rs/.Google Scholar
Robbert Van Renesse, Nicolas Schiper, and Fred B Schneider. 2014. Vive la différence: Paxos vs. viewstamped replication vs. zab. IEEE Transactions on Dependable and Secure Computing 12, 4 (2014), 472--484.Google ScholarDigital Library
White-Box implementation [n. d.]. https://github.com/imdea-software/atomic-multicast.Google Scholar

Index Terms

PrimCast: A Latency-Efficient Atomic Multicast
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms
2. Software and its engineering
  1. Software organization and properties
    1. Extra-functional properties
      1. Software fault tolerance
    2. Software functional properties
      1. Correctness
        Consistency

Recommendations

FlexCast: Genuine Overlay-based Atomic Multicast
Middleware '23: Proceedings of the 24th International Middleware Conference

Atomic multicast is a communication abstraction where messages are propagated to groups of processes with reliability and order guarantees. Atomic multicast is at the core of strongly consistent storage and transactional systems. This paper presents ...
Read More
Broadcast Protocols for Distributed Systems

An innovative approach is presented to the design of fault-tolerant distributed systems that avoids the several rounds of message exchange required by current protocols for consensus agreement. The approach is based on broadcast communication over a ...
Read More
Optimistic Atomic Multicast
ICDCS '13: Proceedings of the 2013 IEEE 33rd International Conference on Distributed Computing Systems

Message ordering is one of the cornerstones of reliable distributed systems. However, some ordering guarantees, such as atomic order, are expensive to implement in terms of message delays. This paper presents Optimistic Atomic Multicast, a protocol that ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

Middleware '23: Proceedings of the 24th International Middleware Conference
November 2023
334 pages
ISBN:9798400701771
DOI:10.1145/3590140

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 November 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
atomic multicast
distributed agreement
fault-tolerant distributed systems
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate203of948submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 43
  Total Downloads
- Downloads (Last 12 months)43
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

PrimCast: A Latency-Efficient Atomic Multicast

Middleware '23: Proceedings of the 24th International Middleware Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

FlexCast: Genuine Overlay-based Atomic Multicast

Broadcast Protocols for Distributed Systems

Optimistic Atomic Multicast

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

PrimCast: A Latency-Efficient Atomic Multicast

Middleware '23: Proceedings of the 24th International Middleware Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

FlexCast: Genuine Overlay-based Atomic Multicast

Broadcast Protocols for Distributed Systems

Optimistic Atomic Multicast

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media