No abstract available.
Proceeding Downloads
Challenges facing tomorrow's datacenter: summary of the LADiS workshop
- Robbert van Renesse,
- Rodrigo Rodrigues,
- Mike Spreitzer,
- Christopher Stewart,
- Doug Terry,
- Franco Travostino
The 2008 workshop on Large-Scale Distributed Systems and Middleware (LADiS) addressed challenges facing tomorrow's datacenter. Over the course of three days, attendees laid forth an ambitious research agenda that covered hot topics, ranging from fault-...
A simple totally ordered broadcast protocol
This is a short overview of a totally ordered broadcast protocol used by ZooKeeper, called Zab. It is conceptually easy to understand, is easy to implement, and gives high performance. In this paper we present the requirements ZooKeeper makes on Zab, we ...
Paxos for System Builders: an overview
This paper presents an overview of Paxos for System Builders, a complete specification of the Paxos replication protocol such that system builders can understand it and implement it. We evaluate the performance of a prototype implementation and detail ...
Towards distributed software transactional memory systems
The recent architectural trend that has lead to the widespread adoption of multi-core CPUs has fostered a remarkable research interest in Software Transactional Memory (STM). As STMs are starting to face the high availability and scalability ...
Harnessing the power of DHTs to build dynamic quorums in large-scale enterprise infrastructures
Recently, enterprises owning a large IT hardware and software infrastructure have started looking at Peer-to-peer technologies as a mean both to reduce costs and to help their technical divisions to manage huge number of devices characterized by a high ...
Efficient reconciliation and flow control for anti-entropy protocols
The paper shows that anti-entropy protocols can process only a limited rate of updates, and proposes and evaluates a new state reconciliation mechanism as well as a flow control scheme for anti-entropy protocols.
Decentralized real-time monitoring of network-wide aggregates
The traditional monitoring paradigm of network and systems management, characterized by a central entity polling individual devices, is not adequate for today's large-scale networked systems whose states and configurations are highly dynamic. We outline ...
Efficient on-demand operations in dynamic distributed infrastructures
In a large-scale distributed infrastructure, users and administrators typically desire to perform on-demand operations that act upon the most up-to-date state of the infrastructure. These on-demand operations range from monitoring the up-to-date machine ...
Configuration-space performance anomaly depiction
Complex distributed systems (like those based on J2EE platforms) are designed to perform well for a variety of application workloads and configuration settings. In practice, however, the system performance may not meet the expectation at all execution ...
Dr. Multicast: Rx for data center communication scalability
Data centers avoid IP Multicast (IPMC) because of a series of problems with the technology. We introduce Dr. Multicast (MCMD), a system that maps IPMC operations to a combination of point-to-point unicast and traditional IPMC transmissions. MCMD ...
Defining weakly consistent Byzantine fault-tolerant services
We propose a specification for weak consistency in the context of a replicated service that tolerates Byzantine faults. We define different levels of consistency for the replies that can be obtained from such a service---we use a real world application ...
Low-latency access to robust amnesic storage
We address the problem of building a reliable distributed read/write storage from unreliable storage units, e.g. a collection of servers, of which up to one-third can fail by not responding or by undetectably corrupting the data stored on them. Our ...
BFT: the time is now
Data centers strive to provide reliable access to the data and services that they host. This reliable access requires the hosted data and services hosted by the data center to be both consistent and available. Byzantine fault tolerance (BFT) replication ...
Reducing the costs of large-scale BFT replication
We identify three key challenges in designing large-scale fault tolerant services. The first is keeping stable best-case performance in presence of failures, which are increasingly becoming commonplace. The second is that worst-case failures should not ...