ABSTRACT
Distributed software systems are the basis for many innovative applications. The key for achieving scalable and maintainable distributed systems is dependability, because otherwise the complexity of distribution would leave the system uncontrollable. Hence, our approach aims at a concept for optimizing dependability. Similar to other approaches we use replication as means to provide transparent fault-tolerance and persistence, but we especially focus on increasing availability by relaxing data integrity by using a mixture of asynchronous and synchronous replication techniques. This work contributes three main aspects: First, a description of the envisioned trade-off between availability and consistency, secondly with a mechanism to achieve this trade-off, and thirdly, with models that use this mechanism and can be transparently deployed by developers. This work aims at enabling a configurable and application-specific optimum of availability, possibly even controlled during runtime. A real-life telecommunication application serves as proof of concept.
- Helal et al. "Replication Techniques in Distributed Systems", Kluwer Academic Publishers 1995, ISBN 0-7923-9800-9 Google ScholarDigital Library
- K. P. Birman. "The process group approach to reliable distributed computing", Communication of ACM, 36(12):37--53, December 1993. Google ScholarDigital Library
- D. Marculescu et al. "Ready to Ware", IEEE Spectrum, pp.28--32, October 2003. Google ScholarDigital Library
- "The Object Management Group.", http://www.omg.org/.Google Scholar
- D. Malkhi et al. "Persistent objects in the fleet system.", In Proceedings of the 2nd DARPA Information Survivability Conference and Exposition (DISCEX II), June 2001.Google ScholarCross Ref
- M. Wiesmann et al. "Understanding replication in databases and distributed systems.", In Proceedings of the 20th International Conference on Distributed Computing Systems (ICDCS 2000), pages 264--274. IEEE, April 2000. Google ScholarDigital Library
- L. E. Moser et al. "The eternal system: An architecture for enterprise applications.", In Proceedings of the International Enterprise Distributed Object Computing Conference EDOC 1999, pages 214--222, September 1999.Google Scholar
- H. Yu and A. Vahdat. "Design and evaluation of a conit-based continuous consistency model for replicated services.", ACM Transactions on Computer Systems, 20(3):239--282, August 2002. Google ScholarDigital Library
- R. van Renesse et al. "Horus: a flexible group communication system.", Communication of ACM, 39(4):76--83, April 1996. Google ScholarDigital Library
- K. Birman et al. "The horus and ensemble projects: Accomplishments and limitations.", In Proceedings of the DARPA Information Survivability Conference and Exposition (DISCEX '00), January 2000.Google Scholar
- European Commission. "White Paper -- Eupean transport policy for 2010: time to decide.", Luxembourg, 2001, ISBN 92-894-0341-1.Google Scholar
Index Terms
- Dependable distributed systems
Recommendations
Consistent and automatic replica regeneration
Reducing management costs and improving the availability of large-scale distributed systems require automatic replica regeneration, that is, creating new replicas in response to replica failures. A major challenge to regeneration is maintaining ...
Building Replicated Internet Services Using TACT: A Toolkit for Tunable Availability and Consistency Tradeoffs
WECWIS '00: Proceedings of the Second International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems (WECWIS 2000)An ultimate goal for modern Internet services is the development of scalable, high-performance, highly available and fault-tolerant systems. Replication is an important approach to achieve this goal. However, replication introduces the issue of ...
Achieving Reliability through Replication in a Wide-Area Network DHT Storage System
ICPP '07: Proceedings of the 2007 International Conference on Parallel ProcessingIt is a challenge to design and implement a wide-area distributed hash table (DHT) which provides a storage service with high reliability. Many existing systems use replication to reach the goal of reliability. However, maintaining availability and ...
Comments