On-line self-checking of replication consistency for autonomic computing

Bartoli, Alberto; Masarin, Giovanni

doi:10.1007/s10586-006-0012-5

On-line self-checking of replication consistency for autonomic computing

Published: October 2006

Volume 9, pages 449–463, (2006)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Alberto Bartoli¹ &
Giovanni Masarin¹

53 Accesses
Explore all metrics

Abstract

In this paper we are concerned with the live verification of the consistency of a replicated system, an issue that has not been addressed by the research community so far. We consider the problem of how to enable the system to detect automatically and in production whether the invariants defining the correctness of object replication are violated. This feature could greatly improve the dependability of distributed applications and is necessary for constructing self-managing and self-healing replicated systems. We focus on systems that enforce strongly consistent replication: all replicas of each object must be kept “continuously” in-sync. This replication strategy is appropriate for application domains where correctness guarantees in spite of failures are more important than performance and scalability. We present the design and implementation of a replicated web service capable of self-checking whether all replicas are indeed kept in sync. This check occurs on-line, transparently to clients. We also discuss the performance cost of self-checking in our prototype.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A brief introduction to distributed systems

Article Open access 16 August 2016

Maarten van Steen & Andrew S. Tanenbaum

A survey on transactional stream processing

Article Open access 27 September 2023

Shuhao Zhang, Juan Soto & Volker Markl

A program logic for obstruction-freedom

Article 28 December 2023

Zhao-Hui Li & Xin-Yu Feng

References

L. Moser, P. Melliar-Smith, and P. Narasimhan, Consistent object replication in the eternal system, Theory and Practice of Object Systems 4(2) (1998) 81–92.
Article Google Scholar
S. Mishra, L. Fei, X. Lin, and G. Xing, On group communication support in CORBA, IEEE Transactions on Parallel and Distributed Systems 12(2) (February 2001).
S. Labourey, Bill Burke JBoss Clustering, The JBoss Group (2003).
T. Amir, R. Caudy, A. Munjal, T. Schlossangle, and C. Tutu, N-Way fail-over infrastructure for reliable servers and routers, in: Proceedings of Dependable Systems and Networks (June 2003).
Y. Ren, D. Bakken, T. Courtney, M.Cukier, D. Karr, P. Rubel, C. Sabnis, W. Sanders, R. Schantz, and M. Seri, AquA: an adaptive architecture that provides dependable distributed objects, IEEE Transactions on Computers 52(1) (January 2003) 31–49.
Article Google Scholar
D. Oppenheimer, A. Ganapathi, and D. Patterson, Why do INTERNET services fail, and what can be done about it?, in: Proceedings of the 4-th USENIX Symposium on Internet Technologies and Systems (March 2003).
K. Birman, A Review of Experiences with Reliable Multicast, Software—Practice & Experience 29(9) (July 1999) 741–774.
Article Google Scholar
V. Castelli, R. E. Harper, P. Heidelberger, S. Hunter, K. Trivedi, K. Vaidyanathan, and W. P. Zeggert, Proactive management of software aging, IBM Journal of Research and Development 45(2) (March 2001).
J. Kephart and D. Chess, The vision of autonomic computing IEEE Computer (January 2003).
F. Hanik, In-memory Session Replication with Tomcat 4, (April 2002), theserverside.com.
V. Cardellini, E. Casalicchio, M. Colajanni, and P. Yu, The State of the Art in Locally Distributed Web Server Systems, ACM Computing Surveys 34(2) (June 2002) 263–311.
Article Google Scholar
G. Masarin, A. Bartoli, and V. Maverick, On-line consistency checking for replicated objects, International Conference on Distributed Objects and Applications (DOA) 2003, poster session, OTM 2003 Workshops, Lecture Notes in Computer Science 2889, Springer Verlag. Full technical report: http://www.univ.trieste.it/bartolia/download/DOA-03-full.pdf.
A. Bartoli, E. Antoniutti, and M. Prica, A replication framework for program-to-program interaction across unreliable networks and its implementation in a servlet container, Concurrency and Computation—Practice & Experience 18(7) (2006) 701–724.
Article Google Scholar
K. Gottschalk, S. Graham, H. Kreger, and J. Snell, Introduction to web services architecture, IBM Systems Journal 41(2) (2002) 170–177.
Article Google Scholar
L. Lamport, How to make a multiprocessor that correctly executes multiprocess programs, IEEE Transactions on Computers 28(9) (September 1979) 690–691.
MATH Google Scholar
M. Raynal, G. Thia-Kime, and M. Ahamad, From serializable to causal transactions for collaborative applications, in: Proceedings of the 23rd IEEE Euromicro Conference (September 1997) pp. 314–321.
S. Frølund and R. Guerraoui, X-Ability: A theory of replication, Distributed Computing 14(4) (2001).

Download references

Author information

Authors and Affiliations

Dipartimento di Elettrotecnica, Elettronica, Informatica, University of Trieste, Via Valerio 10, 34100, Trieste, Italy
Alberto Bartoli & Giovanni Masarin

Authors

Alberto Bartoli
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Masarin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alberto Bartoli.

Additional information

Alberto Bartoli is Associate Professor of Computer Engineering at the University of Trieste, Italy. He took a degree in Electrical Engineering in 1989 and a doctorate in Computer Engineering in 1994, both at the University of Pisa, Italy. His research interests are in the area of reliability and fault-tolerance in distributed systems.

Giovanni Masarin took a degree in Electronic Engineering in 2004, at the University of Trieste, Italy. He is currently involved in product development at RadioTrevisan, a company specialized in the production of lawful interception equipments.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bartoli, A., Masarin, G. On-line self-checking of replication consistency for autonomic computing. Cluster Comput 9, 449–463 (2006). https://doi.org/10.1007/s10586-006-0012-5

Download citation

Issue Date: October 2006
DOI: https://doi.org/10.1007/s10586-006-0012-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On-line self-checking of replication consistency for autonomic computing

Abstract

Access this article

Similar content being viewed by others

A brief introduction to distributed systems

A survey on transactional stream processing

A program logic for obstruction-freedom

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On-line self-checking of replication consistency for autonomic computing

Abstract

Access this article

Similar content being viewed by others

A brief introduction to distributed systems

A survey on transactional stream processing

A program logic for obstruction-freedom

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation