Fault-tolerant atomic computations in an object-based distributed system

Ahamad, Mustaque; Dasgupta, Partha; LeBlanc, Richard J.

doi:10.1007/BF01786632

Fault-tolerant atomic computations in an object-based distributed system

Published: June 1990

Volume 4, pages 69–80, (1990)
Cite this article

Distributed Computing Aims and scope Submit manuscript

Mustaque Ahamad¹,
Partha Dasgupta¹ &
Richard J. LeBlanc Jr.¹

40 Accesses
3 Altmetric
Explore all metrics

Abstract

A distributed system can support fault-tolerant applications by replicating data and computation at nodes that have independent failure modes. We present a scheme called parallel execution threads (PET) which can be used to implement fault-tolerant computations in an object-based distributed system. In a system that replicates objects, the PET scheme can be used to replicate a computation by creating a number of parallel threads which execute with different replicas of the invoked objects. A computation can be completed successfully if at least one thread does not encounter any failed nodes and its completion preserves the consistency of the objects. The PET scheme can tolerate failures that occur during the execution of the computation as long as all threads are not affected by the failures. We present the algorithms required to implement the PET scheme and also address some performance issues.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable Byzantine fault-tolerant state-machine replication on heterogeneous servers

Article 21 August 2018

Using Replication for Resilience on Exascale Systems

Low-Overhead Fault-Tolerance Support Using DISC Programming Model

References

Ahamad M, Dasgupta P, LeBlanc R, Wilkes T.: Fault-tolerant computing in object based distributed operating systems. In: Proc 6th Symp on Reliability in Distributed Systems, March 1987
Avizienis A: Then-version approach to fault-tolerant software. IEEE Trans Software Eng 11 (12): 1491–1501 (1985)
Google Scholar
Bernabéu Aubán JM, Hutto PW, Khalidi MYA, Ahamad M, Appelbe WF, Dasgupta P, LeBlanc RJ, Ramachandran U: The architecture ofRa: a kernel forClouds. In Proc 22nd Annu Hawaii Int Conf on System Sciences, January 1989
Bernstein PA, Goodman N: An algorithm for concurrency control and recovery in replicated distributed databases. ACM Trans Database Syst 9(4):596–615 (1984)
Google Scholar
Birman K, Joseph T, Raeuchle R, El Abbadi A: Implementing fault-tolerant distributed objects. IEEE Trans Software Eng 11(6):502–508 (1985)
Google Scholar
Cooper E: Replicated distributed programs. In: Proc 10th ACM Symp on Operating Systems Principles, December 1985
Dasgupta P, LeBlanc RJ, Appelbee W: TheClouds distributed operating system. In: Proc Int Conf on Distributed Systems, June 1988
Garcia Molina H: Elections in a distributed computing system. IEEE Trans. Comput C-31(1):48–59 (1982)
Google Scholar
Gifford D: Weighted voting for replicated data. In: Proc 7th Symp on Operating Systems (Pacific Grove, California). ACM, December 1979
Ng TP, Shi SSB: Replicated transactions. In: Proc 9th Int Conf on Distributed Computing Systems, pp 474–480. IEEE, June 1989
Oki B, Liskov B: Viewstamped replication: a general primary copy method to support highly-available distributed systems. In: Proc 7th Symp on Principles of Distributed Computing, August 1988
Ramachandran U, Ahamad M, Khalidi MY: Unifying synchronization and data transfer in maintaining coherence of distributed shared memory. In: Proc Int Conf on Parallel Processing, August 1989
Stonebreaker M: Concurrency control and consistency of multiple copies of data in distributed INGRES. IEEE Trans Software Eng 5(3):188–194 (1979)
Google Scholar
Yap KS, Jalote P, Tripathi S: Fault tolerant remote procedure calls. In: 8th Int Conf on Distributed Computing, June 1988

Download references

Author information

Authors and Affiliations

School of Information and Computer Science, Georgia Institute of Technology, 30332, Atlanta, GA, USA
Mustaque Ahamad, Partha Dasgupta & Richard J. LeBlanc Jr.

Authors

Mustaque Ahamad
View author publications
You can also search for this author inPubMed Google Scholar
Partha Dasgupta
View author publications
You can also search for this author inPubMed Google Scholar
Richard J. LeBlanc Jr.
View author publications
You can also search for this author inPubMed Google Scholar

Additional information

Mustaque Ahamad received his B.E. (Hons.) degree in Electrical Engineering from the Birla Institute of Technology and Science, Pilani, India. He obtained his M.S. and Ph.D. degrees in Computer Science from the State University of New York at Stony Brook in 1983 and 1985 respectively. Since September 1985, he is an Assistant Professor in the School of Information and Computer Science at the Georgia Institute of Technology, Atlanta. His research interests include distributed operating systems, distributed algorithms, faulttolerant systems and performance evaluation.

Partha Dasgupta is an Assistant Professor at Georgia Tech since 1984. He has a Ph.D. in Computer Science from the State University of New York at Stony Brook. He is the technical project director of the Clouds distributed operating systems project, as well as a coprincipal investigator of Georgia Tech's NSF-CER award. His research interests include building distributed operating systems, distributed algorithms, fault-tolerant systems and distributed programming support.

Richard J. LeBlanc, Jr. received the B.S. degree in physics from Louisiana State University in 1972 and the M.S. and Ph.D. degrees in computer sciences from the University of Wisconsin-Madison in 1974 and 1977, respectively. He is currently a Professor in the School of Information and Computer Science of the Georgia Institute of Technology. His research interests include programming language design and implementation, programming environments, and software engineering. Dr. LeBlanc's current research work involves application of these interests in distributed processing systems. As co-director of the Clouds Project, he is studying language concepts and software engineering methodology for utilizing a highly reliable, object-based distributed system. He is also interested in specification-based software development methodologies and tools. Dr. LeBlanc is a member of the Association for Computing Machinery, the IEEE Computer Society and Sigma Xi.

This work was supported in part by NSF grants CCR-8619886 and CCR-8806358, and RADC contract number F30602-86-C-0032

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahamad, M., Dasgupta, P. & LeBlanc, R.J. Fault-tolerant atomic computations in an object-based distributed system. Distrib Comput 4, 69–80 (1990). https://doi.org/10.1007/BF01786632

Download citation

Received: 22 December 1988
Accepted: 20 February 1990
Issue Date: June 1990
DOI: https://doi.org/10.1007/BF01786632

Key words

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fault-tolerant atomic computations in an object-based distributed system

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Scalable Byzantine fault-tolerant state-machine replication on heterogeneous servers

Using Replication for Resilience on Exascale Systems

Low-Overhead Fault-Tolerance Support Using DISC Programming Model

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Subscribe and save

Buy Now