Skip to main content
Log in

A technique for constructing highly available services

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

This paper describes a general method for constructing a highly available service for use in a distributed system. It gives a specific implementation of the method and proves the implementation correct. The service consists of replicas that reside at several different locations in a network. It presents its clients with a consistent view of its state, but the view may contain old information. Clients can indicate how recent the information must be. The method can be used in applications satisfying certain semantic constraints. For applications that can use it, the method performs better than other replication techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Birman, K., and Joseph, T. Exploiting Virtual Synchrony in Distributed Systems.Proc. of the Eleventh ACM Symposium on Operating Systems Principles, November, 1987, pp. 123–138.

  2. Birrell, A., Levin, R., Needham, R., and Schroeder, M., Grapevine: An Exercise in Distributed Computing.Communications of the Association for Computing Machinery 25, 4 (1982), 260–274.

    Google Scholar 

  3. El-Abbadi, A., Skeen, D., and Cristian, F. An Efficient Fault-Tolerant Protocol for Replicated Data Management.Proc. of the Fourth ACM Symposium on Principles of Database Systems, March, 1985, pp. 215–229.

  4. El-Abbadi, A., and Toueg, S. Maintaining Availability in Partitioned Replicated Databases.Proc. of the Fifth ACM Symposium on Principles of Database Systems, March, 1986, pp. 240–251.

  5. Fischer, M. J., and Michael, A. Sacrificing Serializability to Attain High Availability of Data in an Unreliable Network.Proc. of the Symposium on Principles of Database Systems, ACM, March, 1982, pp. 70–75.

  6. Gifford, D. K. Weighted Voting for Replicated Data.Proc. of the Seventh Symposium on Operating Systems Principles, December, 1979, pp. 150–162.

  7. Gray, J. N. Notes on Data Base Operating Systems. InOperating Systems—An Advanced Course, Bayer, R., Graham, R. M., and Seegmuller, G. (Eds.). Lecture Notes in Computer Science, Vol. 60. Springer-Verlag, Berlin, 1978, pp. 393–481.

    Google Scholar 

  8. Hwang, D. Constructing a Highly-Available Location Service for a Distributed Environment. S.M. Thesis, M.I.T. Department of Electrical Engineering and Computer Science, Cambridge, MA, December, 1987.

    Google Scholar 

  9. Lampson, B. W., and Sturgis, H. E. Crash Recovery in a Distributed Data Storage System. Xerox Research Center, Palo Alto, CA, 1979.

    Google Scholar 

  10. Liskov, B. Overview of the Argus Language and System. Programming Methodology Group Memo 40, M.I.T. Laboratory for Computer Science, Cambridge, MA, February, 1984.

    Google Scholar 

  11. Liskov, B., and Ladin, R. Highly-Available Distributed Services and Fault-Tolerant Distributed Garbage Collection.Proc. of the Fifth ACM Symposium on Principles of Distributed Computing, August, 1986, pp. 29–39.

  12. Liskov, B., and Scheifler, R. W., Guardians and Actions: Linguistic Support for Robust, Distributed Programs.ACM Transactions on Programming Languages and Systems 5, 3 (1983), 381–404.

    Article  MATH  Google Scholar 

  13. Liskov, B., Scheifler, R., Walker, E., and Weihl, W. Orphan Detection. Programming Methodology Group Memo 53, M.I.T. Laboratory for Computer Science, Cambridge, MA, 1987. Also published inProc. of the Seventeenth International Symposium on Fault-Tolerant Computing, July, 1987, pp. 2–7.

    Google Scholar 

  14. Lundelius, J. Synchronizing Clocks in a Distributed System. Technical Report M1T/LCS/TR335, M.I.T. Laboratory for Computer Science, Cambridge, MA, 1984.

    Google Scholar 

  15. Marzullo, K. Loosely-Coupled Distributed Services: A Distributed Time Service. Ph.D. Thesis, Stanford University, Stanford, CA, 1983.

    Google Scholar 

  16. Parker, D. S., Popek, G. J., Rudisin, G., Stoughton, A., Walker, B., Walton, E., Chow, J., Edwards, D., Kiser, S., and Kline, C., Detection of Mutual Inconsistency in Distributed Systems.IEEE Transactions on Software Engineering 9 (1983), 240–247.

    Article  Google Scholar 

  17. Schlichting, R. D., and Schneider, F. B., Fail-Stop Processors: An Approach to Designing Fault-Tolerant Computing Systems.ACM Transactions on Computing Systems 1, 3 (1983), 222–238.

    Article  Google Scholar 

  18. Walker, E. W. Orphan Detection in the Argus System. Technical Report MIT/LCS/TR326, M.I.T. Laboratory for Computer Science, Cambridge, MA, June, 1984.

    Google Scholar 

  19. Weihl, W., Distributed Version Management for Read-only Actions.IEEE Transactions on Software Engineering, Special Issue on Distributed Systems,13, 1 (1987), 55–64.

    Google Scholar 

  20. Wuu, G. T. J., and Bernstein, A. J. Efficient Solutions to the Replicated Log and Dictionary Problems.Proc. of the Third Annual Symposium on Principles of Distributed Computing, August, 1984, pp. 233–242.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Communicated by Jeffrey Scott Vitter.

This research was supported in part by the Advanced Research Projects Agency of the Department of Defense, monitored by the Office of Naval Research under Contract No. N00014-83-K.-0125, by the National Science Foundation under Grant No. DCR-8503662, and by the HTI Postdoctoral Fellowship.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ladin, R., Liskov, B. & Shrira, L. A technique for constructing highly available services. Algorithmica 3, 393–420 (1988). https://doi.org/10.1007/BF01762124

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01762124

Key words

Navigation