Abstract
If a process fails in a distributed systems, for proper recovery, the messages sent to the process need to be recovered. We present sufficient conditions for recovering the messages for a distributed application. For a general purpose recovery technique these also become necessary conditions. ¿From the conditions it is clear that requiring messages to be recovered in the same order as they were received by a process before failure is a stricter requirement than necessary.
Preview
Unable to display preview. Download preview PDF.
References
J. F. Bartlett, “A NonStop kernel”, Proc. of 7th ACM Symp. on Operating Systems Principles, 1981, pp. 22–29.
A. Borg, J. Baumbach and S. Glazer, “A message system supporting fault tolerance”, 9th ACM Symp. on Op, Sys. Principles, Op. Sys. Review, Vol. 17:5, Oct 1983, pp. 90–99.
P. Jalote, “Fault tolerant processes”, Distributed Computing, Vol 3, pp. 187–195, 1989.
P. Jalote, “Fault Tolerance in Distributed Systems”, PTR Prentice Hall, Englewood Cliffs, NJ, 1994.
D. B. Johnson and W. Zwaenepoel, “Sender-based message logging”, Digest of Papers: The 17th Int. Fault Tolerant Computing Symposium, July 1987, Pittsburgh, pp. 14–19.
D. B. Johnson and W. Zwaenepoel, “Recovery in distributed systems using optimistic message logging and checkpointing”, Journal of Algorithms, Vol 11, pp. 462–491, 1990.
M. L. Powell and D. L. Presotto, “PUBLISHING: a reliable broadcast communication mechanism”, 9th ACM Symp. on Op. Sys. Principles, Op. Sys. Review, Vol. 17:5, Oct. 1983, pp. 100–109.
R. D. Schlichting and F. B. Schneider, “Fail-stop processors: an approach to designing fault-tolerant computing systems”, ACM Tran. on Comput. Systems, Vol. 1, no. 3, Aug. 1983, pp. 222–238.
F. B. Schneider, “Synchronization in distributed programs”, ACM Tran. on Prog. Languages and Systems, Vol. 4, no. 2, April 1982, pp. 179–195.
R. E. Strom and S. Yemini, “Optimistic recovery: an asynchronous approach to fault-tolerance in distributed systems”, Digest of Papers: The 14th Int. Fault Tolerant Computing Symposium, 1984, Florida, pp. 374–379.
R. E. Strom and S. Yemini, “Optimistic recovery in distributed systems”, ACM Tran. on Comput. Sys., Vol. 3, no. 3, pp. 204–226, 1985.
Y. M. Want and W. K. Fuchs, “Optimistic message logging for independent check-pointing in a message passing system”, Proc. 11th Symp. on Reliable Dist. Sys., 1992, pp. 147–154.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jalote, P. (1995). Formalizing inductive proofs of message recovery in distributed systems. In: Kanchanasut, K., Lévy, JJ. (eds) Algorithms, Concurrency and Knowledge. ACSC 1995. Lecture Notes in Computer Science, vol 1023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60688-2_44
Download citation
DOI: https://doi.org/10.1007/3-540-60688-2_44
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60688-8
Online ISBN: 978-3-540-49262-7
eBook Packages: Springer Book Archive