A Low Overhead Logging Scheme for Fast Recovery in Distributed Shared Memory Systems

Park, Taesoon; Yeom, Heon Y.

doi:10.1023/A:1008116511402

A Low Overhead Logging Scheme for Fast Recovery in Distributed Shared Memory Systems

Published: February 2000

Volume 15, pages 295–320, (2000)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Taesoon Park¹ &
Heon Y. Yeom²

41 Accesses
8 Citations
Explore all metrics

Abstract

This paper presents an efficient, writer-based logging scheme for recoverable distributed shared memory systems, in which logging of a data item is performed by its writer process, instead of every process that accesses the item logging it. Since the writer process maintains the log of data items, volatile storage can be used for logging. Only the readers' access information needs to be logged into the stable storage of the writer process to tolerate multiple failures. Moreover, to reduce the frequency of stable logging, only the data items accessed by multiple processes are logged with their access information when the items are invalidated, and also semantic-based optimization in logging is considered. Compared with the earlier schemes in which stable logging was performed whenever a new data item was accessed or written by a process, the size of the log and the logging frequency can be significantly reduced in the proposed scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Adaptive Logging Framework for Persistent Memories

Low-Overhead Paxos Replication

Article Open access 22 March 2017

An Efficient Bucket Logging for Persistent Memory

References

M. Ahamad, P. W. Hutto, and R. John. Implementing and programming causal distributed shared memory. In Proc. of the 10th Int'l Conf on Distributed Computing Systems, pp. 274–281, Jun. 1990.
M. Ahamad, J. E. Burns, P. W. Hutto, and G. Neiger. Causal memory. In Proc. of the 11th Int'l Conf on Distributed Computing Systems, pp. 274–281, May 1991.
R. E. Ahmed, R. C. Frazier, and P. N. Marinos. Cache-aided rollback error recovery carer algorithms for shared-memory multiprocessor systems. In Proc. of the 20th Symp. on Fault-Tolerant Computing, pp. 82–88, Jun. 1990.
G. Cabillic, G. Muller, and I. Puaut. The performance of consistent checkpointing in distributed shared memory systems. In Proc. of the l4th Symp. on Reliable Distributed Systems, Sep. 1995.
J. B. Carter, A. L. Cox, S. Dwarkadas, E. N. Elnozahy, D. B. Johnson, P. Keleher, S. Rodrigues, W. Yu, and W. Zwaenepoel. Network multicomputing using recoverable distributed shared memory. In Proc. of the IEEE Int'l Conf. CompCon'93, Feb. 1993.
M. Chandy and L. Lamport. Distributed snapshot: Determining global states of distributed systems. ACM Trans. on Computer Systems, 3(1): 63–75, Feb. 1985.
Google Scholar
M. Costa, P. Guedes, M. Sequeira, N. Neves, and M. Castro. Lightweight logging for lazy release consistent distributed shared memory. In Proc. of the USENIX 2nd Symp. on Operating Systems Design and Implementation, Oct. 1996.
G. Janakiraman and Y. Tamir. Coordinated checkpointing-rollback error recovery for distributed shared memory multicomputers. In Proc. of the 13th Symp. on Reliable Distributed Systems, pp. 42–51, Oct. 1994.
B. Janssens and W. K. Fuchs. Relaxing consistency in recoverable distributed shared memory. In Proc. of the 23rd Annual Int'l Symp. on Fault-Tolerant Computing, pp. 155–163, Jun. 1993.
B. Janassens and W. K. Fuchs. Reducing interprocessor dependence in recoverable shared memory. In Proc. of the 13rd Symp. on Reliable Distributed Systems, pp. 34–41, Oct. 1994.
S. Kanthadai and J. L. Welch. Implementation of recoverable distributed shared memory by logging writes. In Proc. of the 16th Int'l Conf. on Distributed Computing Systems, pp. 116–123, May 1996.
P. Keleher. CVM: The coherent virtual machine. http: www.cs.umd.eduprojectscvm.
A. Kermarrec, G. Cabillic, A. Gefflaut, C. Morin, and I. Puaut. A recoverable distributed shared memory integrating coherence and recoverability. In Proc. of the 25th Int'l Symp. on Fault-Tolerant Computing Systems, pp. 289–298, Jun. 1995.
L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. on Computers, C-28(9): 690–691, Sep. 1979.
Google Scholar
K. Li. Shared virtual memory on loosely coupled multiprocessors. Ph.D. thesis, Department of Computer Science, Yale University, Sep. 1986.
B. Nitzberg and V. Lo. Distributed shared memory: A survey of issues and algorithms. IEEE Computer, Aug. 1991.
B. Randell, P. A. Lee, and P. C. Treleaven. Reliability issues in computing system design. ACM Computing Surveys, 10(2): 123–165, Jun. 1978.
Google Scholar
M. Raynal, A. Schiper, and S. Toueg. The causal ordering abstraction and a simple way to implement it. Information Processing Letters, 39(6): 343–350, 1991.
Google Scholar
G. G. Richard III and M. Singhal. Using logging and asynchronous checkpointing to implement recoverable distributed shared memory. In Proc. of the 12th Symp. on Reliable Distributed Systems, pp. 58–67, Oct. 1993.
R. D. Schlichting and F. B. Schneider. Fail-stop processors: An approach to designing fault-tolerant computing systems. ACM Trans. on Computer Systems, 1(3): 222–238, Aug. 1983.
Google Scholar
M. Stumm and S. Zhou. Algorithms implementing distributed shared memory. IEEE Computer, 54–64, May 1990.
M. Stumm and S. Zhou. Fault tolerant distributed shared memory. In Proc. of the 2nd IEEE Symp. on Parallel and Distributed Processing, pp. 719–724, Dec. 1990.
G. Suri, B. Janssens, and W. K. Fuchs. Reduced overhead logging for rollback recovery in distributed shared memory. In Proc. of the 25th Annual Int'l Symp. on Fault-Tolerant Computing, Jun. 1995.
V. O. Tam and M. Hsu. Fast recovery in distributed shared virtual memory systems. In Proc. of the 10th Int'l Conf on Distributed Computing Systems, pp. 38–45, May 1990.
K. L. Wu and W. K. Fuchs. Recoverable distributed shared memory. IEEE Trans. on Computers, 39(4): 460–469, Apr. 1990.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Sejong University, Seoul, 143-747, Korea
Taesoon Park
Department of Computer Science, Seoul National University, Seoul, 151-742, Korea
Heon Y. Yeom

Authors

Taesoon Park
View author publications
You can also search for this author in PubMed Google Scholar
Heon Y. Yeom
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, T., Yeom, H.Y. A Low Overhead Logging Scheme for Fast Recovery in Distributed Shared Memory Systems. The Journal of Supercomputing 15, 295–320 (2000). https://doi.org/10.1023/A:1008116511402

Download citation

Issue Date: February 2000
DOI: https://doi.org/10.1023/A:1008116511402

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Low Overhead Logging Scheme for Fast Recovery in Distributed Shared Memory Systems

Abstract

Access this article

Similar content being viewed by others

An Adaptive Logging Framework for Persistent Memories

Low-Overhead Paxos Replication

An Efficient Bucket Logging for Persistent Memory

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Low Overhead Logging Scheme for Fast Recovery in Distributed Shared Memory Systems

Abstract

Access this article

Similar content being viewed by others

An Adaptive Logging Framework for Persistent Memories

Low-Overhead Paxos Replication

An Efficient Bucket Logging for Persistent Memory

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation