Skip to main content
Log in

An Asynchronous Protocol for Release Consistent Distributed Shared Memory Systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Distributed shared memory (DSM) systems provide a simple programming paradigm for networks of workstations, which are gaining popularity due to their cost-effective high computing power. However, DSM systems usually exhibit poor performance due to the large communication delay between the nodes; and a lot of different memory consistency models have been proposed to mask the network delay. In this paper, we propose an asynchronous protocol for the release consistent memory model, which we call an Asynchronous Release Consistency (ARC) protocol. Unlike other protocols where the communication adheres to the synchronous request/receive paradigm, the ARC protocol is asynchronous, such that the necessary pages are broadcast before they are requested. Hence, the network delay can be reduced by proper prefetching of necessary pages. We have also compared the performance of the ARC protocol with the lazy release protocol by running standard benchmark programs; and the experimental results showed that the ARC protocol achieves a performance improvement of up to 29%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. S. Adve, A. L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. A comparison of entry consistency and lazy release consistency implementations. In Proc. 2nd High Performance Computer Architecture Conference, pp. 26–37, 1996.

  2. S. V. Adve and K. Gharachorloo. Shared memory consistency models: A tutorial. Technical Report WRL-TR 95/7, Digital Western Research Laboratory, 1995.

  3. S. V. Adve and M. D. Hill. Weak ordering-a new definition. In Proc. 17th Annual Int'l Symp. on Computer Architecture, pp. 2–14, 1990.

  4. M. Ahamad, R. A. Bazzi, R. John, P. Kohli, and G. Neiger. The power of processor consistency. In Proc. 5th ACM Annual Symp. on Parallel Algorithms and Architectures, pp. 251–260, 1993.

  5. B. N. Bershad. The Midway distributed shared memory system. In Proc. IEEE CompCon Conference, pp. 528–537, 1993.

  6. P. Bitar. The weakest memory-access order. Journal of Parallel and Distributed Computing, 15:305–331, 1992.

    Google Scholar 

  7. J. B. Carter, J. K. Bennett, and W. Zwaenepoel. Implementation and performance of Munin. In Proc. 13th ACM Symp. Operating Systems Principles, pp. 152–164, 1991.

  8. M. Dubois, C. Scheurich, and F. A. Briggs. Memory access buffering in multiprocessors. In Proc. 13th Annual Int'l Symp. on Computer Architecture, pp. 434–442, 1986.

  9. K. Gharachorloo, D. E. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In In Proc. 17th Annual Int'l Symp. on Computer Architecture, pp. 15–26, 1990.

  10. J. R. Goodman and P. J. Woest. The wisconsin multicube: A new large-scale cache-coherent multiprocessor. In Proc. 15th Annual Int'l Symp. on Computer Architecture, pp. 422–431, 1988.

  11. P. W. Hutto and M. Ahamad. Slow memory: Weakening consistency to enhance concurrency in distributed shared memories. In Proc. 10th Int'l Conf. on Distributed Computing Systems, pp. 302–311, 1990.

  12. L. Iftode, J. P. Singh, and K. Li. Scope consistency: A bridge between release consistency and entry consistency. In Proc. 8th ACM Annual Symp. on Parallel Algorithms and Architectures, pp. 277–287, 1996.

  13. P. Keleher. Lazy release consistency for distributed shared memory. In Proc. 18th Annual Int'l Symp. on Computer Architecture, pp. 13–21, 1992.

  14. P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. An evaluation of software-based release consistent protocols. Technical Report CS-TR-3543, University of Maryland, Computer Science Department, 1995.

  15. L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, C-28(9):690–691, 1979.

    Google Scholar 

  16. D. E. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. L. Hennessy. The directory-based cache coherence protocol for the dash multiprocessor. In Proc. 17th Annual Int'l Symp. on Computer Architecture, pp. 148–159, May 1990.

  17. D. E. Lenoski, J. Ludon, K. Gharachorloo, W. D. Weber, A. Gupta, J. L. Hennessy, M. Horowitz, and M. S. Lam. The stanford dash multiprocessor. IEEE Computer, 25(3):63–79, 1992.

    Google Scholar 

  18. K. Li and P. Hudak. Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems, 7(4):321–359, 1989.

    Google Scholar 

  19. B. H. Lim and R. Bianchini. Limits on the performance benefits of multithreading and prefetching. In Proc. Int'l Conf. on the Measurement and Modeling of Computer Systems, May 1996.

  20. H. Lu, S. Dwarkadas, A. L. Cox, and W. Zwaenepoel. Quantifying the performance differences between pvm and treadmarks. Journal of Parallel and Distributed Computation, 43:65–78, 1997.

    Google Scholar 

  21. B. Nitzberg and V. Lo. Distributed shared memory: A survey of issues and algorithms. IEEE Computer, 24(8):52–60, 1991.

    Google Scholar 

  22. E. W. Parsons, M. Brorsson, and K. C. Sevcik. Predicting the performance of distributed virtual shared-memory applications. IBM Systems Journal, 36(4), 1997.

  23. P. Stenstrom. A survey of cache coherence schemes for multiprocessors. IEEE Computer, 23(6):12–24, 1990.

    Google Scholar 

  24. K. Thitikamol and P. Keleher. Multi-threading and remote latency in software DSMs. In Proc. 17th Int'l Conf. on Distributed Computing Systems, pp. 296–304, 1997.

  25. K. Thitikamol and P. Keleher. Per-node multi-threading and remote latency. IEEE Transactions on Computers, 1998.

  26. J. E. Veenstra and R. Fowler. MINT tutorial and user manual. Rochester University, 1993.

  27. S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Proc. 22nd Annual Int'l Symp. on Computer Architecture, 1995.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeo, J., Yeom, H.Y. & Park, T. An Asynchronous Protocol for Release Consistent Distributed Shared Memory Systems. The Journal of Supercomputing 24, 25–41 (2003). https://doi.org/10.1023/A:1020937425960

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1020937425960

Navigation