Shared Virtual Memory Clusters with Next-Generation Interconnection Networks and Wide Compute Nodes

Gibson, Courtney R.; Bilas, Angelos

doi:10.1007/3-540-45307-5_15

Courtney R. Gibson⁷ &
Angelos Bilas⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2228))

Included in the following conference series:

International Conference on High-Performance Computing

347 Accesses
1 Citations

Abstract

Recently much effort has been spent on providing a shared address space abstraction on clusters of small-scale symmetric multiprocessors. However, advances in technology will soon make it possible to construct these clusters with larger-scale cc-NUMA nodes, connected with non-coherent networks that ofier latencies and bandwidth comparable to interconnection networks used in hardware cache-coherent systems. The shared memory abstraction can be provided on these systems in software across nodes and in hardware within nodes. In this work we investigate this approach to building future software shared memory clusters. We use an existing, large-scale hardware cache- coherent system with 64 processors to emulate a future cluster. We present results for both 32- and 64-processor system configurations. We quantify the effects of faster interconnects and wide, NUMA nodes on system design and identify the areas where more research is required for future SVM clusters. We find that current SVM protocols can only partially take advantage of faster interconnects and they need to be adjusted to the new system features. In particular, unlike in today’s clusters that employ SMP nodes, improving intra-node synchronization and data placement are key issues for future clusters. Data wait time and synchronization costs are not major issues, when not affected by the cost of page invalidations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. Bilas, C. Liao, and J. P. Singh. Accelerating shared virtual memory using commodity ni support to avoid asynchronous message handling. In The 26th International Symposium on Computer Architecture, May 1999.
Google Scholar
A. Bilas and J. P. Singh. The effects of communication parameters on end performance of shared virtual memory clusters. In Proceedings of Supercomputing 97, clSan Jose, CA, November 1997.
Google Scholar
C. Dubnicki, A. Bilas, Y. Chen, S. Damianakis, and K. Li. VMMC-2: efficient support for reliable, connection-oriented communication. In Proceedings of Hot Interconnects, Aug. 1997.
Google Scholar
A. Erlichson, N. Nuckolls, G. Chesson, and J. Hennessy. SoftFLASH: analyzing the performance of clustered distributed virtual shared memory. In The 7th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 210–220, Oct 1996.
Google Scholar
C. Gibson and A. Bilas. Shared virtual memory clusters with next-generation interconnection networks and wide compute nodes. Technical ReportTR-01-01-02, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S3G4, Canada, 2001.
Google Scholar
R. Grindley, T. Abdelrahman, S. Brown, S. Caranci, D. Devries, B. Gamsa, A. Grbic, M. Gusat, R. Ho, O. Krieger, G. Lemieux, K. Loveless, N. Manjikian, P. McHardy, S. Srblijic, M. Stumm, Z. Vranesic, and Z. Zilac. The NUMAchine Multiprocessor. In The 2000 International Conference on Parallel Processing (ICPP2000), Toronto, Canada, Aug. 2000.
Google Scholar
L. Iftode, J. P. Singh, and K. Li. Understanding application performance on shared virtual memory. In Proceedings of the 23rd International Symposium on Computer Architecture (ISCA), May 1996.
Google Scholar
D. Jiang, B. Cokelley, X. Yu, A. Bilas, and J. P. Singh. Applicaiton scaling under shared virtual memory on a cluster of smps. In The 13th ACM International Conference on Supercomputing (ICS’99), June 1999.
Google Scholar
D. Jiang, H. Shan, and J. P. Singh. Application restructuring and performance portability across shared virtual memory and hardware-coherent multiprocessors. In Proceedings of the 6th ACM Symposium on Principles and Practice of Parallel Programming, June 1997.
Google Scholar
D. Jiang and J. P. Singh. Does application performance scale on cache-coherent multiprocessors: A snapshot. In Proceedings of the 26th International Symposium on Computer Architecture (ISCA), May 1999.
Google Scholar
L. I. Kontothanassis and M. L. Scott. Using memory-mapped network interfaces to improve t he performance of distributed shared memory. In The 2nd IEEE Symposium on High-Performance Computer Architecture, Feb. 1996.
Google Scholar
J. P. Laudon and D. Lenoski. The SGI Origin2000: a scalable cc-numa server. In Proceedings of the 24rd Annual International Symposium on Computer Architecture, June 1997.
Google Scholar
D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam. Design of the Stanford DASH multiprocessor. Technical Report CSL-TR-89-403, Stanford University, December1989.
Google Scholar
R. Stets, S. Dwarkadas, N. Hardavellas, G. Hunt, L. Kontothanassis, S. Parthasarathy, and M. Scott. Cashmere-2L: Software Coherent Shared Memory on a Clustered Remote-Write Network. In Proc. of the 16th ACM Symp. on Operating Systems Principles (SOSP-16), Oct. 1997.
Google Scholar
S. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. Methodological considerations and characterization of the SPLASH-2 parallel application suite. In Proceedings of the 23rd International Symposium on Computer Architecture (ISCA), May 1995.
Google Scholar
D. Yeung, J. Kubiatowicz, and A. Agarwal. Multigrain shared memory. ACM Transactions on Computer Systems, 18(2):154–196, May 2000.
Article Google Scholar
Y. Zhou, L. Iftode, and K. Li. Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems. In Proceedings of the Operating Systems Design and Implementation Symposium, Oct. 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Toronto, M5S 3G4, Toronto, Ontario, Canada
Courtney R. Gibson & Angelos Bilas

Authors

Courtney R. Gibson
View author publications
You can also search for this author in PubMed Google Scholar
Angelos Bilas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Paderborn, Fürstenallee 11, 33102, Paderborn, Germany
Burkhard Monien
Department of EE-Systems, Computer Engineering Division, University of Southern California, 3740 McClintock Avenue, EEB 200C, 90089-2562, Los Angeles, CA, USA
Viktor K. Prasanna
Independent Consultant, c/o Infosys Ltd., “Mangala”, Kuloor Ferry Road, Kottara, 575006, Mangalore, India
Sriram Vajapeyam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gibson, C.R., Bilas, A. (2001). Shared Virtual Memory Clusters with Next-Generation Interconnection Networks and Wide Compute Nodes. In: Monien, B., Prasanna, V.K., Vajapeyam, S. (eds) High Performance Computing — HiPC 2001. HiPC 2001. Lecture Notes in Computer Science, vol 2228. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45307-5_15

Download citation

DOI: https://doi.org/10.1007/3-540-45307-5_15
Published: 04 December 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43009-4
Online ISBN: 978-3-540-45307-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics