Skip to main content

Active Memory Clusters: Efficient Multiprocessing on Commodity Clusters

  • Conference paper
  • First Online:
High Performance Computing (ISHPC 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2327))

Included in the following conference series:

  • 876 Accesses

Abstract

We show how novel active memory system research and system networking trends can be combined to realize hardware distributed shared memory on clusters of industry-standard workstations. Our active memory controller extends the cache coherence protocol to support transparent use of address re-mapping techniques that dramatically improve single-node performance, and also contains the necessary functionality for building a hardware DSM machine. Simultaneously, commodity network technology is becoming more tightly-integrated with the memory controller. We call our design of active memory commodity nodes interconnected by a next-generation network active memory clusters. We present a detailed design of the AMC architecture, focusing on the active memory controller and the network characteristics necessary to support AMC. We show simulation results for a range of parallel applications, showing that AMC performance is comparable to that of custom hardware DSM systems and far exceeds that of the fastest software DSM solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bilas, A., Liao, C., Singh, J.P.: Using Network Interface Support to Avoid Asynchronous Protocol Processing in Shared Virtual Memory Systems. In Proceedings of the 26th International Symposium on Computer Architecture, May 1999.

    Google Scholar 

  2. Carter, J. B., et al.: Design of the Munin Distributed Shared Memory System. Journal of Parallel and Distributed Computing, 29(2):219–227, September 1995.

    Google Scholar 

  3. Carter, J. B., et al.: Impulse: Building a Smarter Memory Controller. In Proceedings of the Fifth International Symposium on High Performance Computer Architecture January 1999.

    Google Scholar 

  4. Gokhale, M., Holmes, B., Iobst, K.: Processing in Memory: the Terasys Massively Parallel PIM Array. Computer, 28(3):23–31, April 1995.

    Google Scholar 

  5. Hall, M., et al.: Mapping Irregular Applications to DIVA, A PIM-based Data-Intensive Architecture. Supercomputing, Portland, OR, Nov. 1999.

    Google Scholar 

  6. Heinrich, M., Speight, E.: Active Memory Clusters: Efficient Multiprocessing on Next-Generation Servers. Technical Report CSL-TR-2001-1014, Computer Systems Lab, Cornell University, August, 2001.

    Google Scholar 

  7. InfiniBand Architecture Specification, Volume 1.0, Release 1.0. InfiniBand Trade Association, October 24, 2000.

    Google Scholar 

  8. Keleher, P., et al.: TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems. In Proceedings of the Winter 1994 USENIX Conference, pages 115–132, January 1994.

    Google Scholar 

  9. Kang, Y., et al.: FlexRAM: Toward an Advanced Intelligent Memory System. International Conference on Computer Design, October 1999.

    Google Scholar 

  10. Kim, D., Chaudhuri, M., Heinrich, M.: Leveraging Cache Coherence in Active Memory Systems. Technical Report CSL-TR-2001-1018, Computer Systems Laboratory, Cornell University, November 2001.

    Google Scholar 

  11. Kuskin, J., et al.: The Stanford FLASH Multiprocessor. In Proceedings of the 21st International Symposium on Computer Architecture, pages 302–313, April 1994.

    Google Scholar 

  12. Laudon, J., Lenoski, D.: The SGI Origin: A ccNUMA Highly Scalable Server. In Proceedings of the 24th International Symposium on Computer Architecture, pages 241–251, June 1997.

    Google Scholar 

  13. Lenoski, D., et al.: The Stanford DASH Multiprocessor. IEEE Computer, 25(3):63–79, March 1992.

    Google Scholar 

  14. Li, K., Hudak, P.: Memory Coherence in Shared Virtual Memory Systems. In ACM Transactions on Computer Systems, 7(4):321–359, November 1989.

    Google Scholar 

  15. Manohar, R., Heinrich, M.: A Case for Asynchronous Active Memories. In ISCA 2000 Solving the Memory Wall Problem Workshop, June 2000.

    Google Scholar 

  16. Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, Version 1.0, 1994.

    Google Scholar 

  17. Nowatzyk, A., et al.: The S3.mp Scalable Shared Memory Multiprocessor. In Proceedings of the 24th International Conference on Parallel Processing, 1995.

    Google Scholar 

  18. Oskin, M., Chong, F. T., Sherwood, T.: Active Pages: A Computation Model for Intelligent Memory. In Proceedings of the 25th International Symposium on Computer Architecture, 1998.

    Google Scholar 

  19. Saulsbury, A., Pong, F., Nowatzyk, A.: Missing the Memory Wall: The Case for Processor/Memory Integration. In Proceedings of the 23rd International Symposium on Computer Architecture, pages 90–101, May 1996.

    Google Scholar 

  20. Scales, D. J., Gharachorloo, K., Thekkath, C. A.: Shasta: A Low-Overhead Software-Only Approach for Supporting Fine-Grain Shared Memory. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 174–185, October 1996.

    Google Scholar 

  21. Soundararajan, R., et al.: Flexible Use of Memory for Replication/Migration in Cache-Coherent DSM Multiprocessors. In Proceedings of the 25th International Symposium on Computer Architecture, pages 342–355, June 1998.

    Google Scholar 

  22. Speight, E., Bennett, J. K.: Brazos: A Third Generation DSM System. In Proceedings of the First Usenix Windows NT Symposium, August 1997.

    Google Scholar 

  23. Torrellas, J., Yang, L., Nguyen, A.-T.: Toward a Cost-Effective DSM Organization that Exploits Processor-Memory Integration In Proceedings of the 6th International Symposium on High-Performance Computer Architecture, January 2000.

    Google Scholar 

  24. Woo, S. C., et al.: The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings of the 22nd International Symposium on Computer Architecture, pages 24–36, June 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Heinrich, M., Speight, E., Chaudhuri, M. (2002). Active Memory Clusters: Efficient Multiprocessing on Commodity Clusters. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds) High Performance Computing. ISHPC 2002. Lecture Notes in Computer Science, vol 2327. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47847-7_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-47847-7_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43674-4

  • Online ISBN: 978-3-540-47847-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics