Skip to main content

Collaborative Memories in Clusters: Opportunities and Challenges

  • Chapter
Transactions on Computational Science XXII

Part of the book series: Lecture Notes in Computer Science ((TCOMPUTATSCIE,volume 8360))

Abstract

Highly-integrated distributed systems such as Intel Micro Server and SeaMicro Server are increasingly becoming a popular server architecture. Designers of such systems face interesting memory hierarchy design challenges while attempting to reduce/eliminate the notorious disk storage swapping. Disk swapping activities slow down applications’ execution drastically. Swapping to the free remote memory - near by nodes, through Memory Collaboration has demonstrated its cost-effectiveness compared to overprovisioning memory for peak load requirements. Recent studies propose several ways to access the under-utilized remote memory in static system configurations, without detailed exploration of dynamic memory collaboration. Dynamic collaboration is an important aspect given the run-time memory usage fluctuations in clustered systems. Furthermore, with the growing interest in memory collaboration, it is crucial to understand the existing performance bottlenecks, overheads, and potential optimizations.

In this paper we address these two issues. First, we propose an Autonomous Collaborative Memory System (ACMS) that manages memory resources dynamically at run time, to optimize performance, and provide QoS measures for nodes engaging in the system. We implement a prototype realizing the proposed ACMS, experiment with a wide range of real-world applications, and show up to 3x performance speedup compared to a non-collaborative memory system, without perceivable performance impact on nodes that provide memory. Second, we analyze, in depth, the end-to-end memory collaboration overhead and bottlenecks. Based on this analysis, we provide insights on several corresponding optimizations to further improve the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, A.: Facebook: Science and the social graph (2009), http://www.infoq.com/presentations/Facebook-Software-Stack ; Presented in QCon San Francisco

  2. Apache: Hadoop (2011), http://hadoop.apache.org/

  3. Baumann, A., Barham, P., Dagand, P.E., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schuepbach, A., Singhania, A.: The multikernel: a new OS architecture for scalable multicore systems. In: SOSP 2009: Proceedings of the 22nd ACM Symposium on Operating Systems Principles. ACM Press, New York (2009)

    Google Scholar 

  4. Beckmann, B.M., Marty, M.R., Wood, D.A.: ASR: Adaptive Selective Replication for CMP Caches. In: MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society (2006), http://dx.doi.org/10.1109/MICRO.2006.10

  5. Chang, J., Sohi, G.S.: Cooperative Caching for Chip Multiprocessors. In: 33rd International Symposium on Computer Architecture, ISCA, 2006 (2006), http://dx.doi.org/10.1109/ISCA.2006.17 , doi:10.1109/ISCA.2006.17

  6. Chen, H., Luo, Y., Wang, X., Zhang, B., Sun, Y., Wang, Z.: A transparent remote paging model for virtual machines (2008)

    Google Scholar 

  7. Chishti, Z., Powell, M.D., Vijaykumar., T.N.: Optimizing Replication, Communication and Capacity Allocation in CMPs. In: The 32th ISCA (June 2005)

    Google Scholar 

  8. Corp., I.: Chip shot: Intel outlines low-power micro server strategy (2011)

    Google Scholar 

  9. Dhiman, G., Ayoub, R., Rosing, T.: PDRAM: a hybrid PRAM and DRAM main memory system. In: Proceedings of the 46th Annual Design Automation Conference, DAC 2009, pp. 469–664. ACM, New York (2009), doi: http://doi.acm.org/10.1145/1629911.1630086

  10. Fedora Project: Intel. Core. i7-800 Processor Series (2010), http://fedoraproject.org/

  11. Grant, R., Balaji, P., Afsahi, A.: A study of hardware assisted ip over infiniband and its impact on enterprise data center performance. In: 2010 IEEE International Symposium on Performance Analysis of Systems Software (ISPASS), pp. 144–153 (2010), doi:10.1109/ISPASS.2010.5452035

    Google Scholar 

  12. Huggahalli, R., Iyer, R., Tetrick, S.: Direct cache access for high bandwidth network i/o. In: Proceedings of the 32nd Annual International Symposium on Computer Architecture, ISCA 2005, pp. 50–59. IEEE Computer Society, Washington, DC (2005), http://dx.doi.org/10.1109/ISCA.2005.23

    Google Scholar 

  13. Intel Corp.: Thunderbolt Technology (2011), http://www.intel.com/technology/io/thunderbolt/index.htm

  14. Intel Microarchitecture: Intel. Core. i7-800 Processor Series (2010), http://download.intel.com/products/processor/corei7/319724.pdf

  15. Howard, J., Dighe, S.: A 48-core ia-32 message-passing processor with dvfs in 45nm cmos. In: Proceedings of the International Solid-State Circuits Conference (ISCC), ISSCC, 2010 (2010)

    Google Scholar 

  16. Kyasanur, P., Choudhury, R.R., Gupta, I.: Smart gossip: An adaptive gossip-based broadcasting service for sensor networks. In: 2006 IEEE International Conference on Mobile Adhoc and Sensor Systems (MASS), pp. 91–100 (2006), doi:10.1109/MOBHOC.2006.278671

    Google Scholar 

  17. Liang, S., Noronha, R., Panda, D.: Swapping to remote memory over InfiniBand: An approach using a high performance network block device. In: IEEE International Cluster Computing, pp. 1–10 (2005), doi: 10.1109/CLUSTR.2005.347050

    Google Scholar 

  18. Lim, K., Chang, J., Mudge, T., Ranganathan, P., Reinhardt, S.K., Wenisch, T.F.: Disaggregated memory for expansion and sharing in blade servers. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009, pp. 267–278. ACM, New York (2009), doi: http://doi.acm.org/10.1145/1555754.1555789

  19. Markatos, E., Markatos, E.P., Dramitinos, G., Dramitinos, G.: Implementation of a reliable remote memory pager. In: USENIX Annual Technical Conference, pp. 177–190 (1996)

    Google Scholar 

  20. Markatos, E.P., Dramitinos, G.: Adding flexibility to a remote memory pager (1996)

    Google Scholar 

  21. Massie, M.L., Chun, B.N., Culler, D.E.: The ganglia distributed monitoring system: Design, implementation and experience (2004)

    Google Scholar 

  22. Midorikawa, H., Kurokawa, M., Himeno, R., Sato, M.: DLM: A distributed large memory system using remote memory swapping over cluster nodes. In: 2008 IEEE International Conference on Cluster Computing, pp. 268–273 (2008), doi:10.1109/CLUSTR.2008.4663780

    Google Scholar 

  23. Network Block Device TCP version: NBD (2011), http://nbd.sourceforge.net/

  24. Newhall, T., Finney, S., Ganchev, K., Spiegel, M.: Nswap: A network swapping module for linux clusters (2003)

    Google Scholar 

  25. Ousterhout, J.K., Agrawal, P., Erickson, D., Kozyrakis, C., Leverich, J., Mazières, D., Mitra, S., Narayanan, A., Rosenblum, M., Rumble, S.M., Stratmann, E., Stutsman, R.: The case for ramclouds: Scalable high-performance storage entirely in DRAM. In: SIGOPS OSR. Stanford InfoLab (2009), http://ilpubs.stanford.edu:8090/942/

  26. Peterson, L., Davie, B.: Computer networks, 5th edn. (2011)

    Google Scholar 

  27. Qureshi, M.: Adaptive Spill-Receive for Robust High-Performance Caching in CMPs. In: IEEE 15th International Symposium on High Performance Computer Architecture, HPCA (2009), doi:10.1109/HPCA.2009.4798236

    Google Scholar 

  28. Qureshi, M.K., Franceschini, M.M., Lastras-Montaño, L.A., Karidis, J.P.: Morphable memory system: a robust architecture for exploiting multi-level phase change memories. SIGARCH Comput. Archit. News 38, 153–162 (2010), doi: http://doi.acm.org/10.1145/1816038.1815981

  29. Qureshi, M.K., Srinivasan, V., Rivers, J.A.: Scalable high performance main memory system using phase-change memory technology. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA 2009, pp. 24–33. ACM, New York (2009), doi: http://doi.acm.org/10.1145/1555754.1555760

  30. Rafique, N., Lim, W.T., Thottethodi, M.: Architectural support for operating system-driven CMP cache management. In: PACT 2006: Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, ACM (2006), doi: http://doi.acm.org/10.1145/1152154.1152160

  31. Ramos, L.E., Gorbatov, E., Bianchini, R.: Page placement in hybrid memory systems. In: Proceedings of the International Conference on Supercomputing, ICS 2011, pp. 85–95. ACM, New York (2011), doi: http://doi.acm.org/10.1145/1995896.1995911

  32. Rao, A.: Seamicro technology overview (2010)

    Google Scholar 

  33. Samih, A., Krishna, A., Solihin, Y.: Understanding the limits of capacity sharing in CMP Private Caches, in CMP-MSI (2009)

    Google Scholar 

  34. Samih, A., Krishna, A., Solihin, Y.: Evaluating Placement Policies for Managing Capacity Sharing in CMP Architectures with Private Caches. ACM Transactions on Architecture and Code Optimization (TACO) 8(3) (2011)

    Google Scholar 

  35. Soares, L., Stumm, M.: Flexsc: flexible system call scheduling with exception-less system calls. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI 2010, pp. 1–8. USENIX Association, Berkeley (2010), http://dl.acm.org/citation.cfm?id=1924943.1924946

    Google Scholar 

  36. SPEC: SPECjbb2005, http://www.spec.org/jbb2005/

  37. Standard Performance Evaluation Corporation (2006), http://www.specbench.org

  38. Suh, G., Devadas, S., Rudolph, L.: A new memory monitoring scheme for memory-aware scheduling and partitioning. In: Proceedings of the Eighth International Symposium on High-Performance Computer Architecture, pp. 117–128 (2002), doi:10.1109/HPCA.2002.995703

    Google Scholar 

  39. Tam, D.K., Azimi, R., Soares, L.B., Stumm, M.: RapidMRC: Approximating L2 Miss Rate Curves on Commodity Systems for Online Optimizations. SIGPLAN Not. 44(3) (2009), doi: http://doi.acm.org/10.1145/1508284.1508259

  40. Tanenbaum, A.S., Van Renesse, R.: Distributed operating systems. ACM Comput. Surv. 17, 419–470 (1985), doi: http://doi.acm.org/10.1145/6041.6074

  41. Transaction Processing Performance Council: TPC-H 2.14.2 (2011), http://www.tpc.org/tpch/

  42. vmware : experience game-changing virtual machine mobility, http://www.vmware.com/products/vmotion/overview.html (2011)

  43. Wang, N., Liu, X., He, J., Han, J., Zhang, L., Xu, Z.: Collaborative memory pool in cluster system. In: International Conference on Parallel Processing, ICPP 2007, p. 17 (2007), doi:10.1109/ICPP.2007.25

    Google Scholar 

  44. Zhang, M., Asanovic, K.: Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors. In: ISCA 2005: Proceedings of the 32nd Annual International Symposium on Computer Architecture, IEEE Computer Society (2005), doi: http://dx.doi.org/10.1109/ISCA.2005.53

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Samih, A., Wang, R., Maciocco, C., Kharbutli, M., Solihin, Y. (2014). Collaborative Memories in Clusters: Opportunities and Challenges. In: Gavrilova, M.L., Tan, C.J.K. (eds) Transactions on Computational Science XXII. Lecture Notes in Computer Science, vol 8360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54212-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54212-1_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54211-4

  • Online ISBN: 978-3-642-54212-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics