skip to main content
10.1145/3127479.3131612acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Remote memory in the age of fast networks

Published:24 September 2017Publication History

ABSTRACT

As the latency of the network approaches that of memory, it becomes increasingly attractive for applications to use remote memory---random-access memory at another computer that is accessed using the virtual memory subsystem. This is an old idea whose time has come, in the age of fast networks. To work effectively, remote memory must address many technical challenges. In this paper, we enumerate these challenges, discuss their feasibility, explain how some of them are addressed by recent work, and indicate other promising ways to tackle them. Some challenges remain as open problems, while others deserve more study. In this paper, we hope to provide a broad research agenda around this topic, by proposing more problems than solutions.

References

  1. CCIX: cache coherent interconnect for accelerators. http://www.ccixconsortium.com. Accessed: 2017-05-05.Google ScholarGoogle Scholar
  2. etcd 3.1.7 documentation. https://coreos.com/etcd/docs/latest. Accessed: 2017-05-05.Google ScholarGoogle Scholar
  3. Gen-Z draft core specification, December 2016. http://genzconsortium.org/draft-core-specification-december-2016.Google ScholarGoogle Scholar
  4. InfiniBand. http://www.infinibandta.org/content/pages.php?pg=about_us_infiniband. Accessed on 2017-01-24.Google ScholarGoogle Scholar
  5. Intel Omni-Path. http://www.intel.com/content/www/us/en/high-performance-computing-fabrics/omni-path-architecture-fabric-overview.html. Accessed on 2017-01-24.Google ScholarGoogle Scholar
  6. Mellanox Connect X4. http://www.mellanox.com/page/products_dyn?product_family=201&. Accessed on 2017-01-24.Google ScholarGoogle Scholar
  7. OpenCAPI consortium. http://opencapi.org. Accessed: 2017-05-05.Google ScholarGoogle Scholar
  8. pmem.io persistent memory emulation in DRAM. http://pmem.io/2016/02/22/pm-emulation.html.Google ScholarGoogle Scholar
  9. Magic quadrant for x86 server virtualization infrastructure. https://www.gartner.com/doc/2788024/magic-quadrant-x-server-virtualization, 2014.Google ScholarGoogle Scholar
  10. C. Amza, A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel. TreadMarks: Shared memory computing on networks of workstations. IEEE Computer, 29(2):18--28, Feb. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Asanovic and D. Patterson. FireBox: A hardware building block for 2020 warehouse-scale computers. In Keynote USENIX Conference on File and Storage Technologies (FAST), Feb. 2014.Google ScholarGoogle Scholar
  12. J. Behrens, S. Jha, M. Milano, E. Tremel, K. Birman, and R. van Renesse. The Derecho project. https://derecho-project.github.io.Google ScholarGoogle Scholar
  13. J. K. Bennett, J. B. Carter, and W. Zwaenepoel. Munin: Distributed shared memory based on type-specific memory coherence. In ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 168--176, Mar. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Carriero and D. Gelernter. The S/Net's Linda kernel (extended abstract). In ACM Symposium on Operating Systems Principles (SOSP), page 160, Dec. 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Comer and J. Griffioen. A new design for distributed systems: The remote memory model. In Usenix Summer 1990 Technical Conference, pages 127--136, June 1990.Google ScholarGoogle Scholar
  16. A. Dragojević, D. Narayanan, M. Castro, and O. Hodson. FaRM: Fast remote memory. In Symposium on Networked Systems Design and Implementation (NSDI), pages 401--414, Apr. 2014.Google ScholarGoogle Scholar
  17. A. Dragojević, D. Narayanan, E. Nightingale, M. Renzelmann, A. Shamis, A. Badam, and M. Castro. No compromises: distributed transactions with consistency, availability, and performance. In ACM Symposium on Operating Systems Principles (SOSP), pages 54--70, Oct. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Edge. DAX, mmap(), and a "go faster" flag. https://lwn.net/Articles/684828/. Accessed on 2017-01-24.Google ScholarGoogle Scholar
  19. P. Faraboschi, K. Keeton, T. Marsland, and D. Milojicic. Beyond processor-centric operating systems. In Workshop on Hot Topics in Operating Systems (HotOS), May 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. J. Feeley, W. E. Morgan, E. P. Pighin, A. R. Karlin, H. M. Levy, and C. A. Thekkath. Implementing global memory management in a workstation cluster. In ACM Symposium on Operating Systems Principles (SOSP), pages 201--212, Dec. 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. D. Flouris and E. P. Markatos. The network RamDisk: Using remote memory on heterogeneous nows. Cluster Computing, 2(4):281--293, Oct. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. X. Gao, A. Narayan, S. Karandikar, J. Carreira, S. Han, R. Agarwal, S. Ratnasamy, and S. Shenker. Network requirements for resource disaggregation. In Symposium on Operating Systems Design and Implementation (OSDI), pages 249--264, Oct. 2016.Google ScholarGoogle Scholar
  23. J. Gu, Y. Lee, Y. Zhang, M. Chowdhury, and K. G. Shin. Efficient memory disaggregation with Infiniswap. In Symposium on Networked Systems Design and Implementation (NSDI), pages 649--667, Mar. 2017.Google ScholarGoogle Scholar
  24. C. Guo, H. Wu, Z. Deng, G. Soni, J. Ye, J. Padhye, and M. Lipshteyn. RDMA over commodity ethernet at scale. In ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), pages 202--215, Aug. 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Han, N. Egi, A. Panda, S. Ratnasamy, G. Shi, and S. Shenker. Network support for resource disaggregation in next-generation datacenters. In Workshop on Hot Topics in Networks (HotNets), pages 10:1--10:7, Nov. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. R. Hines, A. Gordon, M. Silva, D. Da Silva, K. Ryu, and M. Ben-Yehuda. Applications know best: Performance-driven memory overcommit with Ginkgo. In Cloud Computing Technology and Science (CloudCom), pages 130--137, Nov. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. ZooKeeper: Wait-free coordination for internet-scale systems. In USENIX Annual Technical Conference (ATC), June 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Kalia, M. Kaminsky, and D. G. Andersen. Using RDMA efficiently for key-value services. In ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), pages 295--306, Aug. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Khandual. [RFC 0/8] Define coherent device memory node. https://lkml.org/lkml/2016/10/24/19. Accessed on 2017-01-24.Google ScholarGoogle Scholar
  30. S. Koussih, A. Acharya, and S. Setia. Dodo: A user-level system for exploiting idle memory in workstation clusters. In IEEE International Symposium on High Performance Distributed Computing (HPDC), pages 301--308, July 1998.Google ScholarGoogle Scholar
  31. K. Li and P. Hudak. Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems (TOCS), 7(4):321--359, Nov. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Liang, R. Noronha, and D. K. Panda. Swapping to remote memory over InfiniBand: An approach using a high performance network block device. In IEEE International Conference on Cluster Computing (CLUSTER), pages 1--10, Sept. 2005. Google ScholarGoogle ScholarCross RefCross Ref
  33. K. Lim, J. Chang, T. Mudge, P. Ranganathan, S. K. Reinhardt, and T. F. Wenisch. Disaggregated memory for expansion and sharing in blade servers. In International Symposium on Computer Architecture (ISCA), pages 267--278, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. T. Lim, Y. Turner, J. R. Santos, A. AuYoung, J. Chang, P. Ranganathan, and T. F. Wenisch. System-level implications of disaggregated memory. In IEEE Symposium on High Performance Computer Architecture (HPCA), pages 189--200, Feb. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Malka, N. Amit, M. Ben-Yehuda, and D. Tsafrir. riommu: Efficient iommu for i/o devices that employ ring buffers. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 355--368, Mar. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. Mitchell, Y. Geng, and J. Li. Using one-sided RDMA reads to build a fast, cpu-efficient key-value store. In USENIX Annual Technical Conference (ATC), pages 103--114, June 2013.Google ScholarGoogle Scholar
  37. G. Natapov. Asynchronous page faults - AIX did it. http://www.linux-kvm.org/wiki/images/a/ac/2010-forum-Async-page-faults.pdf. Accessed on 2017-01-24.Google ScholarGoogle Scholar
  38. J. Nelson, B. Holt, B. Myers, P. Briggs, L. Ceze, S. Kahan, and M. Oskin. Latency-tolerant software distributed shared memory. In USENIX Annual Technical Conference (ATC), pages 291--305, July 2015.Google ScholarGoogle Scholar
  39. P. S. Rao and G. Porter. Is memory disaggregation feasible? A case study with Spark SQL. In Symposium on Architectures for Networking and Communications Systems (ANCS), pages 75--80, Mar. 2016.Google ScholarGoogle Scholar
  40. R. Sahita, V. Shanbhogue, G. Neiger, J. Edwards, I. Ouziel, B. Huntley, S. Shwartsman, D. Durham, A. Anderson, and M. LeMay. Method and apparatus for fine grain memory protection, Dec. 2015. US Patent App. 14/320,334.Google ScholarGoogle Scholar
  41. T.-I. Salomie, G. Alonso, T. Roscoe, and K. Elphinstone. Application level ballooning for efficient server consolidation. In European Conference on Computer Systems (EuroSys), pages 337--350, Apr. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. J. Scales, K. Gharachorloo, and C. A. Thekkath. Shasta: A low overhead, software-only approach for supporting fine-grain shared memory. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 174--185, Oct. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. I. Schoinas, B. Falsafi, A. R. Lebeck, S. K. Reinhardt, J. R. Larus, and D. A. Wood. Fine-grain access control for distributed shared memory. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 297--306, Oct. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. I. C. Tuduce and T. R. Gross. Adaptive main memory compression. In USENIX Annual Technical Conference (ATC), pages 237--250, Apr. 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. J. Waldo, G. Wyant, A. Wollrath, and S. Kendall. A note on distributed computing. Technical Report SMLI TR-94--29, Sun Microsystems, Nov. 1994.Google ScholarGoogle Scholar
  46. C. A. Waldspurger. Memory resource management in VMware ESX server. In Symposium on Operating Systems Design and Implementation (OSDI), pages 181--194, Dec. 2002. Google ScholarGoogle ScholarCross RefCross Ref
  47. S. Woo. DRAM and memory system trends. In Keynote International Symposium on Memory Management (ISMM), Oct. 2004.Google ScholarGoogle Scholar
  48. Y. Zhang, J. Yang, A. Memaripour, and S. Swanson. Mojim: A reliable and highly-available non-volatile memory system. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 3--18, Mar. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing
    September 2017
    672 pages
    ISBN:9781450350280
    DOI:10.1145/3127479

    Copyright © 2017 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 24 September 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate169of722submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader