skip to main content
10.1145/3448016.3452817acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

CoRM: Compactable Remote Memory over RDMA

Published:18 June 2021Publication History

ABSTRACT

Distributed memory systems are becoming increasingly important since they provide a system-scale abstraction where physically separated memories can be addressed as a single logical one. This abstraction enables memory disaggregation, allowing systems as in-memory databases, caching services, and ephemeral storage to be naturally deployed at large scales. While this abstraction effectively increases the memory capacity of these systems, it faces additional overheads for remote memory accesses. To narrow the difference between local and remote accesses, low latency RDMA networks are a key element for efficient memory disaggregation. However, RDMA acceleration poses new obstacles to efficient memory management and particularly to memory compaction: network controllers and CPUs can concurrently access memory, potentially leading to inconsistencies if memory management operations are not synchronized. To ensure consistency, most distributed memory systems do not provide memory compaction and are exposed to memory fragmentation. We introduce CoRM, an RDMA-accelerated shared memory system that supports memory compaction and ensures strict consistency while providing one-sided RDMA accesses. We show that CoRM sustains high read throughput during normal operations, comparable to similar systems not providing memory compaction while experiencing minimal overheads during compaction. CoRM never disrupts RDMA connections and can reduce applications' active memory up to 6x by performing memory compaction.

Skip Supplemental Material Section

Supplemental Material

3448016.3452817.mp4

mp4

336.3 MB

References

  1. Marcos K. Aguilera, Nadav Amit, Irina Calciu, Xavier Deguillard, Jayneel Gandhi, Stanko Novakovic, Arun Ramanathan, Pratap Subrahmanyam, Lalith Suresh, Kiran Tati, Rajesh Venkatasubramanian, and Michael Wei. 2018. Remote Regions: A Simple Abstraction for Remote Memory. In Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference (Boston, MA, USA) (USENIX ATC 18). USENIX Association, USA, 775--787.Google ScholarGoogle Scholar
  2. Marcos K. Aguilera, Kimberly Keeton, Stanko Novakovic, and Sharad Singhal. 2019. Designing Far Memory Data Structures: Think Outside the Box. In Proceedings of the Workshop on Hot Topics in Operating Systems (Bertinoro, Italy) (HotOS 19). Association for Computing Machinery, New York, NY, USA, 120--126. https://doi.org/10.1145/3317550.3321433Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. InfiniBand Trade Association et almbox. 2000. The InfiniBand Architecture Specification. http://www.infinibandta.org/specs/ (2000).Google ScholarGoogle Scholar
  4. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload Analysis of a Large-Scale Key-Value Store. SIGMETRICS Perform. Eval. Rev. , Vol. 40, 1 (June 2012), 53--64. https://doi.org/10.1145/2318857.2254766Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Claude Barthels, Simon Loesing, Gustavo Alonso, and Donald Kossmann. 2015. Rack-Scale In-Memory Join Processing Using RDMA. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (Melbourne, Victoria, Australia) (SIGMOD 15). Association for Computing Machinery, New York, NY, USA, 1463--1475. https://doi.org/10.1145/2723372.2750547Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. 2000. Hoard: A Scalable Memory Allocator for Multithreaded Applications. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (Cambridge, Massachusetts, USA) (ASPLOS IX). Association for Computing Machinery, New York, NY, USA, 117--128. https://doi.org/10.1145/378993.379232Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Carsten Binnig, Andrew Crotty, Alex Galakatos, Tim Kraska, and Erfan Zamanian. 2016. The End of Slow Networks: It's Time for a Redesign. Proc. VLDB Endow. , Vol. 9, 7 (March 2016), 528--539. https://doi.org/10.14778/2904483.2904485Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chiranjeeb Buragohain, Knut Magne Risvik, Paul Brett, Miguel Castro, Wonhee Cho, Joshua Cowhig, Nikolas Gloy, Karthik Kalyanaraman, Richendra Khanna, John Pao, Matthew Renzelmann, Alex Shamis, Timothy Tan, and Shuheng Zheng. 2020. A1: A Distributed In-Memory Graph Database. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD 20). Association for Computing Machinery, New York, NY, USA, 329--344. https://doi.org/10.1145/3318464.3386135Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Qingchao Cai, Wentian Guo, Hao Zhang, Divyakant Agrawal, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, Yong Meng Teo, and Sheng Wang. 2018. Efficient Distributed Memory Management with RDMA and Caching. Proc. VLDB Endow. , Vol. 11, 11 (July 2018), 1604--1617. https://doi.org/10.14778/3236187.3236209Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Barbara Chapman, Tony Curtis, Swaroop Pophale, Stephen Poole, Jeff Kuehn, Chuck Koelbel, and Lauren Smith. 2010. Introducing OpenSHMEM: SHMEM for the PGAS community. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model . 1--3.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Haibo Chen, Rong Chen, Xingda Wei, Jiaxin Shi, Yanzhe Chen, Zhaoguo Wang, Binyu Zang, and Haibing Guan. 2017. Fast In-Memory Transaction Processing Using RDMA and HTM. ACM Trans. Comput. Syst. , Vol. 35, 1, Article 3 (July 2017), bibinfonumpages37 pages. https://doi.org/10.1145/3092701Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing. 143--154.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Aleksandar Dragojevic, Dushyanth Narayanan, and Miguel Castro. 2017. RDMA reads: To use or not to use? IEEE Data Eng. Bull. , Vol. 40, 1 (2017), 3--14.Google ScholarGoogle Scholar
  14. Aleksandar Dragojević , Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). USENIX Association, Seattle, WA, 401--414. https://www.usenix.org/conference/nsdi14/technical-sessions/dragojevićGoogle ScholarGoogle Scholar
  15. Aleksandar Dragojeviundefined, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No Compromises: Distributed Transactions with Consistency, Availability, and Performance. In Proceedings of the 25th Symposium on Operating Systems Principles (Monterey, California) (SOSP 15). Association for Computing Machinery, New York, NY, USA, 54--70. https://doi.org/10.1145/2815400.2815425Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Brad Fitzpatrick. 2004. Distributed Caching with Memcached. Linux J. , Vol. 2004, 124 (Aug. 2004).Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang G. Shin. 2017. Efficient Memory Disaggregation with INFINISWAP. In Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation (Boston, MA, USA) (NSDI 17). USENIX Association, USA, 649--667.Google ScholarGoogle Scholar
  18. Sagar Jha, Jonathan Behrens, Theo Gkountouvas, Matthew Milano, Weijia Song, Edward Tremel, Robbert Van Renesse, Sydney Zink, and Kenneth P. Birman. 2019. Derecho: Fast State Machine Replication for Cloud Services. ACM Trans. Comput. Syst. , Vol. 36, 2, Article 4 (April 2019), bibinfonumpages49 pages. https://doi.org/10.1145/3302258Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA Efficiently for Key-value Services. In Proceedings of the 2014 ACM Conference on SIGCOMM (Chicago, Illinois, USA) (SIGCOMM 14). ACM, New York, NY, USA, 295--306. https://doi.org/10.1145/2619239.2626299Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016a. Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). USENIX Association, Denver, CO, 437--450. https://www.usenix.org/conference/atc16/technical-sessions/presentation/kaliaGoogle ScholarGoogle Scholar
  21. Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016b. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Savannah, GA, USA) (OSDI 16). USENIX Association, USA, 185--201.Google ScholarGoogle Scholar
  22. Antonios Katsarakis, Vasilis Gavrielatos, M.R. Siavash Katebzadeh, Arpit Joshi, Aleksandar Dragojevic, Boris Grot, and Vijay Nagarajan. 2020. Hermes: A Fast, Fault-Tolerant and Linearizable Replication Protocol. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Lausanne, Switzerland) (ASPLOS 20). Association for Computing Machinery, New York, NY, USA, 201--217. https://doi.org/10.1145/3373376.3378496Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Daehyeok Kim, Amirsaman Memaripour, Anirudh Badam, Yibo Zhu, Hongqiang Harry Liu, Jitu Padhye, Shachar Raindel, Steven Swanson, Vyas Sekar, and Srinivasan Seshan. 2018. Hyperloop: Group-Based NIC-Offloading to Accelerate Replicated Transactions in Multi-Tenant Storage Systems. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM 18). Association for Computing Machinery, New York, NY, USA, 297--312. https://doi.org/10.1145/3230543.3230572Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel. 2016. Coordinated and Efficient Huge Page Management with Ingens. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA, 705--721. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/kwonGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  25. Bojie Li, Zhenyuan Ruan, Wencong Xiao, Yuanwei Lu, Yongqiang Xiong, Andrew Putnam, Enhong Chen, and Lintao Zhang. 2017. KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC. In Proceedings of the 26th Symposium on Operating Systems Principles (Shanghai, China) (SOSP 17). Association for Computing Machinery, New York, NY, USA, 137--152. https://doi.org/10.1145/3132747.3132756Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Feng Li, Sudipto Das, Manoj Syamala, and Vivek R. Narasayya. 2016. Accelerating Relational Databases by Leveraging Remote Memory and RDMA. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD 16). Association for Computing Machinery, New York, NY, USA, 355--370. https://doi.org/10.1145/2882903.2882949Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A Holistic Approach to Fast In-Memory Key-Value Storage. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14) . USENIX Association, Seattle, WA, 429--444.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Martin Maas, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley, and Colin Raffel. 2020. Learning-Based Memory Allocation for CGoogle ScholarGoogle Scholar
  29. Server Workloads. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Lausanne, Switzerland) (ASPLOS 20). Association for Computing Machinery, New York, NY, USA, 541--556. https://doi.org/10.1145/3373376.3378525Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Linux Programmer's Manual. 2019. memfd_create - create an anonymous file. http://man7.org/linux/man-pages/man2/memfd_create.2.html (2019).Google ScholarGoogle Scholar
  31. Christopher Mitchell, Yifeng Geng, and Jinyang Li. 2013. Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store. In Proceedings of the 2013 USENIX Conference on Annual Technical Conference (San Jose, CA) (USENIX ATC 13). USENIX Association, USA, 103--114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin. 2015. Latency-Tolerant Software Distributed Shared Memory. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). USENIX Association, Santa Clara, CA, 291--305. https://www.usenix.org/conference/atc15/technical-session/presentation/nelsonGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  33. John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières, Subhasish Mitra, Aravind Narayanan, Diego Ongaro, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman. 2011. The Case for RAMCloud. Commun. ACM , Vol. 54, 7 (July 2011), 121--130. https://doi.org/10.1145/1965724.1965751Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ashish Panwar, Sorav Bansal, and K. Gopinath. 2019. HawkEye: Efficient Fine-Grained OS Support for Huge Pages. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Providence, RI, USA) (ASPLOS 19). Association for Computing Machinery, New York, NY, USA, 347--360. https://doi.org/10.1145/3297858.3304064Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ashish Panwar, Aravinda Prasad, and K. Gopinath. 2018. Making Huge Pages Actually Useful. SIGPLAN Not. , Vol. 53, 2 (March 2018), 679--692. https://doi.org/10.1145/3296957.3173203Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Marius Poke and Torsten Hoefler. 2015. DARE: High-Performance State Machine Replication on RDMA Networks. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (Portland, Oregon, USA) (HPDC 15). ACM, New York, NY, USA, 107--118. https://doi.org/10.1145/2749246.2749267Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Bobby Powers, David Tench, Emery D. Berger, and Andrew McGregor. 2019. Mesh: Compacting Memory Management for C/CGoogle ScholarGoogle Scholar
  38. Applications. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA) (PLDI 2019). ACM, New York, NY, USA, 333--346. https://doi.org/10.1145/3314221.3314582Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. John M Robson. 1977. Worst case fragmentation of first fit and best fit storage allocation strategies. Comput. J. , Vol. 20, 3 (1977), 242--244.Google ScholarGoogle ScholarCross RefCross Ref
  40. Wolf Rödiger, Tobias Mühlbauer, Alfons Kemper, and Thomas Neumann. 2015. High-Speed Query Processing over High-Speed Networks. Proc. VLDB Endow. , Vol. 9, 4 (Dec. 2015), 228--239. https://doi.org/10.14778/2856318.2856319Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Stephen M. Rumble, Ankita Kejriwal, and John Ousterhout. 2014. Log-structured Memory for DRAM-based Storage. In 12th USENIX Conference on File and Storage Technologies (FAST 14). USENIX Association, Santa Clara, CA, 1--16. https://www.usenix.org/conference/fast14/technical-sessions/presentation/rumbleGoogle ScholarGoogle Scholar
  42. Salvatore Sanfilippo. 2009. Redis. http://redis.io (2009).Google ScholarGoogle Scholar
  43. Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, Ana Klimovic, Adrian Schuepbach, and Bernard Metzler. 2019. Unification of Temporary Storage in the NodeKernel Architecture. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). USENIX Association, Renton, WA, 767--782. https://www.usenix.org/conference/atc19/presentation/stuediGoogle ScholarGoogle Scholar
  44. Yacine Taleb, Ryan Stutsman, Gabriel Antoniu, and Toni Cortes. 2018. Tailwind: Fast and Atomic RDMA-based Replication. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 851--863. https://www.usenix.org/conference/atc18/presentation/talebGoogle ScholarGoogle Scholar
  45. Xingda Wei, Zhiyuan Dong, Rong Chen, and Haibo Chen. 2018. Deconstructing RDMA-enabled Distributed Transactions: Hybrid is Better!. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) . USENIX Association, Carlsbad, CA, 233--251. https://www.usenix.org/conference/osdi18/presentation/weiGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  46. Erfan Zamanian, Carsten Binnig, Tim Harris, and Tim Kraska. 2017. The End of a Myth: Distributed Transactions Can Scale. Proc. VLDB Endow. , Vol. 10, 6 (Feb. 2017), 685--696. https://doi.org/10.14778/3055330.3055335Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Erfan Zamanian, Xiangyao Yu, Michael Stonebraker, and Tim Kraska. 2019. Rethinking Database High Availability with RDMA Networks. Proc. VLDB Endow. , Vol. 12, 11 (July 2019), 1637--1650. https://doi.org/10.14778/3342263.3342639Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Yili Zheng, Amir Kamil, Michael B Driscoll, Hongzhang Shan, and Katherine Yelick. 2014. UPCGoogle ScholarGoogle Scholar
  49. : a PGAS extension for CGoogle ScholarGoogle Scholar
  50. . In 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, 1105--1114.Google ScholarGoogle Scholar
  51. Weixi Zhu, Alan L. Cox, and Scott Rixner. 2020. A Comprehensive Analysis of Superpage Management Mechanisms and Policies. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 829--842. https://www.usenix.org/conference/atc20/presentation/zhu-weixiGoogle ScholarGoogle Scholar
  52. Tobias Ziegler, Sumukha Tumkur Vani, Carsten Binnig, Rodrigo Fonseca, and Tim Kraska. 2019. Designing Distributed Tree-Based Index Structures for Fast RDMA-Capable Networks. In Proceedings of the 2019 International Conference on Management of Data (Amsterdam, Netherlands) (SIGMOD 19). Association for Computing Machinery, New York, NY, USA, 741--758. https://doi.org/10.1145/3299869.3300081Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. CoRM: Compactable Remote Memory over RDMA

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
          June 2021
          2969 pages
          ISBN:9781450383431
          DOI:10.1145/3448016

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 18 June 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader