skip to main content
10.1145/3538641.3561495acmconferencesArticle/Chapter ViewAbstractPublication PagesracsConference Proceedingsconference-collections
research-article

hKVS: a framework for designing a high throughput heterogeneous key-value store with SmartNIC and RDMA

Published:20 October 2022Publication History

ABSTRACT

In-memory key-value store (KVS) is a crucial component of data center applications. Since DRAM provides high bandwidth and low latency, the major performance bottleneck of common in-memory KVS lies in the network stack. Prior works have attempted to replace the traditional network stack with remote direct memory access (RDMA), which achieve orders of magnitude higher throughput and reduce the response latency. To further increase the throughput of an in-memory KVS, we propose a framework called hKVS, which enables the developers to design high-throughput heterogeneous KVS systems by adding the latest generations of smart network interface cards (SmartNIC), such as the NVIDIA BlueField DPU, to the host machines. The hKVS enables a host server to efficiently exploit the computational resources and utilize the RDMA capability of the SmartNICs to offload the workload for the CPU and increase the network bandwidth. The hKVS allows popular key-value objects to be replicated from the host to SmartNIC to form a high-throughput RDMA KVS jointly. We design the architecture of the hKVS, optimize its software implementation, and conduct a series of experiments to evaluate the resulted performance in realistic applications. By adding a SmartNIC to the host, hKVS achieves up to 1.86X and 1.48X higher throughput in 100% and 95% read workloads, which is cost-effective and scalable compared to building a KVS with multiple hosts, considering the SmartNIC costs much less than a high-performance server and multiple SmartNICs can be added to scale the throughput if needed.

References

  1. 2021. linux-rdma/perftest: Infiniband Verbs Performance Tests. (2021). https://github.com/linux-rdma/perftest Version: 4.5-0.2.Google ScholarGoogle Scholar
  2. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload Analysis of a Large-Scale Key-Value Store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '12). New York, NY, USA, 53--64.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Breslau, Pei Cao, Li Fan, G. Phillips, and S. Shenker. 1999. Web caching and Zipf-like distributions: evidence and implications. In IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320), Vol. 1. 126--134 vol.1.Google ScholarGoogle Scholar
  4. Benjamin Cassell, Tyler Szepesi, Bernard Wong, Tim Brecht, Jonathan Ma, and Xiaoyi Liu. 2017. Nessie: A Decoupled, Client-Driven Key-Value Store Using RDMA. IEEE Transactions on Parallel and Distributed Systems 28, 12 (2017), 3537--3552.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sean Choi, Muhammad Shahbaz, Balaji Prabhakar, and Mendel Rosenblum. 2020. λ-NIC: Interactive Serverless Compute on Programmable SmartNICs. In 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). 67--77.Google ScholarGoogle ScholarCross RefCross Ref
  6. Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC '10). New York, NY, USA, 143--154.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). Seattle, WA, 401--414.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA Efficiently for Key-Value Services. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM '14). New York, NY, USA, 295--306.Google ScholarGoogle Scholar
  9. Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). Denver, CO, 437--450.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jongyul Kim, Insu Jang, Waleed Reda, Jaeseong Im, Marco Canini, Dejan Kostić, Youngjin Kwon, Simon Peter, and Emmett Witchel. 2021. LineFS: Efficient Smart-NIC Offload of a Distributed File System with Pipeline Parallelism. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP '21). New York, NY, USA, 756--771.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yanfang Le, Hyunseok Chang, Sarit Mukherjee, Limin Wang, Aditya Akella, Michael M. Swift, and T. V. Lakshman. 2017. UNO: Uniflying Host and Smart NIC Offload for Flexible Packet Processing. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC '17). New York, NY, USA, 506--519.Google ScholarGoogle Scholar
  12. Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A Holistic Approach to Fast In-Memory Key-Value Storage. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). Seattle, WA, 429--444.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ming Liu, Tianyi Cui, Henry Schuh, Arvind Krishnamurthy, Simon Peter, and Karan Gupta. 2019. Offloading Distributed Applications onto SmartNICs Using IPipe. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM '19). New York, NY, USA, 318--333.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. NVIDIA. 2021. NVIDIA BlueField-2 Datasheet. (2021). https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/documents/datasheet-nvidia-bluefield-2-dpu.pdf Accessed: 2022-04-20.Google ScholarGoogle Scholar
  15. NVIDIA. 2022. Functional Diagram - BlueField DPU OS 3.8.5 - NVIDIA Networking Docs. (Jan. 2022). https://docs.nvidia.com/networking/display/BlueFieldDPUOSv385/Functional+Diagram Accessed: 2022-04-30.Google ScholarGoogle Scholar
  16. NVIDIA. 2022. NVIDIA InfiniBand Adapters. (2022). https://www.nvidia.com/en-us/networking/infiniband-adapters/ Accessed: 2022-05-02.Google ScholarGoogle Scholar
  17. Renato J. Recio, Paul R. Culley, Dave Garcia, Bernard Metzler, and Jeff Hilland. 2007. A Remote Direct Memory Access Protocol Specification. RFC 5040. (2007).Google ScholarGoogle Scholar
  18. Henry N. Schuh, Weihao Liang, Ming Liu, Jacob Nelson, and Arvind Krishnamurthy. 2021. Xenic: SmartNIC-Accelerated Distributed Transactions. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP '21). New York, NY, USA, 740--755.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Andrew S. Tanenbaum and Maarten van Steen. 2007. Distributed Systems: Principles and Paradigms (2 ed.). Pearson Prentice Hall, Upper Saddle River, NJ.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Shin-Yeh Tsai, Yizhou Shan, and Yiying Zhang. 2020. Disaggregating Persistent Memory and Controlling Them Remotely: An Exploration of Passive Disaggregated Key-Value Stores. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 33--48.Google ScholarGoogle Scholar
  21. Juncheng Yang, Yao Yue, and K. V. Rashmi. 2020. A large scale analysis of hundreds of in-memory cache clusters at Twitter. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 191--208.Google ScholarGoogle Scholar

Index Terms

  1. hKVS: a framework for designing a high throughput heterogeneous key-value store with SmartNIC and RDMA

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        RACS '22: Proceedings of the Conference on Research in Adaptive and Convergent Systems
        October 2022
        208 pages
        ISBN:9781450393980
        DOI:10.1145/3538641

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 October 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate393of1,581submissions,25%
      • Article Metrics

        • Downloads (Last 12 months)111
        • Downloads (Last 6 weeks)14

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader