skip to main content
10.1145/3538641.3561495acmconferencesArticle/Chapter ViewAbstractPublication PagesracsConference Proceedingsconference-collections
research-article

hKVS: a framework for designing a high throughput heterogeneous key-value store with SmartNIC and RDMA

Published: 20 October 2022 Publication History

Abstract

In-memory key-value store (KVS) is a crucial component of data center applications. Since DRAM provides high bandwidth and low latency, the major performance bottleneck of common in-memory KVS lies in the network stack. Prior works have attempted to replace the traditional network stack with remote direct memory access (RDMA), which achieve orders of magnitude higher throughput and reduce the response latency. To further increase the throughput of an in-memory KVS, we propose a framework called hKVS, which enables the developers to design high-throughput heterogeneous KVS systems by adding the latest generations of smart network interface cards (SmartNIC), such as the NVIDIA BlueField DPU, to the host machines. The hKVS enables a host server to efficiently exploit the computational resources and utilize the RDMA capability of the SmartNICs to offload the workload for the CPU and increase the network bandwidth. The hKVS allows popular key-value objects to be replicated from the host to SmartNIC to form a high-throughput RDMA KVS jointly. We design the architecture of the hKVS, optimize its software implementation, and conduct a series of experiments to evaluate the resulted performance in realistic applications. By adding a SmartNIC to the host, hKVS achieves up to 1.86X and 1.48X higher throughput in 100% and 95% read workloads, which is cost-effective and scalable compared to building a KVS with multiple hosts, considering the SmartNIC costs much less than a high-performance server and multiple SmartNICs can be added to scale the throughput if needed.

References

[1]
2021. linux-rdma/perftest: Infiniband Verbs Performance Tests. (2021). https://github.com/linux-rdma/perftest Version: 4.5-0.2.
[2]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload Analysis of a Large-Scale Key-Value Store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '12). New York, NY, USA, 53--64.
[3]
L. Breslau, Pei Cao, Li Fan, G. Phillips, and S. Shenker. 1999. Web caching and Zipf-like distributions: evidence and implications. In IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320), Vol. 1. 126--134 vol.1.
[4]
Benjamin Cassell, Tyler Szepesi, Bernard Wong, Tim Brecht, Jonathan Ma, and Xiaoyi Liu. 2017. Nessie: A Decoupled, Client-Driven Key-Value Store Using RDMA. IEEE Transactions on Parallel and Distributed Systems 28, 12 (2017), 3537--3552.
[5]
Sean Choi, Muhammad Shahbaz, Balaji Prabhakar, and Mendel Rosenblum. 2020. λ-NIC: Interactive Serverless Compute on Programmable SmartNICs. In 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). 67--77.
[6]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC '10). New York, NY, USA, 143--154.
[7]
Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). Seattle, WA, 401--414.
[8]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA Efficiently for Key-Value Services. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM '14). New York, NY, USA, 295--306.
[9]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). Denver, CO, 437--450.
[10]
Jongyul Kim, Insu Jang, Waleed Reda, Jaeseong Im, Marco Canini, Dejan Kostić, Youngjin Kwon, Simon Peter, and Emmett Witchel. 2021. LineFS: Efficient Smart-NIC Offload of a Distributed File System with Pipeline Parallelism. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP '21). New York, NY, USA, 756--771.
[11]
Yanfang Le, Hyunseok Chang, Sarit Mukherjee, Limin Wang, Aditya Akella, Michael M. Swift, and T. V. Lakshman. 2017. UNO: Uniflying Host and Smart NIC Offload for Flexible Packet Processing. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC '17). New York, NY, USA, 506--519.
[12]
Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A Holistic Approach to Fast In-Memory Key-Value Storage. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). Seattle, WA, 429--444.
[13]
Ming Liu, Tianyi Cui, Henry Schuh, Arvind Krishnamurthy, Simon Peter, and Karan Gupta. 2019. Offloading Distributed Applications onto SmartNICs Using IPipe. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM '19). New York, NY, USA, 318--333.
[14]
NVIDIA. 2021. NVIDIA BlueField-2 Datasheet. (2021). https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/documents/datasheet-nvidia-bluefield-2-dpu.pdf Accessed: 2022-04-20.
[15]
NVIDIA. 2022. Functional Diagram - BlueField DPU OS 3.8.5 - NVIDIA Networking Docs. (Jan. 2022). https://docs.nvidia.com/networking/display/BlueFieldDPUOSv385/Functional+Diagram Accessed: 2022-04-30.
[16]
NVIDIA. 2022. NVIDIA InfiniBand Adapters. (2022). https://www.nvidia.com/en-us/networking/infiniband-adapters/ Accessed: 2022-05-02.
[17]
Renato J. Recio, Paul R. Culley, Dave Garcia, Bernard Metzler, and Jeff Hilland. 2007. A Remote Direct Memory Access Protocol Specification. RFC 5040. (2007).
[18]
Henry N. Schuh, Weihao Liang, Ming Liu, Jacob Nelson, and Arvind Krishnamurthy. 2021. Xenic: SmartNIC-Accelerated Distributed Transactions. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP '21). New York, NY, USA, 740--755.
[19]
Andrew S. Tanenbaum and Maarten van Steen. 2007. Distributed Systems: Principles and Paradigms (2 ed.). Pearson Prentice Hall, Upper Saddle River, NJ.
[20]
Shin-Yeh Tsai, Yizhou Shan, and Yiying Zhang. 2020. Disaggregating Persistent Memory and Controlling Them Remotely: An Exploration of Passive Disaggregated Key-Value Stores. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 33--48.
[21]
Juncheng Yang, Yao Yue, and K. V. Rashmi. 2020. A large scale analysis of hundreds of in-memory cache clusters at Twitter. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 191--208.

Cited By

View all
  • (2024)A Comprehensive Survey on SmartNICs: Architectures, Development Models, Applications, and Research DirectionsIEEE Access10.1109/ACCESS.2024.343720312(107297-107336)Online publication date: 2024
  • (2023)The Global Trends of Publications on Complex Networks and the Contribution From the Computer Sciences and Engineering Subject Areas.2023 IEEE Seventh Ecuador Technical Chapters Meeting (ECTM)10.1109/ETCM58927.2023.10309003(1-6)Online publication date: 10-Oct-2023

Index Terms

  1. hKVS: a framework for designing a high throughput heterogeneous key-value store with SmartNIC and RDMA

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      RACS '22: Proceedings of the Conference on Research in Adaptive and Convergent Systems
      October 2022
      208 pages
      ISBN:9781450393980
      DOI:10.1145/3538641
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 October 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. DPU
      2. SmartNIC
      3. heterogeneous system
      4. key-value store
      5. remote direct memory access (RDMA)

      Qualifiers

      • Research-article

      Conference

      RACS '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 393 of 1,581 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)90
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Comprehensive Survey on SmartNICs: Architectures, Development Models, Applications, and Research DirectionsIEEE Access10.1109/ACCESS.2024.343720312(107297-107336)Online publication date: 2024
      • (2023)The Global Trends of Publications on Complex Networks and the Contribution From the Computer Sciences and Engineering Subject Areas.2023 IEEE Seventh Ecuador Technical Chapters Meeting (ECTM)10.1109/ETCM58927.2023.10309003(1-6)Online publication date: 10-Oct-2023

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media