skip to main content
10.1145/2731186.2731200acmconferencesArticle/Chapter ViewAbstractPublication PagesveeConference Proceedingsconference-collections
research-article

A Hybrid I/O Virtualization Framework for RDMA-capable Network Interfaces

Published: 14 March 2015 Publication History

Abstract

DMA-capable interconnects, providing ultra-low latency and high bandwidth, are increasingly being used in the context of distributed storage and data processing systems. However, the deployment of such systems in virtualized data centers is currently inhibited by the lack of a flexible and high-performance virtualization solution for RDMA network interfaces.
In this work, we present a hybrid virtualization architecture which builds upon the concept of separation of paths for control and data operations available in RDMA. With hybrid virtualization, RDMA control operations are virtualized using hypervisor involvement, while data operations are set up to bypass the hypervisor completely. We describe HyV (Hybrid Virtualization), a virtualization framework for RDMA devices implementing such a hybrid architecture. In the paper, we provide a detailed evaluation of HyV for different RDMA technologies and operations. We further demonstrate the advantages of HyV in the context of a real distributed system by running RAMCloud on a set of HyV-enabled virtual machines deployed across a 6-node RDMA cluster. All of the performance results we obtained illustrate that hybrid virtualization enables bare-metal RDMA performance inside virtual machines while retaining the flexibility typically associated with paravirtualization.

References

[1]
Adit Ranadive and Bhavesh Davda. Toward a Paravirtual vRDMA Device for VMware ESXi Guests. VMware, 2012.
[2]
Ardalan Amiri Sani, Kevin Boos, Shaopu Qin, and Lin Zhong. I/O Paravirtualization at the Device File Boundary. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, pages 319--332, New York, NY, USA, 2014. ACM.
[3]
Nadav Amit, Dan Tsafrir, and Assaf Schuster. VSwapper: A Memory Swapper for Virtualized Environments. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, pages 349--366, New York, NY, USA, 2014. ACM.
[4]
Fabrice Bellard. QEMU, a Fast and Portable Dynamic Translator. In Proceedings of USENIX Annual Technical Conference, pages 41--46, 2005.
[5]
Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 401--414, Seattle, WA, April 2014. USENIX Association.
[6]
Thorsten Von Eicken, Anindya Basu, Vineet Buch, and Werner Vogels. U-net: A user-level network interface for parallel and distributed computing. In In Fifteenth ACM Symposium on Operating System Principles, 1995.
[7]
Keir Fraser, Steven H, Rolf Neugebauer, Ian Pratt, Andrew Warfield, and Mark Williamson. Safe hardware access with the Xen virtual machine monitor. In In 1st Workshop on Operating System and Architectural Support for the on demand IT InfraStructure (OASIS), 2004.
[8]
Abel Gordon, Nadav Amit, Nadav Har'El, Muli Ben-Yehuda, Alex Landau, Assaf Schuster, and Dan Tsafrir. ELI: Baremetal Performance for I/O Virtualization. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 411--422, New York, NY, USA, 2012. ACM.
[9]
InfiniBand Trade Association. InfiniBand Architectur Specification, Volume 1, Release 1.2.1. 2007.
[10]
InfiniBand Trade Association. Annex A16: RDMA over Converged Ethernet (RoCE). 2010.
[11]
J. Pinkerton J. Hilland, P. Culley and R. Recio. RDMA Protocol Verbs Specification. http://www.rdmaconsortium. org/home/draft-hilland-iwarp-verbs-v1.0-RDMAC. pdf, 2003.
[12]
Hwanju Kim, Sangwook Kim, Jinkyu Jeong, Joonwon Lee, and Seungryoul Maeng. Demand-based Coordinated Scheduling for SMP VMs. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '13, pages 369--380, New York, NY, USA, 2013. ACM.
[13]
Hwanju Kim, Hyeontaek Lim, Jinkyu Jeong, Heeseung Jo, and Joonwon Lee. Task-aware Virtual Machine Scheduling for I/O Performance. In Proceedings of the 2009 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '09, pages 101--110, New York, NY, USA, 2009. ACM.
[14]
Avi Kivity, Yaniv Kamay, Dor Laor, Uri Lublin, and Anthony Liguori. kvm: the Linux Virtual Machine Monitor. In Proceedings of the Linux Symposium, volume 1, pages 225--230, Ottawa, Ontario, Canada, June 2007.
[15]
L. Lamport. Proving the correctness of multiprocess programs. IEEE Trans. Softw. Eng., 3(2):125--143, March 1977.
[16]
Jiuxing Liu, Wei Huang, Bulent Abali, and Dhabaleswar K. Panda. High Performance VMM-bypass I/O in Virtual Machines. In Proceedings of the Annual Conference on USENIX '06 Annual Technical Conference, ATEC '06, pages 3--3, Berkeley, CA, USA, 2006. USENIX Association.
[17]
Matthew Wilcox. I'll Do It Later: Softirqs, Tasklets, Bottom Halves, Task Queues, Work Queues and Timers. In Linux.Conf.Au, 2003.
[18]
Christopher Mitchell, Yifeng Geng, and Jinyang Li. Using One-sided RDMA Reads to Build a Fast, CPU-efficient Keyvalue Store. In Proceedings of the 2013 USENIX Conference on Annual Technical Conference, USENIX ATC'13, pages 103--114, Berkeley, CA, USA, 2013. USENIX Association.
[19]
OFED. The Open Fabric Alliance, at https://www. openfabrics.org/.
[20]
Diego Ongaro, Alan L. Cox, and Scott Rixner. Scheduling I/O in Virtual Machine Monitors. In Proceedings of the Fourth ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '08, pages 1--10, New York, NY, USA, 2008. ACM.
[21]
Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. Fast Crash Recovery in RAMCloud. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP '11, pages 29--41, New York, NY, USA, 2011. ACM.
[22]
John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazi'eres, Subhasish Mitra, Aravind Narayanan, Diego Ongaro, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman. The Case for RAMCloud. Commun. ACM, 54(7):121--130, July 2011.
[23]
John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazi'eres, Subhasish Mitra, Aravind Narayanan, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman. The Case for RAMClouds: Scalable High-performance Storage Entirely in DRAM. SIGOPS Oper. Syst. Rev., 43(4):92--105, January 2010.
[24]
Zhenhao Pan, Yaozu Dong, Yu Chen, Lei Zhang, and Zhijiao Zhang. CompSC: Live Migration with Pass-through Devices. In Proceedings of the 8th ACM SIGPLAN/SIGOPS Conference on Virtual Execution Environments, VEE '12, pages 109--120, New York, NY, USA, 2012. ACM.
[25]
PCI SIG. Single Root I/O Virtualization, at https://www.pcisig.com/specifications/iov/single_root/.
[26]
A Ranadive, A Gavrilovska, and K. Schwan. FaReS: Fair Resource Scheduling for VMM-Bypass InfiniBand Devices. In Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on, pages 418--427, May 2010.
[27]
R. Recio, B. Metzler, P. Culley, J. Hilland, and D. Garcia. A Remote Direct Memory Access Protocol Specification. RFC 5040, October 2007.
[28]
S. A. Reinemo, T. Skeie, T. Sodring, O. Lysne, and O. Trudbakken. An Overview of QoS Capabilities in Infiniband, Advanced Switching Interconnect, and Ethernet. Comm. Mag., 44(7):32--38, September 2006.
[29]
Rusty Russell. virtio: Towards a De-facto Standard for Virtual I/O Devices. SIGOPS Oper. Syst. Rev., 42(5):95--103, July 2008.
[30]
Animesh Trivedi, Bernard Metzler, and Patrick Stuedi. A case for RDMA in clouds: turning supercomputer networking into commodity. In Proceedings of the Second Asia-Pacific Workshop on Systems, APSys '11, pages 17:1--17:5, New York, NY, USA, 2011. ACM.

Cited By

View all
  • (2024)PeRFProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692005(209-225)Online publication date: 10-Jul-2024
  • (2024)HarmonicProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691907(1479-1496)Online publication date: 16-Apr-2024
  • (2024)Un-IOV: Achieving Bare-Metal Level I/O Virtualization Performance for Cloud Usage With Migratability, Scalability and TransparencyIEEE Transactions on Computers10.1109/TC.2024.337558973:7(1655-1668)Online publication date: Jul-2024
  • Show More Cited By

Index Terms

  1. A Hybrid I/O Virtualization Framework for RDMA-capable Network Interfaces

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    VEE '15: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
    March 2015
    238 pages
    ISBN:9781450334501
    DOI:10.1145/2731186
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 50, Issue 7
      VEE '15
      July 2015
      221 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2817817
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 March 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. rdma
    2. virtualization

    Qualifiers

    • Research-article

    Conference

    VEE '15

    Acceptance Rates

    VEE '15 Paper Acceptance Rate 16 of 50 submissions, 32%;
    Overall Acceptance Rate 80 of 235 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)46
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)PeRFProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692005(209-225)Online publication date: 10-Jul-2024
    • (2024)HarmonicProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691907(1479-1496)Online publication date: 16-Apr-2024
    • (2024)Un-IOV: Achieving Bare-Metal Level I/O Virtualization Performance for Cloud Usage With Migratability, Scalability and TransparencyIEEE Transactions on Computers10.1109/TC.2024.337558973:7(1655-1668)Online publication date: Jul-2024
    • (2024)DockRDMA: Hybrid RDMA Virtualization for Containerized Clouds2024 IEEE 32nd International Conference on Network Protocols (ICNP)10.1109/ICNP61940.2024.10858532(1-12)Online publication date: 28-Oct-2024
    • (2022)NVMe-oAFProceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing10.1145/3502181.3531476(56-70)Online publication date: 27-Jun-2022
    • (2020)StratusProceedings of the 12th USENIX Conference on Hot Topics in Cloud Computing10.5555/3485849.3485861(12-12)Online publication date: 13-Jul-2020
    • (2019)FreeflowProceedings of the 16th USENIX Conference on Networked Systems Design and Implementation10.5555/3323234.3323245(113-125)Online publication date: 26-Feb-2019
    • (2019)vSocket: virtual socket interface for RDMA in public cloudsProceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3313808.3313813(179-192)Online publication date: 14-Apr-2019
    • (2019)HyperCo: Optimizing Network Performance in ARM-Based Mobile VirtualizationIEEE Transactions on Services Computing10.1109/TSC.2016.259476012:1(131-143)Online publication date: 1-Jan-2019
    • (2018)Interdomain I/O Optimization in Virtualized Sensor NetworksSensors10.3390/s1812439518:12(4395)Online publication date: 12-Dec-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media