skip to main content
10.1145/3343180.3343190acmotherconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

On the Impact of Cluster Configuration on RoCE Application Design

Published: 17 August 2019 Publication History

Abstract

RDMA over Converged Ethernet (RoCE) allows RDMA-enabled NICs to operate in datacenter networks. This study focuses on identifying how different aspects of datacenter cluster configuration impact the latency, and throughput, and CPU utilization of different ways of transferring data in RoCE (RDMA verbs). We look into the impact of colocated applications competing for both the CPU and access to the NIC as well as the impact of the network MTU. We find that RDMA applications do not fairly share the NIC, large frames should not be used, and that correct verb choice is dependent on many variables, including application access patterns, object size, and the load of both the local and remote CPU.

References

[1]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload Analysis of a Large-scale Key-value Store. In SIGMETRICS. ACM.
[2]
Theophilus Benson, Aditya Akella, and David A. Maltz. 2010. Network Traffic Characteristics of Data Centers in the Wild. In IMC.
[3]
CloudLab {n.d.}. CloudLab. http://cloudlab.us/.
[4]
Data Center Bridging Task Group. {n.d.}. http://www.ieee802.org/1/pages/dcbridges.html.
[5]
Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson, and Miguel Castro. 2014. FaRM: Fast Remote Memory. In NSDI. USENIX.
[6]
Aleksandar Dragojević, Dushyanth Narayanan, Edmund B Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No compromises: distributed transactions with consistency, availability, and performance. In SOSP. ACM.
[7]
Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: A Scalable and Flexible Data Center Network. In SIGCOMM.
[8]
Albert Greenberg, Microsoft Azure. 2014. ONS 2014 Keynote.
[9]
Jeff Hilland. 2003. RDMA Protocol Verbs Specification. Internet-Draft draft-hilland-rddp-verbs-00. Internet Engineering Task Force. https://tools.ietf.org/html/draft-hilland-rddp-verbs-00 Work in Progress.
[10]
HP Moonshot-45XGc Switch Module {n.d.}. HP Moonshot-45XGc Switch Module. http://www8.hp.com/us/en/products/moonshot-systems/product-detail.html?oid=7398915.
[11]
InfiniBand Trade Association. 2010. Supplement to InfiniBand Architecture Specification Volume 1 Release 1.2.1 Annex A16: RDMA over Converged Ethernet (RoCE). https://cw.infinibandta.org/document/dl/7148.
[12]
InfiniBand Trade Association. 2014. Supplement to InfiniBand Architecture Specification Volume 1 Release 1.2.1 Annex A17: RoCEv2. https://cw.infinibandta.org/document/dl/7781.
[13]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA Efficiently for Key-Value Services. In SIGCOMM. Chicago, IL.
[14]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2019. Datacenter RPCs can be General and Fast. In NSDI.
[15]
Antoine Kaufmann, Simon Peter, Naveen Kr. Sharma, Thomas Anderson, and Arvind Krishnamurthy. 2016. High Performance Packet Processing with FlexNIC. In ASPLOS.
[16]
Yanfang Le, Brent Stephens, Arjun Singhvi, Aditya Akella, and Michael M. Swift. 2018. RoGUE: RDMA over Generic Unconverged Ethernet. In ACM Symposium on Cloud Computing (ACM SoCC).
[17]
Mellanox Technologies. {n.d.}. ConnectXÂő-3 EN Single/Dual-Port 10/40/56GbE Adapters w/ PCI Express 3.0. http://www.mellanox.com/page/products_dyn?product_family=127.
[18]
Christopher Mitchell, Yifeng Geng, and Jinyang Li. 2013. Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store. In USENIX Annual Technical Conference. San Jose, CA.
[19]
Radhika Mittal, Terry Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, and David Zats. 2015. TIMELY: RTT-based Congestion Control for the Datacenter. In SIGCOMM.
[20]
Radhika Mittal, Alexander Shpiner, Aurojit Panda, Eitan Zahavi, Arvind Krishnamurthy, Sylvia Ratnasamy, and Scott Shenker. 2018. Revisiting Network Support for RDMA. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM '18).
[21]
Jeffrey C. Mogul and K. K. Ramakrishnan. 1997. Eliminating Receive Livelock in an Interrupt-driven Kernel. ACM Transactions on Computer Systems (Aug. 1997).
[22]
Marius Poke and Torsten Hoefler. 2015. DARE: High-Performance State Machine Replication on RDMA Networks. In HPDC.
[23]
Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C. Snoeren. 2015. Inside the Social Network's (Datacenter) Network. In SIGCOMM. ACM.
[24]
David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, and Randy Katz. 2012. DeTail: Reducing the Flow Completion Time Tail in Datacenter Networks. In SIGCOMM.
[25]
Yiwen Zhang, Juncheng Gu, Youngmoon Lee, Mosharaf Chowdhury, and Kang G. Shin. 2017. Performance Isolation Anomalies in RDMA. In Proceedings of the Workshop on Kernel-Bypass Networks (KBNets '17). ACM.
[26]
Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion Control for Large-Scale RDMA Deployments. In SIGCOMM. ACM. http://research.microsoft.com/apps/pubs/default.aspx?id=252307

Cited By

View all
  • (2023)Rambda: RDMA-driven Acceleration Framework for Memory-intensive µs-scale Datacenter Applications2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071127(499-515)Online publication date: Feb-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
APNet '19: Proceedings of the 3rd Asia-Pacific Workshop on Networking
August 2019
104 pages
ISBN:9781450376358
DOI:10.1145/3343180
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Datacenter
  2. Performance Measurement
  3. RDMA
  4. RDMA Performance

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

APNet '19

Acceptance Rates

Overall Acceptance Rate 50 of 118 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Rambda: RDMA-driven Acceleration Framework for Memory-intensive µs-scale Datacenter Applications2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071127(499-515)Online publication date: Feb-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media