skip to main content
10.1145/3600061.3600082acmotherconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Open Access

sRDMA: A General and Low-Overhead Scheduler for RDMA

Published:05 September 2023Publication History

ABSTRACT

Remote Direct Memory Access (RDMA) has been widely deployed in data centers to improve application performance. However, the characteristic of RDMA to deliver messages in order cannot meet the emerging requirements of applications for scheduling messages within an RDMA connection, making RDMA unable to be fully utilized. Some works try to schedule the data to be transferred in specific applications before delivering to RDMA, or distribute messages to different connections. However, these approaches tightly couple scheduling logic with application logic and may result in high scheduling overhead.

In this paper, we propose sRDMA, a general and low-overhead scheduler working in user-space RDMA driver. sRDMA allows the application to express the expected transfer order to RDMA hardware via work requests (WRs). With priority information in WRs, sRDMA slices and schedules WRs to achieve desired order of message transfer and reduce blocking impact of large messages in the same RDMA connection. Our experiments show that sRDMA can improve the performance of applications, e.g., TensorFlow, by up to , and sRDMA has negligible overhead in terms of CPU and flow throughput.

References

  1. 2022. NVIDIA ConnectX family of smart network interface cards. https://www.nvidia.com/en-us/networking/ethernet-adapters/.Google ScholarGoogle Scholar
  2. Albert Gran Alcoz, Alexander Dietmüller, and Laurent Vanbever. 2020. SP-PIFO: Approximating Push-In First-Out Behaviors using Strict-Priority Queues. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 59–76.Google ScholarGoogle Scholar
  3. Rajarshi Biswas, Xiaoyi Lu, and Dhabaleswar K. Panda. 2018. Accelerating TensorFlow with Adaptive RDMA-Based gRPC. In 2018 IEEE 25th International Conference on High Performance Computing (HiPC). 2–11.Google ScholarGoogle ScholarCross RefCross Ref
  4. Hanhua Chen, Jie Yuan, Hai Jin, Yonghui Wang, Sijie Wu, and Zhihao Jiang. 2022. RGraph: Asynchronous graph processing based on asymmetry of remote direct memory access. Software: Practice and Experience (2022), 374–393.Google ScholarGoogle Scholar
  5. Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: fast remote memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14). 401–414.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Aleksandar Dragojević, Dushyanth Narayanan, Edmund B Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No compromises: distributed transactions with consistency, availability, and performance. In Proceedings of the 25th symposium on operating systems principles. 54–70.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yixiao Gao, Qiang Li, Lingbo Tang, Yongqing Xi, Pengcheng Zhang, Wenwen Peng, Bo Li, Yaohui Wu, Shaozong Liu, Lei Yan, Fei Feng, Yan Zhuang, Fan Liu, Pan Liu, Xingkui Liu, Zhongjie Wu, Junping Wu, Zheng Cao, Chen Tian, Jinbo Wu, Jiaji Zhu, Haiyong Wang, Dennis Cai, and Jiesheng Wu. 2021. When Cloud Storage Meets RDMA. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, 519–533.Google ScholarGoogle Scholar
  8. Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitu Padhye, and Marina Lipshteyn. 2016. RDMA over Commodity Ethernet at Scale. In Proceedings of the 2016 conference on ACM SIGCOMM 2016 Conference. ACM, 202–215.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Anand Jayarajan, Jinliang Wei, Garth Gibson, Alexandra Fedorova, and Gennady Pekhimenko. 2019. Priority-based Parameter Propagation for Distributed DNN Training. In Proceedings of Machine Learning and Systems, Vol. 1. 132–145.Google ScholarGoogle Scholar
  10. Anuj Kalia, Michael Kaminsky, and David Andersen. 2019. Datacenter RPCs can be General and Fast. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). USENIX Association, Boston, MA, 1–16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Conference (ATC 16). USENIX Association, 437–450.Google ScholarGoogle Scholar
  12. Anuj Kalia, Michael Kaminsky, and David G Andersen. 2016. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided RDMA Datagram RPCs. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 185–201.Google ScholarGoogle Scholar
  13. Xiaoyi Lu, Dipti Shankar, Shashank Gugnani, and Dhabaleswar K. Panda. 2016. High-performance design of apache spark with RDMA and its benefits on various workloads. In 2016 IEEE International Conference on Big Data (Big Data). 253–262.Google ScholarGoogle ScholarCross RefCross Ref
  14. Yuanwei Lu, Guo Chen, Bojie Li, Kun Tan, Yongqiang Xiong, Peng Cheng, Jiansong Zhang, Enhong Chen, and Thomas Moscibroda. 2018. Multi-path transport for RDMA in datacenters. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). 357–371.Google ScholarGoogle Scholar
  15. Youyou Lu, Jiwu Shu, Youmin Chen, and Tao Li. 2017. Octopus: an rdma-enabled distributed persistent memory file system. In 2017 USENIX Annual Technical Conference (ATC 17). 773–785.Google ScholarGoogle Scholar
  16. Anirudh Sivaraman, Suvinay Subramanian, Mohammad Alizadeh, Sharad Chole, Shang-Tse Chuang, Anurag Agrawal, Hari Balakrishnan, Tom Edsall, Sachin Katti, and Nick McKeown. 2016. Programmable Packet Scheduling at Line Rate. In Proceedings of the 2016 ACM SIGCOMM Conference. Association for Computing Machinery, New York, NY, USA, 44–57.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Shuai Wang, Dan Li, Jiansong Zhang, and Wei Lin. 2020. CEFS: Compute-Efficient Flow Scheduling for Iterative Synchronous Applications. In Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies(CoNEXT 20). Association for Computing Machinery, New York, NY, USA, 136–148.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, and Haibo Chen. 2015. Fast in-memory transaction processing using RDMA and HTM. In Proceedings of the 25th Symposium on Operating Systems Principles. 87–104.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Sijie Wu, Hanhua Chen, Yonghui Wang, and Hai Jin. 2021. Argus: Efficient Job Scheduling in RDMA-assisted Big Data Processing. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 827–836.Google ScholarGoogle Scholar
  20. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauly, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). USENIX Association, San Jose, CA, 15–28.Google ScholarGoogle Scholar

Index Terms

  1. sRDMA: A General and Low-Overhead Scheduler for RDMA

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        APNET '23: Proceedings of the 7th Asia-Pacific Workshop on Networking
        June 2023
        229 pages
        ISBN:9798400707827
        DOI:10.1145/3600061

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 September 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)298
        • Downloads (Last 6 weeks)68

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format