ABSTRACT
Datacenter Quantized Congestion Notification (DCQCN) [12] is the default congestion control algorithm for Mellanox RDMA (Remote Direct Memory Access) NICs [2] in RoCEv2 (RDMA over Converged Ethernet v2) networks, one of the most widely used NICs in leading industry companies [4, 5, 7, 9]. In DCQCN, firstly switches mark packets with ECN (Explicit Congestion Notification) when the queue length exceeds ECN thresholds, then receivers respond to ECN-marked packets with CNPs (Congestion Notification Packets), and finally senders reduce transmission rate when receiving CNPs. DCQCN has 10+ parameters at both NICs and switches, including Alpha Update, Rate Increase & Decrease, Notification Point and ECN thresholds [3], and these parameters have a non-negligible impact on the network performance. Our experiments also verify the network performance of common AI (Artificial Intelligence) training workloads in RoCEv2 networks (e.g., all-to-all collective communication) is greatly influenced by different DCQCN parameter settings (§3). Therefore, when deploying applications in practice, the DCQCN parameters need to be carefully tested and tuned to improve the network performance.
- 2020. High Precision Congestion Control. (2020). https://github.com/alibaba-edu/High-Precision-Congestion-ControlGoogle Scholar
- 2022. DCQCN CC Algorithm. (2022). https://enterprise-support.nvidia.com/s/article/DCQCN-CC-algorithmGoogle Scholar
- 2023. DCQCN Parameters. (2023). https://enterprise-support.nvidia.com/s/article/dcqcn-parametersGoogle Scholar
- Wei Bai, Shanim Sainul Abdeen, Ankit Agrawal, Krishan Kumar Attre, Paramvir Bahl, Ameya Bhagat, Gowri Bhaskara, Tanya Brokhman, Lei Cao, Ahmad Cheema, et al. 2023. Empowering Azure Storage with RDMA. In NSDI.Google Scholar
- Yixiao Gao, Qiang Li, Lingbo Tang, Yongqing Xi, Pengcheng Zhang, Wenwen Peng, Bo Li, Yaohui Wu, Shaozong Liu, Lei Yan, et al. 2021. When Cloud Storage Meets RDMA. In NSDI.Google Scholar
- Yixiao Gao, Yuchen Yang, Tian Chen, Jiaqi Zheng, Bing Mao, and Guihai Chen. 2018. Dcqcn+: Taming large-scale incast congestion in rdma over ethernet networks. In ICNP.Google Scholar
- Yimin Jiang, Yibo Zhu, Chang Lan, Bairen Yi, Yong Cui, and Chuanxiong Guo. 2020. A unified architecture for accelerating distributed DNN training in heterogeneous GPU/CPU clusters. In OSDI.Google Scholar
- Yuliang Li, Rui Miao, Hongqiang Harry Liu, Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, et al. 2019. HPCC: High precision congestion control. In SIGCOMM.Google Scholar
- Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, et al. 2022. Software-hardware co-design for fast and scalable training of deep learning recommendation models. In ISCA.Google Scholar
- Kai Wang, Fang Dong, Dian Shen, Chengtian Zhang, Jinghui Zhang, and Junzhou Luo. 2021. Towards tunable RDMA parameter selection at runtime for datacenter applications. In CSCWD.Google Scholar
- Siyu Yan, Xiaoliang Wang, Xiaolong Zheng, Yinben Xia, Derui Liu, and Weishan Deng. 2021. ACC: Automatic ECN tuning for high-speed datacenter networks. In SIGCOMM.Google ScholarDigital Library
- Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. In SIGCOMM.Google Scholar
Index Terms
- Poster: Chameleon: Automatic and Adaptive Tuning for DCQCN Parameters in RDMA Networks
Recommendations
Congestion Control for Large-Scale RDMA Deployments
SIGCOMM'15Modern datacenter applications demand high throughput (40Gbps) and ultra-low latency (< 10 μs per hop) from the network, with low CPU overhead. Standard TCP/IP stacks cannot meet these requirements, but Remote Direct Memory Access (RDMA) can. On IP-...
Congestion Control for Large-Scale RDMA Deployments
SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data CommunicationModern datacenter applications demand high throughput (40Gbps) and ultra-low latency (< 10 μs per hop) from the network, with low CPU overhead. Standard TCP/IP stacks cannot meet these requirements, but Remote Direct Memory Access (RDMA) can. On IP-...
Unreliable transport protocol using congestion control for high-speed networks
Currently there is no control for the real-time traffic of multimedia applications using UDP (User Datagram Protocol) in high-speed networks. Therefore, although a number of high-speed TCP (Transmission Control Protocol) protocols have been developed ...
Comments