skip to main content
10.1145/3337821.3337866acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

TLB: Traffic-aware Load Balancing with Adaptive Granularity in Data Center Networks

Published: 05 August 2019 Publication History

Abstract

Modern datacenter topologies typically are multi-rooted trees consisting of multiple paths between any given pair of hosts. Recent load balancing designs focus on making full use of available parallel paths to provide high bisection bandwidth. However, they are agnostic to the mixed traffic generated by diverse applications in data centers and respectively use the same granularity in rerouting flows regardless of the flow type. Therefore, the short flows suffer the long-tailed queueing delay and reordering problems, while the throughputs of long flows are also degraded dramatically due to low link utilization and packet reordering under the non-adaptive granularity. To solve these problems, we design a traffic-aware load balancing (TLB) scheme to adopt different rerouting granularities for two kinds of flows. Specifically, TLB adaptively adjusts the switching granularity of long flows according to the load strength of short ones. Under the heavy load of short flows, the long flows use large switching granularity to help short ones obtain more opportunities in choosing short queues to complete quickly. When the load strength of short flows is low, the long flows switch paths more flexibly with small switching granularity to achieve high throughput. TLB is deployed at the switch, without any modifications on the end-hosts. The experimental results of NS2 simulations and Mininet implementation show that TLB significantly reduces the average flow completion time (AFCT) of short flows by ~15%-40% over the state-of-the-art load balancing schemes and achieves the high throughput for long flows.

References

[1]
M. Alizadeh, T. Edsall, S. Dharmapurikar, et al. 2014. CONGA: Distributed congestion-aware load balancing for datacenters. In Proc. of ACM SIGCOMM.
[2]
H. Zhang, J. Zhang, W. Bai, et al. 2017. Resilient datacenter load balancing in the wild. In Proc. of ACM SIGCOMM.
[3]
S. Ghorbani, Z. Yang, P. Godfrey, et al. 2017. DRILL: Micro load balancing for low-latency data center networks. In Proc. of ACM SIGCOMM.
[4]
G. Michelogiannakis, K. Z. Ibrahim, J. Shalf, et al. 2017. Aphid: Hierarchical task placement to enable a tapered fat tree topology for lower power and cost in hpc networks. In Proc. of IEEE/ACM CCGrid.
[5]
C. Hopps. 2000. Analysis of an equal-cost multi-path algorithm. RFC 2992, Internet Engineering Task Force.
[6]
A. Dixit, P. Prakash, Y. C. Hu, and R. R. Kompella. 2013. On the impact of packet spraying in data center networks. In Proc. of IEEE INFOCOM.
[7]
K. He, E. Rozner, K. Agarwal, et al. 2015. Presto: Edge-based load balancing for fast datacenter networks. In Proc. of ACM SIGCOMM.
[8]
E. Vanini, R. Pan, M. Alizadeh, et al. 2017. Let It Flow: Resilient Asymmetric Load Balancing with Flowlet Switching. In Proc. of USENIX NSDI.
[9]
A. Putnam, A. M. Caulfield, E. S. Chung, et al. A reconfigurable fabric for accelerating large-scale datacenter services. ACM SIGARCH Computer Architecture News 42, 3(2014), 13--24.
[10]
L. Zhou, C. Chou, L. N. Bhuyan, et al. 2018. Joint Server and Network Energy Saving in Data Centers for Latency-Sensitive Applications. In Proc. of IEEE IPDPS.
[11]
M. Alizadeh, A. Greenberg, D. A. Maltz, et al. 2010. Data center tcp (dctcp). In Proc. of ACM SIGCOMM.
[12]
A. Munir, I. A. Qazi, Z. A. Uzmi, et al. 2013. Minimizing flow completion times in data centers. In Proc. of IEEE INFOCOM.
[13]
T. Benson, A. Akella, and D. Maltz. 2010. Network traffic characteristics of data centers in the wild. In Proc. of ACM IMC.
[14]
L. Suresh, M. Canini, S. Schmid, and A. Feldmann. 2015. C3: Cutting tail latency in cloud data stores via adaptive replica selection. In Proc. of USENIX NSDI.
[15]
S. Bak, H. Menon, S. White, et al. 2018. Multi-level load balancing with an integrated runtime approach. In Proc. of IEEE/ACM CCGrid.
[16]
B. Vamanan, J. Hasan, and T. N. Vijaykumar. 2012. Deadline-aware datacenter tcp (d2tcp). In Proc. of ACM SIGCOMM.
[17]
K. Zheng and X. Wang. 2017. Dynamic control of flow completion time for power efficiency of data center networks. In Proc. of IEEE ICDCS.
[18]
H.Xu and B. Li. 2014. RepFlow: Minimizing flow completion times with replicated flows in data centers. In Proc. of IEEE INFOCOM.
[19]
M. Kheirkhah, I. Wakeman, G. Parisis. 2016. MMPTCP: A multipath transport protocol for data centers. In Proc. of IEEE INFOCOM.
[20]
L. Chen, K. Chen, W. Bai, and M. Alizadeh. 2016. Scheduling mix-flows in commodity datacenters with karuna. In Proc. of ACM SIGCOMM.
[21]
J. Hu, J. Huang, J. Lv, et al. 2018. CAPS: Coding-based adaptive packet spraying to reduce flow completion time in data center. In Proc. of IEEE INFOCOM.
[22]
M. Alizadeh, S. Yang, M. Sharif, et al. 2013. pFabric: Minimal near-optimal datacenter transport. In Proc. of ACM SIGCOMM.
[23]
D. Gross, J. F. Shortle, J. M. Thompson, and C. M. Harris. 2008. Fundamentals of queueing theory. Wiley-Interscience.
[24]
C. Wilson, H. Ballani, T. Karagiannis, and A. Rowstron. 2011. Better never than late: meeting deadlines in datacenter networks. In Proc. of ACM SIGCOMM.
[25]
W. Bai, L. Chen, K. Chen, et al. 2015. Information-Agnostic Flow Scheduling for Commodity Data Centers. In Proc. of USENIX NSDI.
[26]
N Handigol, B. Heller, V. Jeyakumar, et al. 2012. Reproducible network experiments using container-based emulation. In Proc. of ACM CoNEXT.
[27]
A. Khurshid, X. Zou, W. Zhou, et al. 2013. Veriflow: Verifying network-wide invariants in real time. In Proc. of USENIX NSDI.
[28]
M. Al-Fares, S. Radhakrishnan, B. Raghavan, et al. 2010. Hedera: Dynamic flow scheduling for data center networks. In Proc. of USENIX NSDI.
[29]
P. Bosshart, D. Daly, G. Gibb, et al. 2014. P4: Programming protocol-independent packet processors. In Proc. of ACM SIGCOMM Computer Communication Review.
[30]
T. Benson, A. Anand, A. Akella, and M. Zhang. 2011. MicroTE: Fine grained traffic engineering for data centers. In Proc. of ACM CoNEXT.
[31]
A. Kabbani, B. Vamanan, J. Hasan, and F. Duchene. 2014. Flowbender: Flow-level adaptive routing for improved latency and throughput in datacenter networks. In Proc. of ACM CoNEXT.
[32]
M. Shafiee and J. Ghaderi. 2016. A simple congestion-aware algorithm for load balancing in datacenter networks. In Proc. of IEEE INFOCOM.
[33]
L. Chen, K. Chen, W. Bai, et al. 2016. Scheduling mix-flows in commodity datacenters with karuna. In Proc. of ACM SIGCOMM.
[34]
W. Wang, Y. Sun, K. Zheng, et al. 2014. Freeway: Adaptively isolating the elephant and mice flows on different transmission paths. In Proc. of IEEE ICNP.
[35]
J. Perry, H. Balakrishnan, and D. Shah. 2017. Flowtune: Flowlet Control for Datacenter Networks. In Proc. of USENIX NSDI.
[36]
N. Katta, A. Ghag, M. Hira, et al. 2017. Clove: Congestion-Aware Load Balancing at the Virtual Edge. In Proc. of ACM CoNEXT.
[37]
C. Raiciu, S. Barre, C. Pluntke, et al. 2011. Improving datacenter performance and robustness with multipath TCP. In Proc. of ACM SIGCOMM.
[38]
M. Mizenmacher. The Power of Two Choices in Randomized Load Balancing. IEEE Tansactions on Parallel and Distributed Systems 12, 10(2001),1094-1104.

Cited By

View all
  • (2024)Enhancing Load Balancing With In-Network Recirculation to Prevent Packet Reordering in Lossless Data CentersIEEE/ACM Transactions on Networking10.1109/TNET.2024.340367132:5(4114-4127)Online publication date: Oct-2024
  • (2024)Optimal Server Prediction in Data Center Networks using Machine Learning Techniques2024 2nd International Conference on Networking and Communications (ICNWC)10.1109/ICNWC60771.2024.10537236(1-4)Online publication date: 2-Apr-2024
  • (2024)Load Balancing Based on Flow Classification with Private Link in Programmable Switch2024 9th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS61882.2024.10602955(646-651)Online publication date: 19-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '19: Proceedings of the 48th International Conference on Parallel Processing
August 2019
1107 pages
ISBN:9781450362955
DOI:10.1145/3337821
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • University of Tsukuba: University of Tsukuba

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data center
  2. TCP
  3. load balancing
  4. multipath

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPP 2019

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)3
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Enhancing Load Balancing With In-Network Recirculation to Prevent Packet Reordering in Lossless Data CentersIEEE/ACM Transactions on Networking10.1109/TNET.2024.340367132:5(4114-4127)Online publication date: Oct-2024
  • (2024)Optimal Server Prediction in Data Center Networks using Machine Learning Techniques2024 2nd International Conference on Networking and Communications (ICNWC)10.1109/ICNWC60771.2024.10537236(1-4)Online publication date: 2-Apr-2024
  • (2024)Load Balancing Based on Flow Classification with Private Link in Programmable Switch2024 9th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS61882.2024.10602955(646-651)Online publication date: 19-Apr-2024
  • (2024)Adaptive Routing for Datacenter Networks Using Ant Colony OptimizationAlgorithms and Architectures for Parallel Processing10.1007/978-981-97-0798-0_17(290-309)Online publication date: 1-Mar-2024
  • (2024)Deep Reinforcement Learning Based Load Balancing for Heterogeneous Traffic in Datacenter NetworksAlgorithms and Architectures for Parallel Processing10.1007/978-981-97-0798-0_16(270-289)Online publication date: 1-Mar-2024
  • (2023)RLB: Reordering-Robust Load Balancing in Lossless Datacenter NetworksProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605617(576-584)Online publication date: 7-Aug-2023
  • (2023)Reducing tail latency with coding-based packet spraying in edge datacentersJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2022.102783134:COnline publication date: 1-Jan-2023
  • (2023)Load balancing for heterogeneous traffic in datacenter networksJournal of Network and Computer Applications10.1016/j.jnca.2023.103692217:COnline publication date: 1-Aug-2023
  • (2022)Load Balancing in PFC-Enabled Datacenter NetworksProceedings of the 6th Asia-Pacific Workshop on Networking10.1145/3542637.3542641(21-28)Online publication date: 1-Jul-2022
  • (2022)CLB: Coarse-Grained Precision Traffic-Aware Weighted Cost Multipath Load Balancing on PISAIEEE Transactions on Network and Service Management10.1109/TNSM.2022.314210619:2(784-803)Online publication date: 11-Jan-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media