skip to main content
10.1145/2342356.2342390acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free access

DeTail: reducing the flow completion time tail in datacenter networks

Published: 13 August 2012 Publication History

Abstract

Web applications have now become so sophisticated that rendering a typical page may require hundreds of intra-datacenter flows. At the same time, web sites must meet strict page creation deadlines of 200-300ms to satisfy user demands for interactivity. Long-tailed flow completion times make it challenging for web sites to meet these constraints. They are forced to choose between rendering a subset of the complex page, or delay its rendering, thus missing deadlines and sacrificing either quality or responsiveness. Either option leads to potential financial loss.
In this paper, we present a new cross-layer network stack aimed at reducing the long tail of flow completion times. The approach exploits cross-layer information to reduce packet drops, prioritize latency-sensitive flows, and evenly distribute network load, effectively reducing the long tail of flow completion times. We evaluate our approach through NS-3 based simulation and Click-based implementation demonstrating our ability to consistently reduce the tail across a wide range of workloads. We often achieve reductions of over 50% in 99.9th percentile flow completion times.

Supplementary Material

JPG File (sigcomm-iii-03-detail.jpg)
MP4 File (sigcomm-iii-03-detail.mp4)

References

[1]
Cisco nexus 5000 series architecture. http://www.cisco.com/en/US/prod/collateral/ switches/ps9441/ps9670/white_paper_c11-462176.html.
[2]
Data center bridging. http://www.cisco.com/en/US/solutions/ collateral/ns340/ns517/ns224/ns783/at_a_glance_c45-460907.pdf.
[3]
Datacenter networks are in my way. http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_CleanSlateCTO2009.pdf.
[4]
Fulcrum focalpoint 6000 series. http://www.fulcrummicro.com/product_library/ FM6000_Product_Brief.pdf.
[5]
Infiniband architecture specification release 1.2.1. http://infinibandta.org/.
[6]
Ns3. http://www.nsnam.org/.
[7]
Priority flow control: Build reliable layer 2 infrastructure. http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-542809.pdf.
[8]
ABTS, D., AND KIM, J. High performance datacenter networks: Architectures, algorithms, and opportunities. Synthesis Lectures on Computer Architecture 6, 1 (2011).
[9]
AL-FARES, M., LOUKISSAS, A., AND VAHDAT, A. A scalable, commodity data center network architecture. In SIGCOMM (2008).
[10]
AL-FARES, M., RADHAKRISHNAN, S., RAGHAVAN, B., HUANG, N., AND VAHDAT, A. Hedera: Dynamic flow scheduling for data center networks. In NSDI (2010).
[11]
ALIZADEH, M. Personal communication, 2012.
[12]
ALIZADEH, M., GREENBERG, A., MALTZ, D. A., PADHYE, J., PATEL, P., PRABHAKAR, B., SENGUPTA, S., AND SRIDHARAN, M. Data center tcp (dctcp). In SIGCOMM (2010).
[13]
ALIZADEH, M., KABBANI, A., EDSALL, T., PRABHAKAR, B., VAHDAT, A., AND YASUDA, M. Less is more: Trading a little bandwidth for ultra-low latency in the data center. In NSDI (2012).
[14]
BENZEL, T., BRADEN, R., KIM, D., NEUMAN, C., JOSEPH, A., SKLOWER, K., OSTRENGA, R., AND SCHWAB, S. Experience with deter: a testbed for security research. In TRIDENTCOM (2006).
[15]
BRAKMO, L. S., O'MALLEY, S. W., AND PETERSON, L. L. Tcp vegas: new techniques for congestion detection and avoidance. In SIGCOMM (1994).
[16]
CHEN, Y., GRIFFITH, R., LIU, J., KATZ, R. H., AND JOSEPH, A. D. Understanding tcp incast throughput collapse in datacenter networks. In WREN (2009).
[17]
CLARK, D. The design philosophy of the darpa internet protocols. In SIGCOMM (1988).
[18]
DEAN, J. Software engineering advice from building large-scale distributed systems. http://research.google.com/people/jeff/stanford-295-talk.pdf.
[19]
DEMERS, A., KESHAV, S., AND SHENKER, S. Analysis and simulation of a fair queueing algorithm. In SIGCOMM (1989).
[20]
FLOYD, S., AND HENDERSON, T. The newreno modification to tcp's fast recovery algorithm, 1999.
[21]
FLOYD, S., AND JACOBSON, V. Random early detection gateways for congestion avoidance. IEEE/ACM Trans. Netw. 1 (August 1993).
[22]
GREENBERG, A., HAMILTON, J. R., JAIN, N., KANDULA, S., KIM, C., LAHIRI, P., MALTZ, D. A., PATEL, P., AND SENGUPTA, S. Vl2: a scalable and flexible data center network. In SIGCOMM (2009).
[23]
GUO, C., LU, G., LI, D., WU, H., ZHANG, X., SHI, Y., TIAN, C., ZHANG, Y., AND LU, S. Bcube: A high performance, server-centric network architecture for modular data centers. In SIGCOMM (2009).
[24]
GUO, C., WU, H., TAN, K., SHI, L., ZHANG, Y., AND LU, S. Dcell: a scalable and fault-tolerant network structure for data centers. In SIGCOMM (2008).
[25]
JACOBSON, V., AND BRADEN, R. T. Tcp extensions for long-delay paths, 1988.
[26]
KOHAVI, R., AND LONGBOTHAM, R. Online experiments: Lessons learned, September 2007. http://exp-platform.com/Documents/IEEEComputer2007 OnlineExperiments.pdf.
[27]
KOHLER, E., MORRIS, R., CHEN, B., JANNOTTI, J., AND KAASHOEK, M. F. The click modular router. ACM Trans. Comput. Syst. 18 (August 2000).
[28]
MCKEOWN, N. White paper: A fast switched backplane for a gigabit switched router. http://www-2.cs.cmu.edu/ srini/15-744/readings/McK97.pdf.
[29]
RAICIU, C., BARRE, S., PLUNTKE, C., GREENHALGH, A., WISCHIK, D., AND HANDLEY, M. Improving datacenter performance and robustness with multipath tcp. In SIGCOMM (2011).
[30]
SALTZER, J. H., REED, D. P., AND CLARK, D. D. End-to-end arguments in system design. ACM Trans. Comput. Syst. 2 (November 1984).
[31]
VASUDEVAN, V., PHANISHAYEE, A., SHAH, H., KREVAT, E., ANDERSEN, D. G., GANGER, G. R., GIBSON, G. A., AND MUELLER, B. Safe and effective fine-grained TCP retransmissions for datacenter communication. In SIGCOMM (2009).
[32]
VISHNU, A., KOOP, M., MOODY, A., MAMIDALA, A. R., NARRAVULA, S., AND PANDA, D. K. Hot-spot avoidance with multi-pathing over infiniband: An mpi perspective. In CCGRID (2007).
[33]
WILSON, C., BALLANI, H., KARAGIANNIS, T., AND ROWTRON, A. Better never than late: meeting deadlines in datacenter networks. In SIGCOMM (2011).

Cited By

View all
  • (2025)Reunion: Receiver-driven network load balancing mechanism in AI training clustersComputer Networks10.1016/j.comnet.2025.111088259(111088)Online publication date: Mar-2025
  • (2024)En4S: Enabling SLOs in Serverless Storage SystemsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698529(160-177)Online publication date: 20-Nov-2024
  • (2024)Alibaba HPN: A Data Center Network for Large Language Model TrainingProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672265(691-706)Online publication date: 4-Aug-2024
  • Show More Cited By

Index Terms

  1. DeTail: reducing the flow completion time tail in datacenter networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGCOMM '12: Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
    August 2012
    474 pages
    ISBN:9781450314190
    DOI:10.1145/2342356
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 August 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. datacenter network
    2. flow statistics
    3. multi-path

    Qualifiers

    • Research-article

    Conference

    SIGCOMM '12
    Sponsor:
    SIGCOMM '12: ACM SIGCOMM 2012 Conference
    August 13 - 17, 2012
    Helsinki, Finland

    Acceptance Rates

    Overall Acceptance Rate 462 of 3,389 submissions, 14%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)198
    • Downloads (Last 6 weeks)32
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Reunion: Receiver-driven network load balancing mechanism in AI training clustersComputer Networks10.1016/j.comnet.2025.111088259(111088)Online publication date: Mar-2025
    • (2024)En4S: Enabling SLOs in Serverless Storage SystemsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698529(160-177)Online publication date: 20-Nov-2024
    • (2024)Alibaba HPN: A Data Center Network for Large Language Model TrainingProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672265(691-706)Online publication date: 4-Aug-2024
    • (2024)BurstBalancer: Do Less, Better Balance for Large-Scale Data Center TrafficIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.329545435:6(932-949)Online publication date: Jun-2024
    • (2024)SAR: Receiver-Driven Transport Protocol With Micro-Burst Prediction in Data Center NetworksIEEE Transactions on Network and Service Management10.1109/TNSM.2024.345059721:6(6409-6422)Online publication date: Dec-2024
    • (2024)Realizing the Carbon-Aware Service Provision in ICT SystemIEEE Transactions on Network and Service Management10.1109/TNSM.2024.338548421:4(4090-4103)Online publication date: Aug-2024
    • (2024)GTCC: A Game Theoretic Approach for Efficient Congestion Control in Datacenter NetworksIEEE Transactions on Network Science and Engineering10.1109/TNSE.2024.344309911:6(6328-6344)Online publication date: Nov-2024
    • (2024)Flow Optimization at Inter-Datacenter Networks for Application Run-time AccelerationICC 2024 - IEEE International Conference on Communications10.1109/ICC51166.2024.10623094(2670-2675)Online publication date: 9-Jun-2024
    • (2024)Elephant Flow Detection With Random Forest Models Under Programmable Network Dataplane ConstraintsIEEE Access10.1109/ACCESS.2024.348558812(158561-158578)Online publication date: 2024
    • (2024)Machine Learning-Based Elephant Flow Classification on the First PacketIEEE Access10.1109/ACCESS.2024.343605612(105744-105760)Online publication date: 2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media