research-article

Trevi: watering down storage hotspots with cool fountain codes

Authors:

George Parisis,

Toby Moncaster,

Anil Madhavapeddy,

Jon CrowcroftAuthors Info & Claims

HotNets-XII: Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks

Article No.: 22, Pages 1 - 7

https://doi.org/10.1145/2535771.2535781

Published: 21 November 2013 Publication History

Abstract

Datacenter networking has brought high-performance storage systems' research to the foreground once again. Many modern storage systems are built with commodity hardware and TCP/IP networking to save costs. In this paper, we highlight a group of problems that are present in such storage systems and which are all related to the use of TCP. As an alternative, we explore Trevi: a fountain coding-based approach for distributing I/O requests that overcomes these problems while still efficiently scheduling resources across both networking and storage layers. We also discuss how receiver-driven flow and congestion control, in combination with fountain coding, can guide the design of Trevi and provide a viable alternative to TCP for datacenter storage.

References

[1]

M. Aguilera, R. Janakiraman, and L. Xu. Using erasure codes efficiently for storage in a distributed system. In Proc. of DSN 2005, 2005.

Digital Library

[2]

M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity data center network architecture. In SIGCOMM, 2008.

Digital Library

[3]

R. J. Anderson. The Eternity service. In Pragocrypt, 1996.

[4]

L. S. Brakmo, S. W. O'Malley, and L. L. Peterson. TCP Vegas: new techniques for congestion detection and avoidance. In SIGCOMM, 1994.

Digital Library

[5]

P. Breuer, A. Lopez, and A. Ares. The Network Block Device. Linux Journal, March 2000.

Digital Library

[6]

P. H. Carns, W. B. Ligon, III, R. B. Ross, and R. Thakur. PVFS: a parallel file system for Linux clusters. In USENIX ALS, 2000.

Digital Library

[7]

P. Cataldi, M. Shatarski, M. Grangetto, and E. Magli. Implementation and performance evaluation of LT and raptor codes for multimedia applications. In IIH-MSP, 2006.

Digital Library

[8]

A. G. Dimakis, V. Prabhakaran, and K. Ramchandran. Decentralized erasure codes for distributed networked storage. IEEE Transactions on Information Theory, 52: 2809--2816, 2006.

[9]

L. Ellenberg. DRBD 9 and device-mapper: Linux block level storage replication. In the Linux System Technology Conference, 2009.

[10]

S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In SOSP, 2003.

Digital Library

[11]

A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta. VL2: a scalable and flexible data center network. ACM SIGCOMM CCR, 39(4), 2009.

Digital Library

[12]

S. Hand and T. Roscoe. Mnemosyne: Peer-to-Peer steganographic storage. In IPTPS, 2002.

Digital Library

[13]

C. Hopps. Analysis of an equal-cost multi-path algorithm. RFC 2992, 2000.

Digital Library

[14]

M. Luby. LT Codes. In Proc. of FOCS, 2002.

Digital Library

[15]

A. Madhavapeddy, R. Mortier, C. Rotsos, D. Scott, B. Singh, T. Gazagnaire, S. Smith, S. Hand, and J. Crowcroft. Unikernels: library operating systems for the cloud. In ASPLOS, 2013.

Digital Library

[16]

S. McCanne, V. Jacobson, and M. Vetterli. Receiver-driven layered multicast. In SIGCOMM, 1996.

Digital Library

[17]

M. Menth, F. Lehrieder, B. Briscoe, P. Eardley, T. Moncaster, et al. A survey of PCN-based admission control and flow termination. Communications Surveys & Tutorials, IEEE, 12(3): 357--375, 2010.

Digital Library

[18]

E. B. Nightingale, J. Elson, J. Fan, O. Hofmann, J. Howell, and Y. Suzue. Flat datacenter storage. In USENIX OSDI, 2012.

Digital Library

[19]

Oracle. The Oracle Clustered File System. http://oss.oracle.com/projects/ocfs/.

[20]

G. Parisis, G. Xylomenos, and T. Apostolopoulos. DHTbd: A reliable block-based storage system for high performance clusters. In CCGRID, 2011.

Digital Library

[21]

B. Pawlowski, D. Noveck, D. Robinson, and R. Thurlow. The NFS version 4 protocol. In SANE 2000, 2000.

[22]

A. Phanishayee, E. Krevat, V. Vasudevan, D. G. Andersen, G. R. Ganger, G. A. Gibson, and S. Seshan. Measurement and analysis of TCP throughput collapse in cluster-based storage systems. In USENIX FAST, 2008.

Digital Library

[23]

C. Raiciu, C. Paasch, S. Barre, A. Ford, M. Honda, F. Duchene, O. Bonaventure, and M. Handley. How hard can it be? designing and implementing a deployable multipath TCP. In Proc. of USENIX NSDI, 2012.

Digital Library

[24]

Y. Saito, S. Frolund, A. C. Veitch, A. Merchant, and S. Spence. FAB: building distributed enterprise disk arrays from commodity components. In ASPLOS, 2004.

Digital Library

[25]

F. Schmuck and R. Haskin. GPFS: A shared-disk file system for large computing clusters. In of USENIX FAST, 2002.

Digital Library

[26]

P. Schwan. Lustre: Building a file system for 1,000-node clusters. In Linux Symposium, 2003.

[27]

A. Shokrollahi. Raptor codes. IEEE Transactions on Information Theory, 52(6): 2551--2567, 2006.

Digital Library

[28]

K. Tan and J. Song. A Compound TCP approach for high-speed and long distance networks. In IEEE INFOCOM, 2006.

[29]

V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. G. Andersen, G. R. Ganger, G. A. Gibson, and B. Mueller. Safe and effective fine-grained TCP retransmissions for datacenter communication. In SIGCOMM, 2009.

Digital Library

[30]

S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. Ceph: a scalable, high-performance distributed file system. In USENIX SOSP, 2006.

[31]

B. Welch, M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka, and B. Zhou. Scalable performance of the Panasas parallel file system. In USENIX FAST, 2008.

Digital Library

[32]

H. Wu, Z. Feng, C. Guo, and Y. Zhang. ICTCP: Incast congestion control for TCP in data center networks. In Proceedings of CoNEXT, 2010.

Digital Library

[33]

Y. Zhang and N. Ansari. On mitigating TCP incast in data center networks. In Proc. of IEEE INFOCOM, 2011.

Cited By

Alasmar MParisis GCrowcroft J(2021)SCDP: Systematic Rateless Coding for Efficient Data Transport in Data CentersIEEE/ACM Transactions on Networking10.1109/TNET.2021.309838629:6(2723-2736)Online publication date: Dec-2021
https://doi.org/10.1109/TNET.2021.3098386
Bashir HFaisal AJamshed MVondras PIftikhar AQazi IDogar FMohaisen AZhang Z(2019)Reducing tail latency using duplicationProceedings of the 15th International Conference on Emerging Networking Experiments And Technologies10.1145/3359989.3365432(246-259)Online publication date: 3-Dec-2019
https://dl.acm.org/doi/10.1145/3359989.3365432
Alasmar MParisis GCrowcroft J(2018)PolyraptorProceedings of the ACM SIGCOMM 2018 Conference on Posters and Demos10.1145/3234200.3234222(69-71)Online publication date: 7-Aug-2018
https://dl.acm.org/doi/10.1145/3234200.3234222
Show More Cited By

Index Terms

Trevi: watering down storage hotspots with cool fountain codes
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Networks
  1. Network architectures
  2. Network protocols

Recommendations

A simple and efficient approach for reducing TCP timeouts due to lack of duplicate acknowledgments in data center networks

The problem of TCP incast in data centers attracts a lot of attention in our research community. TCP incast is a catastrophic throughput collapse that occurs when multiple senders transmitting TCP data simultaneously to a single aggregator. Based on ...
TCP incast solutions in data center networks: A classification and survey
Abstract
In recent years, Data Centers Networks (DCNs) have been deployed to serve as the backbone to support the extensive variety of services offered through the Internet like social networking, web hosting, and e-commerce. The Transmission ...
A Coding-based Approach to Mitigate TCP Incast in Data Center Networks
ICDCSW '12: Proceedings of the 2012 32nd International Conference on Distributed Computing Systems Workshops

As the key infrastructure of cloud computing, data center network provides routing and transport services for cloud applications. To ensure reliable data delivery, TCP has been used in data centers as the de facto transport layer protocol. However, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HotNets-XII: Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks

November 2013

188 pages

ISBN:9781450325967

DOI:10.1145/2535771

General Chair:
Dave Levine
UMD
,
Program Chairs:
Sachin Katti
Stanford University
,
Dave Oran
Cisco

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 November 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Seventh Framework Programme

Conference

HotNets-XII

Sponsor:

SIGCOMM

HotNets-XII: Twelfth ACM Workshop on Hot Topics in Networks

November 21 - 22, 2013

Maryland, College Park

Acceptance Rates

HotNets-XII Paper Acceptance Rate 26 of 110 submissions, 24%;

Overall Acceptance Rate 110 of 460 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
137
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Alasmar MParisis GCrowcroft J(2021)SCDP: Systematic Rateless Coding for Efficient Data Transport in Data CentersIEEE/ACM Transactions on Networking10.1109/TNET.2021.309838629:6(2723-2736)Online publication date: Dec-2021
https://doi.org/10.1109/TNET.2021.3098386
Bashir HFaisal AJamshed MVondras PIftikhar AQazi IDogar FMohaisen AZhang Z(2019)Reducing tail latency using duplicationProceedings of the 15th International Conference on Emerging Networking Experiments And Technologies10.1145/3359989.3365432(246-259)Online publication date: 3-Dec-2019
https://dl.acm.org/doi/10.1145/3359989.3365432
Alasmar MParisis GCrowcroft J(2018)PolyraptorProceedings of the ACM SIGCOMM 2018 Conference on Posters and Demos10.1145/3234200.3234222(69-71)Online publication date: 7-Aug-2018
https://dl.acm.org/doi/10.1145/3234200.3234222
Iftikhar ADogar FQazi IZegura EFord BSnoeren A(2016)Towards a Redundancy-Aware Network Stack for Data CentersProceedings of the 15th ACM Workshop on Hot Topics in Networks10.1145/3005745.3005764(57-63)Online publication date: 9-Nov-2016
https://dl.acm.org/doi/10.1145/3005745.3005764
Parisis GSourlas VKatsaros KChai WPavlou GPaschos GVassilaras STassiulas L(2016)Enhancing multi-source content delivery in content-centric networks with fountain codingProceedings of the 1st Workshop on Content Caching and Delivery in Wireless Networks10.1145/2836183.2836187(1-7)Online publication date: 1-Dec-2016
https://dl.acm.org/doi/10.1145/2836183.2836187
Parisis GTrossen D(2014)Filling the gaps of unused capacity through a fountain coded dissemination of informationACM SIGMOBILE Mobile Computing and Communications Review10.1145/2581555.258156318:1(46-54)Online publication date: 12-Feb-2014
https://dl.acm.org/doi/10.1145/2581555.2581563

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten