skip to main content
10.1145/2535771.2535781acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

Trevi: watering down storage hotspots with cool fountain codes

Published: 21 November 2013 Publication History

Abstract

Datacenter networking has brought high-performance storage systems' research to the foreground once again. Many modern storage systems are built with commodity hardware and TCP/IP networking to save costs. In this paper, we highlight a group of problems that are present in such storage systems and which are all related to the use of TCP. As an alternative, we explore Trevi: a fountain coding-based approach for distributing I/O requests that overcomes these problems while still efficiently scheduling resources across both networking and storage layers. We also discuss how receiver-driven flow and congestion control, in combination with fountain coding, can guide the design of Trevi and provide a viable alternative to TCP for datacenter storage.

References

[1]
M. Aguilera, R. Janakiraman, and L. Xu. Using erasure codes efficiently for storage in a distributed system. In Proc. of DSN 2005, 2005.
[2]
M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity data center network architecture. In SIGCOMM, 2008.
[3]
R. J. Anderson. The Eternity service. In Pragocrypt, 1996.
[4]
L. S. Brakmo, S. W. O'Malley, and L. L. Peterson. TCP Vegas: new techniques for congestion detection and avoidance. In SIGCOMM, 1994.
[5]
P. Breuer, A. Lopez, and A. Ares. The Network Block Device. Linux Journal, March 2000.
[6]
P. H. Carns, W. B. Ligon, III, R. B. Ross, and R. Thakur. PVFS: a parallel file system for Linux clusters. In USENIX ALS, 2000.
[7]
P. Cataldi, M. Shatarski, M. Grangetto, and E. Magli. Implementation and performance evaluation of LT and raptor codes for multimedia applications. In IIH-MSP, 2006.
[8]
A. G. Dimakis, V. Prabhakaran, and K. Ramchandran. Decentralized erasure codes for distributed networked storage. IEEE Transactions on Information Theory, 52: 2809--2816, 2006.
[9]
L. Ellenberg. DRBD 9 and device-mapper: Linux block level storage replication. In the Linux System Technology Conference, 2009.
[10]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In SOSP, 2003.
[11]
A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta. VL2: a scalable and flexible data center network. ACM SIGCOMM CCR, 39(4), 2009.
[12]
S. Hand and T. Roscoe. Mnemosyne: Peer-to-Peer steganographic storage. In IPTPS, 2002.
[13]
C. Hopps. Analysis of an equal-cost multi-path algorithm. RFC 2992, 2000.
[14]
M. Luby. LT Codes. In Proc. of FOCS, 2002.
[15]
A. Madhavapeddy, R. Mortier, C. Rotsos, D. Scott, B. Singh, T. Gazagnaire, S. Smith, S. Hand, and J. Crowcroft. Unikernels: library operating systems for the cloud. In ASPLOS, 2013.
[16]
S. McCanne, V. Jacobson, and M. Vetterli. Receiver-driven layered multicast. In SIGCOMM, 1996.
[17]
M. Menth, F. Lehrieder, B. Briscoe, P. Eardley, T. Moncaster, et al. A survey of PCN-based admission control and flow termination. Communications Surveys & Tutorials, IEEE, 12(3): 357--375, 2010.
[18]
E. B. Nightingale, J. Elson, J. Fan, O. Hofmann, J. Howell, and Y. Suzue. Flat datacenter storage. In USENIX OSDI, 2012.
[19]
Oracle. The Oracle Clustered File System. http://oss.oracle.com/projects/ocfs/.
[20]
G. Parisis, G. Xylomenos, and T. Apostolopoulos. DHTbd: A reliable block-based storage system for high performance clusters. In CCGRID, 2011.
[21]
B. Pawlowski, D. Noveck, D. Robinson, and R. Thurlow. The NFS version 4 protocol. In SANE 2000, 2000.
[22]
A. Phanishayee, E. Krevat, V. Vasudevan, D. G. Andersen, G. R. Ganger, G. A. Gibson, and S. Seshan. Measurement and analysis of TCP throughput collapse in cluster-based storage systems. In USENIX FAST, 2008.
[23]
C. Raiciu, C. Paasch, S. Barre, A. Ford, M. Honda, F. Duchene, O. Bonaventure, and M. Handley. How hard can it be? designing and implementing a deployable multipath TCP. In Proc. of USENIX NSDI, 2012.
[24]
Y. Saito, S. Frolund, A. C. Veitch, A. Merchant, and S. Spence. FAB: building distributed enterprise disk arrays from commodity components. In ASPLOS, 2004.
[25]
F. Schmuck and R. Haskin. GPFS: A shared-disk file system for large computing clusters. In of USENIX FAST, 2002.
[26]
P. Schwan. Lustre: Building a file system for 1,000-node clusters. In Linux Symposium, 2003.
[27]
A. Shokrollahi. Raptor codes. IEEE Transactions on Information Theory, 52(6): 2551--2567, 2006.
[28]
K. Tan and J. Song. A Compound TCP approach for high-speed and long distance networks. In IEEE INFOCOM, 2006.
[29]
V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. G. Andersen, G. R. Ganger, G. A. Gibson, and B. Mueller. Safe and effective fine-grained TCP retransmissions for datacenter communication. In SIGCOMM, 2009.
[30]
S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. Ceph: a scalable, high-performance distributed file system. In USENIX SOSP, 2006.
[31]
B. Welch, M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka, and B. Zhou. Scalable performance of the Panasas parallel file system. In USENIX FAST, 2008.
[32]
H. Wu, Z. Feng, C. Guo, and Y. Zhang. ICTCP: Incast congestion control for TCP in data center networks. In Proceedings of CoNEXT, 2010.
[33]
Y. Zhang and N. Ansari. On mitigating TCP incast in data center networks. In Proc. of IEEE INFOCOM, 2011.

Cited By

View all
  • (2021)SCDP: Systematic Rateless Coding for Efficient Data Transport in Data CentersIEEE/ACM Transactions on Networking10.1109/TNET.2021.309838629:6(2723-2736)Online publication date: Dec-2021
  • (2019)Reducing tail latency using duplicationProceedings of the 15th International Conference on Emerging Networking Experiments And Technologies10.1145/3359989.3365432(246-259)Online publication date: 3-Dec-2019
  • (2018)PolyraptorProceedings of the ACM SIGCOMM 2018 Conference on Posters and Demos10.1145/3234200.3234222(69-71)Online publication date: 7-Aug-2018
  • Show More Cited By

Index Terms

  1. Trevi: watering down storage hotspots with cool fountain codes

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        HotNets-XII: Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks
        November 2013
        188 pages
        ISBN:9781450325967
        DOI:10.1145/2535771
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 21 November 2013

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. TCP incast
        2. datacenter storage
        3. fountain coding

        Qualifiers

        • Research-article

        Funding Sources

        Conference

        HotNets-XII
        Sponsor:
        HotNets-XII: Twelfth ACM Workshop on Hot Topics in Networks
        November 21 - 22, 2013
        Maryland, College Park

        Acceptance Rates

        HotNets-XII Paper Acceptance Rate 26 of 110 submissions, 24%;
        Overall Acceptance Rate 110 of 460 submissions, 24%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)4
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 02 Mar 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2021)SCDP: Systematic Rateless Coding for Efficient Data Transport in Data CentersIEEE/ACM Transactions on Networking10.1109/TNET.2021.309838629:6(2723-2736)Online publication date: Dec-2021
        • (2019)Reducing tail latency using duplicationProceedings of the 15th International Conference on Emerging Networking Experiments And Technologies10.1145/3359989.3365432(246-259)Online publication date: 3-Dec-2019
        • (2018)PolyraptorProceedings of the ACM SIGCOMM 2018 Conference on Posters and Demos10.1145/3234200.3234222(69-71)Online publication date: 7-Aug-2018
        • (2016)Towards a Redundancy-Aware Network Stack for Data CentersProceedings of the 15th ACM Workshop on Hot Topics in Networks10.1145/3005745.3005764(57-63)Online publication date: 9-Nov-2016
        • (2016)Enhancing multi-source content delivery in content-centric networks with fountain codingProceedings of the 1st Workshop on Content Caching and Delivery in Wireless Networks10.1145/2836183.2836187(1-7)Online publication date: 1-Dec-2016
        • (2014)Filling the gaps of unused capacity through a fountain coded dissemination of informationACM SIGMOBILE Mobile Computing and Communications Review10.1145/2581555.258156318:1(46-54)Online publication date: 12-Feb-2014

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media