skip to main content
10.1145/2535771.2535781acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

Trevi: watering down storage hotspots with cool fountain codes

Published:21 November 2013Publication History

ABSTRACT

Datacenter networking has brought high-performance storage systems' research to the foreground once again. Many modern storage systems are built with commodity hardware and TCP/IP networking to save costs. In this paper, we highlight a group of problems that are present in such storage systems and which are all related to the use of TCP. As an alternative, we explore Trevi: a fountain coding-based approach for distributing I/O requests that overcomes these problems while still efficiently scheduling resources across both networking and storage layers. We also discuss how receiver-driven flow and congestion control, in combination with fountain coding, can guide the design of Trevi and provide a viable alternative to TCP for datacenter storage.

References

  1. M. Aguilera, R. Janakiraman, and L. Xu. Using erasure codes efficiently for storage in a distributed system. In Proc. of DSN 2005, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity data center network architecture. In SIGCOMM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. J. Anderson. The Eternity service. In Pragocrypt, 1996.Google ScholarGoogle Scholar
  4. L. S. Brakmo, S. W. O'Malley, and L. L. Peterson. TCP Vegas: new techniques for congestion detection and avoidance. In SIGCOMM, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Breuer, A. Lopez, and A. Ares. The Network Block Device. Linux Journal, March 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. H. Carns, W. B. Ligon, III, R. B. Ross, and R. Thakur. PVFS: a parallel file system for Linux clusters. In USENIX ALS, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Cataldi, M. Shatarski, M. Grangetto, and E. Magli. Implementation and performance evaluation of LT and raptor codes for multimedia applications. In IIH-MSP, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. G. Dimakis, V. Prabhakaran, and K. Ramchandran. Decentralized erasure codes for distributed networked storage. IEEE Transactions on Information Theory, 52: 2809--2816, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  9. L. Ellenberg. DRBD 9 and device-mapper: Linux block level storage replication. In the Linux System Technology Conference, 2009.Google ScholarGoogle Scholar
  10. S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In SOSP, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta. VL2: a scalable and flexible data center network. ACM SIGCOMM CCR, 39(4), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Hand and T. Roscoe. Mnemosyne: Peer-to-Peer steganographic storage. In IPTPS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Hopps. Analysis of an equal-cost multi-path algorithm. RFC 2992, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Luby. LT Codes. In Proc. of FOCS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Madhavapeddy, R. Mortier, C. Rotsos, D. Scott, B. Singh, T. Gazagnaire, S. Smith, S. Hand, and J. Crowcroft. Unikernels: library operating systems for the cloud. In ASPLOS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. McCanne, V. Jacobson, and M. Vetterli. Receiver-driven layered multicast. In SIGCOMM, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Menth, F. Lehrieder, B. Briscoe, P. Eardley, T. Moncaster, et al. A survey of PCN-based admission control and flow termination. Communications Surveys & Tutorials, IEEE, 12(3): 357--375, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. B. Nightingale, J. Elson, J. Fan, O. Hofmann, J. Howell, and Y. Suzue. Flat datacenter storage. In USENIX OSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Oracle. The Oracle Clustered File System. http://oss.oracle.com/projects/ocfs/.Google ScholarGoogle Scholar
  20. G. Parisis, G. Xylomenos, and T. Apostolopoulos. DHTbd: A reliable block-based storage system for high performance clusters. In CCGRID, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Pawlowski, D. Noveck, D. Robinson, and R. Thurlow. The NFS version 4 protocol. In SANE 2000, 2000.Google ScholarGoogle Scholar
  22. A. Phanishayee, E. Krevat, V. Vasudevan, D. G. Andersen, G. R. Ganger, G. A. Gibson, and S. Seshan. Measurement and analysis of TCP throughput collapse in cluster-based storage systems. In USENIX FAST, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Raiciu, C. Paasch, S. Barre, A. Ford, M. Honda, F. Duchene, O. Bonaventure, and M. Handley. How hard can it be? designing and implementing a deployable multipath TCP. In Proc. of USENIX NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Saito, S. Frolund, A. C. Veitch, A. Merchant, and S. Spence. FAB: building distributed enterprise disk arrays from commodity components. In ASPLOS, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Schmuck and R. Haskin. GPFS: A shared-disk file system for large computing clusters. In of USENIX FAST, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Schwan. Lustre: Building a file system for 1,000-node clusters. In Linux Symposium, 2003.Google ScholarGoogle Scholar
  27. A. Shokrollahi. Raptor codes. IEEE Transactions on Information Theory, 52(6): 2551--2567, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. K. Tan and J. Song. A Compound TCP approach for high-speed and long distance networks. In IEEE INFOCOM, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  29. V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. G. Andersen, G. R. Ganger, G. A. Gibson, and B. Mueller. Safe and effective fine-grained TCP retransmissions for datacenter communication. In SIGCOMM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. Ceph: a scalable, high-performance distributed file system. In USENIX SOSP, 2006.Google ScholarGoogle Scholar
  31. B. Welch, M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka, and B. Zhou. Scalable performance of the Panasas parallel file system. In USENIX FAST, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Wu, Z. Feng, C. Guo, and Y. Zhang. ICTCP: Incast congestion control for TCP in data center networks. In Proceedings of CoNEXT, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. Zhang and N. Ansari. On mitigating TCP incast in data center networks. In Proc. of IEEE INFOCOM, 2011.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Trevi: watering down storage hotspots with cool fountain codes

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          HotNets-XII: Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks
          November 2013
          188 pages
          ISBN:9781450325967
          DOI:10.1145/2535771

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 November 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          HotNets-XII Paper Acceptance Rate26of110submissions,24%Overall Acceptance Rate110of460submissions,24%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader