skip to main content
10.1145/2694344.2694362acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

DIABLO: A Warehouse-Scale Computer Network Simulator using FPGAs

Published:14 March 2015Publication History

ABSTRACT

Motivated by rapid software and hardware innovation in warehouse-scale computing (WSC), we visit the problem of warehouse-scale network design evaluation. A WSC is composed of about 30 arrays or clusters, each of which contains about 3000 servers, leading to a total of about 100,000 servers per WSC. We found many prior experiments have been conducted on relatively small physical testbeds, and they often assume the workload is static and that computations are only loosely coupled with the adaptive networking stack. We present a novel and cost-efficient FPGAbased evaluation methodology, called Datacenter-In-A-Box at LOw cost (DIABLO), which treats arrays as whole computers with tightly integrated hardware and software. We have built a 3,000-node prototype running the full WSC software stack. Using our prototype, we have successfully reproduced a few WSC phenomena, such as TCP Incast and memcached request latency long tail, and found that results do indeed change with both scale and with version of the full software stack.

References

  1. XGS Core Switch Series - BCM88030 Series. http://www.broadcom.com/products/Switching/Carrier-and-Service-Provider/BCM88030-Series.Google ScholarGoogle Scholar
  2. Facebook Memcached. https://github.com/amanuel/facebook-memcached.Google ScholarGoogle Scholar
  3. R2D2: RAPID AND RELIABLE DATA DELIVERY IN DATA CENTERS. http://www.stanford.edu/~atikoglu/r2d2/.Google ScholarGoogle Scholar
  4. Network simulator, ns-2 : http://www.isi.edu/nsnam/ns/.Google ScholarGoogle Scholar
  5. Twemcache. https://twitter.com/twemcache.Google ScholarGoogle Scholar
  6. Tsinghua IIIS test cluster. http://wiki.iiis.systems/w/index.php/Cluster.Google ScholarGoogle Scholar
  7. Yahoo! Reaches for the Stars with M45 Supercomputing Project. http://research.yahoo.com/node/1884, 2007.Google ScholarGoogle Scholar
  8. Glen Anderson, private communications, 2009.Google ScholarGoogle Scholar
  9. Switching Architectures for Cloud Network Designs. http://www.aristanetworks.com/en/ SwitchingArchitecture_wp.pdf, 2009.Google ScholarGoogle Scholar
  10. Sun Datacenter InfiniBand Switch 648. http://www.sun.com/products/networking/infiniband.jsp, 2009.Google ScholarGoogle Scholar
  11. Cisco Nexus 5000 Series Architecture: The Building Blocks of the Unified Fabric. http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-462176.html, 2010.Google ScholarGoogle Scholar
  12. Force10 S60 High-Performance 1/10 GbE Access Switch. http://www.force10networks.com/products/s60.asp, 2010.Google ScholarGoogle Scholar
  13. Hadoop. http://hadoop.apache.org/, 2010.Google ScholarGoogle Scholar
  14. Xilinx Virtex 7 Series FPGAs. http://www.xilinx.com/technology/roadmap/7-series-fpgas.htm, 2010.Google ScholarGoogle Scholar
  15. Google G-Scale Network. http://www.eetimes.com/electronics-news/4371179/Google-describes-its-OpenFlow-network, 2012.Google ScholarGoogle Scholar
  16. Microsoft Hyper-V virtualization platform. http://www.microsoft.com/en-us/server-cloud/windows-server/server-virtualization.aspx, 2012.Google ScholarGoogle Scholar
  17. memcached: a distributed memory objec caching system. http://memcached.org, 2012.Google ScholarGoogle Scholar
  18. Open Compute Project. http://opencompute.org, 2012.Google ScholarGoogle Scholar
  19. QEMU open source processor emulator. http://wiki.qemu.org, 2012.Google ScholarGoogle Scholar
  20. Oracle Virtualbox VM. http://www.virtualbox.org/, 2012.Google ScholarGoogle Scholar
  21. VMware Virtual Server. http://www.vmware.com, 2012.Google ScholarGoogle Scholar
  22. Memcache 1.14.17 release notes. https://code.google.com/p/memcached/wiki/ReleaseNotes1417, 2013.Google ScholarGoogle Scholar
  23. B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMET- RICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems, SIGMET-RICS '12, pages 53--64, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1097-0. . URL http://doi.acm.org/10.1145/2254756.2254766. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L. A. Barroso. Warehouse-scale computing: Entering the teenage decade. In Proceedings of the 38th annual inter- national symposium on Computer architecture, ISCA '11, pages --, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0472-6. Google ScholarGoogle Scholar
  25. L. A. Barroso and U. Holzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Synthesis Lectures on Computer Architecture. Morgan & Claypool Publishers, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Bechtolsheim. Moore's Law and Networking. In The Linley Group Processor Conference, San Jose, CA, USA, 2012.Google ScholarGoogle Scholar
  27. N. L. Binkert, R. G. Dreslinski, L. R. Hsu, K. T. Lim, A. G. Saidi, and S. K. Reinhardt. The M5 simulator: Modeling networked systems. IEEE Micro, 26(4):52--60, 2006. ISSN 0272-1732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Chen, R. Griffith, J. Liu, R. H. Katz, and A. D. Joseph. Understanding TCP incast throughput collapse in datacenter networks. In WREN '09: Proceedings of the 1st ACM workshop on Research on enterprise networking, pages 73--82, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-443-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Chen, R. Griffit, D. Zats, and R. H. Katz. Understanding tcp incast and its implications for big data workloads. Technical Report UCB/EECS-2012-40, EECS Department, University of California, Berkeley, Apr 2012. URL http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-40.html.Google ScholarGoogle Scholar
  30. M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing data transfers in computer clusters with orchestra. In SIGCOMM, pages 98--109, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. U. Cummings, D. Daly, R. Collins, V. Agarwal, F. Petrini, M. Perrone, and D. Pasetto. Fulcrum's FocalPoint FM4000: A Scalable, Low-Latency 10GigE Switch for High-Performance Data Centers. In Proceedings of the 2009 17th IEEE Symposium on High Performance Interconnects, pages 42--51, Washington, DC, USA, 2009. IEEE Computer Society. ISBN 978-0-7695-3847-1. . URL http://portal.acm.org/citation.cfm?id=1633800.1634467. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Davis, C. Thacker, and C. Chang. BEE3: Re- vitalizing Computer Architecture Research. Technical Report MSR-TR-2009-45, Microsoft Research, Apr 2009. URL http://research.microsoft.com/apps/pubs/default.aspx?id=80369.Google ScholarGoogle Scholar
  33. J. Dean and L. A. Barroso. The tail at scale. Communications of the ACM, 56(2):74--80, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Ganesan, D. Lee, A. Leinwand, A. Shaikh, and M. Shaw. What is the impact of cloud computing on the data center interconnect? In Hot Interconnects, 2011.Google ScholarGoogle Scholar
  35. A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. The cost of a cloud: research problems in data center networks. SIGCOMM Comput. Commun. Rev., 39(1):68--73, 2009. ISSN 0146-4833. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: a scalable and flexible data center network. In SIGCOMM '09, pages 51--62, New York, NY, USA, 2009. ACM. ISBN 978- 1-60558-594-9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu. BCube: a high performance, server- centric network architecture for modular data centers. In SIGCOMM '09, pages 63--74, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-594-9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. L. Hennessy and D. A. Patterson. Computer architecture: a quantitative approach. Elsevier, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. L. R. Hsu, A. G. Saidi, N. L. Binkert, and S. K. Reinhardt. Sampling and stability in tcp/ip workloads. In Proceedings of the First Annual Workshop on Modeling, Benchmarking, and Simulation, MoBS '05, pages 68--77, 2005.Google ScholarGoogle Scholar
  40. D. A. Joseph, A. Tavakoli, and I. Stoica. A policy-aware switching layer for data centers. In SIGCOMM '08, pages 51--62, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-175-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. R. Katz. Tech titans building boom: The architecture of internet datacenters. IEEE Spectrum, February 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. B. Kwan, P. Agarwal, and L. Ashvin. Flexible buffer allocation entities for traffic aggregate containment. US Patent 20090207848, August 2009.Google ScholarGoogle Scholar
  43. J. Li, N. K. Sharma, D. R. K. Ports, and S. D. Gribble. Tales of the tail: Hardware, OS, and application-level sources of tail latency. In Proceedings of the ACM Symposium on Cloud Computing (SoCC), Seattle, WA, USA, 11 2014. ACM. URL papers/latency-socc14.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. K. Lim, D. Meisner, A. Saidi, P. Ranganathan, and T. F. Wenisch. Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached. In Proceedings of the 40th annual international symposium on Computer architecture, ISCA '13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. R. Liu et al. Tessellation: Space-Time partitioning in a many- core client OS. In HotPar09, Berkeley, CA, 03/2009 2009. URL http://www.usenix.org/event/hotpar09/tech/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. P. S. Magnusson et al. Simics: A Full System Simulation Platform. IEEE Computer, 35, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. Open- flow: enabling innovation in campus networks. SIGCOMM Comput. Commun. Rev., 38(2):69--74, Mar. 2008. ISSN 0146- 4833.. URL http://doi.acm.org/10.1145/1355734. 1355746. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. D. Meisner, J. Wu, and T. F. Wenisch. BigHouse: A Simulation Infrastructure for Data Center Systems. ISPASS '12: International Symposium on Performance Analysis of Systems and Software, April 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. M. Mitzenmacher and A. Broder. Using multiple hash functions to improve ip lookups. In In Proceedings of IEEE IN- FOCOM, pages 1454--1463, 2000.Google ScholarGoogle Scholar
  50. R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat. PortLand: a scalable fault-tolerant layer 2 data center network fabric. In SIGCOMM '09, pages 39--50, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-594-9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. D. Ongaro, S. M. Rumble, R. Stutsman, J. K. Ousterhout, and M. Rosenblum. Fast crash recovery in ramcloud. In SOSP, pages 29--41, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. J. H. Salim, R. Olsson, and A. Kuznetsov. Beyond softnet. In Proceedings of the 5th annual Linux Showcase & Conference - Volume 5, ALS '01, pages 18--18, Berkeley, CA, USA, 2001. USENIX Association. URL http://dl.acm.org/citation.cfm?id=1268488.1268506. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. H. Shah. Solving TCP Incast in Cluster Storage Systems. Technical report (Information Networking Institute). Carnegie Mellon University. Information Networking Institute, 2009. URL http://books.google.com/books?id=Rl2jYgEACAAJ.Google ScholarGoogle Scholar
  54. Z. Tan, A. Waterman, R. Avizienis, Y. Lee, H. Cook, D. Patterson, and K. Asanovic and. RAMP gold: An FPGA-based architecture simulator for multiprocessors. In Design Automation Conference (DAC), 2010 47th ACM/IEEE, pages 463--468, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Z. Tan, A. Waterman, H. Cook, S. Bird, K. Asanovic, and D. Patterson. A case for FAME: FPGA architecture model execution. In Proceedings of the 37th annual international symposium on Computer architecture, ISCA '10, pages 290-- 301, New York, NY, USA, 2010. ACM. ISBN 978-1-4503- 0053-7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Z. Tan, K. Asanovic, and D. Patterson. Datacenter-scale network research on fpgas. In Proc. Workshop on Exascale Evaluation and Research Techniques, 2011.Google ScholarGoogle Scholar
  57. A. Tavakoli, M. Casado, T. Koponen, and S. Shenker. Applying NOX to the datacenter. In HotNets, 2009.Google ScholarGoogle Scholar
  58. C. Thacker. Rethinking data centers. October 2007.Google ScholarGoogle Scholar
  59. C. Thacker. A data center network using FPGAs, May 2010.Google ScholarGoogle Scholar
  60. V. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. G. Andersen, G. R. Ganger, G. A. Gibson, and B. Mueller. Safe and effective fine-grained TCP retransmissions for datacenter communication. In SIGCOMM '09, pages 303--314, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-594-9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. G. Wang, D. G. Andersen, M. Kaminsky, K. Papagiannaki, T. S. E. Ng, M. Kozuch, and M. P. Ryan. c-through: part-time optics in data centers. In SIGCOMM, pages 327--338, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. W. Wu and M. Crawford. Potential performance bottleneck in linux tcp. Int. J. Commun. Syst., 20(11):1263--1283, Nov. 2007. ISSN 1074-5351. URL http://dx.doi.org/10.1002/dac.v20:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In EuroSys, pages 265--278, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DIABLO: A Warehouse-Scale Computer Network Simulator using FPGAs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '15: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems
        March 2015
        720 pages
        ISBN:9781450328357
        DOI:10.1145/2694344

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 March 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ASPLOS '15 Paper Acceptance Rate48of287submissions,17%Overall Acceptance Rate535of2,713submissions,20%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader