skip to main content
10.1145/2670518.2673885acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
tutorial

Theia: Simple and Cheap Networking for Ultra-Dense Data Centers

Published: 27 October 2014 Publication History

Abstract

Recent trends to pack data centers with more CPUs per rack have led to a scenario in which each individual rack may contain hundreds, or even thousands, of compute nodes using system-on-chip (SoC) architectures. At this increased scale, traditional rack-level star topologies with a top-of-rack (ToR) switch as the hub and servers as the leaves are no longer feasible in terms of monetary cost, physical space, and oversubscription. We propose Theia, an architecture to connect hundreds of SoC nodes within a rack, using inexpensive, low-latency, hardware elements to group the rack's servers into subsets which we term SubRacks. We then replace the traditional per-rack ToR with a low-latency, passive, circuit-style patch panel that interconnects these SubRacks. We explore alternatives for the rack-level topology implemented by this patch panel, and we consider approaches for interconnecting racks within a data center. Finally, we investigate options for routing over these new topologies. Our proposal of Theia is unique in that it offers the flexibility of a packet-switched networking over a fixed circuit topology.

References

[1]
Advanced Micro Devices. AMD Embedded G-Series Gamily of Devices. http://www.amd.com/en-us/products/embedded/processors/g-series.
[2]
J. H. Ahn, N. Binkert, A. Davis, M. McLaren, and R. S. Schreiber. HyperX: Topology, Routing, and Packaging of Efficient Large-scale Networks. ACM SC 2009.
[3]
M. Al-Fares, A. Loukissas, and A. Vahdat. A Scalable, Commodity Data Center Network Architecture. ACM SIGCOMM 2008.
[4]
R. Beivide, C. Martínez, C. Izu, J. Gutierrez, J.-A. Gregorio, and J. Miguel-Alonso. Chordal Topologies for Interconnection Networks. In High Performance Computing, volume 2858 of Lecture Notes in Computer Science, pages 385--392. Springer Berlin Heidelberg, 2003.
[5]
J.-C. Bermond, F. Comellas, and D. F. Hsu. Distributed Loop Computer Networks: A Survey. J. Parallel Distrib. Comput., 24(1):2--10, 1995.
[6]
L. N. Bhuyan and D. P. Agrawal. Generalized Hypercube and Hyperbus Structures for a Computer Network. IEEE Transactions on Computing, 33(4):323--333, Apr 1984.
[7]
Cisco Systems, Inc. Cisco Data Center Infrastructure 2.5 Design Guide. www.cisco.com/univercd/cc/td/doc/solution/, 2008.
[8]
N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat. Helios: A Hybrid Electrical/Optical Switch Architecture for Modular Data Centers. ACM SIGCOMM 2010.
[9]
A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: A Scalable and Flexible Data Center Network. ACM SIGCOMM 2009.
[10]
J. L. Gross and J. Yellen. Graph Theory and Its Applications, Second Edition (Discrete Mathematics and Its Applications). Chapman & Hall/CRC, 2005.
[11]
C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu. BCube: A High Performance, Server-Centric Network Architecture for Modular Data Centers. ACM SIGCOMM 2009.
[12]
C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu. DCell: A Scalable and Fault-Tolerant Network Structure for Data Centers. ACM SIGCOMM 2008.
[13]
Hewlett-Packard. HP Moonshot System: The World's First Software-Defined Servers, Technical White Paper. http://h10032.www1.hp.com/ctg/Manual/c03728406.pdf, 2013.
[14]
C. Hopps. Analysis of an Equal-Cost Multi-Path Algorithm. RFC 2992, IETF, 2000.
[15]
F. K. Hwang. A Survey on Multi-loop Networks. Theor. Comput. Sci., 299(1-3):107--121, Apr 2003.
[16]
Intel Corporation. Intel Plans System On Chip (SoC) Designs. http://www.intel.com/pressroom/kits/soc/.
[17]
Open Compute Project. Open Rack. http://www.opencompute.org/projects/open-rack/.
[18]
G. Porter, R. Strong, N. Farrington, A. Forencich, P. Chen-Sun, T. Rosing, Y. Fainman, G. Papen, and A. Vahdat. Integrating Microsecond Circuit Switching into the Data Center. SIGCOMM Comput. Commun. Rev., 43(4):447--458, Aug 2013.
[19]
A. Singla, C.-Y. Hong, L. Popa, and P. B. Godfrey. Jellyfish: Networking Data Centers Randomly. USENIX NSDI 2012.
[20]
I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, SIGCOMM '01, pages 149--160, New York, NY, USA, 2001. ACM.
[21]
I. Stojmenovic. Multiplicative Circulant Networks Topological Properties and Communication Algorithms. Discrete Appl. Math., 77(3):281--305, Aug 1997.
[22]
D. Wang and J. McNair. Circulant-Graph-Based Fault-Tolerant Routing for All-Optical WDM LANs. IEEE GLOBECOM 2010.

Cited By

View all
  • (2024)Shale: A Practical, Scalable Oblivious Reconfigurable NetworkProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672248(449-464)Online publication date: 4-Aug-2024
  • (2023)Poster: Scalability and Congestion Control in Oblivious Reconfigurable NetworksProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3610862(1138-1140)Online publication date: 10-Sep-2023
  • (2023)Networking in next generation disaggregated datacentersConcurrency and Computation: Practice and Experience10.1002/cpe.770235:21Online publication date: 27-Mar-2023
  • Show More Cited By

Index Terms

  1. Theia: Simple and Cheap Networking for Ultra-Dense Data Centers

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      HotNets-XIII: Proceedings of the 13th ACM Workshop on Hot Topics in Networks
      October 2014
      189 pages
      ISBN:9781450332569
      DOI:10.1145/2670518
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      • CISCO

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 October 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Data center networks
      2. Network topologies

      Qualifiers

      • Tutorial
      • Research
      • Refereed limited

      Conference

      HotNets-XIII
      Sponsor:
      HotNets-XIII: The 13th ACM Workshop on Hot Topics in Networks
      October 27 - 28, 2014
      CA, Los Angeles, USA

      Acceptance Rates

      HotNets-XIII Paper Acceptance Rate 26 of 118 submissions, 22%;
      Overall Acceptance Rate 110 of 460 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 18 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Shale: A Practical, Scalable Oblivious Reconfigurable NetworkProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672248(449-464)Online publication date: 4-Aug-2024
      • (2023)Poster: Scalability and Congestion Control in Oblivious Reconfigurable NetworksProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3610862(1138-1140)Online publication date: 10-Sep-2023
      • (2023)Networking in next generation disaggregated datacentersConcurrency and Computation: Practice and Experience10.1002/cpe.770235:21Online publication date: 27-Mar-2023
      • (2017)Lightweight Data Compression for Mobile Flash StorageACM Transactions on Embedded Computing Systems10.1145/312651116:5s(1-18)Online publication date: 27-Sep-2017
      • (2017)Application-Aware Swapping for Mobile SystemsACM Transactions on Embedded Computing Systems10.1145/312650916:5s(1-19)Online publication date: 27-Sep-2017
      • (2017)Adaptive Power Management in Solar Energy Harvesting Sensor Node Using Reinforcement LearningACM Transactions on Embedded Computing Systems10.1145/312649516:5s(1-21)Online publication date: 27-Sep-2017
      • (2016)Network requirements for resource disaggregationProceedings of the 12th USENIX conference on Operating Systems Design and Implementation10.5555/3026877.3026897(249-264)Online publication date: 2-Nov-2016
      • (2015)On optimizing machine learning workloads via kernel fusionACM SIGPLAN Notices10.1145/2858788.268852150:8(173-182)Online publication date: 24-Jan-2015
      • (2015)NUMA-aware graph-structured analyticsACM SIGPLAN Notices10.1145/2858788.268850750:8(183-193)Online publication date: 24-Jan-2015
      • (2015)VirtCL: a framework for OpenCL device abstraction and managementACM SIGPLAN Notices10.1145/2858788.268850550:8(161-172)Online publication date: 24-Jan-2015
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media