Elsevier

Computer Communications

Volume 82, 15 May 2016, Pages 1-12
Computer Communications

Towards cost-effective and low latency data center network architecture

https://doi.org/10.1016/j.comcom.2016.02.016Get rights and content

Abstract

This paper presents the design, analysis, and implementation of a novel data center network architecture, named NovaCube. Based on regular Torus topology, NovaCube is constructed by adding a number of most beneficial jump-over links, which offers many distinct advantages and practical benefits. Moreover, in order to enable NovaCube to achieve its maximum theoretical performance, a probabilistic oblivious routing algorithm PORA is carefully designed. PORA is a both deadlock and livelock free routing algorithm, which achieves near-optimal performance in terms of average routing path length with better load balancing thus leading to higher throughput. Theoretical derivation and mathematical analysis together with extensive simulations further prove the good performance of NovaCube and PORA.

Introduction

The data center network (DCN)6 architecture is regarded as one of the most important determinants of network performance in data centers, and plays a significant role in meeting the requirements of could-based services as well as the agility and dynamic reconfigurability of the infrastructure for changing application demands. As a result, many novel proposals, such as Fat-Tree [1], VL2 [2], DCell [3], BCube [4], c-Through [5], Helios [6], SprintNet [7], [8], CamCube [9], Small-World [10], NovaCube [11], CLOT [12], and so on, have been proposed aiming to efficiently interconnect the servers inside a data center to deliver peak performance to users.

Generally, DCN topologies can be classified into four categories: multi-rooted tree-based topology (e.g. Fat-Tree), server-centric topology (e.g. DCell, BCube, SprintNet), hybrid network (e.g. c-Through, Helios) and direct network (e.g. CamCube, Small-World) [13]. Each of these has their advantages and disadvantages. Tree-based topologies, like FatTree and Clos, can provide full bisection bandwidth, thus the any-to-any performance is good. However, their building cost and complexity is relatively high. The recursive-defined server-centric topology usually concentrates on the scalability and incremental extensibility with a lower building cost; however, the full bisection bandwidth may not be achieved and their performance guarantee is only limited to a small scope. The hybrid network is a hybrid packet and circuit switched network architecture. Compared with packet switching, optical circuit switching can provide higher bandwidth and lower latency in transmission with lower energy consumption. However, optical circuit switching cannot achieve full bisection bandwidth at packet granularity. Furthermore, the optics also suffers from slow switching speed which can take as high as tens of milliseconds.

The direct network topology, which directly connects servers to other servers, is a switchless network interconnection without any switches, or routers. It is usually constructed in a regular pattern, such as Torus (as show in Fig. 1). Besides being widely used in high-performance computing systems, Torus is also an attractive network architecture candidate for data centers. However, this design suffers consistently from poor routing efficiency compared with other designs due to its relatively long network diameter (the maximum shortest path between any node pairs), which is known as k2n for a n-D Torus7 with radix k. Besides, a long network diameter may also lead to high communication delay. Furthermore, its performance largely depends on the routing algorithms.

In order to deal with these imperfections, in this paper we propose a novel container level high-performance Torus-based DCN architecture named NovaCube. The key design principle of NovaCube is to connect the farthest node pairs by adding additional jump-over links. In this way, NovaCube can halve the network diameter and receive higher bisection bandwidth and throughput. Moreover, we design a new weighted probabilistic oblivious deadlock-free routing algorithm PORA for NovaCube, which achieves low average routing path length and good load-balancing by exploiting the path diversities.

The primary contributions of this paper can be summarized as follows:

  • (1)

    We propose a novel Torus-based DCN architecture NovaCube, which exhibits good performance in network latency, bisection bandwidth, throughput, path diversity and average path length.

  • (2)

    We carefully design a weighted probabilistic oblivious routing algorithm PORA, which is both deadlock-free and livelock-free, and helps NovaCube achieve good load balancing.

  • (3)

    We introduce a credit-based lossless flow control mechanism in NovaCube network.

  • (4)

    We design a practical geographical address assignment mechanism, which also can be applied to the traditional Torus network.

  • (5)

    We implement NovaCube architecture, PORA routing algorithm and the flow control mechanism in NS3. Extensive simulations are conducted to demonstrate the good performance of NovaCube.

The rest of the paper is organized as follows. First, we briefly review the related research literature in Section 2. Then Section 3 demonstrates the motivation. In Section 4, NovaCube architecture is introduced and analyzed in detail. Afterwards, the routing algorithm PORA is designed in Section 5. Section 6 introduces the credit-based flow control mechanism. Section 7 demonstrates a geographical address assignment mechanism. Section 8 presents the system evaluation and simulation results. Finally, Section 9 concludes this paper.

Section snippets

Network interconnection

The Torus-based topology well implements the network locality forming the servers in close proximity of each other, which increases the communication efficiency. Besides being widely used in supercomputing, Torus network has also been introduced to the data center networks. Three typical representatives are namely CamCube [9], Small-World [10] and CLOT [12].

CamCube was proposed by Abu-Libdeh et al., and the servers in CamCube are interconnected in a 3D Torus topology. The CamCube is designed

Why Torus-based clusters

The Torus (or precisely k-ary n-cube) based intserconnection has been regarded as an attractive DCN architecture scheme for data centers because of its own unique advantages, some of which are listed below.

Firstly, it incurs lower infrastructure cost since it is a switchless architecture without needing any expensive switches. In addition, the power consumed by the switches and its associated cooling power can also be saved.

Secondly, it achieves better fault-tolerance. Traditional architecture

NovaCube network design

This section presents the network design and theoretical analysis of NovaCube. Before introducing the physical interconnection structure, we firstly provide a theorem with proof, which offers a theoretical basis of NovaCube design.

Theorem 4.1

For any node A(a1, a2, … , an) in a k-ary n-cube (when k is even) if B(b1, b2, … , bn) is assumed to be the farthest node from A, then B is unique and B’s unique farthest node is exactly A.

Proof

In a k-ary n-cube, if B(b1, b2, … , bn) is the farthest node from A(a1, a2

Routing scheme

This section presents the specially designed routing algorithms named PORA for NovaCube, which aims to help NovaCube achieve its maximum theoretical performance. PORA is a probabilistic weighted oblivious routing algorithm. Besides, PORA is also livelock and deadlock free.

Credit-based flow control mechanism

Flow control, or known as congestion control, is designed to manage the rate of data transmission between devices or nodes in a network to prevent the network buffers being overwhelmed. Too much data arrives exceeding the device capacity results in data overflow, meaning the data is either lost or must be retransmitted. Thus, the main objective of flow control is to limit packet delay and avoid buffer overflow. In traditional Internet, the protocols with flow control functionality like TCP

Network layering

Similar to the traditional internet, the protocol stack of NovaCube is also divided into five abstraction layers which are application layer, transport layer, network layer, link layer and physical layer. The only difference lies in the network layer, where the traditional internet uses IP address to locate different hosts while NovaCube uses coordinates to direct the data transmission with the benefit of topology’s symmetry, which can improve the routing efficiency greatly. Except for network

Evaluation

In this section, we evaluate the performance of NovaCube and PORA routing algorithm under various network conditions by using network simulator 3 (NS-3) [46]. The link bandwidth is 1 Gbps and each link is capable of bidirectional communications. The default maximum transmission unit (MTU) of a link is 1500 bytes. The propagation delay of a link and the processing time for a packet at a node are set to be 4 µ s and 1.5 μs, respectively. Besides, Weibull Distribution is adopted to determine the

Conclusion

In this paper, we proposed a novel data center architecture named NovaCube, and presented its design and key properties. As a switchless architecture, NovaCube’s cost-effectiveness is highlighted with regard to its energy consumption and infrastructure cost. As proved, NovaCube is also superior to other candidate architectures in terms of network diameter, throughput, average path length, bisection bandwidth, path diversity and fault tolerance. Furthermore, the specially designed probabilistic

Acknowledgment

This research has been supported by a Grant from Huawei Technologies Co., Ltd. The authors also would like to express their thanks and gratitudes to the anonymous reviewers whose constructive comments helped improve the manuscript.

References (46)

  • T. Wang et al.

    Designing efficient high performance server-centric data center network architecture

    Comput. Netw.

    (2015)
  • Z. Guo et al.

    Jet: electricity cost-aware dynamic workload management in geographically distributed datacenters

    Comput. Commun.

    (2014)
  • T. Wang et al.

    Towards bandwidth guaranteed energy efficient data center networking

    J. Cloud Comput.

    (2015)
  • M. Al-Fares et al.

    A scalable, commodity data center network architecture

    Proceedings of the ACM SIGCOMM Computer Communication Review

    (2008)
  • A. Greenberg et al.

    VL2: a scalable and flexible data center network

    SIGCOMM Comput. Commun. Rev.

    (2009)
  • C. Guo et al.

    DCell: A scalable and fault-tolerant network structure for data centers

    ACM SIGCOMM Comput. Commun. Rev.

    (2008)
  • C. Guo et al.

    Bcube: a high performance, server-centric network architecture for modular data centers

    ACM SIGCOMM Comput. Commun. Rev.

    (2009)
  • G. Wang et al.

    c-Through: Part-time optics in data centers

    Proceedings of the ACM SIGCOMM Computer Communication Review

    (2010)
  • N. Farrington et al.

    Helios: a hybrid electrical/optical switch architecture for modular data centers

    Proceedings of the ACM SIGCOMM Computer Communication Review

    (2010)
  • T. Wang et al.

    Sprintnet: a high performance server-centric network architecture for data centers

    Proceedings of the 2014 IEEE International Conference on Communications (ICC)

    (2014)
  • H. Abu-Libdeh et al.

    Symbiotic routing in future data centers

    ACM SIGCOMM Comput. Commun. Rev.

    (2010)
  • J.-Y. Shin et al.

    Small-world datacenters

    Proceedings of the 2nd ACM Symposium on Cloud Computing

    (2011)
  • T. Wang et al.

    Novacube: a low latency torus-based network architecture for data centers

    Proceedings of the 2014 IEEE Global Communications Conference (GLOBECOM)

    (2014)
  • T. Wang et al.

    Clot: a cost-effective low-latency overlaid torus-based network architecture for data centers

    Proceedings of the 2015 IEEE International Conference on Communications (ICC)

    (2015)
  • T. Wang et al.

    Rethinking the data center networking: architecture, network protocols, and resource sharing

    Access, IEEE

    (2014)
  • Z. Guo et al.

    Cutting the electricity cost of distributed datacenters through smart workload dispatching

    IEEE Commun. Lett.

    (2013)
  • L. Rao et al.

    Minimizing electricity cost: optimization of distributed internet data centers in a multi-electricity-market environment

    Proceedings of the INFOCOM, 2010

    (2010)
  • T. Wang et al.

    A general framework for performance guaranteed green data center networking

    Proceedings of the 2014 IEEE Global Communications Conference (GLOBECOM)

    (2014)
  • B. Heller et al.

    Elastictree: saving energy in data center networks.

    Proceedings of the NSDI

    (2010)
  • V. Mann et al.

    Vmflow: leveraging vm mobility to reduce network power costs in data centers

    Proceedings of the NETWORKING 2011

    (2011)
  • D. Meisner et al.

    Powernap: eliminating server idle power

    Proceedings of the ACM Sigplan Notices

    (2009)
  • J. Leverich et al.

    Power management of datacenter workloads using per-core power gating

    Comput. Archit. Lett.

    (2009)
  • H. David et al.

    Memory power management via dynamic voltage/frequency scaling

    Proceedings of the 8th ACM international conference on Autonomic computing

    (2011)
  • Cited by (18)

    • Crystal: A scalable and fault-tolerant Archimedean-based server-centric cloud data center network architecture

      2019, Computer Communications
      Citation Excerpt :

      Differently, as an energy provisioning strategy, it is possible to profit from green and renewable energy resources such as solar panels and wind farms to support a significant part of the total required energy [29,30]. On the other hand, as an architectural trend, many works have been dedicated to designing energy-efficient network architectures to achieve power savings [15,31,32]. However, the conducted studies revealed a trade-off between the throughput and the power consumption of the data center networks [33].

    • The Internet of People (IoP): A new wave in pervasive mobile computing

      2017, Pervasive and Mobile Computing
      Citation Excerpt :

      However, we expect that the innovation there will be much more turbulent and radical, possibly following a more disruptive, revolutionary path. We envisage that, in the future, the Internet core will be an information highway, mainly based on fiber optics [56], which may also use quantum technologies (e.g., quantum algorithms, quantum computing and quantum cryptography [57–59]), to connect the users to large data-centers [60–62] providing large-scale Internet services. Power consumption of the data centers is already a critical issue (to make businesses profitable), and it is expected to significantly increase in the future; hence energy efficient policies will have a major role in the data-center design.

    • Achieving energy efficiency in data centers with a performance-guaranteed power aware routing

      2017, Computer Communications
      Citation Excerpt :

      This relates to the design of a data center that conserves energy thanks to its architecture. NovaCube [19] and flattened butterfly topology [20] are designed to interconnect servers directly without an intermediate switches (switchless networks). These topologies save energy consumed by switches, racks and associated cooling machines.

    • The features, hardware, and architectures of data center networks: A survey

      2016, Journal of Parallel and Distributed Computing
      Citation Excerpt :

      As MDC’s are increasing in popularity, the basic component of building DC’s gradually changes from a rack to a shipping container. Typical architectures for MDC’s include BCube [66], MDCube [176], PCube [83], uFix [108], Snowflake [123], HFN [46], Hyper-BCube [116], BCCC [109], ABCCC [113], CamCube [2] and NovaCube [169,170]. In the load-balanced, fault tolerant BCube Source Routing, the source determines the routing path of a packet flow by sending probe packets over multiple parallel paths, and the destination returns a probe response.

    View all citing articles on Scopus
    1

    Student Member, IEEE

    2

    Student Member, IEEE

    3

    Member, IEEE

    4

    Student Member, IEEE

    5

    Fellow, IEEE

    View full text