Towards cost-effective and low latency data center network architecture
Introduction
The data center network (DCN)6 architecture is regarded as one of the most important determinants of network performance in data centers, and plays a significant role in meeting the requirements of could-based services as well as the agility and dynamic reconfigurability of the infrastructure for changing application demands. As a result, many novel proposals, such as Fat-Tree [1], VL2 [2], DCell [3], BCube [4], c-Through [5], Helios [6], SprintNet [7], [8], CamCube [9], Small-World [10], NovaCube [11], CLOT [12], and so on, have been proposed aiming to efficiently interconnect the servers inside a data center to deliver peak performance to users.
Generally, DCN topologies can be classified into four categories: multi-rooted tree-based topology (e.g. Fat-Tree), server-centric topology (e.g. DCell, BCube, SprintNet), hybrid network (e.g. c-Through, Helios) and direct network (e.g. CamCube, Small-World) [13]. Each of these has their advantages and disadvantages. Tree-based topologies, like FatTree and Clos, can provide full bisection bandwidth, thus the any-to-any performance is good. However, their building cost and complexity is relatively high. The recursive-defined server-centric topology usually concentrates on the scalability and incremental extensibility with a lower building cost; however, the full bisection bandwidth may not be achieved and their performance guarantee is only limited to a small scope. The hybrid network is a hybrid packet and circuit switched network architecture. Compared with packet switching, optical circuit switching can provide higher bandwidth and lower latency in transmission with lower energy consumption. However, optical circuit switching cannot achieve full bisection bandwidth at packet granularity. Furthermore, the optics also suffers from slow switching speed which can take as high as tens of milliseconds.
The direct network topology, which directly connects servers to other servers, is a switchless network interconnection without any switches, or routers. It is usually constructed in a regular pattern, such as Torus (as show in Fig. 1). Besides being widely used in high-performance computing systems, Torus is also an attractive network architecture candidate for data centers. However, this design suffers consistently from poor routing efficiency compared with other designs due to its relatively long network diameter (the maximum shortest path between any node pairs), which is known as for a n-D Torus7 with radix k. Besides, a long network diameter may also lead to high communication delay. Furthermore, its performance largely depends on the routing algorithms.
In order to deal with these imperfections, in this paper we propose a novel container level high-performance Torus-based DCN architecture named NovaCube. The key design principle of NovaCube is to connect the farthest node pairs by adding additional jump-over links. In this way, NovaCube can halve the network diameter and receive higher bisection bandwidth and throughput. Moreover, we design a new weighted probabilistic oblivious deadlock-free routing algorithm PORA for NovaCube, which achieves low average routing path length and good load-balancing by exploiting the path diversities.
The primary contributions of this paper can be summarized as follows:
- (1)
We propose a novel Torus-based DCN architecture NovaCube, which exhibits good performance in network latency, bisection bandwidth, throughput, path diversity and average path length.
- (2)
We carefully design a weighted probabilistic oblivious routing algorithm PORA, which is both deadlock-free and livelock-free, and helps NovaCube achieve good load balancing.
- (3)
We introduce a credit-based lossless flow control mechanism in NovaCube network.
- (4)
We design a practical geographical address assignment mechanism, which also can be applied to the traditional Torus network.
- (5)
We implement NovaCube architecture, PORA routing algorithm and the flow control mechanism in NS3. Extensive simulations are conducted to demonstrate the good performance of NovaCube.
The rest of the paper is organized as follows. First, we briefly review the related research literature in Section 2. Then Section 3 demonstrates the motivation. In Section 4, NovaCube architecture is introduced and analyzed in detail. Afterwards, the routing algorithm PORA is designed in Section 5. Section 6 introduces the credit-based flow control mechanism. Section 7 demonstrates a geographical address assignment mechanism. Section 8 presents the system evaluation and simulation results. Finally, Section 9 concludes this paper.
Section snippets
Network interconnection
The Torus-based topology well implements the network locality forming the servers in close proximity of each other, which increases the communication efficiency. Besides being widely used in supercomputing, Torus network has also been introduced to the data center networks. Three typical representatives are namely CamCube [9], Small-World [10] and CLOT [12].
CamCube was proposed by Abu-Libdeh et al., and the servers in CamCube are interconnected in a 3D Torus topology. The CamCube is designed
Why Torus-based clusters
The Torus (or precisely k-ary n-cube) based intserconnection has been regarded as an attractive DCN architecture scheme for data centers because of its own unique advantages, some of which are listed below.
Firstly, it incurs lower infrastructure cost since it is a switchless architecture without needing any expensive switches. In addition, the power consumed by the switches and its associated cooling power can also be saved.
Secondly, it achieves better fault-tolerance. Traditional architecture
NovaCube network design
This section presents the network design and theoretical analysis of NovaCube. Before introducing the physical interconnection structure, we firstly provide a theorem with proof, which offers a theoretical basis of NovaCube design.
Theorem 4.1 For any node A(a1, a2, … , an) in a k-ary n-cube (when k is even) if B(b1, b2, … , bn) is assumed to be the farthest node from A, then B is unique and B’s unique farthest node is exactly A. Proof In a k-ary n-cube, if B(b1, b2, … , bn) is the farthest node from A(a1, a2
Routing scheme
This section presents the specially designed routing algorithms named PORA for NovaCube, which aims to help NovaCube achieve its maximum theoretical performance. PORA is a probabilistic weighted oblivious routing algorithm. Besides, PORA is also livelock and deadlock free.
Credit-based flow control mechanism
Flow control, or known as congestion control, is designed to manage the rate of data transmission between devices or nodes in a network to prevent the network buffers being overwhelmed. Too much data arrives exceeding the device capacity results in data overflow, meaning the data is either lost or must be retransmitted. Thus, the main objective of flow control is to limit packet delay and avoid buffer overflow. In traditional Internet, the protocols with flow control functionality like TCP
Network layering
Similar to the traditional internet, the protocol stack of NovaCube is also divided into five abstraction layers which are application layer, transport layer, network layer, link layer and physical layer. The only difference lies in the network layer, where the traditional internet uses IP address to locate different hosts while NovaCube uses coordinates to direct the data transmission with the benefit of topology’s symmetry, which can improve the routing efficiency greatly. Except for network
Evaluation
In this section, we evaluate the performance of NovaCube and PORA routing algorithm under various network conditions by using network simulator 3 (NS-3) [46]. The link bandwidth is 1 Gbps and each link is capable of bidirectional communications. The default maximum transmission unit (MTU) of a link is 1500 bytes. The propagation delay of a link and the processing time for a packet at a node are set to be 4 µ s and 1.5 μs, respectively. Besides, Weibull Distribution is adopted to determine the
Conclusion
In this paper, we proposed a novel data center architecture named NovaCube, and presented its design and key properties. As a switchless architecture, NovaCube’s cost-effectiveness is highlighted with regard to its energy consumption and infrastructure cost. As proved, NovaCube is also superior to other candidate architectures in terms of network diameter, throughput, average path length, bisection bandwidth, path diversity and fault tolerance. Furthermore, the specially designed probabilistic
Acknowledgment
This research has been supported by a Grant from Huawei Technologies Co., Ltd. The authors also would like to express their thanks and gratitudes to the anonymous reviewers whose constructive comments helped improve the manuscript.
References (46)
- et al.
Designing efficient high performance server-centric data center network architecture
Comput. Netw.
(2015) - et al.
Jet: electricity cost-aware dynamic workload management in geographically distributed datacenters
Comput. Commun.
(2014) - et al.
Towards bandwidth guaranteed energy efficient data center networking
J. Cloud Comput.
(2015) - et al.
A scalable, commodity data center network architecture
Proceedings of the ACM SIGCOMM Computer Communication Review
(2008) - et al.
VL2: a scalable and flexible data center network
SIGCOMM Comput. Commun. Rev.
(2009) - et al.
DCell: A scalable and fault-tolerant network structure for data centers
ACM SIGCOMM Comput. Commun. Rev.
(2008) - et al.
Bcube: a high performance, server-centric network architecture for modular data centers
ACM SIGCOMM Comput. Commun. Rev.
(2009) - et al.
c-Through: Part-time optics in data centers
Proceedings of the ACM SIGCOMM Computer Communication Review
(2010) - et al.
Helios: a hybrid electrical/optical switch architecture for modular data centers
Proceedings of the ACM SIGCOMM Computer Communication Review
(2010) - et al.
Sprintnet: a high performance server-centric network architecture for data centers
Proceedings of the 2014 IEEE International Conference on Communications (ICC)
(2014)
Symbiotic routing in future data centers
ACM SIGCOMM Comput. Commun. Rev.
Small-world datacenters
Proceedings of the 2nd ACM Symposium on Cloud Computing
Novacube: a low latency torus-based network architecture for data centers
Proceedings of the 2014 IEEE Global Communications Conference (GLOBECOM)
Clot: a cost-effective low-latency overlaid torus-based network architecture for data centers
Proceedings of the 2015 IEEE International Conference on Communications (ICC)
Rethinking the data center networking: architecture, network protocols, and resource sharing
Access, IEEE
Cutting the electricity cost of distributed datacenters through smart workload dispatching
IEEE Commun. Lett.
Minimizing electricity cost: optimization of distributed internet data centers in a multi-electricity-market environment
Proceedings of the INFOCOM, 2010
A general framework for performance guaranteed green data center networking
Proceedings of the 2014 IEEE Global Communications Conference (GLOBECOM)
Elastictree: saving energy in data center networks.
Proceedings of the NSDI
Vmflow: leveraging vm mobility to reduce network power costs in data centers
Proceedings of the NETWORKING 2011
Powernap: eliminating server idle power
Proceedings of the ACM Sigplan Notices
Power management of datacenter workloads using per-core power gating
Comput. Archit. Lett.
Memory power management via dynamic voltage/frequency scaling
Proceedings of the 8th ACM international conference on Autonomic computing
Cited by (18)
Crystal: A scalable and fault-tolerant Archimedean-based server-centric cloud data center network architecture
2019, Computer CommunicationsCitation Excerpt :Differently, as an energy provisioning strategy, it is possible to profit from green and renewable energy resources such as solar panels and wind farms to support a significant part of the total required energy [29,30]. On the other hand, as an architectural trend, many works have been dedicated to designing energy-efficient network architectures to achieve power savings [15,31,32]. However, the conducted studies revealed a trade-off between the throughput and the power consumption of the data center networks [33].
The Internet of People (IoP): A new wave in pervasive mobile computing
2017, Pervasive and Mobile ComputingCitation Excerpt :However, we expect that the innovation there will be much more turbulent and radical, possibly following a more disruptive, revolutionary path. We envisage that, in the future, the Internet core will be an information highway, mainly based on fiber optics [56], which may also use quantum technologies (e.g., quantum algorithms, quantum computing and quantum cryptography [57–59]), to connect the users to large data-centers [60–62] providing large-scale Internet services. Power consumption of the data centers is already a critical issue (to make businesses profitable), and it is expected to significantly increase in the future; hence energy efficient policies will have a major role in the data-center design.
Achieving energy efficiency in data centers with a performance-guaranteed power aware routing
2017, Computer CommunicationsCitation Excerpt :This relates to the design of a data center that conserves energy thanks to its architecture. NovaCube [19] and flattened butterfly topology [20] are designed to interconnect servers directly without an intermediate switches (switchless networks). These topologies save energy consumed by switches, racks and associated cooling machines.
The features, hardware, and architectures of data center networks: A survey
2016, Journal of Parallel and Distributed ComputingCitation Excerpt :As MDC’s are increasing in popularity, the basic component of building DC’s gradually changes from a rack to a shipping container. Typical architectures for MDC’s include BCube [66], MDCube [176], PCube [83], uFix [108], Snowflake [123], HFN [46], Hyper-BCube [116], BCCC [109], ABCCC [113], CamCube [2] and NovaCube [169,170]. In the load-balanced, fault tolerant BCube Source Routing, the source determines the routing path of a packet flow by sending probe packets over multiple parallel paths, and the destination returns a probe response.
RibsNet: A Scalable, High-Performance, and Cost-Effective Two-Layer-Based Cloud Data Center Network Architecture
2023, IEEE Transactions on Network and Service Management