Toward efficient parallel routing optimization for large-scale SDN networks using GPGPU

https://doi.org/10.1016/j.jnca.2018.03.031Get rights and content

Abstract

Routing optimization is an efficient way to improve network performance and guarantee the QoS requirements of users. However, with the rapid growth of network size and traffic demands, the routing optimization of SDN networks with centralized control plane is facing the scalability issue. To overcome the scalability issue, we aim to speed up the routing optimization process in large networks by utilizing the massive parallel computation capability of GPU. In this paper, we develop an efficient Lagrangian Relaxation based Parallel Routing Optimization Algorithm (LR-PROA). LR-PROA first decomposes the routing optimization problem into a set of path calculation problems for the traffic demands by relaxing the link capacity constraints, then the path calculation tasks are dispatched to GPU and executed concurrently on GPU. In order to achieve high degree of parallelism, LR-PROA also parallelizes the path calculation process for each traffic demand. Furthermore, to improve the convergence speed, LR-PROA uses efficient methods to adjust the calculated paths for a part of traffic demands and set the step size of subgradient algorithm for solving the Lagrangian dual problem in each iteration. Our evaluations on synthetic network topologies verify that LR-PROA has good optimization performance as well as superior calculation time efficiency. In our simulations, LR-PROA is up to tens of times faster than benchmark algorithms in large networks.

Introduction

Software-Defined Networking (SDN) (McKeown et al., 2008) is an emerging networking paradigm that has a centralized and programmable control plane. With the centralized and programmable control plane, SDN networks can flexibly control the routings for fine-grained traffic demands in a timely manner. The deployment of SDN enables network operators to carry out efficient and online routing optimization according to the current status of networks (Nunes et al., 2014; Masoudi and Ghaffari, 2016). The experiment results in the SDN-enabled data center networks of Microsoft(Hong et al., 2013) and Google (Jain et al., 2013) verify that the SDN networks can achieve near-optimal performance in the aspects of throughput and link utilization by implementing routing optimization.

However, on the other hand, the routing optimization of SDN networks is also facing the scalability challenges raised by centralized implementation of control plane. First of all, the number of traffic demands increases dramatically with the rapid development of network applications(Ciso, 2010). For a SDN network, a large number of traffic demands may arrive in a short time, and thus the centralized control plane must complete the path calculation for the traffic demands in a very short time. Secondly, to accommodate the increased traffic, the network size also grows significantly over the last decade (e.g., a data center network may have hundreds of thousands of switches) (Guo et al., 2015; Soliman and Song, 2017). At last, SDN technique enables networks to run at almost-full capacity (Gay et al., 2017), which implies that network controllers need to complete the routing re-optimization as soon as the traffic significant changes or unexpected network failures happen. Therefore, fast and efficient routing optimization for a large number of traffic demands is an important but challenging problem for large-scale SDN networks.

To handle the scalability challenges of SDN, we explore efficient parallel routing optimization algorithms, which are suitable for running on GPUs that has massive parallel computation capability. The routing optimization problem considered in this paper tries to reduce the total routing cost of the traffic demands under the link capacity constraint. We assume that a batch of traffic demands arriving at a network in a short time are given, and the network controllers need to calculate paths for the traffic demands such that the total routing cost of the traffic demands is minimized under the link capacity constraint. In order to accommodate more traffic demands, we assume that a blocked traffic demand has a high routing cost.

The routing optimization problem considered in this paper is a combinatorial optimization problem, and it is proved to be NP-hard in Section 5. Generally, solving the routing optimization problem for large networks is computationally hard. To speed up the routing optimization process, a natural choice is to develop parallel routing optimization algorithms, which calculate paths for traffic demands simultaneously on many different threads in each iteration.

Meanwhile, today's commodity severs with multi-core CPU and GPU provide a low cost and high performance computation environment for parallel algorithms. Furthermore, to facilitate the programming on GPU, Nvidia releases a programming framework called Compute Unified Device Architecture(CUDA) (Nvidia, 2010). CUDA is highly suitable for General Purpose programming on GPUs (GPGPU). With the drastic improvement of performance and general programmability of GPU, GPU-based parallel computation seems to be a promising approach for solving the combinatorial optimization problems (Brodtkorb et al., 2013b). The existing studies reveal that well-designed parallel algorithms running on GPUs can significantly speed up the computation process (Brodtkorb et al., 2013b).

However, due to the architecture constraints, using GPUs for general purpose computation also has some limitations. First of all, cores in the GPUs lack control logic for branch prediction and other aggressive speculative execution techniques (Suchard et al., 2010), which implies that GPUs are not suitable for processing complex logic computation. Second, in each iteration of the algorithms, the data must be copied from host memory to device memory, which incurs extra delays. Therefore, to fully exploit the massive parallel computation capability of GPUs, we require a fine-grained decomposition of parallelized algorithms for achieving optimal performance. Moreover, to reduce the memory copy delay, the number of iterations in the parallel algorithms should be minimized. Based on the above motivations, we study the problem of how to utilize the massive parallel computation capability for solving the routing optimization problem in large-scale SDN networks. To the best of our knowledge, the problem is not well-studied (McCormick et al., 2014; Kikuta et al., 2015). In this work, we develop an efficient parallel routing optimization algorithm, which is massively parallelized and suitable for GPU computation. We implement the parallel routing optimization algorithm on Nvidia GPUs using the CUDA model. The evaluation results show that our proposed parallel routing optimization algorithm is much faster than the sequential routing optimization algorithms. The main contributions of this paper are summarized as follows.

  • We propose a Lagrangian Relaxation based Parallel Routing Optimization Algorithm (LR-PROA). LR-PROA decouples the path calculation task for each traffic demand by relaxing the link capacity constraint, and the path calculation tasks are executed concurrently on the GPU. In LR-PROA, the path calculation algorithm for each traffic demand is also parallelized to get the best speedup ratio.

  • To improve the convergence speed, we propose efficient methods to adjust the calculated paths for a part of traffic demands and set the step size of subgradient algorithm for solving the Laglangian dual problem in each iteration of LR-PROA. With the proposed method, the LR-PROA converges quickly.

  • We compare LR-PROA against benchmark solutions through simulations on synthetic networks. The results verify the superior efficiency of the proposed algorithm, which can be up to tens of times faster than the benchmarks for large networks.

The remainder of the paper is organized as follows. The related work on routing optimization and GPU-based parallel algorithms are reviewed in Section 2. Section 3 describes the GPU architecture and programming model. The network model and problem formulation are presented in Section 4. In Section 5, the routing optimization problem considered in this paper is proved to be NP-hard. Section 6 presents the parallel routing optimization algorithm LR-PROA. Performance evaluations are shown in Section 7, and Section 8 concludes this paper.

Section snippets

Related work

As defined in (Lee and Mukherjee, 2004), routing optimization is to put the traffic where the network bandwidth available. Essentially, routing optimization is an effectively scheme for improving network service capability without causing network congestion. Due to the rapid growth of traffic on the Internet, the routing optimization problem has attracted extensive attentions during the last decade. We refer the reader to (Wang et al., 2008; Lee and Mukherjee, 2004; Mendiola et al., 2017) for

GPU hardware architecture and CUDA programming model

In this paper, we implement the proposed parallel routing optimization algorithm on Nvidia GPUs using the CUDA programming model. For ease of understanding, we first present the GPU hardware architecture and CUDA programming model briefly.

A GPU has a large number of independent processors (e.g., Nividia GT200 has 240 processors), which can execute the same instruction on different data concurrently. Thus, GPU is well suited for massively parallel computing. CUDA (Nvidia, 2010) is a programming

Network model

We model a SDN network as a connected undirected graph G(V, E), where V is the set of nodes and E is the set of links. Let n = |V | and m = |L| denote the number of nodes and links, respectively. Each link (i, j) ∈ E is associated with a weight wij, which denotes the cost of carrying unit traffic on link (i, j). Without loss of generality, we assume that the link weight wij for each link (i, j) ∈ E is an integer. The capacity of link (i, j) is cij. Let D be the set of traffic demands that need

The hardness of the routing optimization problem

Theorem 1

The routing optimization problem considered in this paper is NP-hard.

Proof

We will show the Routing Optimization Problem (ROP) studied in this paper is NP-hard, by outlining a reduction from the Unsplittable Flow Problem (UFP) (Chakrabarti et al., 2007) to ROP. An instance of UFP is given by a graph G(V, E) and a set of flows F. Each link (i, j) ∈ E has a capacity cij. The size and the weight of a flow f are df and wf, respectively. The UFP is to find the maximum weight subset of flows from F and a

The Lagrangian Relaxation based parallel routing optimization algorithm

Essentially, the routing optimization problem is to select a path for each traffic demand, so that the objective is optimal. However, in the MILP formulation, the link capacity constraints (Eq. (4)) tie together the routing variables of all the traffic demands by restricting the used bandwidth on each link (i, j) to at most βcij. The link capacity constraints make it impossible to route each traffic demand independently. Therefore, to achieve parallelized routing calculation, we decouple the

Simulation setup

Actually, the parallel routing optimization problem studied in this paper is originated from a large cellular network provider of a province in China. But due to the privacy and security issues, they do not give us the real topologies. So to evaluate the performance of LR-PROA, we conduct extensive simulations on synthetic topologies, which are generated by using Barabási-Albert (BA) (Albert and Barabasi, 2002) model. BA model generates random topology by beginning with an initially connected

Conclusion

In this paper, we proposed LR-PROA, which is a parallel algorithm running on GPU for the routing optimization problem in large-scale SDN networks. LR-PROA uses Lagrangian relaxation to decouple the routing problem for each traffic demand, and improves the routing solution iteratively. In each iteration, the path calculation for each demand is executed on GPU in parallel, and to utilize the massive parallel computation power of GPU, the path calculation algorithm for each traffic demand is also

Acknowledgment

We would like to thank the reviewers for their valuable comments. This work is partially supported by The NSFC Fund (61671130, 61701058, 61271165, 61301153), National Basic Research Program (China’ s 973 Program) (2013CB329103), the 111 Project(B14039), Technology Program of Sichuan Province (2016GZ0138) and the Fundamental Research Funds for the Central Universities (ZYGX2016J002).

Xiong Wang is an associate professor in school of information and communication at the University of Electronic and Science of China, Chengdu, China. She received his Ph.D. in communication and information system from the University of Electronic and Science of China, Chengdu, china, in 2008. From 2013 to 2014, he was a visiting scholar in electrical and computer engineering at the University of California, Davis, CA, USA. His research interests include network measurement, modeling and

References (38)

  • A. Elwalid et al.

    MATE: MPLS adaptive traffic engineering

  • B. Fortz et al.

    Internet traffic engineering by optimizing OSPF weights

  • S. Gay et al.

    Expect the unexpected:sub-second optimization ofr segment routing

  • C. Guo et al.

    Pingmesh: a large-scale system for data center network latency measurement and analysis

  • P. Harish et al.

    Accelerating large graph algorithms on the GPU using CUDA

  • J. He et al.

    DATE: distibuted adaptive traffic engineering

  • C.Y. Hong et al.

    Achieving high utilization with software-driven wan

  • IBM

    IBM ILOG CPLEX Optimization Studio

    (2014)
  • S. Jain et al.

    B4: experience with a globally-deployed software defined wan

  • Xiong Wang is an associate professor in school of information and communication at the University of Electronic and Science of China, Chengdu, China. She received his Ph.D. in communication and information system from the University of Electronic and Science of China, Chengdu, china, in 2008. From 2013 to 2014, he was a visiting scholar in electrical and computer engineering at the University of California, Davis, CA, USA. His research interests include network measurement, modeling and optimization, algorithm analysis and design, network management in communication networks.

    Qian Zhang received his B.S degree in communication engineering from the University of Electronic and Science of China, Chengdu, china, in 2015. He is currently pursuing the Master degree in communication and information systems with the School of Communication and Information Engineering, University of Electronic Science and Technology of China, Chengdu, China. His research interests include traffic engineering, software defined networks, and network virtualization.

    Jing Ren received her Ph.D. degree in communication and information system from the University of Electronic Science and Technology of China, Chengdu, China, in 2007. She now is a lecturer in school of information and communication at the University of Electronic and Science of China, Chengdu, China. Her research interests include network architecture and protocol design, information-centric networking, and software-defined networking.

    Sheng Wang received the B.S. degree in electronic engineering, and the M.S. and Ph.D. degrees in communication engineering from the University of Electronic Science and Technology of China, Chengdu, China, in 1992, 1995, and 2000, respectively. He now is a professor in the school of information and communication at the University of Electronic Science and Technology of China, Chengdu, China. His research interests include planning and optimization of wire and wireless networks, next generation of internet, and next-generation optical networks. He is a senior member of the communication society of china, a member of the ACM, and a member of the china computer federation.

    Shizhong Xu received his B.S., M.S., and Ph.D. degrees in electrical engineering from the University of Electronic Science and Technology of China, Chengdu, China, in 1994, 1997, and 2000, respectively. He now is a professor in the school of information and communication at the University of Electronic Science and Technology of China, Chengdu, China. His research interests include Internet of Things, next generation network and network science.

    Shui Yu received his B. Eng (Electronic Engineering) and M. Eng (Computer Science) degree from University of Electronic Science and Technology of China, P. R. China in 1993 and 1999, respectively. He also obtained an Associate Degree in Mathematics from the same university in 1993. He received his PhD (Computer Science) from Deakin University in 2004. He is currently a Senior Lecturer of School of Information Technology, Deakin University, Melbourne, Australia. Before joining Deakin University, Dr Yu was a Lecturer of Computer College in University of Electronic Science and Technology of China. He has a good experience of industry, especially in network design and software development organization and implementation. His research interests include Big Data Theory and Application, Networking Theory and Application, and Mathematical Modelling. He dedicates himself in advance human understanding of networks and information, including their measurement, representation, analysis, and application. As a semi-mathematician, he targets on narrowing the gap between theory and application using mathematical tools. Dr Yu is a Guest Professor of South West University of China, an overseas expert of the national 111 project at Beijing Jiaotong Univesity. Dr Yu is a Member of AAAS, ACM, and a Senior Member of IEEE.

    View full text