QoS provisioning for various types of deadline-constrained bulk data transfers between data centers☆
Introduction
Similar to Internet Service Providers (ISPs), Cloud Service Providers (CSPs) aim to satisfy as many customers as possible with high throughput. Most of the leading CSPs deploygeographically-distributed data centers (DCs) to provide various types of services to their customers. These DCs are typically connected by links of high bandwidths across wide-area networks (WANs).
The network infrastructures that connect these geographically distributed DCs cost millions of dollars annually [2], but unfortunately have not been fully utilized. The average utilization of network resources on even those busy inter-DC links is 40%–60% [2], and is 30%–40% [3] on many others, partially due to the traditional best-effort transfer method on the Internet. As the number of cloud-based applications continues to increase, it has become a significant challenge to fully utilize the bandwidth resources of inter-DC network links to accommodate as many data transfer requests as possible and meanwhile maximize the throughput of the entire network system.
Nowadays, the backbones of many WANs employ new technologies to create high-performance networks (HPNs) (e.g., ESnet [4], Internet2 [5], etc.), which provide the capability of advance bandwidth reservation over dedicated channels provisioned by circuit-switching infrastructures or IP-based tunneling techniques for big data transfer. Particularly, the emerging software-defined networking (SDN) technologies greatly facilitate HPN deployments, and in fact, many HPNs have incorporated SDN capabilities into their network infrastructures to provide better Quality of Service (QoS). Such networks feature a virtual single-switch abstraction on top of data planes that employ both a bandwidth reservation system and SDN concepts [6], [7]. Generally, bandwidth reservation allows a batch of user transfer requests accumulated over a period of time to be scheduled collectively in advance, and has proven to be an effective solution to providing QoS guaranteed transfer services and meanwhile achieving a high utilization of network resources.
There exist different types of inter-DC data transfers, among which, bulk data transfer requests (BDTRs) for large data volumes on the order of terabytes to petabytes with deadline constraints account for a major portion of traffic (e.g., 85%–95% in some WANs) [2], [8], [9], [10].
However, most existing solutions for BDTRs are tailored for private cloud services, hence limiting their generalization and scope of application. For example, both Software-Driven WAN (SWAN) [2] and B4 [3] take traffic engineering approaches to improve the inter-DC WAN utilization by considering traffic characteristics and priorities (e.g., interactive elastic background). However, neither of them addresses the deadline constraint of BDTRs, one of the most common performance requirements from users [9]. As an increasing number of applications in scientific and many other domains have migrated from local computing and storage platforms to clouds, the demand for inter-DC data transfer with different types of BDTRs is rapidly growing, but the bandwidth scheduling problem in the emerging cloud environment still remains largely unexplored.
In this paper, we investigate a bandwidth scheduling problem for two types of BDTRs with fixed or variable bandwidth. Given multiple such BDTRs, we aim to fully utilize inter-DC link bandwidth resources and schedule as many BDTRs as possible while minimizing the earliest complete time (ECT) of each request. Specifically, we construct a rigorous cost model, define a new performance metric named user satisfaction degree, and then formulate a generic problem, Bandwidth Scheduling for Multiple Requests of Various Types, referred to as BS-MRVT. We prove BS-MRVT to be not only NP-complete but also non-approximable, and then propose an efficient heuristic scheduling algorithm. We conduct proof-of-concept experiments on a Mininet-based emulated testbed and also extensive simulations for bandwidth scheduling in both simulated and real-life networks. Both experimental and simulation results show that our proposed algorithm significantly outperforms existing methods in terms of user satisfaction degree and scheduling success ratio.
The rest of this paper is organized as follows. We conduct a survey of related work in Section 2. We construct network models and formulate BS-MRVT with complexity analysis in Section 3. We design the algorithm with a detailed illustration in Section 4. We conduct performance evaluation in Section 5 and conclude our work in Section 6.
Section snippets
Related work
There have been a number of successful research efforts in making full use of network resources for bulk data transfer between data centers using centralized traffic engineering techniques. For example, Jain et al. developed B4 [3], which is able to globally schedule massive bandwidth requirements at a modest number of sites. Hong et al. designed SWAN [2], a centrally controlling system that enables inter-DC WANs to carry more traffic. Kandula et al. developed TEMPUs [8], an online temporal
Problem formulation
In this section, we construct rigorous network models to define a new performance metric named user satisfaction degree, and then formally formulate the problem along with detailed complexity analysis.
Algorithms design and analysis
The NP-completeness of BS-MRVT indicates that there does not exist any polynomial-time optimal algorithm for BS-MRVT unless . In this section, we focus on the design of a heuristic algorithm.
Performance evaluation
For performance evaluation, we implement the proposed algorithms and conduct (i) proof-of-concept experiments on an emulated SDN testbed based on Mininet [27] system, and (ii) extensive simulations in randomly generated networks as well as a real-life HPN topology.
Conclusion and future work
In this paper, we investigated an advance bandwidth scheduling problem, i.e., Bandwidth Scheduling for Multiple Reservations of Various Types, referred to as BS-MRVT, with the objective to maximize BDTRs scheduling success ratio while minimizing the data transfer completion time of each request. We considered two different types of BDTRs: FBBRs and VBBRs. We proved BS-MRVT to be both NP-complete and non-approximable, and proposed an efficient heuristic algorithm, FMS-MRVT. For performance
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This research is sponsored by National Natural Science Foundation of China under Grant No. U1609202, Key Research and Development Plan of Shaanxi Province, China under Grant No. 2018GY-011, and Xi’an Science and Technology Plan Project under Grant No. GXYD18.2 with Northwest University, China. The authors would also like to acknowledge the anonymous reviewers’ constructive comments.
Aiqin Hou received the Ph.D. degree in computer science from Northwest University, Xi’an, China, in 2018. She is currently a faculty member with the School of Information Science and Technology, Northwest University, Xi’an, China. Her research interests include big data, high-performance network, and bandwidth scheduling.
References (31)
- et al.
Multiple bulk data transfers scheduling among datacenters
- A. Hou, C. Wu, D. Fang, L. Zuo, M. Zhu, X. Zhang, R. Qiao, X. Yin, Bandwidth scheduling for big data transfer with...
- et al.
Achieving high utilization using software-driven WAN
- et al.
B4: Experience with a globally-deployed software defined WAN
ESnet
(2018)Internet2
(2018)- et al.
Software-defined networking for big-data science - architectural models from campus to the WAN
- et al.
Scheduling and flexible control of bandwidth and in-transit services for end-to-end application workflows
Future Gener. Comput. Syst.
(2015) - et al.
Calendaring for wide area networks
- et al.
Guaranteeing deadlines for inter-data center transfers
IEEE/ACM Trans. Netw.
(2017)
Optimizing bulk transfers with software-defined optical WAN
Lowering inter-datacenter bandwidth costs via bulk data scheduling
An integrated transport solution to big data movement in high-performance networks
Bandwidth on-demand for multimedia big data transfer across geo-distributed cloud data centers
IEEE Trans. Cloud Comput.
Bandwidth scheduling for energy-efficiency in high-performance networks
IEEE Trans. Commun.
Cited by (4)
Cleaner production practices at company level enhance the desire of employees to have a significant positive impact on society through work
2021, Journal of Cleaner ProductionCitation Excerpt :Although, earlier studies (e.g. Akter et al., 2013; Bahia and Nantel, 2000; Cristobal et al., 2007; Octabriyantiningtyas et al., 2019; Qiu et al., 2020; Senthilkumar and Arulraj, 2011) do not have a unanimous take on the dimensions of SQ (Bahia and Nantel, 2000; Farooq et al., 2018a; Salam and Farooq, 2020; Senthilkumar and Arulraj, 2011). Yet, technical aspects of quality and functional aspects of quality are the two most definite and obvious basic aspects of SQ in any industry (Farooq et al., 2018a; Grönroos, 1984; Hou et al., 2020; Konijnendijk, 1993; Prentice and Kadan, 2019; Resende and Cardoso, 2019; Salam and Farooq, 2020). Further, Neave (1987) has noted that SQ is often relinquished for the sake of maximizing organizational profitability and short term organizational gains.
Energy and Network Aware Workload Management for Geographically Distributed Data Centers
2022, IEEE Transactions on Sustainable ComputingData Transfers Using Bandwidth Reservation Through Multiple Disjoint Paths of Dynamic HPNs
2021, Journal of Network and Systems ManagementIntelligent and Flexible Bandwidth Scheduling for Data Transfers in Dedicated High-Performance Networks
2020, IEEE Transactions on Network and Service Management
Aiqin Hou received the Ph.D. degree in computer science from Northwest University, Xi’an, China, in 2018. She is currently a faculty member with the School of Information Science and Technology, Northwest University, Xi’an, China. Her research interests include big data, high-performance network, and bandwidth scheduling.
Chase Q. Wu completed his Ph.D. dissertation with Oak Ridge National Laboratory, Oak Ridge, TN, USA, and received the Ph.D. degree in computer science from Louisiana State University, Baton Rouge, LA, USA, in 2003. He was a Research Fellow with Oak Ridge National Laboratory during 2003–2006 and an Associate Professor with the University of Memphis, Memphis, TN, USA, during 2006–2015. He is currently a Professor of computer science and the Director of the Center for Big Data, New Jersey Institute of Technology, Newark, NJ, USA. His research interests include big data, parallel and distributed computing, high-performance networking, sensor networks, and cybersecurity.
Ruimin Qiao received the B.S. degree in mathematics from Northwest University, Xi’an, China, in 2017. She is currently an M.S. student in the School of Information Science and Technology at Northwest University, Xi’an, China. Her research interests include big data computing and high-performance networking.
Liudong Zuo received the Ph.D. degree in computer science from Southern Illinois University Carbondale in 2015. He received the B.E. degree in computer science from University of Electronic Science and Technology of China in 2009. He is currently an assistant professor in Computer Science Department at California State University, Dominguez Hills. His research interests include computer networks, algorithm design, and big data.
Michelle M. Zhu received her Ph.D. degree in computer science from Louisiana State University in 2005. She finished her dissertation research in the Computer Science and Mathematics Division at Oak Ridge National Laboratory. She was an associate professor in the Computer Science Department at Southern Illinois University, Carbondale, until 2016. She is currently an associate professor in the Department of Computer Science at Montclair State University. Her research interests include high-performance computing, grid and cloud computing, and big data.
Dingyi Fang is currently a Professor with the School of Information Science and Technology, Northwest University, Xi’an, China. His current research interests include mobile computing and distributed computing systems, network and information security, localization, social networks, and wireless sensor networks.
Weike Nie received the B.S. degree in electronic engineering, the M.S. degree in electronic and information engineering, and the Ph.D. degree in information and telecommunication engineering from XiDian University, Xi’an, China, in 1997, 2004, and 2009, respectively. Since September 2009, he has been with the Department of Information Science and Technology School, Northwest University, Xi’an, China, where he is currently an Associate Professor. He was a visiting scholar with New Jersey Institute of Technology from February 2017 to February 2018. His current research interests include array signal processing, blind signal processing, and wireless sensor network localization.
Feng Chen received the M.S. degree in computer science from Northwest University, Xi’an, China, in 2007, and the Ph.D. degree in computer science from Northwestern Polytechnical University, Xi’an, in 2012. He is currently a faculty member with Northwest University, Xi’an. His research interests are in the area of wireless networks, social networks, and Internet of Things.