Elsevier

Journal of Systems Architecture

Volume 98, September 2019, Pages 361-373
Journal of Systems Architecture

vSimilar: A high-adaptive VM scheduler based on the CPU pool mechanism

https://doi.org/10.1016/j.sysarc.2019.02.002Get rights and content

Abstract

In a virtualized system, the virtual machine (VM) scheduler plays a key role to the performance promotion of the virtual machine monitor (VMM), a.k.a. hypervisor. The scheduler is responsible for assigning adequate system resources to each VM according to the demands of the VM tenants, which is quite challenging as VM tenants’ demands are quite dynamic and unpredictable. To this end, CPU pool mechanism has been widely adopted as an adaptive solution. However, the CPU pool mechanism still has defficiency in terms of VM classification model and time-slice allocation strategy, as the two strategies have to be effectively utilized for realizing a high-adaptive VM scheduler. In this paper, we thus explore opportunities to improve the CPU pool mechanism and develop a new VM scheduling solution, called vSimilar, which uses VM multi-classification model to more effectively adaptive to the VMs of running different types of tasks at different time. Moreover, by a dynamic time-slice function, vSimilar manages to provide a more efficient resource allocation. The experimental evaluation shows that vSimilar can significantly improve the performance of a VMM, such as Xen. The improvements include 1) a VM server hosted by Xen with vSimilar can reduce nearly 95% of a client’s Ping round-trip time (Ping RTT), 2) vSimilar can help increase about 40% the TCP throughput, and about 20% the UDP throughput, between a Xen-hosted VM server and a client, and 3) vSimilar also increases the page operation rate by nearly 50% for a Xen-hosted VM Web server.

Introduction

In virtualized cloud datacenters [1], VM scheduling plays a key role to improve the performance of a cloud system in various domains, such as real-time applications supporting [2], security [3], network latency reducing [4], energy efficiency [5], load balancing [6], to name just a few. The VM scheduler is responsible for allocating CPU time to VMs in VM scheduling, which is a vital component of the hypervisor [7]. The design of a VM scheduler thus has a significant impact on the overall performance of the virtualized system. In the scheduling policies of the VM scheduler, the setting of timeslice length is particularly important. For I/O-intensive VMs, achieving low I/O latency is their top priority demand, in which case, the timeslice should be set to be short. Setting a short timeslice ensures that the I/O requests of VMs are handled timely because it guarantees VMs higher scheduling frequency. In contrast to I/O-intensive VMs, fewer context switches is preferred for none-I/O-intensive VMs. Thus, setting a longer timeslice is preferred by none-I/O-intensive VMs due to the fact that long-timeslice setting does not cause too many expensive VM context switches and provides a better CPU utilization environment. Setting a unified timeslice is not able to meet both the requirements of I/O-intensive VMs and none-I/O-intensive VMs. The credit scheduler is the default and the most commonly used VM scheduler in Xen. It adopts unified timeslice setting for VMs of different I/O-intensity, and thus impede the adaptivity of the virtualized system.

Another deficiency of the credit scheduler is that it yields high I/O latency. The credit scheduler employs the Boost mechanism to accelerate the handling of I/O events. When an I/O event arrives for a none-Boost-priority VM with remaining credit, the VM will be set with a higher scheduling priority, the Boost priority, and will be scheduled before the VMs with a lower scheduling priority. In this manner, I/O event handling is made quicker. However, the Boost mechanism may not be effective in some cases. One case is that the Boost mechanism only boosts VMs with remaining credits. For VMs without remaining credits, their I/O events will not be handled before all VMs run out of their credits and all VMs are then supplied with newly allocated credits. The waiting time causes high I/O latency. Another case is that there are so many I/O-intensive VMs in the scheduling queue that almost all VMs are in the Boost priority. In this situation, the VMs have to wait in a long Boost-priority-VM queue for their scheduling turn, which is time-consuming. In these two cases, the credit scheduler fails to mitigate I/O latency.

Besides, the scheduling policies of the credit scheduler are not friendly to none-I/O-intensive VMs. The credit scheduler allows other VMs to preempt the pCPU from the VM occupying the pCPU. This policy can reduce the latency for handling I/O events. But for none-I/O-intensive VMs, this policy may cause unnecessary overhead. The main types of none-I/O-intensive VMs are CPU-bound and memory-bound and require less context switching. Frequent VM context switching causes more overhead for executing context switching codes and cache swapping, which are expensive and degrading the performance of none-I/O-intensive VMs.

To improve the credit scheduler, the Xen organization adopts cpupool scheme, which has been introduced to Xen since Xen 4.2 [8]. The idea behind the cpupool scheme is to group the physical cores (pCPU) of the machine into several pools. At any time, a given pCPU can be assigned to no more than one of these pools. A VM is assigned to a pool at a time. Each of these pools has an individual VM scheduler, and can be set with different scheduling parameters, so as to enable Xen to classify VMs to different types and allocate CPU resources according to VMs’ demands.

However, though the cpupool approach provides users tools to implement VM classification, the rulers of what VMs should be grouped into the same pool and how the parameters of the cpupool-specific VM scheduler should be set are not addressed. These issues have to be addressed in order to build a VM scheduler with high adaptability, low I/O latency, and fair friendliness to none-I/O-intensive VMs.

In this paper, we thus propose approaches to build a high-adaptive VM scheduler, called vSimilar, on extending our previous work [9]. In comparison, we improve the dynamic time slice function and VM scheduling scheme in vSimilar. vSimilar classifies VMs according to their average I/O-event count per second, and classifies VMs into as many types as possible to obtain fine-grained VM classification to ensure superior compatibility between the scheduling strategy and the characteristics of various VMs. Here, vSimilar uses the number of pCPUs as the number of types of VMs, because larger number of types will cause more than one types of VMs to be scheduled to one single pCPU such that different parameters for the cpupool-specific VM scheduler are not allowed to be set. On each physical core, vSimilar dynamically calculates a time-slice for all VMs in the core. vSimilar takes the performance of both I/O and CPU activities into considerations, which is more balanced and fairer treatment of I/O and CPU activities.

The contributions of this paper can be summarized as follows:

  • 1.

    We analyze the main causes leading to high latency, low CPU usage and low network throughput;

  • 2.

    We propose a new VM scheduling model, vSimilar, to achieve higher performance on CPU and I/O;

  • 3.

    Our proto system, vSimilar, implemented on hypervisor Xen 4.4. achieve nearly 95% latency mitigation, and about 40% and 20% throughput enhancement on TPC and UDP performance gain, respectively, which is substantially higher than previously proposed approaches. It is worth noticing that vSimilar is implemented on application level, and requires no modification of Xen source code.

The rest of this paper is organized as follows. Section 2 introduces the background, the previous work, and the unresolved issues. Section 3 describes in detail how vSimilar addresses the main causes that contribute to low performance of VMM. Section 4 briefly discusses implementation issues of vSimilar. Section 5 presents the evaluation results of vSimilar, using vSimilar in hypervisor Xen. Finally, Section 6 discusses the future work and concludes the paper.

Section snippets

Motivation

In this section, we first give a background in VM scheduling, then describe the relationship between CPU time-slice and system performance, and identify the main causes for the poor performance of current VM schedulers. After that, we will outline our proposed methods to improve the performance.

vSimilar

The previous section describes main weaknesses in current VM scheduling solutions. In this section, we introduce our proposed scheduling model, called vSimilar, and describe in detail how vSimilar can eliminate the weaknesses. We will first give an overview of vSimilar architecture in Section 3.1. Then in the rest of the section we will analyze quantitatively the deficiencies of existing methods, and explain how vSimilar tackles them (Sections 3.2 and 3.3).

Implementation

vSimilar is built based on the CPU pool mechanism of Xen VMM, and enhances the CPU pool mechanism with

  • 1.

    VM multi-classification model;

  • 2.

    Variable time-slicing.

The scheduling procedure of vSimilar consists of 4 steps: (1) collecting statistics; (2) VM classification; (3) separating VMs of different types and attaching the VMs to CPU pools; and (4) assigning time-slices. In every rescheduling cycle, steps (1)-(4) are applied in that order. Step 1 collects statistics about every VM for the purpose of

Evaluation

In this section, we present the results of experimental evaluation of vSimilar using the Xen-based prototype. We use microbenchmarks as well as application-level benchmarks to evaluate the effectiveness of vSimilar. Our experiments evaluate three key performance achieved by vSimilar: (a) transport-level latency reduction; (b) overall CPU-sharing fairness; and (c) application-level performance improvement.

Conclusion and future work

We proposed a new VM scheduler, vSimilar, to overcome the shortcomings of the existing schedulers used by current VMMs. The basic idea is to classify VMs with close interrupt rates into the same group, assign them the same time-slice, and place them in the same physical CPU core for execution. With vSimilar, I/O-intensive applications and non-I/O-intensive applications are separated to achieve better performance in terms of scheduling timeliness, and CPU usage fairness. The evaluation of a

Acknowledgments

We would like to thank reviewers for their insightful feedback. This work was supported in part by NSFC (no. 61525204, 61732010).

Liwei Lin received the B.S. and M.S. degree from Fujian Normal University, China, in 2006 and 2010 respectively. He is currently pursuing the Ph.D. degree in computer science and engineering with Shanghai Jiao Tong University, China. His research interests include data center network, mobile computing and cloud computing.

References (42)

  • Y. Cheng et al.

    Precise contention-aware performance prediction on virtualized multicore system

    J. Syst. Archit.

    (2017)
  • R.S. Michalski et al.

    Machine learning: an artificial intelligence approach

    (2013)
  • A. Surie et al.

    Low-bandwidth VM migration via opportunistic replay

    Proceedings of the 9th Workshop on Mobile Computing Systems and Applications

    (2008)
  • S. Wu et al.

    Doris: an adaptive soft real-time scheduler in virtualized environments

    IEEE Trans. Serv. Comput.

    (2017)
  • N. Juma et al.

    The overhead from combating side-channels in cloud systems using vm-scheduling

    IEEE Trans. Dependable Secure Comput.

    (2018)
  • B. Guan et al.

    Civsched: a communication-aware inter-vm scheduling technique for decreased network latency between co-located vms

    IEEE Trans. Cloud Comput.

    (2014)
  • G.H. Bindu et al.

    A statistical survey on vm scheduling in cloud workstation for reducing energy consumption by balancing load in cloud

    Networks & Advances in Computational Technologies (NetACT), 2017 International Conference on

    (2017)
  • K.-M. Cho et al.

    A hybrid meta-heuristic algorithm for VM scheduling with load balancing in cloud computing

    Neural Comput. Appl.

    (2015)
  • A.K.S. Rajan et al.

    Hypervisor for consolidating real-time automotive control units: its procedure, implications and hidden pitfalls

    J. Syst. Archit.

    (2018)
  • Introduction to cpupools, [accessed 2017],...
  • X. Liu et al.

    Ecps: an application-specific VM scheduler basing on cpu pool mechanism for big data environment

    Advanced Communication Technology (ICACT), 2018 20th International Conference on

    (2018)
  • P. Barham et al.

    Xen and the art of virtualization

    ACM SIGOPS Operating Systems Review

    (2003)
  • L. Cherkasova et al.

    Comparison of the three cpu schedulers in Xen

    SIGMETRICS Perform. Eval. Rev.

    (2007)
  • N. Nishiguchi

    Evaluation and consideration of the credit scheduler for client virtualization

    Xen Summit Asia

    (2008)
  • D. Ongaro et al.

    Scheduling i/o in virtual machine monitors

    Proceedings of the fourth ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

    (2008)
  • L. Zeng et al.

    An improved Xen credit scheduler for i/o latency-sensitive applications on multicores

    Cloud Computing and Big Data (CloudCom-Asia), 2013 International Conference on

    (2013)
  • E. Ackaouy

    The Xen credit cpu scheduler

    Proceedings of

    (2006)
  • R. Ma et al.

    Dass: dynamic time slice scheduler for virtual machine monitor

    International Conference on Algorithms and Architectures for Parallel Processing

    (2015)
  • C. Xu et al.

    vslicer: latency-aware virtual machine scheduling via differentiated-frequency cpu slicing

    Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing

    (2012)
  • C. Xu et al.

    vturbo: accelerating virtual machine i/o processing using designated turbo-sliced core

    Presented as part of the 2013 USENIX Annual Technical Conference (USENIX ATC 13)

    (2013)
  • H. Guan et al.

    Workload-aware credit scheduler for improving network i/o performance in virtualization environment

    IEEE Trans. Cloud Comput.

    (2014)
  • Liwei Lin received the B.S. and M.S. degree from Fujian Normal University, China, in 2006 and 2010 respectively. He is currently pursuing the Ph.D. degree in computer science and engineering with Shanghai Jiao Tong University, China. His research interests include data center network, mobile computing and cloud computing.

    Xiaodong Liu received the BS degree in Computer Science from Nanjing University of Aeronautics and Astronautics, China, in 2015, respectively. He is studying for his MS degree in Shanghai Jiao Tong University from 2015.

    Ruhui Ma received the Ph.D. degree in computer science from Shanghai Jiao Tong University (SJTU), China, in 2011. He held postdoctoral positions with SJTU (2012 and 2013) and McGill University, Canada (2014), respectively. He is currently an Assistant Professor with the Department of Computer Science and Engineering, SJTU. His main research interests are in virtual machines, computer architecture, and network virtualization.

    Jian Li received the BS degree in electronics and information technology from TianJin University, China, in 2001, the MS degree in telecommunication and computer science from the University of Henri Poincare, France, in 2003, and the PhD degree in computer science from the Institute National Polytechnique de Lorraine (INPL), Nancy, France, in 2007. He is an associate professor in the School of Software at Shanghai Jiao Tong University. He has worked as a postdoctoral researcher at the University of Toronto and as an associated researcher at McGill University in 2007 and 2008, respectively. His research interests include real-time scheduling theory, Cyber-Physical system, real-time communication, network protocol design and quality of service, real-time computing and embedded system. He is a member of the IEEE.

    Dajin Wang received the B.Eng. degree in computer engineering from the Shanghai University of Science and Technology, Shanghai, China, in 1982, and the Ph.D. degree in computer science from the Stevens Institute of Technology, Hoboken, NJ, USA, in 1990. Since then, he has been with the Department of Computer Science at Montclair State University, Montclair, NJ, USA, where he is currently a Professor. He received several university awards for his scholarly accomplishments. He has held visiting positions in other universities, and has consulted in industry. His main research interests include interconnection networks, fault-tolerant computing, algorithmic robotics, parallel processing, and wireless ad hoc and sensor networks. He has published more than 60 papers in these areas. He was an Associate Editor of the IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS from 2010 to 2014.

    Haibing Guan received the PhD degree in computer science from the Tongji University, China, in 1999. He is currently a professor with the Faculty of Computer Science, Shanghai Jiao Tong University, China. He is a member of CCF. His current research interests include, but are not limited to, computer architecture, compiling, virtualization, and hardware/software co-design. He is a member of the IEEE.

    View full text