ABSTRACT
Virtualization is omnipresent in server environments. The scheduling of virtual machines is a challenging task because it is necessary to avoid differences in processing progress of the virtual CPUs, which otherwise can lead to a severe performance degradation. Coscheduling is a commonly used technique to solve this issue. With coscheduled virtual machines, all virtual CPUs are executed at the same time by the host. However, in a situation with virtual machines of arbitrary size, coscheduling of a whole virtual machine can lead to an under-utilization of the host. This situation occurs when the sizes of the virtual machines prohibit a scheduling where all cores of the host machines are used at every point in time.
In this paper, we show that this under-utilization can be reduced through partial coscheduling. Partial coscheduling uses sets that are not based on the size of the virtual machine but on the requirements of the load inside the virtual machine. We show through experiments with the Linux Kernel Virtual Machine (KVM) in combination with a coscheduling capable Linux kernel, that partial coscheduling can lead to an overall performance improvement compared to full coscheduling of complete virtual machines. The partial coscheduling approach requires knowledge about the relation between processes and threads inside the virtual machine, which is usually not available at runtime. To gather this information without modifying the guest, we propose an automatic algorithm based on the recent technique of Communication Detection through Shared Pages (SPCD), which detects the memory access behavior of applications inside virtual machines.
Our experiments show that partial coscheduling can improve the utilization of the host by reducing the waste of computation time caused by unnecessarily idle cores, thereby increasing the performance of virtual machines. In many scenarios, automated partial coscheduling can also increase the host utilization.
- K. Andreev and H. Räcke. Balanced graph partitioning. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 120--124, 2004. Google ScholarDigital Library
- Y. Bai, C. Xu, and Z. Li. Task-aware based co-scheduling for virtual machine system. In ACM Symposium on Applied Computing (SAC), pages 181--188, 2010. Google ScholarDigital Library
- F. Broquedis, O. Aumage, B. Goglin, S. Thibault, P.-A. Wacrenier, and R. Namyst. Structuring the execution of OpenMP applications for multicore architectures. In IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 1--10, 2010.Google ScholarCross Ref
- F. Broquedis, F. Diakhaté, S. Thibault, O. Aumage, R. Namyst, and P.-A. Wacrenier. Scheduling Dynamic OpenMP Applications over Multicore Architectures. In International Workshop on OpenMP (IWOMP), pages 170--180, 2008. Google ScholarDigital Library
- A. Busse, J. H. Schönherr, M. Diener, G. Mühl, and J. Richling. Analyzing resource interdependencies in multi-core architectures to improve scheduling decisions. In ACM Symposium on Applied Computing (SAC), pages 1595--1602, 2013. Google ScholarDigital Library
- A. Caldwell, A. Kahng, and I. Markov. Design and Implementation of the Fiduccia-Mattheyses Heuristic for VLSI Netlist Partitioning. In Algorithm Engineering and Experimentation, volume 1619 of Lecture Notes in Computer Science, pages 182--198. 1999. Google ScholarDigital Library
- M. Diener, E. H. M. Cruz, and P. O. A. Navaux. Communication-Based Mapping Using Shared Pages. In IEEE International Parallel & Distributed Processing Symposium (IPDPS), pages 700--711, 2013. Google ScholarDigital Library
- H. Jin, M. Frumkin, and J. Yan. The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance. Technical report, Oct. 1999.Google Scholar
- G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 20(1):359--392, 1998. Google ScholarDigital Library
- A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. kvm: the Linux Virtual Machine Monitor. In Proceedings of the Linux Symposium, pages 225--230, 2007.Google Scholar
- J. Ousterhout. Scheduling techniques for concurrent systems. In International Conference on Distributed Computing Systems (ICDCS), pages 22--30, 1982.Google Scholar
- G. J. Popek and R. P. Goldberg. Formal requirements for virtualizable third generation architectures. ACM SIGOPS Operating Systems Review, 7(4):121, 1973. Google ScholarDigital Library
- J. H. Schönherr, B. Lutz, and J. Richling. Non-intrusive coscheduling for general purpose operating systems. In International Conference on Multicore Software Engineering, Performance, and Tools (MSEPT), pages 66--77, 2012. Google ScholarDigital Library
- J. H. Schönherr, B. Juurlink, and J. Richling. Topology-aware equipartitioning with coscheduling on multicore systems. In IEEE International Workshop on Multi-/Many-core Computing Systems (MuCoCoS), pages 1--8, 2013.Google ScholarCross Ref
- S. Thibault, R. Namyst, and P.-A. Wacrenier. Building portable thread schedulers for hierarchical multiprocessors: The bubblesched framework. In Euro-Par 2007 Parallel Processing, volume 4641 of Lecture Notes in Computer Science, pages 42--51. 2007. Google ScholarDigital Library
- V. Uhlig, J. LeVasseur, E. Skoglund, and U. Dannowski. Towards scalable multiprocessor virtual machines. In Conference on Virtual Machine Research And Technology Symposium (VM), 2004. Google ScholarDigital Library
- VMware, Inc. VMware vSphere: The CPU Scheduler in VMware ESX 4.1, 2010. White paper.Google Scholar
- Y. Yu, Y. Wang, H. Guo, and X. He. Hybrid Co-scheduling Optimizations for Concurrent Applications in Virtualized Environments. In IEEE International Conference on Networking, Architecture, and Storage, pages 20--29, 2011. Google ScholarDigital Library
- H. Zheng and C. Waldspurger. Implicit co-scheduling of cpus, Aug. 28 2014. US Patent App. 14/273,022.Google Scholar
Index Terms
- Partial coscheduling of virtual machines based on memory access patterns
Recommendations
Dynamic adaptive scheduling for virtual machines
HPDC '11: Proceedings of the 20th international symposium on High performance distributed computingWith multi-core processors becoming popular, exploiting their computational potential becomes an urgent matter. The functionality of multiple standalone computer systems can be aggregated into a single hardware computer by virtualization, giving ...
Demand-based coordinated scheduling for SMP VMs
ASPLOS '13As processor architectures have been enhancing their computing capacity by increasing core counts, independent workloads can be consolidated on a single node for the sake of high resource efficiency in data centers. With the prevalence of virtualization ...
Enabling Instantaneous Relocation of Virtual Machines with a Lightweight VMM Extension
CCGRID '10: Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid ComputingWe are developing an efficient resource management system with aggressive virtual machine (VM) relocation among physical nodes in a data center. Existing live migration technology, however, requires a long time to change the execution host of a VM, it ...
Comments