ABSTRACT
The simplest strategy to guarantee good quality of service (QoS) for a latency-sensitive workload with sub-millisecond latency in a shared cluster environment is to never run other workloads concurrently with it on the same server. Unfortunately, this inevitably leads to low server utilization, reducing both the capability and cost effectiveness of the cluster.
In this paper, we analyze the challenges of maintaining high QoS for low-latency workloads when sharing servers with other workloads. We show that workload co-location leads to QoS violations due to increases in queuing delay, scheduling delay, and thread load imbalance. We present techniques that address these vulnerabilities, ranging from provisioning the latency-critical service in an interference aware manner, to replacing the Linux CFS scheduler with a scheduler that provides good latency guarantees and fairness for co-located workloads. Ultimately, we demonstrate that some latency-critical workloads can be aggressively co-located with other workloads, achieve good QoS, and that such co-location can improve a datacenter's effective throughput per TCO-$ by up to 52%.
- Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. Workload Analysis of a Large-Scale Key-Value Store. SIGMETRICS, 2012. Google ScholarDigital Library
- Luiz Andre Barroso. Warehouse-Scale Computing: Entering the Teenage Decade. ISCA, 2011.Google Scholar
- Juan A. Colmenares et al. Tessellation: Refactoring the OS Around Explicit Resource Containers with Continuous Adaptation. DAC, 2013. Google ScholarDigital Library
- Jeffrey Dean and Luiz André Barroso. The Tail at Scale. Commununications of the ACM, 56(2):74--80, February 2013. Google ScholarDigital Library
- Christina Delimitrou and Christos Kozyrakis. iBench: Quantifying Interference for Datacenter Workloads. IISWC, 2013.Google Scholar
- Christina Delimitrou and Christos Kozyrakis. Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters. ASPLOS, 2013. Google ScholarDigital Library
- Kenneth J Duda and David R Cheriton. Borrowed-Virtual-Time (BVT) Scheduling: Supporting Latency-Sensitive Threads in a General-Purpose Scheduler. SOSP, 1999. Google ScholarDigital Library
- Frank C. Eigler, Vara Prasad, Will Cohen, Hien Nguyen, Martin Hunt, Jim Keniston, and Brad Chen. Architecture of Systemtap: A Linux Trace/Probe Tool, 2005.Google Scholar
- Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. Dark Silicon and the End of Multicore Scaling. ISCA, 2011. Google ScholarDigital Library
- Gartner says efficient data center design can lead to 300 percent capacity growth in 60 percent less space. http://www.gartner.com/newsroom/id/1472714, 2010.Google Scholar
- Donald Gross, John F Shortle, James M Thompson, and Carl M Harris. Fundamentals of Queueing Theory. Wiley, 2013. Google ScholarDigital Library
- Andrew Herdrich, Ramesh Illikkal, Ravi Iyer, Ronak Singhal, Matt Merten, and Martin Dixon. SMT QoS: Hardware Prototyping of Thread-level Performance Differentiation Mechanisms. HotPar, 2012.Google Scholar
- Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy Katz, Scott Shenker, and Ion Stoica. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. NSDI, 2011. Google ScholarDigital Library
- Urs Hoelzle and Luiz Andre Barroso. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan and Claypool Publishers, 1st edition, 2009. Google ScholarDigital Library
- Vimalkumar Jeyakumar, Mohammad Alizadeh, David Mazières, Balaj i Prabhakar, Changhoon Kim, and Albert Greenberg. EyeQ: Practical Network Performance Isolation at the Edge. NSDI, 2013. Google ScholarDigital Library
- James M Kaplan, William Forrest, and Noah Kindler. Revolutionizing Data Center Energy Efficiency. Technical report, McKinsey & Company, 2008.Google Scholar
- Rishi Kapoor, George Porter, Malveeka Tewari, Geoffrey M Voelker, and Ãmin Vahdat. Chronos: Predictable Low Latency for Data Center Applications. SOCC, 2012. Google ScholarDigital Library
- David G Kendall. Stochastic Processes Occurring in the Theory of Queues and their Analysis by the Method of the Imbedded Markov Chain. The Annals of Mathematical Statistics, 1953.Google ScholarCross Ref
- Kevin Lim, Parthasarathy Ranganathan, Jichuan Chang, Chandrakant Patel, Trevor Mudge, and Steven Reinhardt. Understanding and designing new server architectures for emerging warehouse-computing environments. ISCA, 2008. Google ScholarDigital Library
- Chung Laung Liu and James W Layland. Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment. Journal of the ACM, 20(1):46--61, 1973. Google ScholarDigital Library
- Huan Liu. A Measurement Study of Server Utilization in Public Clouds. In Proc. of the Intl. Conference on Dependable, Autonomic and Secure Computing, 2011. Google ScholarDigital Library
- Jason Mars, Lingjia Tang, and Robert Hundt. Heterogeneity in "Homogeneous" Warehouse-Scale Computers: A Performance Opportunity. IEEE Computer Architecture Letters, 10(2), 2011. Google ScholarDigital Library
- Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. Bubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations. In Proc. of the Intl. Symposium on Microarchitecture, 2011. Google ScholarDigital Library
- David Meisner, Junjie Wu, and Thomas F Wenisch. Big-House: A Simulation Infrastructure for Data Center Systems. ISPASS, 2012. Google ScholarDigital Library
- Onur Mutlu and Thomas Moscibroda. Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors. In Proc. of the Intl. Symposium on Microarchitecture, 2007.Google Scholar
- Kyle J. Nesbit, Nidhi Aggarwal, James Laudon, and James E. Smith. Fair Queuing Memory Systems. In Proc. of the Intl. Symposium on Microarchitecture, 2006. Google ScholarDigital Library
- John Ousterhout et al. The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM. ACM SIGOPS Operating Systems Review, 43(4), 2010. Google ScholarDigital Library
- Chandandeep Singh Pabla. Completely fair scheduler. Linux Journal, 2009(184):4, 2009. Google ScholarDigital Library
- Chandrakant D. Patel and Amip J. Shah. Cost Model for Planning, Development and Operation of a Data Center. Technical report HPL-2005-107R1, Hewlett-Packard Labs, 2005.Google Scholar
- Aleksey Pesterev, Jacob Strauss, Nickolai Zeldovich, and Robert T Morris. Improving Network Connection Locality on Multicore Systems. EuroSys, 2012. Google ScholarDigital Library
- Moinuddin K. Qureshi and Yale N. Patt. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In Proc. of the Intl. Symposium on Microarchitecture, 2006. Google ScholarDigital Library
- Charles Reiss, Alexey Tumanov, Gregory R Ganger, Randy H Katz, and Michael A Kozuch. Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis. SOCC, 2012. Google ScholarDigital Library
- Barret Rhoden, Kevin Klues, David Zhu, and Eric Brewer. Improving Per-Node Efficiency in the Datacenter with New OS Abstractions. SOCC, 2011. Google ScholarDigital Library
- Efraim Rotem, Alon Naveh, Doron Rajwan, Avinash Ananthakrishnan, and Eliezer Weissmann. Power-Management Architecture of the Intel Microarchitecture Code-named Sandy Bridge. IEEE Micro, 32(2), 2012. Google ScholarDigital Library
- Paul Saab. Scaling memcached at Facebook. https://www.facebook.com/note.php?note_id=39391378919, December 2008.Google Scholar
- Daniel Sanchez and Christos Kozyrakis. Scalable and Efficient Fine-Grained Cache Partitioning with Vantage. IEEE Micro's Top Picks, 32(3), May-June 2012. Google ScholarDigital Library
- Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. Omega: Flexible, Scalable Schedulers for Large Compute Clusters. EuroSys, 2013. Google ScholarDigital Library
- David Shue, Michael J Freedman, and Anees Shaikh. Performance Isolation and Fairness for Multi-Tenant Cloud Storage. OSDI, 2012. Google ScholarDigital Library
- Paul Turner, Bharata B Rao, and Nikhil Rao. CPU Bandwidth Control for CFS. Linux Symposium, 2010.Google Scholar
- Arunchandar Vasan, Anand Sivasubramaniam, Vikrant Shimpi, T Sivabalan, and Rajesh Subbiah. Worth Their Watts?--An Empirical Study of Datacenter Servers. HPCA, 2010.Google ScholarCross Ref
- VMware. VMware Infrastructure: Resource Management with VMware DRS. White paper, VMware, 2006.Google Scholar
- Wenji Wu, Phil DeMar, and Matt Crawford. Why Can Some Advanced Ethernet NICs Cause Packet Reordering? IEEE Communications Letters, 15(2):253--255, 2011.Google ScholarCross Ref
- Yunjing Xu, Zachary Musgrave, Brian Noble, and Michael Bailey. Bobtail: Avoiding Long Tails in the Cloud. NSDI, 2013. Google ScholarDigital Library
- Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. Bubble-Flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers. ISCA, 2013. Google ScholarDigital Library
- Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. CPI2: CPU Performance Isolation for Shared Compute Clusters. EuroSys, 2013.Google ScholarDigital Library
- Reconciling high server utilization and sub-millisecond quality-of-service
Recommendations
Improving server utilization using fast virtual machine migration
Live virtual machine (VM) migration is a technique for transferring an active VM from one physical host to another without disrupting the VM. In principle, live VM migration enables dynamic resource requirements to be matched with available physical ...
Towards high-quality I/O virtualization
SYSTOR '09: Proceedings of SYSTOR 2009: The Israeli Experimental Systems ConferenceHigh-quality I/O virtualization (that is, complete device semantics, full-feature set, close-to-native performance and real-time response) is critical to both server and client virtualizations. Existing solutions for I/O virtualization (e.g., full ...
Improving server utilization via resource-adaptive batch VMs: poster
Middleware '17: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference: Posters and DemosPublic cloud data centers often suffer from low resource utilization [1]. To increase utilization, recent works have proposed running batch workloads next to customer VMs to leverage idle resources [6]. While effective, the key challenge here is ...
Comments