skip to main content
10.1145/2592798.2592821acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Reconciling high server utilization and sub-millisecond quality-of-service

Published:14 April 2014Publication History

ABSTRACT

The simplest strategy to guarantee good quality of service (QoS) for a latency-sensitive workload with sub-millisecond latency in a shared cluster environment is to never run other workloads concurrently with it on the same server. Unfortunately, this inevitably leads to low server utilization, reducing both the capability and cost effectiveness of the cluster.

In this paper, we analyze the challenges of maintaining high QoS for low-latency workloads when sharing servers with other workloads. We show that workload co-location leads to QoS violations due to increases in queuing delay, scheduling delay, and thread load imbalance. We present techniques that address these vulnerabilities, ranging from provisioning the latency-critical service in an interference aware manner, to replacing the Linux CFS scheduler with a scheduler that provides good latency guarantees and fairness for co-located workloads. Ultimately, we demonstrate that some latency-critical workloads can be aggressively co-located with other workloads, achieve good QoS, and that such co-location can improve a datacenter's effective throughput per TCO-$ by up to 52%.

References

  1. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. Workload Analysis of a Large-Scale Key-Value Store. SIGMETRICS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Luiz Andre Barroso. Warehouse-Scale Computing: Entering the Teenage Decade. ISCA, 2011.Google ScholarGoogle Scholar
  3. Juan A. Colmenares et al. Tessellation: Refactoring the OS Around Explicit Resource Containers with Continuous Adaptation. DAC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jeffrey Dean and Luiz André Barroso. The Tail at Scale. Commununications of the ACM, 56(2):74--80, February 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Christina Delimitrou and Christos Kozyrakis. iBench: Quantifying Interference for Datacenter Workloads. IISWC, 2013.Google ScholarGoogle Scholar
  6. Christina Delimitrou and Christos Kozyrakis. Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters. ASPLOS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Kenneth J Duda and David R Cheriton. Borrowed-Virtual-Time (BVT) Scheduling: Supporting Latency-Sensitive Threads in a General-Purpose Scheduler. SOSP, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Frank C. Eigler, Vara Prasad, Will Cohen, Hien Nguyen, Martin Hunt, Jim Keniston, and Brad Chen. Architecture of Systemtap: A Linux Trace/Probe Tool, 2005.Google ScholarGoogle Scholar
  9. Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. Dark Silicon and the End of Multicore Scaling. ISCA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gartner says efficient data center design can lead to 300 percent capacity growth in 60 percent less space. http://www.gartner.com/newsroom/id/1472714, 2010.Google ScholarGoogle Scholar
  11. Donald Gross, John F Shortle, James M Thompson, and Carl M Harris. Fundamentals of Queueing Theory. Wiley, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Andrew Herdrich, Ramesh Illikkal, Ravi Iyer, Ronak Singhal, Matt Merten, and Martin Dixon. SMT QoS: Hardware Prototyping of Thread-level Performance Differentiation Mechanisms. HotPar, 2012.Google ScholarGoogle Scholar
  13. Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy Katz, Scott Shenker, and Ion Stoica. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Urs Hoelzle and Luiz Andre Barroso. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan and Claypool Publishers, 1st edition, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Vimalkumar Jeyakumar, Mohammad Alizadeh, David Mazières, Balaj i Prabhakar, Changhoon Kim, and Albert Greenberg. EyeQ: Practical Network Performance Isolation at the Edge. NSDI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. James M Kaplan, William Forrest, and Noah Kindler. Revolutionizing Data Center Energy Efficiency. Technical report, McKinsey & Company, 2008.Google ScholarGoogle Scholar
  17. Rishi Kapoor, George Porter, Malveeka Tewari, Geoffrey M Voelker, and Ãmin Vahdat. Chronos: Predictable Low Latency for Data Center Applications. SOCC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. David G Kendall. Stochastic Processes Occurring in the Theory of Queues and their Analysis by the Method of the Imbedded Markov Chain. The Annals of Mathematical Statistics, 1953.Google ScholarGoogle ScholarCross RefCross Ref
  19. Kevin Lim, Parthasarathy Ranganathan, Jichuan Chang, Chandrakant Patel, Trevor Mudge, and Steven Reinhardt. Understanding and designing new server architectures for emerging warehouse-computing environments. ISCA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Chung Laung Liu and James W Layland. Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment. Journal of the ACM, 20(1):46--61, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Huan Liu. A Measurement Study of Server Utilization in Public Clouds. In Proc. of the Intl. Conference on Dependable, Autonomic and Secure Computing, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jason Mars, Lingjia Tang, and Robert Hundt. Heterogeneity in "Homogeneous" Warehouse-Scale Computers: A Performance Opportunity. IEEE Computer Architecture Letters, 10(2), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. Bubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations. In Proc. of the Intl. Symposium on Microarchitecture, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. David Meisner, Junjie Wu, and Thomas F Wenisch. Big-House: A Simulation Infrastructure for Data Center Systems. ISPASS, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Onur Mutlu and Thomas Moscibroda. Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors. In Proc. of the Intl. Symposium on Microarchitecture, 2007.Google ScholarGoogle Scholar
  26. Kyle J. Nesbit, Nidhi Aggarwal, James Laudon, and James E. Smith. Fair Queuing Memory Systems. In Proc. of the Intl. Symposium on Microarchitecture, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. John Ousterhout et al. The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM. ACM SIGOPS Operating Systems Review, 43(4), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Chandandeep Singh Pabla. Completely fair scheduler. Linux Journal, 2009(184):4, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Chandrakant D. Patel and Amip J. Shah. Cost Model for Planning, Development and Operation of a Data Center. Technical report HPL-2005-107R1, Hewlett-Packard Labs, 2005.Google ScholarGoogle Scholar
  30. Aleksey Pesterev, Jacob Strauss, Nickolai Zeldovich, and Robert T Morris. Improving Network Connection Locality on Multicore Systems. EuroSys, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Moinuddin K. Qureshi and Yale N. Patt. Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches. In Proc. of the Intl. Symposium on Microarchitecture, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Charles Reiss, Alexey Tumanov, Gregory R Ganger, Randy H Katz, and Michael A Kozuch. Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis. SOCC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Barret Rhoden, Kevin Klues, David Zhu, and Eric Brewer. Improving Per-Node Efficiency in the Datacenter with New OS Abstractions. SOCC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Efraim Rotem, Alon Naveh, Doron Rajwan, Avinash Ananthakrishnan, and Eliezer Weissmann. Power-Management Architecture of the Intel Microarchitecture Code-named Sandy Bridge. IEEE Micro, 32(2), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Paul Saab. Scaling memcached at Facebook. https://www.facebook.com/note.php?note_id=39391378919, December 2008.Google ScholarGoogle Scholar
  36. Daniel Sanchez and Christos Kozyrakis. Scalable and Efficient Fine-Grained Cache Partitioning with Vantage. IEEE Micro's Top Picks, 32(3), May-June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. Omega: Flexible, Scalable Schedulers for Large Compute Clusters. EuroSys, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. David Shue, Michael J Freedman, and Anees Shaikh. Performance Isolation and Fairness for Multi-Tenant Cloud Storage. OSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Paul Turner, Bharata B Rao, and Nikhil Rao. CPU Bandwidth Control for CFS. Linux Symposium, 2010.Google ScholarGoogle Scholar
  40. Arunchandar Vasan, Anand Sivasubramaniam, Vikrant Shimpi, T Sivabalan, and Rajesh Subbiah. Worth Their Watts?--An Empirical Study of Datacenter Servers. HPCA, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  41. VMware. VMware Infrastructure: Resource Management with VMware DRS. White paper, VMware, 2006.Google ScholarGoogle Scholar
  42. Wenji Wu, Phil DeMar, and Matt Crawford. Why Can Some Advanced Ethernet NICs Cause Packet Reordering? IEEE Communications Letters, 15(2):253--255, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  43. Yunjing Xu, Zachary Musgrave, Brian Noble, and Michael Bailey. Bobtail: Avoiding Long Tails in the Cloud. NSDI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. Bubble-Flux: Precise Online QoS Management for Increased Utilization in Warehouse Scale Computers. ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, Vrigo Gokhale, and John Wilkes. CPI2: CPU Performance Isolation for Shared Compute Clusters. EuroSys, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Reconciling high server utilization and sub-millisecond quality-of-service

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            EuroSys '14: Proceedings of the Ninth European Conference on Computer Systems
            April 2014
            388 pages
            ISBN:9781450327046
            DOI:10.1145/2592798

            Copyright © 2014 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 14 April 2014

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            EuroSys '14 Paper Acceptance Rate27of147submissions,18%Overall Acceptance Rate241of1,308submissions,18%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader