ABSTRACT
Service providers want to reduce datacenter costs by consolidating workloads onto fewer servers. At the same time, customers have performance goals, such as meeting tail latency Service Level Objectives (SLOs). Consolidating workloads while meeting tail latency goals is challenging, especially since workloads in production environments are often bursty. To limit the congestion when consolidating workloads, customers and service providers often agree upon rate limits. Ideally, rate limits are chosen to maximize the number of workloads that can be co-located while meeting each workload's SLO. In reality, neither the service provider nor customer knows how to choose rate limits. Customers end up selecting rate limits on their own in some ad hoc fashion, and service providers are left to optimize given the chosen rate limits.
This paper describes WorkloadCompactor, a new system that uses workload traces to automatically choose rate limits simultaneously with selecting onto which server to place workloads. Our system meets customer tail latency SLOs while minimizing datacenter resource costs. Our experiments show that by optimizing the choice of rate limits, WorkloadCompactor reduces the number of required servers by 30--60% as compared to state-of-the-art approaches.
- 2017. Wikimedia Downloads: Analytics Datasets. (2017). https://dumps.wikimedia.org/other/analytics/Google Scholar
- Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pfabric: Minimal near-optimal data-center transport. In ACM SIGCOMM. 435--446.Google Scholar
- Yossi Azar, Ilan Reuven Cohen, Seny Kamara, and Bruce Shepherd. 2013. Tight Bounds for Online Vector Bin Packing. In Proceedings of the Forty-fifth Annual ACM Symposium on Theory of Computing (STOC '13). ACM, New York, NY, USA, 961--970. Google ScholarDigital Library
- Feng Chen, Rubao Lee, and Xiaodong Zhang. 2011. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In IEEE HPCA. 266--277.Google Scholar
- Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM 56, 2 (Feb. 2013), 74--80. Google ScholarDigital Library
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-value Store. In ACM SOSP. 205--220.Google ScholarDigital Library
- Cagdas Dirik and Bruce Jacob. 2009. The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization. In ACM ISCA, Vol. 37. 279--289.Google Scholar
- Aaron J. Elmore, Sudipto Das, Alexander Pucher, Divyakant Agrawal, Amr El Abbadi, and Xifeng Yan. 2013. Characterizing Tenant Behavior for Placement and Crisis Mitigation in Multitenant DBMSs. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). ACM, New York, NY, USA, 517--528. Google ScholarDigital Library
- Matthew P Grosvenor, Malte Schwarzkopf, Ionel Gog, Robert NM Watson, Andrew W Moore, Steven Hand, and Jon Crowcroft. 2015. Queues Don't Matter When You Can JUMP Them!. In USENIX NSDI.Google ScholarDigital Library
- Ajay Gulati, Chethan Kumar, Irfan Ahmad, and Karan Kumar. 2010. BASIL: Automated IO Load Balancing Across Storage Devices. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST'10). USENIX Association, Berkeley, CA, USA, 13--13. http://dl.acm.org/citation.cfm?id=1855511.1855524Google ScholarDigital Library
- Ajay Gulati, Arif Merchant, and Peter J. Varman. 2007. pClock: an arrival curve based approach for QoS guarantees in shared storage systems. In Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems (SIGMETRICS '07). ACM, New York, NY, USA, 13--24. Google ScholarDigital Library
- Ajay Gulati, Ganesha Shanmuganathan, Irfan Ahmad, Carl Waldspurger, and Mustafa Uysal. 2011. Pesto: Online Storage Performance Management in Virtualized Datacenters. In Proceedings of the 2Nd ACM Symposium on Cloud Computing (SOCC '11). ACM, New York, NY, USA, Article 19, 14 pages. Google ScholarDigital Library
- Keon Jang, Justine Sherry, Hitesh Ballani, and Toby Moncaster. 2015. Silo: Predictable Message Latency in the Cloud. In ACM SIGCOMM. ACM, 435--448.Google ScholarDigital Library
- Swaroop Kavalanekar, Bruce L. Worthington, Qi Zhang, and Vishal Sharda. 2008. Characterization of storage workload traces from production Windows Servers.. In IISWC (2008-10-29), David Christie, Alan Lee, Onur Mutlu, and Benjamin G. Zorn (Eds.). IEEE, 119--128. http://dblp.uni-trier.de/db/conf/iiswc/iiswc2008.html#KavalanekarWZS08Google ScholarCross Ref
- Jean-Yves Le Boudec and Patrick Thiran. 2001. Network Calculus: A Theory of Deterministic Queuing Systems for the Internet. Springer-Verlag, Berlin, Heidelberg.Google ScholarCross Ref
- Nohhyun Park, Irfan Ahmad, and David J. Lilja. 2012. Romano: Autonomous Storage Management Using Performance Prediction in Multi-tenant Datacenters. In Proceedings of the Third ACM Symposium on Cloud Computing (SoCC '12). ACM, New York, NY, USA, Article 21, 14 pages. Google ScholarDigital Library
- Jonathan Perry, Amy Ousterhout, Hari Balakrishnan, Devavrat Shah, and Hans Fugal. 2014. Fastpass: A centralized zero-queue datacenter network. In ACM SIGCOMM. 307--318.Google ScholarDigital Library
- Aameek Singh, Madhukar Korupolu, and Dushmanta Mohapatra. 2008. Server-storage Virtualization: Integration and Load Balancing in Data Centers. In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (SC '08). IEEE Press, Piscataway, NJ, USA, Article 53, 12 pages. http://dl.acm.org/citation.cfm?id=1413370.1413424Google ScholarCross Ref
- Eno Thereska, Hitesh Ballani, Greg O'Shea, Thomas Karagiannis, Antony Rowstron, Tom Talpey, Richard Black, and Timothy Zhu. 2013. IOFlow: A Software-defined Storage Architecture. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP '13). ACM, New York, NY, USA, 182--196. Google ScholarDigital Library
- Balajee Vamanan, Jahangir Hasan, and TN Vijaykumar. 2012. Deadline-aware datacenter tcp (d2tcp). In ACM SIGCOMM. 115--126.Google Scholar
- Yunjing Xu, Zachary Musgrave, Brian Noble, and Michael Bailey. 2013. Bobtail: Avoiding Long Tails in the Cloud. In USENIX NSDI. 329--342.Google Scholar
- Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman, Andrew A. Chien, and Haryadi S. Gunawi. 2017. Tiny-Tail Flash: Near-Perfect Elimination of Garbage Collection Tail Latencies in NAND SSDs. In 15th USENIX Conference on File and Storage Technologies (FAST 17). USENIX Association, Santa Clara, CA, 15--28. https://www.usenix.org/conference/fast17/technical-sessions/presentation/yanGoogle ScholarDigital Library
- Young Jin Yu, Dong In Shin, Hyeonsang Eom, and Heon Young Yeom. 2010. NCQ vs. I/O Scheduler: Preventing Unexpected Misbehaviors. ACM Trans. Storage 6, 1 (April 2010), 2:1--2:37.Google ScholarDigital Library
- Jianyong Zhang, Anand Sivasubramaniam, Qian Wang, Alma Riska, and Erik Riedel. 2006. Storage performance virtualization via throughput and latency control. Trans. Storage 2, 3 (Aug. 2006), 283--308. Google ScholarDigital Library
- Timothy Zhu, Daniel S. Berger, and Mor Harchol-Balter. 2016. SNC-Meister: Admitting More Tenants with Tail Latency SLOs. In Proceedings of the Seventh ACM Symposium on Cloud Computing (SoCC '16). ACM, New York, NY, USA, 374--387. Google ScholarDigital Library
- Timothy Zhu, Alexey Tumanov, Michael A. Kozuch, Mor Harchol-Balter, and Gregory R. Ganger. 2014. PriorityMeister: Tail Latency QoS for Shared Networked Storage. In ACM SOCC. ACM, New York, NY, USA, Article 29, 14 pages. Google ScholarDigital Library
Index Terms
WorkloadCompactor: reducing datacenter cost while providing tail latency SLO guarantees
Recommendations
AppRAISE: application-level performance management in virtualized server environments
Managing application-level performance for multitier applications in virtualized server environments is challenging because the applications are distributed across multiple virtual machines, and workloads are dynamic in their intensity and transaction ...
An Approach for Detection of Overloaded Host to Consolidate Workload in Cloud Datacenter
This article describes the process of workload consolidation through detection of overloaded hosts in Cloud datacenter which leads to saving in energy consumption. Cloud computing is a novice paradigm where virtual resources are provisioned on pay-as-...
Rubik: fast analytical power management for latency-critical systems
MICRO-48: Proceedings of the 48th International Symposium on MicroarchitectureLatency-critical workloads (e.g., web search), common in datacenters, require stable tail (e.g., 95th percentile) latencies of a few milliseconds. Servers running these workloads are kept lightly loaded to meet these stringent latency targets. This low ...
Comments