research-article

PSLO: enforcing the X^th percentile latency and throughput SLOs for consolidated VM storage

Authors:

Zhan ShiAuthors Info & Claims

EuroSys '16: Proceedings of the Eleventh European Conference on Computer Systems

Article No.: 28, Pages 1 - 14

https://doi.org/10.1145/2901318.2901330

Published: 18 April 2016 Publication History

Abstract

It is desirable but challenging to simultaneously support latency SLO at a pre-defined percentile, i.e., the X^th percentile latency SLO, and throughput SLO for consolidated VM storage. Ensuring the X^th percentile latency contributes to accurately differentiating service levels in the metric of the application-level latency SLO compliance, especially for the application built on multiple VMs. However, the X^th percentile latency SLO and throughput SLO enforcement are the opposite sides of the same coin due to the conflicting requirements for the level of IO concurrency. To address this challenge, this paper proposes PSLO, a framework supporting the X^th percentile latency and throughput SLOs under consolidated VM environment by precisely coordinating the level of IO concurrency and arrival rate for each VM issue queue. It is noted that PSLO can take full advantage of the available IO capacity allowed by SLO constraints to improve throughput or reduce latency with the best effort. We design and implement a PSLO prototype in the real VM consolidation environment created by Xen. Our extensive trace-driven prototype evaluation shows that our system is able to optimize the X^th percentile latency and throughput for consolidated VMs under SLO constraints.

References

[1]

Amazon EC2 website, 2015. URL http://aws.amazon.com/ec2.

[2]

Storagetrace, storage performance council (umass trace repository), 2015. URL http://traces.cs.umass.edu/index.php/Storage.

[3]

P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the art of virtualization. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), 2003.

Digital Library

[4]

J. C. R. Bennett and H. Zhang. WF2Q: Worst-case fair weighted fair queueing. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), 1996.

Digital Library

[5]

D. D. Chambliss, G. A. Alvarez, P. Pandey, D. Jadav, J. Xu, R. Menon, and T. P. Lee. Performance virtualization for large-scale storage systems. In Proceedings 22nd International Symposium on Reliable Distributed Systems (SRDS), 2003.

[6]

J. Dean and L. A. Barroso. The tail at scale. Communications of the ACM, 56:74--80, 2013.

Digital Library

[7]

A. Demers, S. Keshav, and S. Shenker. Analysis and simulation of a fair queuing algorithm. Journal of Internetworking Research and Experience, 1(1):3--26, 1990.

[8]

G. F. Franklin, J. D. Powell, and M. Workman. Digital Control of Dynamic Systems. Addison-Wesley, 1998.

Digital Library

[9]

P. Goyal, H. M. Vin, and H. Chen. Start-time fair queuing: A scheduling algorithm for integrated services packet switching networks. IEEE/ACM Transactions on Networking, 5(5):690--704, 1997.

Digital Library

[10]

A. Gulati, A. Merchant, and P. J. Varman. pClock: an arrival curve based approach for QoS guarantees in shared storage systems. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), 2007.

Digital Library

[11]

A. Gulati, I. Ahmad, and C. A. Waldspurger. PARDA: proportional allocation of resources for distributed storage access. In Proccedings of the conference on File and storage technologies (FAST), 2009.

Digital Library

[12]

A. Gulati, A. Merchant, and P. J. Varman. mClock: handling throughput variability for hypervisor IO scheduling. In Proceedings of the Symposium on Operating Systems Design and Implementation (OSDI), 2010.

Digital Library

[13]

A. Gulati, G. Shanmuganathan, X. Zhang, and P. Varman. Demand based hierarchical QoS using storage resource pools. In Proceedings of the USENIX Annual Technical Conference (ATC), 2012.

Digital Library

[14]

J. L. Hellerstein, Y. Diao, S. Parekh, and D. M. Tilbury. Feedback Control of Computing Systems. John Wiley & Sons, Inc., Hoboken, New Jersey, 2004.

Digital Library

[15]

V. Jalaparti, P. Bodik, S. Kandula, I. Menache, M. Rybalkin, and C. Yan. Speeding up distributed request-response workflows. In SIGCOMM, 2013.

Digital Library

[16]

W. Jin, J. S. Chase, and J. Kaur. Interposed proportional sharing for a storage service utility. ACM SIGMETRICS Performance Evaluation Review, 32(1):37--48, 2004.

Digital Library

[17]

M. Karlsson, C. Karamanolis, and X. Zhu. Triage: Performance differentiation for storage systems using adaptive control. ACM Transactions on Storage (TOS), 1(4):457--480, 2005.

Digital Library

[18]

S. Kavalanekar, B. L. Worthington, Q. Zhang, and V. Sharda. Characterization of storage workload traces from production windows servers. In IEEE International Symposium on Workload Characterization, 2008.

[19]

J. Li, N. K. Sharma, D. R. K. Ports, and S. D. Gribble. Tales of the tail: Hardware, OS, and application-level sources of tail latency. In Proceedings of the ACM symposium on Cloud computing (SoCC), 2014.

Digital Library

[20]

C. R. Lumb and R. Golding. D-SPTF: Decentralized request distribution in brick-based storage systems. SIGOPS Oper. Syst. Rev., 38(5):37--47, 2004.

Digital Library

[21]

C. R. Lumb, A. Merchant, and G. A. Alvarez. Façade: Virtual storage devices with performance guarantees. In Proccedings of the conference on File and storage technologies (FAST), 2003.

Digital Library

[22]

J. C. McCullough, J. Dunagan, A. Wolman, and A. C. Snoeren. Stout: An adaptive interface to scalable cloud storag. In Proceedings of the USENIX Annual Technical Conference (ATC), 2010.

Digital Library

[23]

A. Povzner, T. Kaldewey, S. Brandt, R. Golding, T. M. Wong, and C. Maltzahn. Efficient guaranteed disk request scheduling with fahrrad. In Proceedings of the 3th European conference on Computer systems (EuroSys), 2008.

Digital Library

[24]

M. Shreedhar and G. Varghese. Efficient fair queuing using deficit round-robin. IEEE/ACM Transactions on Networking, 4(3):375--385, 1996.

Digital Library

[25]

D. Shue, M. J. Freedman, and A. Shaikh. Performance isolation and fairness for multi-tenant cloud storage. In Proceedings of the Symposium on Operating Systems Design and Implementation (OSDI), 2012.

Digital Library

[26]

L. Suresh, M. Canini, S. Schmid, and A. Feldmann. C3: Cutting tail latency in cloud data stores via adaptive replica selection. In Proceedings of the USENIX conference on Networked systems design and implementation (NSDI), 2015.

Digital Library

[27]

A. Vulimiri, P. B. Godfrey, R. Mittal, J. Sherry, S. Ratnasamy, and S. Shenker. Low latency via redundancy. In CoNEXT, 2013.

Digital Library

[28]

M. Wachs, M. Abd-El-Malek, E. Thereska, and G. R. Ganger. Argon: performance insulation for shared storage servers. In Proccedings of the conference on File and storage technologies (FAST), 2007.

Digital Library

[29]

A. Wang, S. Venkataraman, S. Alspaugh, R. Katz, and I. Stoica. Cake: enabling high-level SLOs on shared storage systems. In Proceedings of the ACM symposium on Cloud computing (SoCC), 2012.

Digital Library

[30]

J. C. Wu and S. A. Brandt. The design and implementation of Aqua: an adaptive quality of service aware object-based storage device. In Proceedings of the IEEE conference on Mass Storage Systems and Technologies (MSST), 2006.

[31]

Z. Wu, C. Yu, and H. V. Madhyastha. CosTLO: Cost-effective redundancy for lower latency variance on cloud storage services. In Proceedings of the USENIX conference on Networked systems design and implementation (NSDI), 2015.

Digital Library

[32]

Y. Xu, Z. Musgrave, B. Noble, and M. Bailey. Bobtail: Avoiding long tails in the cloud. In Proceedings of the USENIX conference on Networked systems design and implementation (NSDI), 2013.

Digital Library

[33]

J. Zhang, A. Riska, A. Sivasubramaniam, Q. Wang, and E. Riedel. Storage performance virtualization via throughput and latency control. ACM Transactions on Storage (TOS), 2 (3):283--308, 2006.

Digital Library

[34]

T. Zhu, A. Tumanov, M. A. Kozuch, M. Harchol-Balter, and G. R. Ganger. PriorityMeister: Tail latency QoS for shared networked storage. In Proceedings of the ACM symposium on Cloud computing (SoCC), 2014.

Digital Library

Cited By

Sun JReidys BLi DChang JSnir MHuang JEeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)FleetIO: Managing Multi-Tenant Cloud Storage with Multi-Agent Reinforcement LearningProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707229(478-492)Online publication date: 30-Mar-2025
https://dl.acm.org/doi/10.1145/3669940.3707229
Ma LLiu ZXiong JWu YChen RPeng XZhang YZhang GJiang D(2024)zQoS: Unleashing full performance capabilities of NVMe SSDs while enforcing SLOs in distributed storage systemsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673156(618-628)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673156
Wen JGe JZhang ZE YWu B(2024)A Latency-Predictable Cloud-Native Network Architecture based on XDP2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA)10.1109/ISPA63168.2024.00090(661-668)Online publication date: 30-Oct-2024
https://doi.org/10.1109/ISPA63168.2024.00090
Show More Cited By

Recommendations

SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16

Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...
SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16: Proceedings of the12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...
Nosv

nOSV can provide a bare-metal like performance for HPC applications on Cloud.The CPU cores and main memory are not shared among guest VMs of nOSV.Dedicated I/O resources are allocated to I/O sensitive HPC guests.Other virtualization environments can run ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

EuroSys '16: Proceedings of the Eleventh European Conference on Computer Systems

April 2016

605 pages

ISBN:9781450342407

DOI:10.1145/2901318

General Chairs:
Cristian Cadar
Imperial College London, UK
,
Peter Pietzuch
Imperial College London, UK
,
Program Chairs:
Kimberly Keeton
HP Labs
,
Rodrigo Rodrigues
Instituto Superior Técnico (Univ. Lisbon) and INESC-ID, Lisbon, Portugal

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

US NSF
NSFC
National High-tech R&D Program of China (863 Program)

Conference

EuroSys '16

EuroSys '16: Eleventh EuroSys Conference 2016

April 18 - 21, 2016

London, United Kingdom

Acceptance Rates

EuroSys '16 Paper Acceptance Rate 38 of 180 submissions, 21%;

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
641
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)2

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sun JReidys BLi DChang JSnir MHuang JEeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)FleetIO: Managing Multi-Tenant Cloud Storage with Multi-Agent Reinforcement LearningProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707229(478-492)Online publication date: 30-Mar-2025
https://dl.acm.org/doi/10.1145/3669940.3707229
Ma LLiu ZXiong JWu YChen RPeng XZhang YZhang GJiang D(2024)zQoS: Unleashing full performance capabilities of NVMe SSDs while enforcing SLOs in distributed storage systemsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673156(618-628)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673156
Wen JGe JZhang ZE YWu B(2024)A Latency-Predictable Cloud-Native Network Architecture based on XDP2024 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA)10.1109/ISPA63168.2024.00090(661-668)Online publication date: 30-Oct-2024
https://doi.org/10.1109/ISPA63168.2024.00090
Hu YHuang GHuang PDruschel PKaufmann AMace JFlinn JSeltzer M(2023)Pushing Performance Isolation Boundaries into Application with pBoxProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613159(247-263)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3600006.3613159
Wang ZLi HSun LRosenkrantz TChe HJiang H(2023)TailGuard: Tail Latency SLO Guaranteed Task Scheduling for Data-Intensive User-Facing Applications2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS57875.2023.00042(898-909)Online publication date: Jul-2023
https://doi.org/10.1109/ICDCS57875.2023.00042
Macedo RMiranda MTanimura YHaga JRuhela AHarrell SEvans RPereira JPaulo J(2023)Taming Metadata-intensive HPC Jobs Through Dynamic, Application-agnostic QoS Control2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid57682.2023.00015(47-61)Online publication date: May-2023
https://doi.org/10.1109/CCGrid57682.2023.00015
Ma LLiu ZXiong JJiang D(2022)QWin: Core Allocation for Enforcing Differentiated Tail Latency SLOs at Shared Storage Backend2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS54860.2022.00109(1098-1109)Online publication date: Jul-2022
https://doi.org/10.1109/ICDCS54860.2022.00109
Fan PJin PLuo YWang X(2022)Burger-tree: A Three-Layer Cache-Conscious Tree Index for Persistent Memory2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00152(951-956)Online publication date: Dec-2022
https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00152
Wang MStuardo CKurniawan DSinurat RGunawi H(2022)Layered Contention Mitigation for Cloud Storage2022 IEEE 15th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD55607.2022.00036(167-178)Online publication date: Jul-2022
https://doi.org/10.1109/CLOUD55607.2022.00036
Kwon DBoo JKim DKim JLu SHowell J(2020)FVMProceedings of the 14th USENIX Conference on Operating Systems Design and Implementation10.5555/3488766.3488820(955-971)Online publication date: 4-Nov-2020
https://dl.acm.org/doi/10.5555/3488766.3488820
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten