research-article

L-QoCo: learning to optimize cache capacity overloading in storage systems

Authors:

Yifan LiAuthors Info & Claims

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

Pages 379 - 384

https://doi.org/10.1145/3489517.3530466

Published: 23 August 2022 Publication History

Abstract

Cache plays an important role to maintain high and stable performance (i.e. high throughput, low tail latency and throughput jitter) in storage systems. Existing rule-based cache management methods, coupled with engineers' manual configurations, cannot meet ever-growing requirements of both time-varying workloads and complex storage systems, leading to frequent cache overloading.

In this paper, we propose the first light-weight learning-based cache bandwidth control technique, called L-QoCo which can adaptively control the cache bandwidth so as to effectively prevent cache overloading in storage systems. Extensive experiments with various workloads on real systems show that L-QoCo, with its strong adaptability and fast learning ability, can adapt to various workloads to effectively control cache bandwidth, thereby significantly improving the storage performance (e.g. increasing the throughput by 10%-20% and reducing the throughput jitter and tail latency by 2X-6X and 1.5X-4X, respectively, compared with two representative rule-based methods).

References

[1]

Shaohong Li et al. Thunderbolt: Throughput-optimized, quality-of-service-aware power capping at scale. In 14th USENIX SOSP & OSDI, pages 1241--1255, 2020.

[2]

Nathan Beckmann et al. Maximizing cache performance under uncertainty. In 2017 IEEE HPCA, pages 109--120, 2017.

[3]

Nathan Beckmann et al. Modeling cache performance beyond LRU. In 2016 IEEE HPCA, pages 225--236, 2016.

[4]

Juliana Franco et al. You can have it all: Abstraction and good cache performance. Onward! 2017, page 148--167, New York, NY, USA, 2017. ACM.

[5]

Sarra Slimani et al. Service-oriented replication strategies for improving quality-of-service in cloud computing: a survey. Cluster Computing, 24(1):361--392, 2021.

Digital Library

[6]

Saba Ahmadian et al. Lbica: A load balancer for i/o cache architectures. In 2019 DATE, pages 1196--1201, 2019.

[7]

Kan Wu et al. The storage hierarchy is not a hierarchy: Optimizing caching on modern storage devices with orthus. In 19th USENIX FAST 21, pages 307--323.

[8]

Mingzhe Hao et al. Linnos: Predictability on unpredictable flash storage with a light neural network. In 14th USENIX SOSP & OSDI, pages 173--190, 2020.

[9]

Christopher JCH Watkins and Peter Dayan. Q-learning. Machine learning, 8(3--4):279--292, 1992.

[10]

D Osmanković and Samim Konjicija. Implementation of q---learning algorithm for solving maze problem. In 34th MIPRO, pages 1619--1622. IEEE, 2011.

[11]

Christopher Kramer et al. Cooperative fair bandwidth scaling in contention-based wireless networks using time token bucket. In IPCCC, pages 1--9, 2019.

[12]

Neal Cardwell et al. Bbr: Congestion-based congestion control. Commun. ACM, 60(2):58--66, January 2017.

Digital Library

[13]

Olivier Pietquin et al. Sample-efficient batch reinforcement learning for dialogue management optimization. ACM TSLP, 7(3):1--21, 2011.

Digital Library

[14]

Yingtian Tang et al. Learning-aided heuristics design for storage system. SIGMOD/PODS '21, page 2597--2601. ACM, 2021.

[15]

Inho Cho et al. Overload control for us-scale RPCs with breakwater. In 14th USENIX OSDI, pages 299--314, 2020.

[16]

Junjie Xie et al. Cutting long-tail latency of routing response in software defined networks. IEEE Journal on Selected Areas in Communications, 36(3):384--396, 2018.

Cited By

Zhou YWang FShi ZFeng D(2024)The Static Allocation is Not a Static: Optimizing SSD Address Allocation Through Boosting Static PolicyIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.340736735:8(1373-1386)Online publication date: Aug-2024
https://doi.org/10.1109/TPDS.2024.3407367
Zhou YWang FShi ZFeng D(2024)CoFS: A Collaboration-Aware Fairness Scheme for NVMe SSD in Cloud Storage SystemIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.341297043:12(4490-4504)Online publication date: Dec-2024
https://doi.org/10.1109/TCAD.2024.3412970
Sun PYou LZheng SZhang WMa RYang JWang GZhu FLi SHuang L(2023)Learning-based Data Separation for Write Amplification Reduction in Solid State Drives2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247795(1-6)Online publication date: 9-Jul-2023
https://doi.org/10.1109/DAC56929.2023.10247795
Show More Cited By

Recommendations

TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

Translation Lookaside Buffers (TLBs) are critical to overall system performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as Chip MultiProcessors (CMPs) become ubiquitous, TLB design and ...
SELECTIVE VICTIM CACHING: A METHOD TO IMPROVE THE PERFORMANCE OF DIRECT-MAPPED CACHES
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

July 2022

1462 pages

ISBN:9781450391429

DOI:10.1145/3489517

General Chair:
Rob Oshana
NXP

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation
IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

DAC '22

Sponsor:

SIGDA

DAC '22: 59th ACM/IEEE Design Automation Conference

July 10 - 14, 2022

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
263
Total Downloads

Downloads (Last 12 months)50
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhou YWang FShi ZFeng D(2024)The Static Allocation is Not a Static: Optimizing SSD Address Allocation Through Boosting Static PolicyIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.340736735:8(1373-1386)Online publication date: Aug-2024
https://doi.org/10.1109/TPDS.2024.3407367
Zhou YWang FShi ZFeng D(2024)CoFS: A Collaboration-Aware Fairness Scheme for NVMe SSD in Cloud Storage SystemIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.341297043:12(4490-4504)Online publication date: Dec-2024
https://doi.org/10.1109/TCAD.2024.3412970
Sun PYou LZheng SZhang WMa RYang JWang GZhu FLi SHuang L(2023)Learning-based Data Separation for Write Amplification Reduction in Solid State Drives2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247795(1-6)Online publication date: 9-Jul-2023
https://doi.org/10.1109/DAC56929.2023.10247795
Zhou YWang FShi ZFeng DDu Y(2023)Fair Will Go On: A Collaboration-Aware Fairness Scheme for NVMe SSD in Cloud Storage System2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247718(1-6)Online publication date: 9-Jul-2023
https://doi.org/10.1109/DAC56929.2023.10247718

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten