research-article

HUNTER: An Online Cloud Database Hybrid Tuning System for Personalized Requirements

Authors:

Jiashu XingAuthors Info & Claims

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Pages 646 - 659

https://doi.org/10.1145/3514221.3517882

Published: 11 June 2022 Publication History

Abstract

Recently, using machine learning for performance tuning of cloud database (CDB) service has shown great potentials. However, facing personalized requirements such as various restrictions for tuning with very different workloads, pre-trained models may mismatch or recommend suboptimal configurations given a new workload. On the other hand, if the system tunes configurations in an online fashion, the system will suffer from the cold start problem, resulting in long tuning time and performance fluctuation. To accommodate these problems, we propose an online CDB tuning system called HUNTER. The key feature of HUNTER is a hybrid architecture, which uses samples generated by Genetic Algorithm to warm-start the finer grained exploration of deep reinforcement learning. Meanwhile, we employ Principal Component Analysis, Random Forest, and Fast Exploration Strategy to reduce the search space and the update time of the learning model. In addition, we further propose a clone and parallelization scheme to stress-test workloads on multiple cloned CDB instances (CDBs), resulting in faster and safer configuration exploration. Extensive trials on CDB with public and real-world workloads demonstrate that, given the same time budget and resources, HUNTER improves performance and considerably decreases recommendation time compared to state-of-the-art tuning systems, with accelerations of up to 2.8× and 22.8× utilizing 1 and 20 cloned CDBs, respectively.

References

[1]

Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, and Wojciech Zaremba. 2017. Hindsight experience replay. arXiv preprint arXiv:1707.01495 (2017).

[2]

Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil Anthony Bharath. 2017. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine 34, 6 (2017), 26--38.

[3]

Surajit Chaudhuri and Vivek R Narasayya. 1997. An efficient, cost-driven index selection tool for Microsoft SQL server. In VLDB, Vol. 97. Citeseer, 146--155.

[4]

Sudipto Das, Miroslav Grbic, Igor Ilic, Isidora Jovandic, Andrija Jovanovic, Vivek R Narasayya, Miodrag Radulovic, Maja Stikic, Gaoxiang Xu, and Surajit Chaudhuri. 2019. Automatically indexing millions of databases in microsoft azure sql database. In Proceedings of the 2019 International Conference on Management of Data. 666--679.

Digital Library

[5]

Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2018, Gautam Das, Christopher M. Jermaine, and Philip A. Bernstein (Eds.). ACM, 505--520.

Digital Library

[6]

Niv Dayan and Stratos Idreos. 2019. The log-structured merge-bush & the wacky continuum. In Proceedings of the 2019 International Conference on Management of Data. 449--466.

Digital Library

[7]

Biplob K Debnath, David J Lilja, and Mohamed F Mokbel. 2008. SARD: A statistical approach for ranking database tuning parameters. In 2008 IEEE 24th International Conference on Data Engineering Workshop. IEEE, 11--18.

Digital Library

[8]

Karl Dias, Mark Ramacher, Uri Shaft, Venkateshwaran Venkataramani, and Graham Wood. 2005. Automatic Performance Diagnosis and Tuning in Oracle. In CIDR. 84--94.

[9]

Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, et al . 2020. ALEX: an updatable adaptive learned index. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 969--984.

Digital Library

[10]

Songyun Duan, Vamsidhar Thummala, and Shivnath Babu. 2009. Tuning database configuration parameters with iTuned. Proceedings of the VLDB Endowment 2, 1 (2009), 1246--1257.

Digital Library

[11]

Anshuman Dutt, Chi Wang, Azade Nazi, Srikanth Kandula, Vivek Narasayya, and Surajit Chaudhuri. 2019. Selectivity estimation for range predicates using lightweight models. Proceedings of the VLDB Endowment 12, 9 (2019), 1044--1057.

Digital Library

[12]

Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Proceedings of the 34th International Conference on Machine Learning, Doina Precup and Yee Whye Teh (Eds.), Vol. 70. PMLR, 1126--1135.

[13]

Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, and Tim Kraska. 2019. Fiting-tree: A data-aware index structure. In Proceedings of the 2019 ACM SIGMOD International Conference on Management of Data. 1189--1206.

Digital Library

[14]

David E Goldberg and John Henry Holland. 1988. Genetic algorithms and machine learning. (1988).

[15]

Steven M Holland. 2008. Principal components analysis (PCA). Department of Geology, University of Georgia, Athens, GA (2008), 30602--2501.

[16]

Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. 1996. Reinforcement learning: A survey. Journal of artificial intelligence research 4 (1996), 237--285.

Digital Library

[17]

Konstantinos Kanellis, Ramnatthan Alagappan, and Shivaram Venkataraman. 2020. Too Many Knobs to Tune? Towards Faster Database Tuning by Pre-selecting Important Knobs. In 12th {USENIX} Workshop on Hot Topics in Storage and File Systems (HotStorage 20).

[18]

Tim Kraska, Alex Beutel, Ed H Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The case for learned index structures. In Proceedings of the 2018 International Conference on Management of Data. 489--504.

Digital Library

[19]

Eva Kwan, Sam Lightstone, Adam Storm, and Leanne Wu. 2002. Automatic configuration for IBM DB2 universal database. Proc. of IBM Perf Technical Report (2002).

[20]

Roger J Lewis. 2000. An introduction to classification and regression tree (CART) analysis. In Annual meeting of the society for academic emergency medicine in San Francisco, California, Vol. 14. Citeseer.

[21]

Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. Qtune: A query-aware database tuning system with deep reinforcement learning. Proceedings of the VLDB Endowment 12, 12 (2019), 2118--2130.

Digital Library

[22]

Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).

[23]

Ji Liu and Ce Zhang. 2021. Distributed learning systems with first-order methods. arXiv preprint arXiv:2104.05245 (2021).

[24]

Lin Ma, Bailu Ding, Sudipto Das, and Adith Swaminathan. 2020. Active learning for ML enhanced database systems. In Proceedings of the 2020 International Conference on Management of Data. 175--191.

Digital Library

[25]

Lin Ma, Dana Van Aken, Ahmed Hefny, Gustavo Mezerhane, Andrew Pavlo, and Geoffrey J Gordon. 2018. Query-based workload forecasting for self-driving database management systems. In Proceedings of the 2018 International Conference on Management of Data. 631--645.

Digital Library

[26]

Lin Ma, William Zhang, Jie Jiao, Wuwen Wang, Matthew Butrovich, Wan Shen Lim, Prashanth Menon, and Andrew Pavlo. 2021. MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems. In Proceedings of the 2021 International Conference on Management of Data. 1248--1261.

Digital Library

[27]

Ryan C. Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: A Learned Query Optimizer. Proc. VLDB Endow. 12, 11 (2019), 1705--1718. https://doi.org/ 10.14778/3342263.3342644

Digital Library

[28]

Thais Mayumi Oshiro, Pedro Santoro Perez, and José Augusto Baranauskas. 2012. How many trees in a random forest?. In International workshop on machine learning and data mining in pattern recognition. Springer, 154--168.

Digital Library

[29]

Zahra Sadri, Le Gruenwald, and Eleazar Leal. 2020. Online index selection using deep reinforcement learning for a cluster database. In 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW). IEEE, 158--161.

[30]

Karl Schnaitter and Neoklis Polyzotis. 2010. Semi-automatic index tuning: Keeping dbas in the loop. arXiv preprint arXiv:1004.1249 (2010).

[31]

Jonathon Shlens. 2014. A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100 (2014).

[32]

Adam J Storm, Christian Garcia-Arellano, Sam S Lightstone, Yixin Diao, and Maheswaran Surendra. 2006. Adaptive self-tuning memory in DB2. In Proceedings of the 32nd international conference on Very large data bases. 1081--1092.

[33]

David G Sullivan, Margo I Seltzer, and Avi Pfeffer. 2004. Using probabilistic reasoning to automate software tuning. ACM SIGMETRICS Performance Evaluation Review 32, 1 (2004), 404--405.

Digital Library

[34]

Jian Tan, Tieying Zhang, Feifei Li, Jie Chen, Qixing Zheng, Ping Zhang, Honglin Qiao, Yue Shi, Wei Cao, and Rui Zhang. 2019. ibtune: Individualized buffer tuning for large-scale cloud databases. Proceedings of the VLDB Endowment 12, 10 (2019), 1221--1234.

Digital Library

[35]

Wenhu Tian, Pat Martin, and Wendy Powley. 2003. Techniques for automatically sizing multiple buffer pools in DB2. In Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research. 294--302.

Digital Library

[36]

Dinh Nguyen Tran, Phung Chinh Huynh, Yong C Tay, and Anthony KH Tung. 2008. A new approach to dynamic self-tuning of database buffers. ACM Transactions on Storage (TOS) 4, 1 (2008), 1--25.

Digital Library

[37]

Dana Van Aken, Andrew Pavlo, Geoffrey J Gordon, and Bohan Zhang. 2017. Automatic database management system tuning through large-scale machine learning. In Proceedings of the 2017 ACM International Conference on Management of Data. 1009--1024.

Digital Library

[38]

Dana Van Aken, Dongsheng Yang, Sebastien Brillard, Ari Fiorino, Bohan Zhang, Christian Bilien, and Andrew Pavlo. 2021. An inquiry into machine learning- based automatic configuration tuning services on real-world database management systems. Proceedings of the VLDB Endowment 14, 7 (2021), 1241--1253.

Digital Library

[39]

Gerhard Weikum, Christof Hasse, Axel Mönkeberg, and Peter Zabback. 1994. The COMFORT automatic tuning project. Information systems 19, 5 (1994), 381--432.

[40]

Darrell Whitley. 1994. A genetic algorithm tutorial. Statistics and computing 4, 2 (1994), 65--85.

[41]

Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemometrics and intelligent laboratory systems 2, 1--3 (1987), 37--52.

[42]

Chenggang Wu, Alekh Jindal, Saeed Amizadeh, Hiren Patel, Wangchao Le, Shi Qiao, and Sriram Rao. 2018. Towards a learning optimizer for shared clouds. Proceedings of the VLDB Endowment 12, 3 (2018), 210--222.

Digital Library

[43]

Yingjun Wu, Jia Yu, Yuanyuan Tian, Richard Sidle, and Ronald Barber. 2019. Designing succinct secondary indexing mechanism by exploiting column correlations. In Proceedings of the 2019 ACM SIGMOD International Conference on Management of Data. 1223--1240.

Digital Library

[44]

Dong Young Yoon, Ning Niu, and Barzan Mozafari. 2016. Dbsherlock: A performance diagnostic tool for transactional databases. In Proceedings of the 2016 International Conference on Management of Data. 1599--1614.

Digital Library

[45]

Xiang Yu, Guoliang Li, Chengliang Chai, and Nan Tang. 2020. Reinforcement learning with tree-lstm for join order selection. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1297--1308.

[46]

Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, et al . 2019. An end-to-end automatic cloud database tuning system using deep reinforcement learning. In Proceedings of the 2019 International Conference on Management of Data. 415--432.

Digital Library

[47]

Xinyi Zhang, Hong Wu, Zhuo Chang, Shuowei Jin, Jian Tan, Feifei Li, Tieying Zhang, and Bin Cui. 2021. ResTune: Resource Oriented Tuning Boosted by Meta- Learning for Cloud Databases. In Proceedings of the 2021 International Conference on Management of Data. 2102--2114.

Digital Library

[48]

Yuqing Zhu, Jianxun Liu, Mengying Guo, Yungang Bao, Wenlong Ma, Zhuoyue Liu, Kunpeng Song, and Yingchun Yang. 2017. Bestconfig: tapping the performance potential of systems via automatic configuration tuning. In Proceedings of the 2017 Symposium on Cloud Computing. 338--350.

Digital Library

Cited By

Li CWang JShi JLiu LZhang S(2025)ADWTune: an adaptive dynamic workload tuning system with deep reinforcement learningComplex & Intelligent Systems10.1007/s40747-025-01801-311:4Online publication date: 28-Feb-2025
https://doi.org/10.1007/s40747-025-01801-3
Bianchi AChai ACorvinelli VGodfrey PSzlichta JZuzarte C(2024)Db2une: Tuning Under Pressure via Deep LearningProceedings of the VLDB Endowment10.14778/3685800.368581117:12(3855-3868)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.14778/3685800.3685811
Lao JWang YLi YWang JZhang YCheng ZChen WTang MWang J(2024)GPTuner: A Manual-Reading Database Tuning System via GPT-Guided Bayesian OptimizationProceedings of the VLDB Endowment10.14778/3659437.365944917:8(1939-1952)Online publication date: 31-May-2024
https://dl.acm.org/doi/10.14778/3659437.3659449
Show More Cited By

Index Terms

HUNTER: An Online Cloud Database Hybrid Tuning System for Personalized Requirements
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems

Recommendations

Towards Dynamic and Safe Configuration Tuning for Cloud Databases
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Configuration knobs of database systems are essential to achieve high throughput and low latency. Recently, automatic tuning systems using machine learning methods (ML) have shown to find better configurations compared to experienced database ...
Automatic Database Management System Tuning Through Large-scale Machine Learning
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data

Database management system (DBMS) configuration tuning is an essential aspect of any data-intensive application effort. But this is historically a difficult task because DBMSs have hundreds of configuration "knobs" that control everything in the system, ...
${CDBTune}^{+}$ : An efficient deep reinforcement learning-based automatic cloud database tuning system
Abstract
Configuration tuning is vital to optimize the performance of a database management system (DBMS). It becomes more tedious and urgent for cloud databases (CDB) due to diverse database instances and query workloads, which make the job of a database ... $^{}$ $^{}$ $^{}$ $^{}$ $^{}$ $^{}$ $^{}$

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

June 2022

2597 pages

ISBN:9781450392495

DOI:10.1145/3514221

General Chair:
Zachary Ives
University of Pennsylvania (USA)
,
Program Chairs:
Angela Bonifati
Lyon 1 University (France)
,
Amr El Abbadi
University of California, Santa Barbara (USA)

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

The National Natural Science Foundation
The Innovation Group Project of National Natural Science Foundation

Conference

SIGMOD/PODS '22

Sponsor:

SIGMOD

SIGMOD/PODS '22: International Conference on Management of Data

June 12 - 17, 2022

PA, Philadelphia, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
1,065
Total Downloads

Downloads (Last 12 months)234
Downloads (Last 6 weeks)30

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li CWang JShi JLiu LZhang S(2025)ADWTune: an adaptive dynamic workload tuning system with deep reinforcement learningComplex & Intelligent Systems10.1007/s40747-025-01801-311:4Online publication date: 28-Feb-2025
https://doi.org/10.1007/s40747-025-01801-3
Bianchi AChai ACorvinelli VGodfrey PSzlichta JZuzarte C(2024)Db2une: Tuning Under Pressure via Deep LearningProceedings of the VLDB Endowment10.14778/3685800.368581117:12(3855-3868)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.14778/3685800.3685811
Lao JWang YLi YWang JZhang YCheng ZChen WTang MWang J(2024)GPTuner: A Manual-Reading Database Tuning System via GPT-Guided Bayesian OptimizationProceedings of the VLDB Endowment10.14778/3659437.365944917:8(1939-1952)Online publication date: 31-May-2024
https://dl.acm.org/doi/10.14778/3659437.3659449
Wang YChen PDou HZhang YYu GHe ZHuang HFilkov VRay BZhou M(2024)FaaSConf: QoS-aware Hybrid Resources Configuration for Serverless WorkflowsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695477(957-969)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695477
Mai GHe ZYu GChen ZChen P(2024)CTuner: Automatic NoSQL Database Tuning with Causal Reinforcement LearningProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3674809(269-278)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3671016.3674809
Zhan YXi RLiao JFan SHou M(2024)KnobTune: A Dynamic Database Configuration Tuning Strategy Leveraging Historical Workload SimilaritiesProceedings of the International Conference on Computing, Machine Learning and Data Science10.1145/3661725.3661734(1-8)Online publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1145/3661725.3661734
Chen HChen XLiang ZFeng XXie JSu HZheng KSerra ESpezzano F(2024)Towards Online and Safe Configuration Tuning with Semi-supervised Anomaly DetectionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679700(218-227)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679700
Dou HWang YZhang YChen PZheng Z(2024)DeepCAT⁺: A Low-Cost and Transferrable Online Configuration Auto-Tuning Approach for Big Data FrameworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345988935:11(2114-2131)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1109/TPDS.2024.3459889
Wang PJiang HLiu YZhao ZZhou KHuang Z(2024)Beyond Belady to Attain a Seemingly Unattainable Byte Miss Ratio for Content Delivery NetworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345209635:11(1949-1963)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1109/TPDS.2024.3452096
Dong HZhang CLi GZhang H(2024)Cloud-Native Databases: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.339750836:12(7772-7791)Online publication date: Dec-2024
https://doi.org/10.1109/TKDE.2024.3397508
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten