Abstract
Storage structure decision for a database aims to automatically determine the effective storage structure according to the data distribution and workload. With the integration of machine learning and database becoming closer, complex machine learning tasks are directly executed in database, and need the support of efficient storage structure. The existing storage decision methods are mainly oriented to common workloads and rely on the decision of experienced DBAs, which has low efficiency and high risk of error. Thus, an automated storage structure decision method for in-database machine learning is urgently needed. We propose a cost-based lightweight row-column storage automatic decision system. To the best of our knowledge, this is the first storage structure selection for machine learning tasks. Extensive experiments show that the accuracy of the storage structure above 90%, shorten the task execution time by about 85%, and greatly reduce the risk of decision error.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
a - total key field size; b - total non-key field size; c - the number of fixed-length fields; d - the number of variable-length fields.
References
Olteanu, D.: The relational data borg is learning. PVLDB 13(12), 3502–3515 (2020)
De Marchi, F., Lopes, S., Petit, J.-M., Toumani, F.: Analysis of existing databases at the logical level: the DBA companion project. ACM SIGMOD Rec. 32(1), 47–52 (2003)
Park, Y., Zhong, S., Mozafari, B.: Quicksel: quick selectivity learning with mixture models. In: Proceedings of the 2020 SIGMOD (2020)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD (2016)
Wang, H., Wei, Y., Yan, H.: Automatic storage structure selection for hybrid workload (2020)
Acknowledgements
This paper was supported by NSFC grant (U1866602, 71773025). The National Key Research and Development Program of China (2020YFB1006104).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cui, S., Wang, H., Gu, H., Xie, Y. (2021). Cost-Based Lightweight Storage Automatic Decision for In-Database Machine Learning. In: Zhang, W., Zou, L., Maamar, Z., Chen, L. (eds) Web Information Systems Engineering – WISE 2021. WISE 2021. Lecture Notes in Computer Science(), vol 13080. Springer, Cham. https://doi.org/10.1007/978-3-030-90888-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-90888-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90887-4
Online ISBN: 978-3-030-90888-1
eBook Packages: Computer ScienceComputer Science (R0)