Skip to main content

Cost-Based Lightweight Storage Automatic Decision for In-Database Machine Learning

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2021 (WISE 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13080))

Included in the following conference series:

  • 1425 Accesses

Abstract

Storage structure decision for a database aims to automatically determine the effective storage structure according to the data distribution and workload. With the integration of machine learning and database becoming closer, complex machine learning tasks are directly executed in database, and need the support of efficient storage structure. The existing storage decision methods are mainly oriented to common workloads and rely on the decision of experienced DBAs, which has low efficiency and high risk of error. Thus, an automated storage structure decision method for in-database machine learning is urgently needed. We propose a cost-based lightweight row-column storage automatic decision system. To the best of our knowledge, this is the first storage structure selection for machine learning tasks. Extensive experiments show that the accuracy of the storage structure above 90%, shorten the task execution time by about 85%, and greatly reduce the risk of decision error.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    a - total key field size; b - total non-key field size; c - the number of fixed-length fields; d - the number of variable-length fields.

References

  1. Olteanu, D.: The relational data borg is learning. PVLDB 13(12), 3502–3515 (2020)

    Google Scholar 

  2. De Marchi, F., Lopes, S., Petit, J.-M., Toumani, F.: Analysis of existing databases at the logical level: the DBA companion project. ACM SIGMOD Rec. 32(1), 47–52 (2003)

    Google Scholar 

  3. Park, Y., Zhong, S., Mozafari, B.: Quicksel: quick selectivity learning with mixture models. In: Proceedings of the 2020 SIGMOD (2020)

    Google Scholar 

  4. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD (2016)

    Google Scholar 

  5. Wang, H., Wei, Y., Yan, H.: Automatic storage structure selection for hybrid workload (2020)

    Google Scholar 

Download references

Acknowledgements

This paper was supported by NSFC grant (U1866602, 71773025). The National Key Research and Development Program of China (2020YFB1006104).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongzhi Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cui, S., Wang, H., Gu, H., Xie, Y. (2021). Cost-Based Lightweight Storage Automatic Decision for In-Database Machine Learning. In: Zhang, W., Zou, L., Maamar, Z., Chen, L. (eds) Web Information Systems Engineering – WISE 2021. WISE 2021. Lecture Notes in Computer Science(), vol 13080. Springer, Cham. https://doi.org/10.1007/978-3-030-90888-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-90888-1_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-90887-4

  • Online ISBN: 978-3-030-90888-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics