Abstract
Cardinality estimation is significant for database query optimization, which affects the query efficiency. Most existing methods often use a uniform approach to model strongly and weakly correlated attributes and seldom make comprehensively use of data information and query information. Some methods have poor accuracy due to simple structure, while others suffer from low efficiency due to complex structure. The problem of cardinality estimation that strong and weak association coexist among attributes can not be well solved by these methods or their simple combinations. Therefore we propose LAF, a new Local deep Autoregressive Framework, which performs fine-grained modeling for attributes with strong and weak correlation. LAF utilizes mutual information to identify the strong and weak association between attributes, applying the local strategy to construct deep autoregressive models to learn the joint distribution for strongly correlated attributes and outputting corresponding local estimations, using lightweight regression model to capture the complex mapping between local estimations with weak correlation and cardinality, and LAF combines information entropy to sort attributes in descending order. Not only do we enable local deep autoregressive models to learn from data information, but also make lightweight regression model to learn from query information. Extensive experimental evaluations on real datasets show that accurate result is achieved while estimation time is significantly shortened, and model size is controlled within a reasonable range.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bruno, N., Chaudhuri, S., Gravano, L.: STHoles: a multidimensional workload-aware histogram. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, pp. 211–222 (2001)
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14(3), 462–467 (2006)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)
Dua, D., Graff, C.: UCI machine learning repository (2017). https://archive.ics.uci.edu/ml/index.php
Dutt, A., Wang, C., Nazi, A., Kandula, S., Narasayya, V., Chaudhuri, S.: Selectivity estimation for range predicates using lightweight models. Proc. VLDB Endow. 12(9), 1044–1057 (2019)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Gunopulos, D., Kollios, G., Tsotras, V.J., Domeniconi, C.: Selectivity estimators for multidimensional range queries over real attributes. Proc. VLDB Endow. 14(2), 137–154 (2005)
Hasan, S., Thirumuruganathan, S., Augustine, J., Koudas, N., Das, G.: Deep learning models for selectivity estimation of multi-attribute queries. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1035–1050 (2020)
Heimel, M., Kiefer, M., Markl, V.: Self-tuning, GPU-accelerated kernel density models for multidimensional selectivity estimation. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1477–1492 (2015)
Hilprecht, B., Schmidt, A., Kulessa, M., Molina, A., Kersting, K., Binnig, C.: DeepDB: learn from data, not from queries! Proc. VLDB Endow. 13(7), 992–1005 (2020)
Kim, K., Jung, J., Seo, I., Han, W.S., Choi, K., Chong, J.: Learned cardinality estimation: An in-depth study. In: Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data, pp. 1214–1227 (2022)
Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P., Kemper, A.: Learned cardinalities: estimating correlated joins with deep learning. arXiv preprint arXiv:1809.00677 (2018)
Kwon, S., Jung, W., Shim, K.: Cardinality estimation of approximate substring queries using deep learning. Proc. VLDB Endow. 15(11), 3145–3157 (2022)
Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., Neumann, T.: How good are query optimizers, really? Proc. VLDB Endow. 9(3), 204–215 (2015)
Leis, V., Radke, B., Gubichev, A., Kemper, A., Neumann, T.: Cardinality estimation done right: Index-based join sampling. In: CIDR (2017)
Nash, C., Durkan, C.: Autoregressive energy machines. In: International Conference on Machine Learning, pp. 1735–1744. PMLR (2019)
Park, Y., Zhong, S., Mozafari, B.: QuickSel: quick selectivity learning with mixture models. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1017–1033 (2020)
Poosala, V., Ioannidis, Y.E.: Selectivity estimation without the attribute value independence assumption. In: VLDB, vol. 97, pp. 486–495 (1997)
Sun, J., Li, G., Tang, N.: Learned cardinality estimation for similarity queries. In: Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data, pp. 1745–1757 (2021)
Tzoumas, K., Deshpande, A., Jensen, C.S.: Lightweight graphical models for selectivity estimation without independence assumptions. Proc. VLDB Endow. 4(11), 852–863 (2011)
Wang, X., Qu, C., Wu, W., Wang, J., Zhou, Q.: Are we ready for learned cardinality estimation? Proc. VLDB Endow. 14(9), 1640–1654 (2021)
Wu, P., Cong, G.: A unified deep model of learning from both data and queries for cardinality estimation. In: Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data, pp. 2009–2022 (2021)
Yang, Z., et al.: NeuroCard: one cardinality estimator for all tables. Proc. VLDB Endow. 14(1), 61–73 (2020)
Yang, Z., et al.: Deep unsupervised cardinality estimation. Proc. VLDB Endow. 13(3), 279–292 (2019)
Zanettin, F.: State of New York. Vehicle, snowmobile, and boat registrations (2019). https://catalog.data.gov/dataset/vehicle-snowmobile-and-boat-registrations
Zhao, Z., Christensen, R., Li, F., Hu, X., Yi, K.: Random sampling over joins revisited. In: Proceedings of the 2018 ACM SIGMOD International Conference on Management of Data, pp. 1525–1539 (2018)
Zhu, R.: Flat: fast, lightweight and accurate method for cardinality estimation. Proc. VLDB Endow. 14(9), 1489–1502 (2021)
Acknowledgement
This work is supported by National Natural Science Foundation of China (No. 62072282), Industrial Internet Innovation and Development Project in 2019 of China, Shandong Provincial Key Research and Development Program (No. 2019JZZY010105).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cheng, Q., Li, H., Wang, D., Zhang, Y., Peng, Z. (2024). LAF: A Local Depth Autoregressive Framework for Cardinality Estimation of Multi-attribute Queries. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14333. Springer, Singapore. https://doi.org/10.1007/978-981-97-2387-4_20
Download citation
DOI: https://doi.org/10.1007/978-981-97-2387-4_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2386-7
Online ISBN: 978-981-97-2387-4
eBook Packages: Computer ScienceComputer Science (R0)