LAF: A Local Depth Autoregressive Framework for Cardinality Estimation of Multi-attribute Queries

Cheng, Qianwen; Li, Hao; Wang, Dawei; Zhang, Yue; Peng, Zhaohui

doi:10.1007/978-981-97-2387-4_20

Qianwen Cheng¹²,
Hao Li¹²,
Dawei Wang¹²,
Yue Zhang¹² &
…
Zhaohui Peng¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14333))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data

53 Accesses

Abstract

Cardinality estimation is significant for database query optimization, which affects the query efficiency. Most existing methods often use a uniform approach to model strongly and weakly correlated attributes and seldom make comprehensively use of data information and query information. Some methods have poor accuracy due to simple structure, while others suffer from low efficiency due to complex structure. The problem of cardinality estimation that strong and weak association coexist among attributes can not be well solved by these methods or their simple combinations. Therefore we propose LAF, a new Local deep Autoregressive Framework, which performs fine-grained modeling for attributes with strong and weak correlation. LAF utilizes mutual information to identify the strong and weak association between attributes, applying the local strategy to construct deep autoregressive models to learn the joint distribution for strongly correlated attributes and outputting corresponding local estimations, using lightweight regression model to capture the complex mapping between local estimations with weak correlation and cardinality, and LAF combines information entropy to sort attributes in descending order. Not only do we enable local deep autoregressive models to learn from data information, but also make lightweight regression model to learn from query information. Extensive experimental evaluations on real datasets show that accurate result is achieved while estimation time is significantly shortened, and model size is controlled within a reasonable range.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bruno, N., Chaudhuri, S., Gravano, L.: STHoles: a multidimensional workload-aware histogram. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, pp. 211–222 (2001)
Google Scholar
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14(3), 462–467 (2006)
Article Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)
Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2017). https://archive.ics.uci.edu/ml/index.php
Dutt, A., Wang, C., Nazi, A., Kandula, S., Narasayya, V., Chaudhuri, S.: Selectivity estimation for range predicates using lightweight models. Proc. VLDB Endow. 12(9), 1044–1057 (2019)
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Google Scholar
Gunopulos, D., Kollios, G., Tsotras, V.J., Domeniconi, C.: Selectivity estimators for multidimensional range queries over real attributes. Proc. VLDB Endow. 14(2), 137–154 (2005)
Google Scholar
Hasan, S., Thirumuruganathan, S., Augustine, J., Koudas, N., Das, G.: Deep learning models for selectivity estimation of multi-attribute queries. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1035–1050 (2020)
Google Scholar
Heimel, M., Kiefer, M., Markl, V.: Self-tuning, GPU-accelerated kernel density models for multidimensional selectivity estimation. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1477–1492 (2015)
Google Scholar
Hilprecht, B., Schmidt, A., Kulessa, M., Molina, A., Kersting, K., Binnig, C.: DeepDB: learn from data, not from queries! Proc. VLDB Endow. 13(7), 992–1005 (2020)
Article Google Scholar
Kim, K., Jung, J., Seo, I., Han, W.S., Choi, K., Chong, J.: Learned cardinality estimation: An in-depth study. In: Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data, pp. 1214–1227 (2022)
Google Scholar
Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P., Kemper, A.: Learned cardinalities: estimating correlated joins with deep learning. arXiv preprint arXiv:1809.00677 (2018)
Kwon, S., Jung, W., Shim, K.: Cardinality estimation of approximate substring queries using deep learning. Proc. VLDB Endow. 15(11), 3145–3157 (2022)
Article Google Scholar
Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., Neumann, T.: How good are query optimizers, really? Proc. VLDB Endow. 9(3), 204–215 (2015)
Article Google Scholar
Leis, V., Radke, B., Gubichev, A., Kemper, A., Neumann, T.: Cardinality estimation done right: Index-based join sampling. In: CIDR (2017)
Google Scholar
Nash, C., Durkan, C.: Autoregressive energy machines. In: International Conference on Machine Learning, pp. 1735–1744. PMLR (2019)
Google Scholar
Park, Y., Zhong, S., Mozafari, B.: QuickSel: quick selectivity learning with mixture models. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1017–1033 (2020)
Google Scholar
Poosala, V., Ioannidis, Y.E.: Selectivity estimation without the attribute value independence assumption. In: VLDB, vol. 97, pp. 486–495 (1997)
Google Scholar
Sun, J., Li, G., Tang, N.: Learned cardinality estimation for similarity queries. In: Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data, pp. 1745–1757 (2021)
Google Scholar
Tzoumas, K., Deshpande, A., Jensen, C.S.: Lightweight graphical models for selectivity estimation without independence assumptions. Proc. VLDB Endow. 4(11), 852–863 (2011)
Article Google Scholar
Wang, X., Qu, C., Wu, W., Wang, J., Zhou, Q.: Are we ready for learned cardinality estimation? Proc. VLDB Endow. 14(9), 1640–1654 (2021)
Article Google Scholar
Wu, P., Cong, G.: A unified deep model of learning from both data and queries for cardinality estimation. In: Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data, pp. 2009–2022 (2021)
Google Scholar
Yang, Z., et al.: NeuroCard: one cardinality estimator for all tables. Proc. VLDB Endow. 14(1), 61–73 (2020)
Article Google Scholar
Yang, Z., et al.: Deep unsupervised cardinality estimation. Proc. VLDB Endow. 13(3), 279–292 (2019)
Article Google Scholar
Zanettin, F.: State of New York. Vehicle, snowmobile, and boat registrations (2019). https://catalog.data.gov/dataset/vehicle-snowmobile-and-boat-registrations
Zhao, Z., Christensen, R., Li, F., Hu, X., Yi, K.: Random sampling over joins revisited. In: Proceedings of the 2018 ACM SIGMOD International Conference on Management of Data, pp. 1525–1539 (2018)
Google Scholar
Zhu, R.: Flat: fast, lightweight and accurate method for cardinality estimation. Proc. VLDB Endow. 14(9), 1489–1502 (2021)
Article Google Scholar

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China (No. 62072282), Industrial Internet Innovation and Development Project in 2019 of China, Shandong Provincial Key Research and Development Program (No. 2019JZZY010105).

Author information

Authors and Affiliations

School of Computer Science and Technology, Shandong University, Qingdao, 266237, China
Qianwen Cheng, Hao Li, Dawei Wang, Yue Zhang & Zhaohui Peng

Authors

Qianwen Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Hao Li
View author publications
You can also search for this author in PubMed Google Scholar
Dawei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaohui Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhaohui Peng .

Editor information

Editors and Affiliations

Peng Cheng Laboratory, Shenzhen, China
Xiangyu Song
China University of Geosciences, Wuhan, China
Ruyi Feng
China University of Geosciences, Wuhan, China
Yunliang Chen
Deakin University, Burwood, VIC, Australia
Jianxin Li
University of Exeter, Exeter, UK
Geyong Min

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, Q., Li, H., Wang, D., Zhang, Y., Peng, Z. (2024). LAF: A Local Depth Autoregressive Framework for Cardinality Estimation of Multi-attribute Queries. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14333. Springer, Singapore. https://doi.org/10.1007/978-981-97-2387-4_20

Download citation

DOI: https://doi.org/10.1007/978-981-97-2387-4_20
Published: 28 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2386-7
Online ISBN: 978-981-97-2387-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

LAF: A Local Depth Autoregressive Framework for Cardinality Estimation of Multi-attribute Queries