ABSTRACT
Deep Learning (DL) has achieved great success in many real applications. Despite its success, there are some main problems when deploying advanced DL models in database systems, such as hyper-parameters tuning, the risk of overfitting, and lack of prediction uncertainty. In this paper, we study a lightweight and accurate cardinality estimation for SQL queries, which is also uncertainty-aware. By lightweight, we mean that we can train a DL model in a few seconds. With uncertainty ensured,it becomes possible to update the estimator to improve its prediction in areas with high uncertainty.The approach we explore is different from the direction of deploying sophisticated DL models as cardinality estimators in database systems. We employ Bayesian deep learning (BDL), which serves as a bridge between Bayesian inference and deep learning. The prediction distribution by BDL provides principled uncertainty calibration for the prediction. In addition, when the network width of a BDL model goes to infinity, the model performs equivalent to Gaussian Process (GP). This special class of BDL, known as Neural Network Gaussian Process (NNGP), inherits the advantages of Bayesian approach while keeping universal approximation of neural networks, and can utilize a much larger model space to model distribution-free data as a nonparametric model. We show our NNGP estimator achieves high accuracy, is built fast, and is robust to query workload shift, in our extensive performance studies by comparing with existing learned estimators. We also confirm the effectiveness of NNGP by integrating it into PostgreSQL.
- Pytorch. https://github.com/pytorch/pytorch.Google Scholar
- S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. The aqua approximate query answering system. In Proc. SIGMOD, pages 574--576. ACM, 1999.Google ScholarDigital Library
- S. Agarwal, H. Milner, A. Kleiner, A. Talwalkar, M. I. Jordan, S. Madden, B. Mozafari, and I. Stoica. Knowing when you're wrong: building fast and reliable approximate query processing systems. In Proc. SIGMOD, pages 481--492. ACM, 2014.Google ScholarDigital Library
- D. V. Aken, A. Pavlo, G. J. Gordon, and B. Zhang. Automatic database management system tuning through large-scale machine learning. In Proc. SIGMOD'17, pages 1009--1024, 2017.Google ScholarDigital Library
- C. M. Bishop. Pattern recognition and machine learning, 5th Edition. Information science and statistics. Springer, 2007.Google Scholar
- J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, and Q. Zhang. JAX: composable transformations of Python+NumPy programs, 2018.Google Scholar
- W. Cai, M. Balazinska, and D. Suciu. Pessimistic cardinality estimation: Tighter upper bounds for intermediate join cardinalities. In Proc. SIGMOD'19, pages 18--35, 2019.Google ScholarDigital Library
- J. Chen, M. Stern, M. J. Wainwright, and M. I. Jordan. Kernel feature selection via conditional covariance minimization. In Proc. NIPS'19, pages 6946--6955, 2017.Google Scholar
- T. Chen and C. Guestrin. Xgboost: A scalable tree boosting system. In Proc. SIGKDD'16, pages 785--794, 2016.Google ScholarDigital Library
- Y. Cho and L. K. Saul. Kernel methods for deep learning. In Proc. NIPS'09, pages 342--350, 2009.Google Scholar
- B. Ding, S. Das, R. Marcus, W. Wu, S. Chaudhuri, and V. R. Narasayya. AI meets AI: leveraging query executions to improve index recommendations. In Proc. SIGMOD'19, pages 1241--1258, 2019.Google ScholarDigital Library
- J. Ding, U. F. Minhas, J. Yu, C. Wang, J. Do, Y. Li, H. Zhang, B. Chandramouli, J. Gehrke, D. Kossmann, D. B. Lomet, and T. Kraska. ALEX: an updatable adaptive learned index. In Proc. SIGMOD'20, pages 969--984, 2020.Google ScholarDigital Library
- D. Dua and C. Graff. UCI machine learning repository, 2017.Google Scholar
- S. Duan, V. Thummala, and S. Babu. Tuning database configuration parameters with ituned. Proc. VLDB Endow., 2(1):1246--1257.Google ScholarDigital Library
- A. Dutt, C. Wang, V. Narasayya, and S. Chaudhuri. Efficiently approximating selectivity functions using low overhead regression models. Proc. VLDB Endow., 13(12):2215--2228, 2020.Google ScholarDigital Library
- A. Dutt, C. Wang, A. Nazi, S. Kandula, V. R. Narasayya, and S. Chaudhuri. Selectivity estimation for range predicates using lightweight models. Proc. VLDB, 12(9):1044--1057, 2019.Google ScholarDigital Library
- W. Fan, R. Jin, M. Liu, P. Lu, X. Luo, R. Xu, Q. Yin, W. Yu, and J. Zhou. Application driven graph partitioning. In Proc. SIGMOD'20, pages 1765--1779, 2020.Google ScholarDigital Library
- Y. Gal and Z. Ghahramani. Bayesian convolutional neural networks with bernoulli approximate variational inference. CoRR, abs/1506.02158, 2015.Google Scholar
- Y. Gal and Z. Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proc. ICML'16, volume 48, pages 1050--1059, 2016.Google Scholar
- A. Garriga-Alonso, C. E. Rasmussen, and L. Aitchison. Deep convolutional networks as shallow gaussian processes. In Proc. ICLR'19, 2019.Google Scholar
- M. Germain, K. Gregor, I. Murray, and H. Larochelle. MADE: masked autoencoder for distribution estimation. In Proc. ICML'15, volume 37, pages 881--889, 2015.Google Scholar
- L. Getoor, B. Taskar, and D. Koller. Selectivity estimation using probabilistic models. In Proc. SIGMOD'01, pages 461--472, 2001.Google ScholarDigital Library
- I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In Proc. ICLR, 2015.Google Scholar
- D. Gunopulos, G. Kollios, V. J. Tsotras, and C. Domeniconi. Selectivity estimators for multidimensional range queries over real attributes. VLDB J., 14(2):137--154, 2005.Google ScholarDigital Library
- C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger. On calibration of modern neural networks. In Proc. ICML'17, volume 70, pages 1321--1330. PMLR, 2017.Google Scholar
- P. J. Haas and J. M. Hellerstein. Ripple joins for online aggregation. ACM SIGMOD Record, 28(2):287--298, 1999.Google ScholarDigital Library
- S. Hasan, S. Thirumuruganathan, J. Augustine, N. Koudas, and G. Das. Deep learning models for selectivity estimation of multi-attribute queries. In Proc. SIGMOD'20, pages 1035--1050, 2020.Google ScholarDigital Library
- B. Hilprecht, A. Schmidt, M. Kulessa, A. Molina, K. Kersting, and C. Binnig. Deepdb: Learn from data, not from queries! Proc. VLDB, 13(7):992--1005, 2020.Google ScholarDigital Library
- J. Hron, Y. Bahri, J. Sohl-Dickstein, and R. Novak. Infinite attention: NNGP and NTK for deep attention networks. In Proc. ICML'20, volume 119, pages 4376--4386. PMLR, 2020.Google Scholar
- J. Hu, J. Shen, B. Yang, and L. Shao. Infinitely wide graph convolutional networks: Semi-supervised learning via gaussian processes. CoRR, abs/2002.12168, 2020.Google Scholar
- F. Hutter, L. Kotthoff, and J. Vanschoren. Automated machine learning: methods, systems, challenges. Springer Nature, 2019.Google ScholarCross Ref
- M. Kiefer, M. Heimel, S. Breß, and V. Markl. Estimating join selectivities using bandwidth-optimized kernel density models. Proc. VLDB, 10(13):2085--2096, 2017.Google ScholarDigital Library
- A. Kipf, T. Kipf, B. Radke, V. Leis, P. A. Boncz, and A. Kemper. Learned cardinalities: Estimating correlated joins with deep learning. In Proc. CIDR'19, 2019.Google Scholar
- T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis. The case for learned index structures. In Proc. SIGMOD'18, pages 489--504, 2018.Google ScholarDigital Library
- S. Krishnan, Z. Yang, K. Goldberg, J. M. Hellerstein, and I. Stoica. Learning to optimize join queries with deep reinforcement learning. CoRR, abs/1808.03196, 2018.Google Scholar
- M. Kunjir and S. Babu. Black or white? how to develop an autotuner for memory-based analytics. In Proc. SIGMOD'20, pages 1667--1683, 2020.Google ScholarDigital Library
- B. Lakshminarayanan, A. Pritzel, and C. Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In Proc. NIPS'17, pages 6402--6413, 2017.Google Scholar
- H. K. Lee. Bayesian nonparametrics via neural networks. SIAM, 2004.Google Scholar
- J. Lee, Y. Bahri, R. Novak, S. S. Schoenholz, J. Pennington, and J. Sohl-Dickstein. Deep neural networks as gaussian processes. In Proc. ICLR'18, 2018.Google Scholar
- J. Lee, L. Xiao, S. S. Schoenholz, Y. Bahri, R. Novak, J. Sohl-Dickstein, and J. Pennington. Wide neural networks of any depth evolve as linear models under gradient descent. In Proc. NeurIPS, pages 8570--8581, 2019.Google Scholar
- V. Leis, A. Gubichev, A. Mirchev, P. A. Boncz, A. Kemper, and T. Neumann. How good are query optimizers, really? Proc. VLDB, 9(3):204--215, 2015.Google ScholarDigital Library
- D. D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In Proc. ICML'94, pages 148--156, 1994.Google ScholarCross Ref
- F. Li, B. Wu, K. Yi, and Z. Zhao. Wander join: Online aggregation via random walks. In Proc. SIGMOD'16, pages 615--629, 2016.Google ScholarDigital Library
- X. Liang, A. J. Elmore, and S. Krishnan. Opportunistic view materialization with deep reinforcement learning. CoRR, abs/1903.01363, 2019.Google Scholar
- J. Liu, W. Dong, D. Li, and Q. Zhou. Fauce: Fast and accurate deep ensembles with uncertainty for cardinality estimation. Proc. VLDB Endow., 14(11):1950--1963, 2021.Google ScholarDigital Library
- Q. Ma and P. Triantafillou. Dbest: Revisiting approximate query processing engines with machine learning models. In Proc. SIGMOD'19, pages 1553--1570, 2019.Google ScholarDigital Library
- D. J. MacKay. Introduction to gaussian processes. NATO ASI series F computer and systems sciences, 168:133--166, 1998.Google Scholar
- R. Marcus, E. Zhang, and T. Kraska. Cdfshop: Exploring and optimizing learned index structures. In Proc. SIGMOD'20, pages 2789--2792, 2020.Google ScholarDigital Library
- R. C. Marcus, P. Negi, H. Mao, C. Zhang, M. Alizadeh, T. Kraska, O. Papaemmanouil, and N. Tatbul. Neo: A learned query optimizer. Proc. VLDB Endow., 12(11):1705--1718, 2019.Google ScholarDigital Library
- A. Nath and P. M. Domingos. Learning relational sum-product networks. In Proc. AAAI'15, pages 2878--2886, 2015.Google ScholarCross Ref
- V. Nathan, J. Ding, M. Alizadeh, and T. Kraska. Learning multi-dimensional indexes. In Proc. SIGMOD'20, pages 985--1000, 2020.Google ScholarDigital Library
- R. M. Neal. Priors for infinite networks. In Bayesian Learning for Neural Networks, pages 29--53. Springer, 1996.Google ScholarCross Ref
- R. Novak, L. Xiao, Y. Bahri, J. Lee, G. Yang, J. Hron, D. A. Abolafia, J. Pennington, and J. Sohl-Dickstein. Bayesian deep convolutional networks with many channels are gaussian processes. In Proc. ICLR'19, 2019.Google Scholar
- R. Novak, L. Xiao, J. Hron, J. Lee, A. A. Alemi, J. Sohl-Dickstein, and S. S. Schoenholz. Neural tangents: Fast and easy infinite neural networks in python. In Proc. ICLR'20, 2020.Google Scholar
- V. Poosala and Y. E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In Proc. VLDB'97, pages 486--495, 1997.Google Scholar
- V. Poosala, Y. E. Ioannidis, P. J. Haas, and E. J. Shekita. Improved histograms for selectivity estimation of range predicates. In Proc. SIGMOD, pages 294--305, 1996.Google ScholarDigital Library
- C. E. Rasmussen and C. K. I. Williams. Gaussian processes for machine learning. MIT Press, 2006.Google ScholarDigital Library
- B. Settles. Active Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2012.Google Scholar
- L. Song, A. J. Smola, A. Gretton, J. Bedo, and K. M. Borgwardt. Feature selection via dependence maximization. J. Mach. Learn. Res., 13:1393--1434, 2012.Google ScholarDigital Library
- J. Sun and G. Li. An end-to-end learning-based cost estimator. Proc. VLDB, 13(3):307--319, 2019.Google ScholarDigital Library
- S. Thirumuruganathan, S. Hasan, N. Koudas, and G. Das. Approximate query processing for data exploration using deep generative models. In Proc. ICDE'20, pages 1309--1320, 2020.Google ScholarCross Ref
- L. N. Trefethen and D. Bau. Numerical linear algebra. SIAM, 1997.Google ScholarCross Ref
- K. Tzoumas, A. Deshpande, and C. S. Jensen. Lightweight graphical models for selectivity estimation without independence assumptions. Proc. VLDB Endow., 4(11):852--863, 2011.Google ScholarDigital Library
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. In Proc. NeurIPS'17, pages 5998--6008, 2017.Google Scholar
- H. Wang and D. Yeung. Towards bayesian deep learning: A framework and some existing methods. IEEE Trans. Knowl. Data Eng., 28(12):3395--3408, 2016.Google ScholarDigital Library
- X. Wang, C. Qu, W. Wu, J. Wang, and Q. Zhou. Are we ready for learned cardinality estimation? Proc. VLDB Endow., 14(9):1640--1654, 2021.Google ScholarDigital Library
- C. K. I. Williams. Computation with infinite neural networks. Neural Comput., 10(5):1203--1216, 1998.Google ScholarDigital Library
- A. G. Wilson and P. Izmailov. Bayesian deep learning and a probabilistic perspective of generalization. In Proc. NeurIPS'20, 2020.Google Scholar
- Y. Xiao and W. Y. Wang. Quantifying uncertainties in natural language processing tasks. In Proc. AAAI, pages 7322--7329. AAAI Press, 2019.Google ScholarDigital Library
- G. Yang. Wide feedforward or recurrent neural networks of any architecture are gaussian processes. In Proc. NeurIPS'19, pages 9947--9960, 2019.Google Scholar
- Z. Yang, A. Kamsetty, S. Luan, E. Liang, Y. Duan, P. Chen, and I. Stoica. Neurocard: One cardinality estimator for all tables. Proc. VLDB Endow., 14(1):61--73, 2020.Google ScholarDigital Library
- Z. Yang, E. Liang, A. Kamsetty, C. Wu, Y. Duan, P. Chen, P. Abbeel, J. M. Hellerstein, S. Krishnan, and I. Stoica. Deep unsupervised cardinality estimation. Proc. VLDB, 13(3):279--292, 2019.Google ScholarDigital Library
- J. Zhang, Y. Liu, K. Zhou, G. Li, Z. Xiao, B. Cheng, J. Xing, Y. Wang, T. Cheng, L. Liu, M. Ran, and Z. Li. An end-to-end automatic cloud database tuning system using deep reinforcement learning. In Proc. SIGMOD'19, pages 415--432, 2019.Google ScholarDigital Library
- K. Zhao, J. X. Yu, H. Zhang, Q. Li, and Y. Rong. A learned sketch for subgraph counting. In Proc. SIGMOD'21, pages 2142--2155. ACM, 2021.Google ScholarDigital Library
- Z. Zhao, R. Christensen, F. Li, X. Hu, and K. Yi. Random sampling over joins revisited. In Proc. SIGMOD'18, pages 1525--1539, 2018.Google ScholarDigital Library
- X. Zhou, J. Sun, G. Li, and J. Feng. Query performance prediction for concurrent queries using graph embedding. Proc. VLDB Endow., 13(9):1416--1428, 2020.Google ScholarDigital Library
Index Terms
- Lightweight and Accurate Cardinality Estimation by Neural Network Gaussian Process
Recommendations
Recurrent neural network-induced Gaussian process
Highlights- Equivalence between the infinitely wide NNs and GPs for an RNN is proved.
- A GP ...
AbstractIn this study, we develop a recurrent neural network-induced Gaussian process (RNNGP) to model sequence data. We derive the equivalence between infinitely wide neural networks and Gaussian processes (GPs) for a relaxed recurrent neural ...
Multi-step prediction of nonlinear Gaussian Process dynamics models with adaptive Gaussian mixtures
This paper presents an adaptive Gaussian Mixture Model aGMM formulation for performing multiple-step probabilistic state predictions using a nonparametric Gaussian Process GP regression model. The presented prediction algorithm is applicable to any ...
Cardinality estimation using neural networks
CASCON '15: Proceedings of the 25th Annual International Conference on Computer Science and Software EngineeringDatabase query optimizers benefit greatly from accurate cardinality estimation; however, this is hard to achieve on tables with correlated and/or skewed columns. We present a novel approach using neural networks to learn and approximate selectivity ...
Comments