skip to main content
10.1145/3534678.3539303acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Efficient Join Order Selection Learning with Graph-based Representation

Published: 14 August 2022 Publication History

Abstract

Join order selection plays an important role in DBMS query optimizers. The problem aims to find the optimal join order with the minimum cost, and usually becomes an NP-hard problem due to the exponentially increasing search space. Recent advanced studies attempt to use deep reinforcement learning (DRL) to generate better join plans than the ones provided by conventional query optimizers. However, DRL-based methods require time-consuming training, which is not suitable for online applications that need frequent periodic re-training. In this paper, we propose a novel framework, namely efficient Join Order selection learninG with Graph-basEd Representation (JOGGER). We firstly construct a schema graph based on the primary-foreign key relationships, from which table representations are well learned to capture the correlations between tables. The second component is the state representation, where a graph convolutional network is utilized to encode the query graph and a tailored-tree-based attention module is designed to encode the join plan. To speed up the convergence of DRL training process, we exploit the idea of curriculum learning, in which queries are incrementally added into the training set according to the level of difficulties. We conduct extensive experiments on JOB and TPC-H datasets, which demonstrate the effectiveness and efficiency of the proposed solutions.

Supplemental Material

MP4 File
Join order selection plays an important role in DBMS query optimizers. The problem aims to find the optimal join order with the minimum cost. Recent advanced studies attempt to use deep reinforcement learning (DRL) to generate better join plans than the ones provided by conventional query optimizers. However, DRL-based methods require time-consuming training. In this paper, we propose a novel framework, namely efficient Join Order selection learninG with Graph-basEd Representation (JOGGER). We firstly construct a schema graph based on the primary-foreign key relationships, from which table representations are learned to capture the correlations between tables. The second component is the state representation, where GCN is utilized to encode the query graph and a tailored-tree-based attention module to encode the join plan. To speed up the convergence of training, we incorporate curriculum learning, in which queries are incrementally added into the training set according to the level of difficulties.

References

[1]
Mahtab Ahmed, Muhammad Rifayat Samee, and Robert E Mercer. 2019. Improving Tree-LSTM with Tree Attention. In 2019 IEEE 13th International Conference on Semantic Computing (ICSC). IEEE Computer Society, 247--254.
[2]
Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. 2009. Curriculum learning. In Proceedings of the Annual International Conference on Machine Learning. 41--48.
[3]
Swati V Chande and Madhavi Sinha. 2011. Genetic optimization for the join ordering problem of database queries. In India International Conference. 1--5.
[4]
Leonidas Fegaras. 1998. A new heuristic for optimizing large queries. In International Conference on Database and Expert Systems Applications. 726--735.
[5]
Carlos Florensa, David Held, Markus Wulfmeier, Michael Zhang, and Pieter Abbeel. 2017. Reverse curriculum generation for reinforcement learning. In Conference on Robot Learning. 482--495.
[6]
Jonas Heitz and Kurt Stockinger. 2019. Join query optimization with deep reinforcement learning algorithms. arXiv preprint arXiv:1911.11689 (2019).
[7]
Yannis E Ioannidis and Younkyung Cha Kang. 1991. Left-deep vs. bushy trees: An analysis of strategy spaces and its implications for query optimization. In Proceedings of the SIGMOD International Vonference on Management of Data. 168--177.
[8]
Lu Jiang, Deyu Meng, Shoou-I Yu, Zhenzhong Lan, Shiguang Shan, and Alexander Hauptmann. 2014. Self-paced learning with diversity. Advances in Neural Information Processing Systems 27 (2014), 2078--2086.
[9]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations.
[10]
Thomas Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. ArXiv abs/1609.02907 (2017).
[11]
Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph M. Hellerstein, and Ion Stoica. 2018. Learning to Optimize Join Queries With Deep Reinforcement Learning. CoRR abs/1808.03196 (2018).
[12]
Phong Le and Willem H. Zuidema. 2015. Compositional Distributional Semantics with Long Short Term Memory. In Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics. 10--19.
[13]
Kyeong-Min Lee, InA Kim, and Kyu-Chul Lee. 2020. DQN-based Join Order Optimization by Learning Experiences of Running Queries on Spark SQL. In International Conference on Data Mining Workshops. 740--742.
[14]
Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2015. How good are query optimizers, really? Proceedings of the Very Large Data Base Endowment 9, 3 (2015), 204--215.
[15]
Jie Liu, Wenqian Dong, Dong Li, and Qingqing Zhou. 2021. Fauce: Fast and Accurate Deep Ensembles with Uncertainty for Cardinality Estimation. Proceedings of the Very Large Data Base Endowment 14, 11 (2021), 1950--1963.
[16]
Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: a learned query optimizer. Proceedings of the Very Large Data Base Endowment 12, 11 (2019), 1705--1718.
[17]
Ryan Marcus and Olga Papaemmanouil. 2018. Deep reinforcement learning for join order enumeration. In Proceedings of the International Workshop on Exploiting Artificial Intelligence Techniques for Data Management. 1--4.
[18]
Tomas Mikolov, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In International Conference on Learning Representations.
[19]
Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional neural networks over tree structures for programming language processing. In Thirtieth AAAI Conference on Artificial Intelligence.
[20]
Sanmit Narvekar, Jivko Sinapov, and Peter Stone. 2017. Autonomous Task Sequencing for Customized Curriculum Design in Reinforcement Learning. In International Joint Conference on Artificial Intelligence. 2536--2542.
[21]
Xuan-Phi Nguyen, Shafiq Joty, Steven Hoi, and Richard Socher. 2020. TreeStructured Attention with Hierarchical Accumulation. In International Conference on Learning Representations.
[22]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: online learning of social representations. In International Conference on Knowledge Discovery and Data Mining. 701--710.
[23]
Meikel Poess and Chris Floyd. 2000. New TPC benchmarks for decision support and web commerce. Sigmod Record 29, 4 (2000), 64--71.
[24]
Zhipeng Ren, Daoyi Dong, Huaxiong Li, and Chunlin Chen. 2018. Self-paced prioritized curriculum learning with coverage penalty in deep reinforcement learning. Transactions on Neural Networks and Learning Systems 29, 6 (2018), 2216--2226.
[25]
Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015).
[26]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[27]
Kai Sheng Tai, Richard Socher, and Christopher D Manning. 2015. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing. 1556--1566.
[28]
Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, and Joseph Antonakakis. 2019. Skinnerdb: Regret-bounded query evaluation via reinforcement learning. In International Conference on Management of Data. 1153--1170.
[29]
Immanuel Trummer, Junxiong Wang, Ziyun Wei, Deepak Maram, Samuel Moseley, Saehan Jo, Joseph Antonakakis, and Ankush Rayabhari. 2021. SkinnerDB: Regret-bounded Query Evaluation via Reinforcement Learning. Transactions on Database Systems 46, 3 (2021), 1--45.
[30]
Kostas Tzoumas, Timos Sellis, and Christian S Jensen. 2008. A reinforcement learning approach for adaptive query processing. History (2008).
[31]
Yaushian Wang, Hung-Yi Lee, and Yun-Nung Chen. 2019. Tree Transformer: Integrating Tree Structures into Self-Attention. In Conferenceon Empirical Methods in Natural Language Processing-International Joint Conference on Natural Language Processing. 1061--1070.
[32]
Xiang Yu, Guoliang Li, Chengliang Chai, and Nan Tang. 2020. Reinforcement Learning with Tree-LSTM for Join Order Selection. In International Conference on Data Engineering. 1297--1308.
[33]
Ji Zhang. 2020. AlphaJoin: Join Order Selection à la AlphaGo. In Very Large Data Base.
[34]
Xuanhe Zhou, Chengliang Chai, Guoliang Li, and Ji Sun. 2020. Database meets artificial intelligence: A survey. Transactions on Knowledge and Data Engineering (2020).
[35]
Xiaodan Zhu, Parinaz Sobihani, and Hongyu Guo. 2015. Long short-term memory over recursive structures. In International Conference on Machine Learning. 1604-- 1612.

Cited By

View all
  • (2024)FlagVNEProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/266(2406-2414)Online publication date: 3-Aug-2024
  • (2024)Towards a Converged Relational-Graph Optimization FrameworkProceedings of the ACM on Management of Data10.1145/36988282:6(1-27)Online publication date: 20-Dec-2024
  • (2024)Mixed Graph Contrastive Network for Semi-supervised Node ClassificationACM Transactions on Knowledge Discovery from Data10.1145/364154918:7(1-19)Online publication date: 19-Jun-2024
  • Show More Cited By

Index Terms

  1. Efficient Join Order Selection Learning with Graph-based Representation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2022
    5033 pages
    ISBN:9781450393850
    DOI:10.1145/3534678
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. database
    2. graph representation
    3. join order

    Qualifiers

    • Research-article

    Funding Sources

    • Shenzhen Municipal Science and Technology R&D Funding Basic Research Program
    • NSFC

    Conference

    KDD '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)126
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 22 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)FlagVNEProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/266(2406-2414)Online publication date: 3-Aug-2024
    • (2024)Towards a Converged Relational-Graph Optimization FrameworkProceedings of the ACM on Management of Data10.1145/36988282:6(1-27)Online publication date: 20-Dec-2024
    • (2024)Mixed Graph Contrastive Network for Semi-supervised Node ClassificationACM Transactions on Knowledge Discovery from Data10.1145/364154918:7(1-19)Online publication date: 19-Jun-2024
    • (2024)Learned Query Optimizer: What is New and What is NextCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654692(561-569)Online publication date: 9-Jun-2024
    • (2024)Hype or Heuristic? Quantum Reinforcement Learning for Join Order Optimisation2024 IEEE International Conference on Quantum Computing and Engineering (QCE)10.1109/QCE60285.2024.00055(409-420)Online publication date: 15-Sep-2024
    • (2024)FOSS: A Self-Learned Doctor for Query Optimizer2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00330(4329-4342)Online publication date: 13-May-2024
    • (2024)Join Order Selection with Transfer Reinforcement Learning2024 Twelfth International Conference on Advanced Cloud and Big Data (CBD)10.1109/CBD65573.2024.00027(96-101)Online publication date: 28-Nov-2024
    • (2024)A novel framework for join order selection based on reinforcement and representation learning2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825394(8757-8761)Online publication date: 15-Dec-2024
    • (2024)DORA: A Reliability-Associated Query Optimization Framework for Plan SelectionWeb Information Systems Engineering – WISE 202410.1007/978-981-96-0579-8_17(230-244)Online publication date: 29-Nov-2024
    • (2023)Join Order Selection with Deep Reinforcement Learning: Fundamentals, Techniques, and ChallengesProceedings of the VLDB Endowment10.14778/3611540.361157616:12(3882-3885)Online publication date: 1-Aug-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media