skip to main content
10.1145/3459637.3481947acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Scalable Learning to Troubleshoot Query Performance Problems

Published:30 October 2021Publication History

ABSTRACT

Query optimization has long been fundamental for database systems. There are cracks in the edifice, however, as the complexity of modern query workloads outpace what database systems can manage well. Automatic tools are needed for database vendors, such as IBM with Db2, to help customers troubleshoot their performance problems, as manual troubleshooting is painstaking. To manage complex and large workloads, we develop a distributed system called dGALO that learns recurring problem patterns in query plans over workloads. dGALO employs these problem patterns to build a RDF-based, SPARQL-queried knowledge-base of plan-rewrite remedies. We illustrate a distributed implementation of dGALO on Apache Spark with efficient partitioning strategies for load balancing. The system employs additional pruning strategies via clustering, which yields a fine-grained trade off between runtime and accuracy. dGALO uses its knowledge-base to re-optimize queries, often to dramatic effect, and is a valuable tool for the development team to refine the optimizer with new techniques. We demonstrate by an experimental study over the TPC-DS benchmark the efficiency and effectiveness of our techniques.

Skip Supplemental Material Section

Supplemental Material

CIKM-2021-video-afp0562.mp4

mp4

19.8 MB

References

  1. S. Agrawal, S. Chaudhuri, L. Kollar, A. Marathe, V. Narasayya, and M. Syamala. 2005. Database tuning advisor for microsoft sql server 2005. In SIGMOD. 930--932.Google ScholarGoogle Scholar
  2. R. Barber, G. Lohman, I. Pandis, V. Raman, R. Sidle, G. Attaluri, N. Chainani, S. Lightstone, and D. Sharpe. 2014. Memory-efficient Hash Joins. PVLDB, Vol. 8, 4 (2014), 353--364.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. Bruno, S. Chaudhuri, and R. Ramamurthy. 2009a. Interactive Plan Hints for Query Optimization. In SIGMOD. 1043--1046.Google ScholarGoogle Scholar
  4. N. Bruno, S. Chaudhuri, and R. Ramamurthy. 2009b. Power Hints for Query Optimization. In ICDE. 469--480.Google ScholarGoogle Scholar
  5. IBM Knowledge Center. 2021. IBM InfoSphere Optim Query Workload Tuner. https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.idm.tools.doc/doc/c0057033.htmlGoogle ScholarGoogle Scholar
  6. S. Chaudhuri and K. Shim. 1999. Optimization of Queries with User-defined Predicates. TODS, Vol. 24, 2 (1999), 177--228.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Dageville, D. Das, K. Dias, K. Yagoub, M. Zait, and M. Ziauddin. 2004. Automatic SQL Tuning in Oracle 10G. In VLDB. 1098--1109.Google ScholarGoogle Scholar
  8. G. Damasio, S. Bryson, V. Corvinelli, P. Godfrey, P. Mierzejewski, A. Mihaylov, J. Szlichta, and C. Zuzarte. 2019 a. GALO: Guided Automated Learning for re-Optimization. PVLDB, Vol. 12, 12 (2019), 1778--1781.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Damasio, V. Corvinelli, P. Godfrey, P. Mierzejewski, A. Mihaylov, J. Szlichta, and C. Zuzarte. 2019 b. Guided Automated Learning for query workload re-Optimization. PVLDB, Vol. 12, 12 (2019), 2010--2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Damasio, P. Mierzejewski, J. Szlichta, and C. Zuzarte. 2016a. OptImatch: Semantic web system for query problem determination. In ICDE. 1334--1337.Google ScholarGoogle Scholar
  11. G. Damasio, P. Mierzejewski, J. Szlichta, and C. Zuzarte. 2016b. Query Performance Problem Determination with Knowledge Base in Semantic Web System OptImatch. In EDBT. 515--526.Google ScholarGoogle Scholar
  12. J. Dean and S. Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM, Vol. 51, 1 (2008), 107--113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. El-Helw, I. F. Ilyas, and C. Zuzarte. 2009. Statadvisor: Recommending statistical views. PVLDB, Vol. 2, 2 (2009), 1306--1317.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Ester, H. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD. 226--231.Google ScholarGoogle Scholar
  15. J. Gryz, Q. Wang, X. Qian, and C. Zuzarte. 2008. SQL Queries with CASE Expressions. In ISMIS. 351--360.Google ScholarGoogle Scholar
  16. Z. He, B. S. Lee, and R. Snapp. 2005. Self-tuning Cost Modeling of User-defined Functions in an Object-relational DBMS. TODS, Vol. 30, 3 (2005), 812--853.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Viktor Leis, Bernhard Radke, Andrey Gubichev, Alfons Kemper, and Thomas Neumann. 2017. Cardinality Estimation Done Right: Index-Based Join Sampling. In CIDR.Google ScholarGoogle Scholar
  18. Henry Liu, Mingbin Xu, Ziting Yu, Vincent Corvinelli, and Calisto Zuzarte. 2015. Cardinality estimation using neural networks. In Proceedings of the 25th Annual International Conference on Computer Science and Software Engineering. IBM Corp., 53--59.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Marcus and O. Papaemmanouil. 2018. Deep Reinforcement Learning for Join Order Enumeration. In aiDM. Article 3, 4 pages.Google ScholarGoogle Scholar
  20. V. Muntés-Mulero, J. Aguilar-Saborit, C. Zuzarte, and J. Larriba-Pey. 2006. CGO: A Sound Genetic Optimizer for Cyclic Query Graphs. In ICCS. 156--163.Google ScholarGoogle Scholar
  21. J. Ortiz, M. Balazinska, J. Gehrke, and S. Keerthi. 2018. Learning State Representations for Query Optimization with Deep Reinforcement Learning. In DEEM. 1--4.Google ScholarGoogle Scholar
  22. A. Pavlo, G. Angulo, J. Arulraj, H. Lin, J. Lin, L. Ma, P. Menon, T. C. Mowry, M. Perron, and I. Quah. 2017. Self-Driving Database Management Systems. In CIDR.Google ScholarGoogle Scholar
  23. P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, and T. Price. 1979. Access path selection in a relational database management system. In SIGMOD. 23--34.Google ScholarGoogle Scholar
  24. J. Szlichta, P. Godfrey, L. Golab, M. Kargar, and D. Srivastava. 2017. Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization. PVLDB, Vol. 10, 7 (2017), 721--732.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Ziauddin, D. Das, H. Su, Y. Zhu, and K. Yagoub. 2008. Optimizer Plan Change Management: Improved Stability and Performance in Oracle 11G. PVLDB, Vol. 1, 2 (2008), 1346--1355.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. C. Zilio, J. Rao, S. Lightstone, G. Lohman, A. Storm, C. Garcia-Arellano, and S. Fadden. 2004 a. DB2 Design Advisor: Integrated Automatic Physical Database Design. In VLDB. 1087--1097.Google ScholarGoogle Scholar
  27. D. C. Zilio, C. Zuzarte, S. Lightstone, W. Ma, G. M. Lohman, R. J. Cochrane, H. Pirahesh, L. Colby, J. Gryz, E. Alton, and G. Valentin. 2004 b. Recommending materialized views and indexes with the IBM DB2 Design Advisor. In ICAC. 180--187.Google ScholarGoogle Scholar

Index Terms

  1. Scalable Learning to Troubleshoot Query Performance Problems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
      October 2021
      4966 pages
      ISBN:9781450384469
      DOI:10.1145/3459637

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 October 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader