skip to main content
10.1145/1838126.1838131acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Using the optimizer to generate an effective regression suite: a first step

Published:07 June 2010Publication History

ABSTRACT

Query optimizers play a critical role in the success of every relational database system. However, regression testing for optimizers remains an ad hoc, tedious, and time consuming process. Typically, a large number of SQL query suites are employed for regression testing. These suites are manually designed at great cost by development and QA groups or are collected from various customers or benchmarks such as TPC-H or TPC-DS. While these suites are useful in capturing regressions, optimizers continue to be plagued by regressions and bug fixing requiring expensive human intervention. This may be because these ad-hoc regression queries are redundant in the sense that they are not covering different parts of the optimizer plan space. This paper introduces a novel way in which the optimizer itself is used to generate an economical regression suite. Our approach eliminates the tedium in manually designing a regression suite and removes redundancy in the suite. As a first step towards solving this very difficult problem, we shall focus on the join plan space in this paper with a small number of tables. We show that our generated queries exhibit 50% more distinct join plans than TPC-H and TPC-DS combined. The generated queries have also been very useful for validating the optimizer's cost functions and hence can be used as a test suite as well. Since this is a new approach, we will highlight some of the areas that need a closer look by the research community.

References

  1. Bati, H., Giakoumakis, L., Herbert, S., and Surna, A. A genetic approach for random testing of database systems. VLDB 2007: 1243--1251 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. DeWitt D. J., Naughton, J. F., Schneider, D. A., and Seshadri, S. Practical Skew Handling in Parallel Joins. VLDB 1992: 27--40 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Graefe G. The Cascades Framework for Query Optimization. IEEE Data Eng. Bull. 18(3): 19--29 (1995)Google ScholarGoogle Scholar
  4. Haftmann, F., Kossmann D., and Lo, E. A framework for efficient regression tests on database applications. VLDB J. 16(1): 145--164 (2007) Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. http://h71028.www7.hp.com/enterprise/us/en/software/bi-product-neoview.htmlGoogle ScholarGoogle Scholar
  6. Muralikrishna, M. Improved Unnesting Algorithms for Join Aggregate SQL Queries. VLDB 1992: 91--102 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Othayoth R. and Poess, M. The Making of TPC-DS, VLDB 2006: 1049--1058 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Poess M. and Stephens, M. J. Generating Thousand Benchmark Queries in Seconds. VLDB 2004: 1045--1053 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Poosala V., Ioannidis Y. E., Haas P. J., and Shekita E. J. Improved Histograms for Selectivity Estimation of Range Predicates. SIGMOD Conference 1996: 294--305 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Pöss M., http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/n/Nambiar:Raghunath_Othayoth.html Othayoth R. N., and Walrath D. Why You Should Run TPC-DS: A Workload Analysis. VLDB 2007: 1138--1149 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Reddy N. and Haritsa, J. R. Analyzing Plan Diagrams of Database Query Optimizers. VLDB 2005: 1228--1240 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Schneider D. A. and DeWitt, D. J. Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. VLDB 1990: 469--480 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Slutz, D. R. Massive Stochastic Testing of SQL. VLDB 1998: 618--622 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. http://tpc.org/tpch/spec/tpch2.7.0.pdfGoogle ScholarGoogle Scholar
  15. Waas F. and Galindo-Legaria, C. A. Counting, Enumerating, and Sampling of Execution Plans in a Cost-Based Query Optimizer. SIGMOD Conference 2000: 499--509 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Willmor D. and Embury, S. M. A Safe Regression Test Selection Technique for Database-Driven Apps. ICSM 2005: 421--434 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using the optimizer to generate an effective regression suite: a first step

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            DBTest '10: Proceedings of the Third International Workshop on Testing Database Systems
            June 2010
            51 pages
            ISBN:9781450301909
            DOI:10.1145/1838126

            Copyright © 2010 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 7 June 2010

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate31of56submissions,55%