ABSTRACT
Many database applications deploy hundreds or thousands of indexes to speed up query execution. Despite a plethora of prior work on index selection, designing and deploying indexes remains a difficult task for database administrators. First, real-world businesses often require online index deployment, and the traditional off-line approach to index selection ignores intermediate workload performance during index deployment. Second, recent work on on-line index selection does not address effects of complex interactions that manifest during index deployment.
In this paper, we propose a new approach that incorporates transitional design performance into the overall problem of physical database design. We call our approach Incremental Database Design. As the first step in this direction, we study the problem of ordering index deployment. The benefits of a good index deployment order are twofold: (1) a prompt query runtime improvement and (2) a reduced total time to deploy the design. Finding an effective deployment order is difficult due to complex index interaction and a factorial number of possible solutions.
We formulate a mathematical model to represent the index ordering problem and demonstrate that Constraint Programming (CP) is a more efficient solution compared to other methods such as mixed integer programming and A * search. In addition to exact search techniques, we also study local search algorithms that make significant improvements over a greedy solution with minimal computational overhead.
Our empirical analysis using the TPC-H dataset shows that our pruning techniques can reduce the size of the search space by many orders of magnitude. Using the TPC-DS dataset, we verify that our local search algorithm is a highly scalable and stable method for quickly finding the best known solutions.
- S. Agrawal, S. Chaudhuri, and V. Narasayya. Automated selection of materialized views and indexes in sql databases. In VLDB, 2000. Google ScholarDigital Library
- S. Agrawal, E. Chu, and V. Narasayya. Automatic physical design tuning: workload as a sequence. In SIGMOD, 2006. Google ScholarDigital Library
- P. Baptiste, C. L. Pape, and W. Nuijten. Constraint-Based Scheduling - Applying Constraint Programming to Scheduling Problems. Springer, 2001. Google ScholarDigital Library
- M. Blaschka, C. Sapia, and G. Hofling. On schema evolution in multidimensional databases. DataWarehousing and Knowledge Discovery, pages 153--164, 1999. Google ScholarDigital Library
- N. Bruno and S. Chaudhuri. An online approach to physical design tuning. In ICDE, 2007.Google ScholarCross Ref
- N. Bruno and S. Chaudhuri. Physical design refinement: The merge-reduce approach. TODS, 2007. Google ScholarDigital Library
- S. Chaudhuri and V. Narasayya. AutoAdmin what-if index analysis utility. In SIGMOD, 1998. Google ScholarDigital Library
- S. Chaudhuri and V. R. Narasayya. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB, 1997. Google ScholarDigital Library
- D. Dash, N. Polyzotis, and A. Ailamaki. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads. In PVLDB, volume 4, 2011. Google ScholarDigital Library
- S. Finkelstein, M. Schkolnick, and P. Tiberio. Physical database design for relational databases. TODS, 13(1):91--128, 1988. Google ScholarDigital Library
- F. Glover and G. Kochenberger. Handbook of metaheuristics. Springer, 2003.Google ScholarCross Ref
- F. Glover and M. Laguna. Tabu Search. Kluwer, 1997. Google ScholarDigital Library
- H. Gupta and I. Mumick. Selection of views to materialize under a maintenance cost constraint. ICDT, 1999. Google ScholarDigital Library
- S. Idreos, M. L. Kersten, and S. Manegold. Database cracking. In CIDR, 2007.Google Scholar
- H. Kimura, C. Coffrin, A. Rasin, and S. Zdonik. A Constraint-Programming Approach To Optimize Index Deployment Order (Extended Version). Arxiv/CoRR abs arXiv:1107.3606, 2011.Google Scholar
- H. Kimura, G. Huo, A. Rasin, S. Madden, and S. B. Zdonik. Correlation Maps: A Compressed Access Method for Exploiting Soft Functional Dependencies. In VLDB, 2009. Google ScholarDigital Library
- H. Kimura, G. Huo, A. Rasin, S. Madden, and S. B. Zdonik. CORADD: Correlation aware database designer for materialized views and indexes. VLDB, 2010. Google ScholarDigital Library
- C. Mohan and I. Narang. Algorithms for creating indexes for very large tables without quiescing updates. In Proceedings of the 1992 ACM SIGMOD international conference on Management of data, SIGMOD '92, 1992. Google ScholarDigital Library
- S. Papadomanolakis and A. Ailamaki. An Integer Linear Programming Approach to Database Design. In ICDE, 2007. Google ScholarDigital Library
- K. Schnaitter, S. Abiteboul, T. Milo, and N. Polyzotis. On-Line Index Selection for Shifting Workloads. In Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop, 2007. Google ScholarDigital Library
- K. Schnaitter et al. Index interactions in physical design tuning: Modeling, analysis, and applications. VLDB, 2009. Google ScholarDigital Library
- J. P. Sousa and L. A. Wolsey. A time indexed formulation of non-preemptive single machine scheduling problems. Mathematical Programming, 54:353--367, 1992. Google ScholarDigital Library
- P. Van Hentenryck and L. Michel. Constraint-based local search. The MIT Press, 2009. Google ScholarDigital Library
- D. C. Zilio et al. DB2 design advisor: integrated automatic physical database design. In VLDB, 2004. Google ScholarDigital Library
Recommendations
Comparison between the zeroth-order Randić index and the sum-connectivity index
The zeroth-order Randić index and the sum-connectivity index are very popular topological indices in mathematical chemistry. These two indices are based on vertex degrees of graphs and attracted a lot of attention in recent years. Recently Li and Li (...
The asymptotic value of the zeroth-order Randić index and sum-connectivity index for trees
The zeroth-order Randić index and sum-connectivity index are two indices based on the vertex degrees. They appeared in the topological formula for the total π-electron energy of conjugated molecules and attracted a lot of attention in recent years. Let ...
Hybrid index for spatio-temporal OLAP operations
ADVIS'06: Proceedings of the 4th international conference on Advances in Information SystemsAccording to increase of spatial data, many decision support systems require the fast spatio-temporal analysis. This paper proposes the improved index for efficient OLAP in a spatial data warehouse. The main idea is to use the hybrid index of the ...
Comments