Abstract
Vertical and Horizontal partitions allow database administrators (DBAs) to considerably improve the performance of business intelligence applications. However, finding and defining suitable horizontal and vertical partitions is a daunting task even for experienced DBAs. This is because the DBA has to understand the physical query execution plans for each query in the workload very well to make appropriate design decisions. To facilitate this process several algorithms and advisory tools have been developed over the past years. These tools, however, still keep the DBA in the loop. This means, the physical design cannot be changed without human intervention. This is problematic in situations where a skilled DBA is either not available or the workload changes over time, e.g. due to new DB applications, changed hardware, an increasing dataset size, or bursts in the query workload. In this paper, we present AutoStore: a self-tuning data store which rather than keeping the DBA in the loop, monitors the current workload and partitions the data automatically at checkpoint time intervals — without human intervention. This allows AutoStore to gradually adapt the partitions to best fit the observed query workload. In contrast to previous work, we express partitioning as a One-Dimensional Partitioning Problem (1DPP), with Horizontal (HPP) and Vertical Partitioning Problem (VPP) being just two variants of it. We provide an efficient \(\textsc{O}^2\) P (One-dimensional Online Partitioning) algorithm to solve 1DPP. \(\textsc{O}^2\) P is faster than the specialized affinity-based VPP algorithm by more than two orders of magnitude, and yet it does not loose much on partitioning quality. AutoStore is a part of the OctopusDB vision of a One Size Fits All Database System [13]. Our experimental results on TPC-H datasets show that AutoStore outperforms row and column layouts by up to a factor of 2.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, S., et al.: Database Tuning Advisor for Microsoft SQL Server 2005. In: VLDB (2004)
Agrawal, S., Chu, E., Narasayya, V.: Automatic Physical Design Tuning: Workload as a Sequence. In: SIGMOD (2006)
Agrawal, S., et al.: Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design. In: SIGMOD (2004)
Alagiannis, I., et al.: An Automated, Yet Interactive and Portable DB Designer. In: SIGMOD (2010)
Bruno, N., Chaudhuri, S.: Physical Design Refinement: The “Merge-Reduce” Approach. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 386–404. Springer, Heidelberg (2006)
Bruno, N., et al.: An Online Approach to Physical Design Tuning. In: ICDE (2007)
Bruno, N., Chaudhuri, S.: Constrained Physical Design Tuning. In: PVLDB (2008)
Chaudhuri, S., Narasayya, V.: An Efficient Cost-driven Index Selection Tool for Microsoft SQL Server. In: VLDB (1997)
Chu, W.W., Ieong, I.T.: A Transaction-Based Approach to Vertical Partitioning for Relational Database Systems. IEEE TSE 19(8), 804–812 (1993)
Cornell, D.W., Yu, P.S.: A Vertical Partitioning Algorithm for Relational Databases. In: ICDE (1987)
Cornell, D.W., Yu, P.S.: An Effective Approach to Vertical Partitioning for Physical Design of Relational Databases. IEEE TSE 16(2), 248–258 (1990)
Curino, C., et al.: Schism: a Workload-Driven Approach to Database Replication and Partitioning. In: PVLDB (2010)
Dittrich, J., Jindal, A.: Towards a One Size Fits All Database Architecture. In: CIDR (2011)
Dittrich, J.-P., Fischer, P.M., Kossmann, D.: AGILE: Adaptive Indexing for Context-aware Information Filters. In: SIGMOD (2005)
Grund, M., et al.: HYRISE - A Main Memory Hybrid Storage Engine. In: PVLDB (2010)
Hammer, M., et al.: A Heuristic Approach to Attribute Partitioning. ACM TODS (1979)
Hankins, R.A., Patel, J.M.: Data Morphing: An Adaptive, Cache-Conscious Storage Technique. In: VLDB (2003)
Hoffer, J.A., Severance, D.G.: The Use of Cluster Analysis in Physical Data Base Design. In: VLDB (1975)
Idreos, S., et al.: Database Cracking. In: CIDR (2007)
Jermaine, C., Omiecinski, E., Yee, W.G.: The partitioned exponential file for database storage management. The VLDB Journal 16, 417–437 (2007)
Jindal, A.: The Mimicking Octopus: Towards a one-size-fits-all Database Architecture. In: VLDB PhD Workshop (2010)
Kimura, H., et al.: CORADD: Correlation Aware Database Designer for Materialized Views and Indexes. In: VLDB (2010)
Navathe, S., et al.: Vertical Partitioning Algorithms for Database Design. ACM TODS (1984)
Navathe, S., Ra, M.: Vertical Partitioning for Database Design: A Graphical Algorithm. In: SIGMOD (1989)
O’Neil, P.E., et al.: The Log-Structured Merge-Tree (LSM-Tree). Acta Inf. (1996)
Ozmen, O., Salem, K., Schindler, J., Daniel, S.: Workload-Aware Storage Layout for Database Systems. In: SIGMOD (2010)
Papadomanolakis, S., Ailamaki, A.: AutoPart: Automating Schema Design for Large Scientific Databases Using Data Partitioning. In: SSDBM (2004)
Raman, V., et al.: Constant-Time Query Processing. In: ICDE (2008)
Sacca, D., Wiederhold, G.: Database Partitioning in a Cluster of Processors. ACM TODS 10(1), 29–56 (1985)
Schnaitter, K., et al.: COLT: Continuous On-Line Database Tuning. In: SIGMOD (2006)
Zhou, J., et al.: Dynamic Materialized Views. In: ICDE (2007)
Zilio, D.C., et al.: DB2 Design Advisor: Integrated Automatic Physical Database Design. In: VLDB (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jindal, A., Dittrich, J. (2012). Relax and Let the Database Do the Partitioning Online. In: Castellanos, M., Dayal, U., Lehner, W. (eds) Enabling Real-Time Business Intelligence. BIRTE 2011. Lecture Notes in Business Information Processing, vol 126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33500-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-33500-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33499-3
Online ISBN: 978-3-642-33500-6
eBook Packages: Computer ScienceComputer Science (R0)