Abstract
Vertical partitioning is a well-explored area of automatic physical database design. The classic approach is as follows: derive an optimal vertical partitioning scheme for a given database and a workload. The workload describes queries, their frequencies, and involved attributes.
In this paper we identify a novel class of vertical partitioning algorithms. The algorithms of this class do not rely on knowledge of the workload, but instead use data properties that are contained in the workload itself. We propose such algorithm that uses a logical scheme represented by functional dependencies, which are derived from stored data. In order to discover functional dependencies we use TANE — a popular functional dependency extraction algorithm. We evaluate our algorithm using an industrial DBMS (PostgreSQL) on number of workloads. We compare the performance of an unpartitioned configuration with partitions produced by our algorithm and several state-of-the-art workload-aware algorithms.
This work is partially supported by Russian Foundation for Basic Research grant 16-57-48001.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
TANE implementation. http://www.cs.helsinki.fi/research/fdk/datamining/tane/.
- 2.
- 3.
- 4.
References
Agrawal, S., Narasayya, V., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: SIGMOD 2004, pp. 359–370. ACM, 2004
Apers, P.M.G.: Data allocation in distributed database systems. ACM Trans. Database Syst. (TODS) 13(3), 263–304 (1988)
Bellatreche, L., Benkrid, S.: A joint design approach of partitioning and allocation in parallel data warehouses. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2009. LNCS, vol. 5691, pp. 99–110. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03730-6_9
Bobrov, N., Chernishev, G., Grigoriev, D., Novikov, B.: An evaluation of TANE algorithm for functional dependency detection. In: Ouhammou, Y., et al. (eds.) MEDI 2017. LNCS, vol. 10563, pp. 208–222. Springer International Publishing, Cham (2017). doi:10.1007/978-3-319-66854-3_16
Boehm, A.M., Seipel, D., Sickmann, A., Wetzka, M.: Squash: a tool for analyzing, tuning and refactoring relational database applications. In: Seipel, D., Hanus, M., Wolf, A. (eds.) INAP/WLP -2007. LNCS (LNAI), vol. 5437, pp. 82–98. Springer, Heidelberg (2009). doi:10.1007/978-3-642-00675-3_6
Cheng, C.-H.: A branch and bound clustering algorithm. IEEE Trans. Syst. Man Cybern. 25, 895–898 (1995)
Chernishev, G.: A survey of dbms physical design approaches. SPIIRAS Proceedings 24, 222–276 (2013)
Chernishev, G.: The design of an adaptive column-store system. J. Big Data 4(5), 25 (2017)
Cornell, D., Yu, P.: An effective approach to vertical partitioning for physical design of relational databases. IEEE Trans. SE 16, 248–258 (1990)
De Marchi, F., Lopes, S., Petit, J.-M., Toumani, F.: Analysis of existing databases at the logical level: the DBA companion project. SIGMOD Rec. 32, 47–52 (2003)
Fung, C.-W., Karlapalem, K., Li, Q.: Cost-driven vertical class partitioning for methods in object oriented databases. VLDB J. 12, 187–210 (2003)
Galaktionov, V., Chernishev, G., Novikov, B., Grigoriev, D.: Matrix clustering algorithms for vertical partitioning problem: an initial performance study. In: DAMDID/RCDL 2016, Russia, pp. 24–31 (2016)
Galaktionov, V., Chernishev, G., Smirnov, K., Novikov, B., Grigoriev, D.A.: A study of several matrix-clustering vertical partitioning algorithms in a disk-based environment. In: Kalinichenko, L., Kuznetsov, S.O., Manolopoulos, Y. (eds.) DAMDID/RCDL 2016. CCIS, vol. 706, pp. 163–177. Springer, Cham (2017). doi:10.1007/978-3-319-57135-5_12
Grund, M., Krüger, J., Plattner, H., Zeier, A., Cudre-Mauroux, P., Madden, S.: HYRISE: a main memory hybrid storage engine. Proc. VLDB Endow. 4, 105–116 (2010)
Hammer, M., Niamir, B.: A heuristic approach to attribute partitioning. In: SIGMOD 1979, pp. 93–101 (1979)
Hankins, R.A., Patel, J.M.: Data morphing: an adaptive, cache-conscious storage technique. In: VLDB 2003, pp. 417–428 (2003)
Hoffer, J.A., Severance, D.G.: The use of cluster analysis in physical data base design. In: VLDB 1975, pp. 69–86 (1975)
Jindal, A., Palatinus, E., Pavlov, V., Dittrich, J.: A comparison of knives for bread slicing. Proc. VLDB Endow. 6, 361–372 (2013)
Li, L., Gruenwald, L.: SMOPD: a vertical database partitioning system with a fully automatic online approach. In: IDEAS 2013, pp. 168–173 (2013)
Lin, X., Orlowska, M., Zhang, Y.: A graph based cluster approach for vertical partitioning in database design. Data Knowl. Eng. 11, 151–169 (1993)
Ma, H., Schewe, K.-D. Kirchberg, M.: A heuristic approach to fragmentation incorporating query information. In: Databases and Information Systems IV - Selected Papers from the Seventh International Baltic Conference, DB&IS 2006, Vilnius, Lithuania, 3–6 July 2006. Frontiers in Artificial Intelligence and Applications, vol. 155. IOS Press (2006). ISBN 978-1-58603-715-4
Malik, T., Wang, X., Burns, R., Dash, D., Ailamaki, A.: Automated physical design in database caches. In: ICDEW 2008, pp. 27–34 (2008)
Navathe, S., Ceri, S., Wiederhold, G., Dou, J.: Vertical partitioning algorithms for database design. ACM Trans. Database Syst. 9, 680–710 (1984)
Navathe, S., Karlapalem, K., Ra, M.: A mixed fragmentation methodology for initial distributed database design. J. Comput. Softw. Eng. 3(4) (1995)
Pai-Cheng, C.: A transaction-oriented approach to attribute partitioning. Inf. Syst. 17, 329–342 (1992)
Papadomanolakis, S., Ailamaki, A.: AutoPart: automating schema design for large scientific databases using data partitioning. In: SSDBM 2004, pp. 383–392 (2004)
Qian, L., LeFevre, K., Jagadish, H.V.: CRIUS: user-friendly database design. Proc. VLDB Endow. 4, 81–92 (2010)
Rodríguez, L., Li, X.: A dynamic vertical partitioning approach for distributed database system. In: SMC 2011, pp. 1853–1858 (2011)
Sacca, D., Wiederhold, G.: Database partitioning in a cluster of processors. ACM Trans. Database Syst. 10, 29–56 (1985)
Wiese, D., Rabinovitch, G., Reichert, M., Arenswald, S.: Autonomic tuning expert: A framework for best-practice oriented autonomic database tuning. In: CASCON 2008, pp. 327–341 (2008)
Acknowledgments
We would like to thank anonymous reviewers for their valuable comments on this work. This work is partially supported by Russian Foundation for Basic Research grant 16-57-48001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bobrov, N., Chernishev, G., Novikov, B. (2017). Workload-Independent Data-Driven Vertical Partitioning. In: Kirikova, M., et al. New Trends in Databases and Information Systems. ADBIS 2017. Communications in Computer and Information Science, vol 767. Springer, Cham. https://doi.org/10.1007/978-3-319-67162-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-67162-8_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67161-1
Online ISBN: 978-3-319-67162-8
eBook Packages: Computer ScienceComputer Science (R0)