Abstract
In cloud computing environment, data information is growing exponentially. Which raises new challenges in efficient distributed data storage and management for large scale OLTP and OLAP applications. Horizontal and vertical database partitioning can improve the performance and manageability for shared-nothing systems which are popular in nowadays. However, the existing partitioning techniques can’t deal with dynamic information efficiently and can’t get the real-time partitioning strategies. In this paper, we present WSPA: a workload-driven stream vertical partitioning approach based on streaming framework. We construct an affinity matrix to get the mapping information from a workload and cluster attributes according to the attribute affinity, then obtain the optimal partitioning scheme by a cost model. The experimental results show that WSPA has good partitioning quality and lower time complexity than existing vertical partitioning method. It is an efficient partitioning method for processing the dynamic and large scale queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
DeWitt, D.J., Ghandeharizadeh, S., Schneider, D., Bricker, A., Hsiao, H.-I., Rasmussen, R.: The Gamma Database Machine Project
Dewitt, D.J., Metha, M.: Data placement in shared-nothing parallel database systems. VLDB J. 6(1), 53–72 (1997)
Megiddo, N., Rao, J., Zhang, C., et al.: Automating physical database design in a parallel database, pp. 558–569. ACM (2002)
DeWitt, D., Gray, J.: Parallel database systems: the future of high performance database systems. Comm. ACM 35(6), 85–98 (1992)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.: Bigtable: a distributed storage system for structured data. In: OSDI (2006)
Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.-A., Puz, N., Weaver, D., Yerneni, R.: PNUTS: Yahoo!s hosted data serving platform. PVLDB 1(2), 1277–1288 (2008)
Curino, C., Jones, E., Zhang, Y., Wu, E., Madden, S.: Relationalcloud: the case for a database service. New England Database Summit (2010)
Ghandeharizadeh, S., DeWitt, D.J.: Hybrid-range partitioning strategy: a new declustering strategy for multiprocessor databases machines. In: VLDB (1990)
Curino, C., Jones, E., Zhang, Y., et al.: Schism: a workload-drive approach to database replication and partitioning. VLDB 3(1/2), 48–57 (2010)
Jindal, A., Dittrich, J.: Relax and let the database do the partitioning online. In: Castellanos, M., Dayal, U., Lehner, W. (eds.) Enabling Real-Time Business Intelligence. LNBIP, vol. 126, pp. 65–80. Springer, Heidelberg (2012)
Grund, M., Krger, J., Plattner, H., Zeier, A., CudreMauroux, P., Madden, S.: HYRISEA—main memory hybrid storage engine. Bull. Tech. Committee Data Eng. 4(1), 105–116 (2010)
Lynch, C.: Big data: how do your data grow. Nature 455(7209), 28–29 (2008)
Li, G.J., Cheng, X.Q.: Research status and scientific thinking of big data. Bull. Chin. Acad. Sci. 27(6), 647–657 (2012)
Wang, Y.Z., Jin, X.L., Cheng, X.Q.: Network big data: present and future. Chin. J. Comput. 36(6), 1125–1138 (2013). (in Chinese with English abstract)
Agrawal, S., Chu, E., Narasayya, V.: Automatic physical design tuning: workload as a sequence. In: SIGMOD (2006)
Navathe, S., et al.: Vertical partitioning algorithms for database design. ACM TODS 9(4), 680–710 (1984)
Navathe, S., Ra, M.: Vertical partitioning for database design: a graphical algorithm. In: SIGMOD (1989)
Pujol, J.M., Siganos, G., Erramilli, V., Rodriguez, P.: Scaling online social networks without pains. In: NetDB (2009)
Zilio, D.C.: Physical database design decision algorithms and concurrent reorganization for parallel database systems. Ph.D. thesis (1998)
Chu, W.W., Ieong, I.T.: A transaction-based approach to vertical partitioning for relational database systems. IEEE TSE 19(8), 804–812 (1993)
Acknowledgments
This work is partially supported by Tianjin Application Foundation and Advanced Technology Research Project No. 14JCYBJC15500 and Special Fund for the Doctoral Program of Higher Education No. 20130031120029.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kang, H., Guo, M., Yuan, X. (2016). A Workload-Driven Vertical Partitioning Approach Based on Streaming Framework. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9932. Springer, Cham. https://doi.org/10.1007/978-3-319-45817-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-45817-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45816-8
Online ISBN: 978-3-319-45817-5
eBook Packages: Computer ScienceComputer Science (R0)