Skip to main content

A Workload-Driven Vertical Partitioning Approach Based on Streaming Framework

  • Conference paper
  • First Online:
Web Technologies and Applications (APWeb 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9932))

Included in the following conference series:

  • 1651 Accesses

Abstract

In cloud computing environment, data information is growing exponentially. Which raises new challenges in efficient distributed data storage and management for large scale OLTP and OLAP applications. Horizontal and vertical database partitioning can improve the performance and manageability for shared-nothing systems which are popular in nowadays. However, the existing partitioning techniques can’t deal with dynamic information efficiently and can’t get the real-time partitioning strategies. In this paper, we present WSPA: a workload-driven stream vertical partitioning approach based on streaming framework. We construct an affinity matrix to get the mapping information from a workload and cluster attributes according to the attribute affinity, then obtain the optimal partitioning scheme by a cost model. The experimental results show that WSPA has good partitioning quality and lower time complexity than existing vertical partitioning method. It is an efficient partitioning method for processing the dynamic and large scale queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. DeWitt, D.J., Ghandeharizadeh, S., Schneider, D., Bricker, A., Hsiao, H.-I., Rasmussen, R.: The Gamma Database Machine Project

    Google Scholar 

  2. Dewitt, D.J., Metha, M.: Data placement in shared-nothing parallel database systems. VLDB J. 6(1), 53–72 (1997)

    Article  Google Scholar 

  3. Megiddo, N., Rao, J., Zhang, C., et al.: Automating physical database design in a parallel database, pp. 558–569. ACM (2002)

    Google Scholar 

  4. DeWitt, D., Gray, J.: Parallel database systems: the future of high performance database systems. Comm. ACM 35(6), 85–98 (1992)

    Article  Google Scholar 

  5. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.: Bigtable: a distributed storage system for structured data. In: OSDI (2006)

    Google Scholar 

  6. Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.-A., Puz, N., Weaver, D., Yerneni, R.: PNUTS: Yahoo!s hosted data serving platform. PVLDB 1(2), 1277–1288 (2008)

    Google Scholar 

  7. Curino, C., Jones, E., Zhang, Y., Wu, E., Madden, S.: Relationalcloud: the case for a database service. New England Database Summit (2010)

    Google Scholar 

  8. Ghandeharizadeh, S., DeWitt, D.J.: Hybrid-range partitioning strategy: a new declustering strategy for multiprocessor databases machines. In: VLDB (1990)

    Google Scholar 

  9. Curino, C., Jones, E., Zhang, Y., et al.: Schism: a workload-drive approach to database replication and partitioning. VLDB 3(1/2), 48–57 (2010)

    Google Scholar 

  10. Jindal, A., Dittrich, J.: Relax and let the database do the partitioning online. In: Castellanos, M., Dayal, U., Lehner, W. (eds.) Enabling Real-Time Business Intelligence. LNBIP, vol. 126, pp. 65–80. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  11. Grund, M., Krger, J., Plattner, H., Zeier, A., CudreMauroux, P., Madden, S.: HYRISEA—main memory hybrid storage engine. Bull. Tech. Committee Data Eng. 4(1), 105–116 (2010)

    Google Scholar 

  12. Lynch, C.: Big data: how do your data grow. Nature 455(7209), 28–29 (2008)

    Article  Google Scholar 

  13. Li, G.J., Cheng, X.Q.: Research status and scientific thinking of big data. Bull. Chin. Acad. Sci. 27(6), 647–657 (2012)

    Google Scholar 

  14. Wang, Y.Z., Jin, X.L., Cheng, X.Q.: Network big data: present and future. Chin. J. Comput. 36(6), 1125–1138 (2013). (in Chinese with English abstract)

    Article  Google Scholar 

  15. Agrawal, S., Chu, E., Narasayya, V.: Automatic physical design tuning: workload as a sequence. In: SIGMOD (2006)

    Google Scholar 

  16. Navathe, S., et al.: Vertical partitioning algorithms for database design. ACM TODS 9(4), 680–710 (1984)

    Article  Google Scholar 

  17. Navathe, S., Ra, M.: Vertical partitioning for database design: a graphical algorithm. In: SIGMOD (1989)

    Google Scholar 

  18. Pujol, J.M., Siganos, G., Erramilli, V., Rodriguez, P.: Scaling online social networks without pains. In: NetDB (2009)

    Google Scholar 

  19. Zilio, D.C.: Physical database design decision algorithms and concurrent reorganization for parallel database systems. Ph.D. thesis (1998)

    Google Scholar 

  20. Chu, W.W., Ieong, I.T.: A transaction-based approach to vertical partitioning for relational database systems. IEEE TSE 19(8), 804–812 (1993)

    Google Scholar 

Download references

Acknowledgments

This work is partially supported by Tianjin Application Foundation and Advanced Technology Research Project No. 14JCYBJC15500 and Special Fund for the Doctoral Program of Higher Education No. 20130031120029.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojie Yuan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kang, H., Guo, M., Yuan, X. (2016). A Workload-Driven Vertical Partitioning Approach Based on Streaming Framework. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9932. Springer, Cham. https://doi.org/10.1007/978-3-319-45817-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45817-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45816-8

  • Online ISBN: 978-3-319-45817-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics