Skip to main content
Log in

SWORD: workload-aware data placement and replica selection for cloud data management systems

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Cloud computing is increasingly being seen as a way to reduce infrastructure costs and add elasticity, and is being used by a wide range of organizations. Cloud data management systems today need to serve a range of different workloads, from analytical read-heavy workloads to transactional (OLTP) workloads. For both the service providers and the users, it is critical to minimize the consumption of resources like CPU, memory, communication bandwidth, and energy, without compromising on service-level agreements if any. In this article, we develop a workload-aware data placement and replication approach, called SWORD, for minimizing resource consumption in such an environment. Specifically, we monitor and model the expected workload as a hypergraph and develop partitioning techniques that minimize the average query span, i.e., the average number of machines involved in the execution of a query or a transaction. We empirically justify the use of query span as the metric to optimize, for both analytical and transactional workloads, and develop a series of replication and data placement algorithms by drawing connections to several well-studied graph theoretic concepts. We introduce a suite of novel techniques to achieve high scalability by reducing the overhead of partitioning and query routing. To deal with workload changes, we propose an incremental repartitioning technique that modifies data placement in small steps without resorting to complete repartitioning. We propose the use of fine-grained quorums defined at the level of groups of data items to control the cost of distributed updates, improve throughput, and adapt to different workloads. We empirically illustrate the benefits of our approach through a comprehensive experimental evaluation for two classes of workloads. For analytical read-only workloads, we show that our techniques result in significant reduction in total resource consumption. For OLTP workloads, we show that our approach improves transaction latencies and overall throughput by minimizing the number of distributed transactions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. The Cartesian product of the full set of attributes forming the unique identifier.

  2. A majority of hyperedges in the cut of our compressed hypergraph representing TPC-C, a typical OLTP workload, span two partitions.

  3. It does not include the query routing time.

References

  1. MLPart: a hypergraph partitioning package. URL http://vlsicad.ucsd.edu/GSRC/bookshelf/Slots/Partitioning/MLPart/

  2. hMETIS: a hypergraph partitioning package. URL http://glaros.dtc.umn.edu/gkhome/metis/hmetis/overview

  3. Alon, N., Seymour, P.D., Thomas, R.: A separator theorem for graphs with an excluded minor and its applications. In: STOC (1990)

  4. Alpert, C.J.: The ISPD98 circuit benchmark suite. In: Proc. of Intl, Symposium on Physical Design (1998)

  5. Asahiro, Y., Iwama, K., Tamaki, H., Tokuyama, T.: Greedily finding a dense subgraph. In: SWAT (1996)

  6. Ayka, C., Cambazoglu, B., Bora, U.: Multi-level direct k-way hypergraph partitioning with multiple constraints and fixed vertices. J. Parallel Distrib. Comput. 68, 609–625 (2008)

  7. Boppana, R.B.: Eigenvalues and graph bisection: an average-case analysis. In: FOCS (1987)

  8. Bruno, N., Chaudhuri, S., Konig, A.C., Narasayya, V.R., Ramamurthy, R., Syamala, M.: Autoadmin project at Microsoft research: lessons learned. IEEE Data Eng. Bull. 34, 12–19 (2011)

  9. Caldwell, A.E., Kahng, A.B., Markov, I.L.: Design and implementation of move-based heuristics for VLSI hypergraph partitioning. J. Exp. Algorithmics. 5, 5 (2000)

    Article  Google Scholar 

  10. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: OSDI (2006)

  11. Chowdhury, M., Zaharia, M., Ma, J., Jordan, M.I., Stoica., I.: Managing data transfers in computer clusters with orchestra. In: SIGCOMM (2011)

  12. Curino, C., Zhang, Y., Jones, E.P.C., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. PVLDB 3(1), 48–57 (2010)

    Google Scholar 

  13. J. Dittrich, Ruiz, J.A.Q., Jindal, A., Kargin, Y., Setty, V., Schad, J.: Hadoop++: making a yellow elephant run like a cheetah (without it even noticing). In: PVLDB, vol. 3, pp. 515–529 (2010)

  14. Economou, D., Rivoire, S., Kozyrakis, C.: Full-system power analysis and modeling for server environments. In: MOBS (2006)

  15. Eltabakh, M.Y., Tian, Y., Ozcan, F., Gemulla, R., Krettek, A., McPherson, J.: Cohadoop: flexible data placement and its exploitation in hadoop. PVLDB 4(9), 575–585 (2011)

    Google Scholar 

  16. Feige, U.: A threshold of ln n for approximating set cover. J. ACM 45(4), 634–652 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  17. Feige, U., Kortsarz, G., Peleg, D.: The dense k-subgraph problem. Algorithmica 29, 410–421 (1999)

  18. Ferhatosmanoglu, H., Tosun, A., Ramachandran, A.: Replicated declustering of spatial data. In: PODS (2004)

  19. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness (1979)

  20. Gray, J., Helland, P., O’Neil, P.E., Shasha, D.: The dangers of replication and a solution. In: SIGMOD (1996)

  21. Ho, L., Wu, J., Liu, P.: Optimal algorithms for cross-rack communication optimization in mapreduce framework. In: CLOUD (2011)

  22. Jones, E.P.C., Abadi, D.J., Madden, S.: Low overhead concurrency control for partitioned main memory databases. In: SIGMOD (2010)

  23. Karypis, G., Kumar, V.: Multilevel k-way hypergraph partitioning. In: Proc. of DAC, pp. 343–348 (1998)

  24. Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning: application in VLSI domain. In: IEEE VLSI, pp. 69–529. (1999)

  25. Koyutürk, M., Aykanat, C.: Iterative-improvement-based declustering heuristics for multi-disk databases. Inf. Syst. 30, 47–70 (2005)

  26. Kumar, K.A., Quamar, A., Deshpande, A., Khuller, S.: Workload-Aware Data Placement and Replica Selection for Cloud Data Management Systems. University of Maryland Technical Report (2013)

  27. Lakshman, A., Malik, P.: Cassandra: structured storage system on a P2P network. In: PODC (2009)

  28. Liu, D., Shekhar, S.: Partitioning similarity graphs: a framework for declustering problems. Inf. Syst. 21,475–496 (1996)

  29. Meyerhenke, H., Monien, B., Sauerwald, T.: A new diffusion-based multilevel algorithm for computing graph partitions. J. Parallel Distrib. Comput. 69(9), 750–761 (2009)

    Article  Google Scholar 

  30. Nehme, R., Bruno, N.: Automated partitioning design in parallel database systems. In: SIGMOD (2011)

  31. Neves, T., Drummond, L.A., Ochi, L., Albuquerque, C., Uchoa, E.: Solving replica placement and request distribution in content distribution networks. Electron. Notes Discret. Math. 36, 89–96 (2010)

    Article  Google Scholar 

  32. Oktay, K.Y., Turk, A., Aykanat, C.: Selective replicated declustering for arbitrary queries. In: Euro-Par (2009)

  33. Pavlo, A., Curino, C., Zdonik, S.: Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In: SIGMOD (2012)

  34. Pavlo, A., Paulson, E., Rasin, A., Abadi, D.J., DeWitt, D.J., Madden, S., Stonebraker, M.: A comparison of approaches to large-scale data analysis. In: SIGMOD (2009)

  35. Peris, J.R., Martinez, M.P., Alonso, G., Kemme, B.: Are quorums an alternative for data replication? TODS 28(3), 257–294 (2003)

    Article  Google Scholar 

  36. Peris, R.J., Martinez, M.P.: How to select a replication protocol according to scalability, availability and communication overhead. In: SRDS (2001)

  37. Peris, R.J., Martnez, M.P., Kemme, B., Alonso, G.: How to select a replication protocol according to scalability, availability, and communication overhead. In: SRDS (2001)

  38. Quamar, A., Kumar, K.A., Deshpande, A.: Sword: scalable workload-aware data placement for transactional workloads. In: EDBT (2013)

  39. Simon, H.D., Teng, S.: How good is recursive bisection? SIAM J. Sci. Comput. 18(5), 1436–1445 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  40. Tatarowicz, A.L., Curino, C., Jones, E.P.C., Madden, S.: Lookup tables: fine-grained partitioning for distributed databases. In: ICDE (2011)

  41. Thain, D., Livny, M.: Building reliable clients and servers. In: Foster, I., Kesselman, C. (eds.) The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, Burlington (2003)

    Google Scholar 

  42. Tosun, A.A., Ferhatosmanoglu, H.: Optimal parallel I/O using replication. In: ICPP (1997)

  43. Tosun, A.S.: Replicated declustering for arbitrary queries. In: ACM Symposium on Applied Computing (2004)

  44. Wang, X., Smalter, A., Huan, J., Lushington, G.H.: G-hash: towards fast kernel-based similarity search in large graph databases. In: EDBT (2009)

  45. White, T.: Hadoop: The Definitive Guide. O’Reilly Media, 1st edn. ISBN:0596521979 (2009)

  46. Wolfson, O., Jajodia, S., Huang, Y.: An adaptive data replication algorithm. TODS 22, 255–314 (1997)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by NSF Grant CCF-0937865, an IBM Collaborative Research Award, and an Amazon AWS in Education Research grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Ashwin Kumar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, K.A., Quamar, A., Deshpande, A. et al. SWORD: workload-aware data placement and replica selection for cloud data management systems. The VLDB Journal 23, 845–870 (2014). https://doi.org/10.1007/s00778-014-0362-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-014-0362-1

Keywords

Navigation