Skip to main content

Advertisement

Log in

Toward cost-effective storage provisioning for DBMSs

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Data center operators face a bewildering set of choices when considering how to provision resources on machines with complex I/O subsystems. Modern I/O subsystems often have a rich mix of fast, high performing, but expensive SSDs sitting alongside with cheaper but relatively slower (for random accesses) traditional hard disk drives. The data center operators need to determine how to provision the I/O resources for specific workloads so as to abide by existing service level agreements, while minimizing the total operating cost (TOC) of running the workload, where the TOC includes the amortized hardware costs and the run-time energy costs. The focus of this paper is on introducing this new problem of TOC-based storage allocation, cast in a framework that is compatible with traditional DBMS query optimization and query processing architecture. We also present a heuristic-based solution to this problem, called DOT. We have implemented DOT in PostgreSQL, and experiments using TPC-H and TPC-C demonstrate significant TOC reduction by DOT in various settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

Notes

  1. A complementary problem is to pick the “right” server hardware from a range of options, for a pre-defined workload. Our framework can be adapted to this problem, and a full exploration of this variant is discussed in Sect. 5.

  2. This problem was first studied in our PVLDB paper [32], and expanded in this extended journal paper.

  3. The queries in this subset include: Q1, Q3, Q4, Q6, Q12, Q13, Q14, Q17, Q18, Q19, Q22.

  4. When the lineitem table is placed on the SSD RAID 0 device, or an even slower storage class (e.g., SSD), the query optimizer chooses the query plan in Table 7 to execute # Query 17, which means that the lineitem table has to be sequentially accessed.

  5. In Table 8, we list some costs for each component in the data center, and we consider and sum up the costs of Networking Equipment, Power Distribution and Cooling, Other Infrastructure and Management, so the total cost is: 294,943+626,211+137,461+105,927 = $1,164,542 per month for 46,000 servers. For each server and with a 36 month hardware lifespan, the cost is \(1,164,542 \div 46,000 *36 = \$ 911\) per server for 36 months.

References

  1. Database test suite. http://osdldbt.sourceforge.net/

  2. Oracle sparc supercluster with t3–4 servers, tpc-c 5.11.0, retrieved on 19-may-2011 http://www.tpc.org/results/individual_results/Oracle/Oracle_SPARC_SuperCluster_with_T3-4s_TPC-C_ES_120210.pdf

  3. Overall data center costs. http://perspectives.mvdirona.com/2010/09/18/OverallDataCenterCosts.aspx

  4. Sql azure service level agreement (sla), retrieved on october 27, 2010. http://go.microsoft.com/fwlink/?LinkId=159706

  5. Towards cost-effective storage provisioning for dbmss: add- endum—query templates and examples. http://pages.cs.wisc.edu/~nzhang/pubs/actual_queries.pdf

  6. Tpc-h homepage. http://www.tpc.org/tpch/

  7. Agrawal, D., Ganesan, D., Sitaraman, R.K., Diao, Y., Singh, S.: Lazy-adaptive tree: An optimized index structure for flash devices. PVLDB 2(1), 361–372 (2009)

    Google Scholar 

  8. Agrawal, S., Chu, E., Narasayya, V.R.: Automatic physical design tuning: workload as a sequence. In: SIGMOD Conference, pp. 683–694 (2006)

  9. Bobroff, N., Kochut, A., Beaty, K.A.: Dynamic placement of virtual machines for managing sla violations. In: Integrated Network Management, pp. 119–128 (2007)

  10. Bruno, N., Chaudhuri, S.: Automatic physical database tuning: a relaxation-based approach. In: SIGMOD Conference, pp. 227–238 (2005)

  11. Bruno, N., Chaudhuri, S.: An online approach to physical design tuning. In: ICDE, pp. 826–835 (2007)

  12. Canim, M., Bhattacharjee, B., Mihaila, G.A., Lang, C.A., Ross, K.A.: An object placement advisor for db2 using solid state storage. PVLDB 2(2), 1318–1329 (2009)

    Google Scholar 

  13. Canim, M., Mihaila, G.A., Bhattacharjee, B., Ross, K.A., Lang, C.A.: Ssd bufferpool extensions for database systems. PVLDB, 3(2), 2010

  14. Chaisiri, S., Lee, B.-S., Niyato, D.: Optimal virtual machine placement across multiple cloud providers. In: APSCC, pp. 103–110 (2009)

  15. Chaudhuri, S., Narasayya, V.R.: Self-tuning database systems: a decade of progress. In: VLDB, pp. 3–14 (2007)

  16. Chi, Y., Moon, H.J., Hacigumus, H.: icbs: Incremental cost-based scheduling under piecewise linear slas. In: PVLDB (2011)

  17. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: SOSP, pp. 205–220 (2007)

  18. Graefe, G.: The five-minute rule twenty years later, and how flash memory changes the rules. In: DaMoN, p. 6 (2007)

  19. Hamilton, J.R.: Cooperative expendable micro-slice servers (cems): Low cost, low power servers for internet-scale services. In: CIDR (2009)

  20. Hyser, C., McKee, B., Gardner, R., Watson, B. J.: Autonomic virtual machine placement in the data center. HPL-2007-189 (2008)

  21. Koltsidas, I., Viglas, S.: Flashing up the storage layer. 1, 514–525 (2008)

  22. Lee, S.-W., Moon, B.: Design of flash-based dbms: an in-page logging approach. In: SIGMOD Conference, pp. 55–66 (2007)

  23. Lee, S.-W., Moon, B., Park, C., Kim, J.-M., Kim, S.-W.: A case for flash memory ssd in enterprise database applications. In: SIGMOD Conference, pp. 1075–1086 (2008)

  24. Li, Y., He, B., Yang, J., Luo, Q., Yi, K.: Tree indexing on solid state drives. PVLDB 3(1), 1195–1206 (2010)

    Google Scholar 

  25. Ozmen, O., Salem, K., Schindler, J., Daniel, S.: Workload-aware storage layout for database systems. In: SIGMOD Conference, pp. 939–950, (2010)

  26. Polte, M., Simsa, J., Gibson, G.: Enabling enterprise solid state disks performance. In: Workshop on Integrating Solid-state Memory into the Storage Hierarchy (2009)

  27. Ross, K.A.: Modeling the performance of algorithms on flash memory devices. In: DaMoN, pp. 11–16, (2008)

  28. Shah, M.A., Harizopoulos, S., Wiener, J.L., Graefe, G.: Fast scans and joins using flash drives. In: DaMoN, pp. 17–24, (2008)

  29. Soror, A.A., Minhas, U.F., Aboulnaga, A., Salem, K., Kokosielis, P., Kamath, S.: Automatic virtual machine configuration for database workloads. In: SIGMOD Conference, pp. 953–966 (2008)

  30. Tsirogiannis, D., Harizopoulos, S., Shah, M.A., Wiener, J.L., Graefe, G.: Query processing techniques for solid state drives. In: SIGMOD Conference, pp. 59–72 (2009)

  31. Xiong, P., Chi, Y., Zhu, S., Moon, H.J., Pu, C., Hacigumus, H.: Intelligent management of virtualized resources for database management systems in cloud environment. In: ICDE (2011)

  32. Zhang, N., Tatemura, J., Patel, J.M., Hacigümüs, H.: Towards cost-effective storage provisioning for dbmss. PVLDB 5(4), 274–285 (2011)

    Google Scholar 

Download references

Acknowledgments

We would like to thank the reviewers of this paper for their insightful feedback on an earlier draft of this paper. This work was supported in part by a gift donation from NEC and by the National Science Foundation under Grant IIS-0963993.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, N., Tatemura, J., Patel, J.M. et al. Toward cost-effective storage provisioning for DBMSs. The VLDB Journal 23, 329–354 (2014). https://doi.org/10.1007/s00778-013-0334-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-013-0334-x

Keywords

Navigation