Skip to main content

And All of a Sudden: Main Memory Is Less Expensive Than Disk

  • Conference paper
  • First Online:
Big Data Benchmarking (WBDB 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8991))

Included in the following conference series:

  • 1022 Accesses

Abstract

Even today, the wisdom for storage still is that storing data in main memory is more expensive than storing on disks. While this is true for the price per byte, the picture looks different for price per bandwidth. However, for data driven applications with high throughput demands, I/O bandwidth can easily become the major bottleneck. Comparing costs for different storage types for a given bandwidth requirement shows that the old wisdom of inexpensive disks and expensive main memory is no longer valid in every case. The higher the bandwidth requirements become, the more cost efficient main memory is. And all of sudden: main memory is less expensive than disk.

In this paper, we show that database workloads for the next generation of enterprise systems have vastly increased bandwidth requirements. These new requirement favor in-memory systems as they are less expensive when operational costs are taken into account. We will discuss mixed enterprise workloads in comparison to traditional transactional workloads and show with a cost evaluation that main memory systems can turn out to incur lower total costs of ownership than their disk-based counterparts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Intel Xeon E7-4890v2 Benchmark – URL: http://www.intel.com/content/www/us/en/benchmarks/server/xeon-e7-v2/xeon-e7-v2-4s-stream.html.

  2. 2.

    iotop – URL: http://guichaz.free.fr/iotop/.

  3. 3.

    Uptime Institute 2012 Data Center Survey – URL: http://uptimeinstitute.com/2012-survey-results.

References

  1. Boissier, M., Krueger, J., Wust, J., Plattner, H.: An integrated data management for enterprise systems. In: ICEIS 2014 - Proceedings of the 16th International Conference on Enterprise Information Systems, vol. 3, 27–30 April, pp. 410–418, Lisbon, Portugal (2014)

    Google Scholar 

  2. Cole, R., Funke, F., Giakoumakis, L., Guy, W., Kemper, A., Krompass, S., Kuno, H.A., Nambiar, R.O., Neumann, T., Poess, M., Sattler, K.-U., Seibold, M., Simon, E., Waas, F.: The mixed workload ch-benchmark. In: DBTest, p. 8. ACM (2011)

    Google Scholar 

  3. Difallah, D.E., Pavlo, A., Curino, C., Cudr-Mauroux, P.: OLTP-bench: an extensible testbed for benchmarking relational databases. PVLDB 7(4), 277–288 (2013)

    Google Scholar 

  4. Färber, F., May, N., Lehner, W., Große, P., Müller, I., Rauhe, H., Dees, J.: The SAP HANA database - an architecture overview. IEEE Data Eng. Bull. 35(1), 28–33 (2012)

    Google Scholar 

  5. Grund, M., Krueger, J., Plattner, H., Zeier, A., Cudr-Mauroux, P., Madden, S.: HYRISE - a main memory hybrid storage engine. PVLDB 4(2), 105–116 (2010)

    Google Scholar 

  6. H-Store Documentation: MapReduce Transactions. http://hstore.cs.brown.edu/documentation/deployment/mapreduce/

  7. Harizopoulos, S., Abadi, D.J., Madden, S., Stonebraker, M.: OLTP through the looking glass, and what we found there. In: SIGMOD Conference, pp. 981–992. ACM (2008)

    Google Scholar 

  8. Idreos, S., Groffen, F., Nes, N., Manegold, S., Mullender, K.S., Kersten, M.L.: MonetDB: two decades of research in column-oriented database architectures. IEEE Data Eng. Bull. 35(1), 40–45 (2012)

    Google Scholar 

  9. Kemper, A., Neumann, T., Finis, J., Funke, F., Leis, V., Muehe, H., Muehlbauer, T., Roediger, W.: Processing in the hybrid OLTP & OLAP main-memory database system hyper. IEEE Data Eng. Bull. 36(2), 41–47 (2013)

    Google Scholar 

  10. Larson, P., Clinciu, C., Fraser, C., Hanson, E.N., Mokhtar, M., Nowakiewicz, M., Papadimos, V., Price, S.L., Rangarajan, S., Rusanu, R., Saubhasik, M.: Enhancements to SQL server column stores. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, June 22–27, pp. 1159–1168, New York (2013)

    Google Scholar 

  11. Malladi, K.T., Lee, B.C., Nothaft, F.A., Kozyrakis, C., Periyathambi, K., Horowitz, M.: Towards energy-proportional datacenter memory with mobile dram. In: SIGARCH Computer Architecture News, vol. 40(3), pp. 37–48 (2012)

    Google Scholar 

  12. Plattner, H.: The impact of columnar in-memory databases on enterprise systems. PVLDB 7(13), 1722–1729 (2014)

    Google Scholar 

  13. Raman, V., Attaluri, G.K., Barber, R., Chainani, N., Kalmuk, D., KulandaiSamy, V., Leenstra, J., Lightstone, S., Liu, S., Lohman, G.M., Malkemus, T., Müller, R., Pandis, I., Schiefer, B., Sharpe, D., Sidle, R., Storm, A.J., Zhang, L.: DB2 with BLU acceleration: so much more than just a column store. PVLDB 6(11), 1080–1091 (2013)

    Google Scholar 

  14. Rowstron, A., Narayanan, D., Donnelly, A., O’Shea, G., Douglas, A.: Nobody ever got fired for using hadoop on a cluster. In: Proceedings of the 1st International Workshop on Hot Topics in Cloud Data Processing, HotCDP 2012, pp. 2:1–2:5. ACM, New York (2012)

    Google Scholar 

  15. Shute, J., Vingralek, R., Samwel, B., Handy, B., Whipkey, C., Rollins, E., Oancea, M., Littlefield, K., Menestrina, D., Ellner, S., Cieslewicz, J., Rae, I., Stancescu, T., Apte, H.: F1: a distributed SQL database that scales. PVLDB 6(11), 1068–1079 (2013)

    Google Scholar 

  16. Sizing Guide for Single Click Configurations of Oracles MySQL on Sun Fire x86 Servers. www.oracle.com/technetwork/server-storage/sun-x86/documentation/o11-133-single-click-sizing-mysql-521534.pdf

  17. Zilio, D.C., Rao, J., Lightstone, S., Lohman, G.M., Storm, A.J., Garcia-Arellano, C., Fadden, S.: DB2 design advisor: integrated automatic physical database design. In: VLDB, pp. 1087–1097. Morgan Kaufmann (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Boissier .

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 7.1 Execution of CH-benCHmark Queries

The following adaptions have been done to run the CH-benCHmark queries:

  • when needed, the extract function (e.g., EXTRACT(YEAR FROM o_entry_d)) has been replaced by the year function (e.g., YEAR(o_entry_d))

  • for MySQL and PostgreSQL, query 15 has been modified to use a view instead of using SQL’s having clause (code provided by the OLTP-Bench framework)

  • when needed, aliases have been resolved in case they are not supported in aggregations

We set the maximal query execution time to 12 h for each query, which excludes queries from our results even though they are executable. Due to their long execution time we assume that the execution of these queries does not terminate.

1.2 7.2 TCO Calculations

The following section lists the components for an assumed bandwidth requirement of 40 GB/s. The prices have been obtained from the official websites of hardware vendors and do not include any discounts. Energy costs are calculated using the technical specifications of the hardware. Cooling costs are calculated using an assumed Power Usage Effectiveness (PUE) of 1.8 according to the Uptime Institute 2012 Data Center SurveyFootnote 3. The cost of energy is $0,276 per kWh. Both energy and cooling costs are calculated for a timespan of three years.

For the hard disk and solid state disk based systems each node is a four processor server (4 \(\times \) Intel Xeon E7-4850v2 12C/24T 2.3 GHz 24 MB) with an estimated price of $30,000. For both configurations the size of main memory is set to \({\sim }10\,\%\) of the database volume (i.e., 50 GB for the 500 GB data set).

All following exemplary calculations do not include costs for high availability.

HDD-Based System. The HDD-based system adapts to higher bandwidth requirements by adding direct attached storage units. In this calculation, each node has eight SAS slots. Each DAS unit is connected to two SAS slots and is assumed to provide the maximal theoretical throughput of 6 GB/s and consists of 96 disks (10 K enterprise grade) to provide the bandwidth. It is possible to reach 6 GB/s with fewer 15 K disks, but a configuration with 10 K is more price efficient.

Since two SAS slots are used to connect each DAS unit, each server node can connect to a maximum of four DAS units resulting in a peak bandwidth of 24 GB/s. Consequently, any bandwidth higher than 24 GB/s requires an additional server node.

The hardware setup for the 40 GB/s configuration and its TCO calculation is listed in Sect. 7.2 (Table 2).

Table 2. TCO calculation for the HDD-based system.

SSD-Based System. The SSD-based system uses PCI-e connected solid state disks. Recent Intel Xeon CPUs have up to 32 PCI-e lanes per socket that are directly connected. Consequently, we assume a theoretical setup of up to eight PCIe-connected SSDs per server node.

For our calculations, we use an PCIe SSD that provide a peak read bandwidth of 3 GB/s and has a size of 1 TB. As of now, there are faster SSDs available (up to 6 GB/s), but these are more expensive by a factor of over 3x. We also calculated prices for another PCIe SSD vendor whose drives are almost a factor 2x less expensive in their smallest size of 350 GB. We did not include these calculations here, as these drives are currently not available.. However, even using these drives the 40 GB/s configuration is still more expensive than its main memory-based counterpart (Table 3).

Table 3. TCO calculation for the SSD-based system.

Main Memory-Based System. The main memory-based server is equipped with Intel’s latest XEON E7 CPU. A server with four CPUs (Intel Xeon E7-4890v2 15C/30T 2.8 GHz 37 MB) costs \({\sim }\$63,000\). The costs include a 600 GB enterprise-grade HDD for persistence (Table 4).

Table 4. TCO calculation for the main memory-based system.

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Boissier, M., Meyer, C., Uflacker, M., Tinnefeld, C. (2015). And All of a Sudden: Main Memory Is Less Expensive Than Disk. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20233-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20232-7

  • Online ISBN: 978-3-319-20233-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics