skip to main content
10.1145/3211922.3211927acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Cost/performance in modern data stores: how data caching systems succeed

Published:11 June 2018Publication History

ABSTRACT

Data in traditional "caching" data systems resides on secondary storage, and is read into main memory only when operated on. This limits system performance. Main memory data stores with data always in main memory are much faster. But this performance comes at a cost. In this paper, we analyze the costs of both in-memory operations and secondary storage operations where data is not "in cache". We study the performance impact of cache misses on caching system performance. The analysis considers both execution and storage costs. Based on our analysis, we derive cost/performance results for a data caching system [Deuteronomy and its Bw-tree] and a main memory system [MassTree] to understand where each demonstrates the best cost per operation, what is driving the cost differences, and the scale of the differences. This analysis (1) provides insight into why data caching systems continue to dominate the market; (2) points to higher performance that does not rely on simply increasing main memory cache size; and (3) suggests a path to lower costs and hence better cost/performance.

References

  1. R. Appuswamy, R. Borovica-Gajic, G. Graefe, and A. Ailamaki: The Five-minute Rule Thirty Years Later and its Impact on the Storage Hierarchy, ADMS, 2017Google ScholarGoogle Scholar
  2. Amazon Aurora https://aws.amazon.com/rds/aurora/details/Google ScholarGoogle Scholar
  3. R. Bayer and E. M. McCreight, "Organization and Maintenance of Large Ordered Indices," Acta Inf., vol. 1, no. 1, pp. 173--189, 1972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. DeBrabant, A. Pavlo, S. Tu, M. Stonebraker, S. Zdonik: Anti-Caching: A New Approach to Database Management System Architecture. PVLDB 6(14): 1942--1953 (2013) Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. LightNVM: The Linux Open-Channel SSD Subsystem. FAST 2017: 359--374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Bonnet: What's Up with the Storage Hierarchy? CIDR: 2017.Google ScholarGoogle Scholar
  7. C. Diaconu, C. Freedman, E. Ismert, P. Larson, P. Mittal, R. Stonecipher, N. Verma, M. Zwilling: Hekaton: SQL server's memory-optimized OLTP engine. SIGMOD 2013: 1243--1254 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Dong, M. Callaghan, L. Galanis, D. Borthakur, T. Savor, M. Strum: Optimizing Space Amplification in RocksDB. CIDR 2017Google ScholarGoogle Scholar
  9. A. Eldawy, J. Levandoski, P. Larson: Trekking Through Siberia: Managing Cold Data in a Memory-Optimized Database. PVLDB 7(11): 931--942 (2014) Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Flash file system: https://en.wikipedia.org/wiki/Flash_file_systemGoogle ScholarGoogle Scholar
  11. J. Gray, G. R. Putzolu: The 5 Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time. SIGMOD 1987: 395--398 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Gray, G. Graefe: The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. SIGMOD Record 26(4): 63--68 (1997) Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Gray: Tape is Dead, Disk is Tape, Flash is Disk, RAM Locality is King, jimgray.azurewebsites.net/talks/flash_is_good.ppt, 12, 2006.Google ScholarGoogle Scholar
  14. IBM DB2 https://en.wikipedia.org/wiki/IBM_Db2Google ScholarGoogle Scholar
  15. Intel: Introduction to the Storage Performance Development Kit (SPDK) https://software.intel.com/en-us/articles/introduction-to-the-storage-performance-development-kit-spdkGoogle ScholarGoogle Scholar
  16. R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi, H-Store: a High-Performance, Distributed Main Memory Transaction Processing System, PVLDB 1(2): 1496--1499 (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Kemper, T. Neumann, J. Finis, F. Funke, V. Leis, H. Mülhe, T. Mühhlbauer, W. Rödiger: Processing in the Hybrid OLTP & OLAP Main-Memory Database System HyPer. IEEE Data Eng. Bull. 36(2): 41--47 (2013)Google ScholarGoogle Scholar
  18. J. Lee, M. Müehle, N. May, F. Faerber, V. Sikka, H. Plattner, J. Krüger, M. Grund: High-Performance Transaction Processing in SAP HANA. IEEE Data Eng. Bull. 36(2): 28--33 (2013)Google ScholarGoogle Scholar
  19. Viktor Leis, Michael Haubenschild, Alfons Kemper, Thomas Neumann LeanStore: In-Memory Data Management Beyond Main Memory ICDE 2018Google ScholarGoogle Scholar
  20. J. Levandoski, D. Lomet, and S. Sengupta, The Bw-Tree: A B-tree for New Hardware Platforms, ICDE 2013, pp. 302--313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Levandoski, D. Lomet, S. Sengupta. LLAMA: A Cache/Storage Subsystem for Modern Hardware. PVLDB 6(10): 877--888 (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Levandoski, D. Lomet, S. Sengupta, R. Stutsman, R. Wang: High Performance Transactions in Deuteronomy. CIDR 2015.Google ScholarGoogle Scholar
  23. Y. Mao, E. Kohler, R. T. Morris. Cache Craftiness for Fast Multicore Key-Value Storage. In EuroSys, 2012, pp. 183--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Microsoft SQL Server https://en.wikipedia.org/wiki/Microsoft_SQL_ServerGoogle ScholarGoogle Scholar
  25. P. E. O'Neil, E. Cheng, D. Gawlick, E. J. O'Neil. The Log-Structured Merge-Tree (LSM-Tree). in Acta Inf. 33(4): 351--385 (1996) Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Oracle Database https://en.wikipedia.org/wiki/Oracle_DatabaseGoogle ScholarGoogle Scholar
  27. R. Sears, R. Ramakrishnan: bLSM: a general purpose log-structured merge tree. SIGMOD 2012: 217--228 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. RocksDB: A persistent key-value store for fast storage environments. http://rocksdb.org/Google ScholarGoogle Scholar
  29. LevelDB http://leveldb.org/Google ScholarGoogle Scholar
  30. M. Rosenblum and J. Ousterhout, "The Design and Implementation of a Log-Structured File System," ACM Trans. Comput. Syst., 10(1), 26--52 (1992). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. Shukla et al: Schema-Agnostic Indexing with Azure DocumentDB. in PVLDB 8(12): 1668--1679 (2015) Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Stonebraker, A. Weisberg: The VoltDB Main Memory DBMS. IEEE Data Eng. Bull. 36(2): 21--27 (2013)Google ScholarGoogle Scholar
  33. TPC: History and Overview of the TPC. http://www.tpc.org/information/about/history.aspGoogle ScholarGoogle Scholar
  34. M. Stonebraker, U. Cetintemel, "One size fits all": an idea whose time has come and gone. ICDE 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Ailamaki, Database Architectures for New Hardware. VLDB 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DAMON '18: Proceedings of the 14th International Workshop on Data Management on New Hardware
    June 2018
    75 pages
    ISBN:9781450358538
    DOI:10.1145/3211922

    Copyright © 2018 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 June 2018

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate80of102submissions,78%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader