ABSTRACT
In this paper, we present a new approach for achieving robust performance of data structures making it easier to reuse the same design for different hardware generations but also for different workloads. To achieve robust performance, the main idea is to strictly separate the data structure design from the actual strategies to execute access operations and adjust the actual execution strategies by means of so-called configurations instead of hard-wiring the execution strategy into the data structure. In our evaluation we demonstrate the benefits of this configuration approach for individual data structures as well as complex OLTP workloads.
Supplemental Material
- Manos Athanassoulis, Kenneth S. Bøgh, and Stratos Idreos. 2019. Optimal Column Layout for Hybrid Workloads. Proceedings of the VLDB Endowment, Vol. 12, 13 (2019), 2393--2407. https://doi.org/10.14778/3358701.3358707Google ScholarDigital Library
- Nick Benton, Luca Cardelli, and Cédric Fournet. 2004. Modern Concurrency Abstractions for C#. ACM Trans. Program. Lang. Syst., Vol. 26, 5 (2004), 769--804. https://doi.org/10.1145/1018203.1018205Google ScholarDigital Library
- Timo Bingmann. 2013. STX B+ Tree C+ Template Classes. http://panthema.net/2007/stx-btree/, accessed 2019-05--23.Google Scholar
- Robert Binna, Eva Zangerle, Martin Pichl, Günther Specht, and Viktor Leis. 2018. HOT: A Height Optimized Trie Index for Main-Memory Database Systems. In Proceedings of the 2018 International Conference on Management of Data - SIGMOD '18. ACM, 521--534. https://doi.org/10.1145/3183713.3196896Google ScholarDigital Library
- Trevor Brown, Alex Kogan, Yossi Lev, and Victor Luchangco. 2016. Investigating the Performance of Hardware Transactions on a Multi-Socket Machine. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 121--132. https://doi.org/10.1145/2935764.2935796Google ScholarDigital Library
- Irina Calciu, Dave Dice, Tim Harris, Maurice Herlihy, Alex Kogan, Virendra Marathe, and Mark Moir. 2013. Message Passing or Shared Memory: Evaluating the Delegation Abstraction for Multicores. In Proceedings of the 17th International Conference on Principles of Distributed Systems - Volume 8304 (OPODIS 2013). Springer-Verlag, 83--97. https://doi.org/10.1007/978--3--319-03850--6_7Google ScholarDigital Library
- Irina Calciu, Siddhartha Sen, Mahesh Balakrishnan, and Marcos K. Aguilera. 2018. How to implement any concurrent data structure. Commun. ACM, Vol. 61, 12 (2018), 97--105. https://doi.org/10.1145/3282506Google ScholarDigital Library
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC '10). ACM, New York, NY, USA, 143--154. https://doi.org/10.1145/1807128.1807152Google ScholarDigital Library
- Travis Craig. 1993. Building FIFO and Priority Queuing Spin Locks from Atomic Swap. Technical Report. TR 93-02-02, University of Washington.Google Scholar
- Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. 2013. Everything You Always Wanted to Know About Synchronization But Were Afraid to Ask. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 2522714, 33--48. https://doi.org/10.1145/2517349.2522714Google ScholarDigital Library
- Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Ake Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. 2013. Hekaton: SQL Server's Memory-optimized OLTP Engine. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). ACM, 1243--1254. https://doi.org/10.1145/2463676.2463710Google ScholarDigital Library
- Dave Dice, Virendra J. Marathe, and Nir Shavit. 2011. Flat-Combining NUMA Locks. In Proceedings of the Twenty-Third Annual ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 65--74. https://doi.org/10.1145/1989493.1989502Google ScholarDigital Library
- David Dice, Virendra J. Marathe, and Nir Shavit. 2015. Lock Cohorting: A General Technique for Designing NUMA Locks. ACM Trans. Parallel Comput., Vol. 1, 2 (2015), 1--42. https://doi.org/10.1145/2686884Google ScholarDigital Library
- Peter Dinda, Thomas Gross, David O'Hallaron, Edward Segall, and Jon Webb. 1994. The Carnegie Mellon University Task Parallel Program Suite. Report. Carnegie-Mellon University Pittsburgh, Dept. of Computer Science.Google Scholar
- Markus Dreseler, Thomas Kissinger, Timo Djürken, Eric Lübke, Matthias Uflacker, Dirk Habich, Hasso Plattner, and Wolfgang Lehner. 2017. Hardware-Accelerated Memory Operations on Large-Scale NUMA Systems.. In ADMS at VLDB. 34--41.Google Scholar
- Jose M. Faleiro and Daniel J. Abadi. 2017. Latch-free Synchronization in Database Systems: Silver Bullet or Fool's Gold?. In In Proceedings of the 8th Biennial Conference on Innovative Data Systems Research (CIDR '17). 1--12.Google Scholar
- Goetz Graefe, Arnd Christian Kö nig, Harumi Anne Kuno, Volker Markl, and Kai-Uwe Sattler (Eds.). 2010. Robust Query Processing, 19.09. - 24.09.2010. Dagstuhl Seminar Proceedings, Vol. 10381. Schloss Dagstuhl - Leibniz-Zentrum fü r Informatik, Germany.Google Scholar
- Intel Corporation. 2019. Intel® 64 and IA-32 Architectures Software Developer's Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4. Report.Google Scholar
- Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. 2008. H-Store: a High-Performance, Distributed Main Memory Transaction Processing System. Proc. VLDB Endow., Vol. 1, 2 (2008), 1496--1499. https://doi.org/10.14778/1454159.1454211Google ScholarDigital Library
- Thomas Kissinger, Tim Kiefer, Benjamin Schlegel, Dirk Habich, Daniel Molka, and Wolfgang Lehner. 2014. ERIS: A Numa-Aware In-Memory Storage Engine for Analytical Workloads. Proceedings of the VLDB Endowment, Vol. 7, 14 (2014), 1--12. https://doi.org/10.1.1.475.3378Google Scholar
- Sven O. Krumke and Clemens Thielen. 2013. The Generalized Assignment Problem with Minimum Quantities. European Journal of Operational Research, Vol. 228, 1 (2013), 46--55. https://doi.org/10.1016/j.ejor.2013.01.027Google ScholarCross Ref
- Viktor Leis, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2014. Morsel-driven Parallelism: A NUMA-aware Query Evaluation Framework for the Many-core Age. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14). ACM, New York, NY, USA, 743--754. https://doi.org/10.1145/2588555.2610507Google ScholarDigital Library
- Justin J. Levandoski, David B. Lomet, and Sudipta Sengupta. 2013. The Bw-Tree: A B-tree for new hardware platforms. In 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8--12, 2013. 302--313.Google ScholarDigital Library
- Jean-Pierre Lozi, Florian David, Gaël Thomas, Julia Lawall, and Gilles Muller. 2012. Remote Core Locking: Migrating Critical-Section Execution to Improve the Performance of Multithreaded Applications. In Proceedings of the 2012 USENIX conference on Annual Technical Conference. USENIX Association, 6--6.Google Scholar
- Ismail Oukid, Johan Lasperas, Anisoara Nica, Thomas Willhalm, and Wolfgang Lehner. 2016. FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). ACM, 371--386. https://doi.org/10.1145/2882903.2915251Google ScholarDigital Library
- Oguzhan Ozmen, Kenneth Salem, Jiri Schindler, and Steve Daniel. 2010. Workload-Aware Storage Layout for Database Systems. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. Association for Computing Machinery, 939--950. https://doi.org/10.1145/1807167.1807268Google ScholarDigital Library
- Ippokratis Pandis, Ryan Johnson, Nikos Hardavellas, and Anastasia Ailamaki. 2010. Data-Oriented Transaction Execution. Proceedings of the VLDB Endowment, Vol. 3, 1--2 (2010), 928--939. https://doi.org/10.14778/1920841.1920959Google ScholarDigital Library
- Orestis Polychroniou and Kenneth A. Ross. 2014. A Comprehensive Study of Main-Memory Partitioning and its Application to Large-Scale Comparison- and Radix-Sort. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Association for Computing Machinery, 755--766. https://doi.org/10.1145/2588555.2610522Google Scholar
- Danica Porobic, Ippokratis Pandis, Miguel Branco, Pinar Tözün, and Anastasia Ailamaki. 2016. Characterization of the Impact of Hardware Islands on OLTP. The VLDB Journal, Vol. 25, 5 (2016), 625--650. https://doi.org/10.1007/s00778-015-0413--2Google ScholarDigital Library
- Iraklis Psaroudakis, Tobias Scheuer, Norman May, Abdelkader Sellami, and Anastasia Ailamaki. 2016. Adaptive NUMA-aware data placement and task scheduling for analytical workloads in main-memory column-stores. Proc. VLDB Endow., Vol. 10, 2 (2016), 37--48. https://doi.org/10.14778/3015274.3015275Google ScholarDigital Library
- Jun Rao and Kenneth A. Ross. 1999. Cache Conscious Indexing for Decision-Support in Main Memory. In VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7--10, 1999, Edinburgh, Scotland, UK. 78--89.Google Scholar
- Jun Rao and Kenneth A. Ross. 2000. Making B+-Trees Cache Conscious in Main Memory. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16--18, 2000, Dallas, Texas, USA. 475--486. https://doi.org/10.1145/342009.335449Google Scholar
- James Reinders. 2007. Intel Threading Building Blocks: Outfitting C+ for Multi-Core Processor Parallelism .O'Reilly Media, Inc.Google Scholar
- Kun Ren, Jose M. Faleiro, and Daniel J. Abadi. 2016. Design Principles for Scaling Multi-core OLTP Under High Contention. In Proceedings of the 2016 International Conference on Management of Data. ACM, 1583--1598. https://doi.org/10.1145/2882903.2882958Google Scholar
- Sepideh Roghanchi, Jakob Eriksson, and Nilanjana Basu. 2017. Ffwd: Delegation is (Much) Faster Than You Think. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). ACM, 342--358. https://doi.org/10.1145/3132747.3132771Google ScholarDigital Library
- Utku Sirin, Ahmad Yasin, and Anastasia Ailamaki. 2017. A Methodology for OLTP Micro-Architectural Analysis. In Proceedings of the 13th International Workshop on Data Management on New Hardware. ACM, 3076116, 1--10. https://doi.org/10.1145/3076113.3076116Google ScholarDigital Library
- Peter Thoman, Kiril Dichev, Thomas Heller, Roman Iakymchuk, Xavier Aguilar, Khalid Hasanov, Philipp Gschwandtner, Pierre Lemarinier, Stefano Markidis, Herbert Jordan, Thomas Fahringer, Kostas Katrinis, Erwin Laure, and Dimitrios S. Nikolopoulos. 2018. A Taxonomy of Task-Based Parallel Programming Technologies for High-Performance Computing. The Journal of Supercomputing, Vol. 74, 4 (2018), 1422--1434. https://doi.org/10.1007/s11227-018--2238--4Google ScholarDigital Library
- Robert Virding, Claes Wikström, Mike Williams, and Joe Armstrong. 1996. Concurrent Programming in ERLANG (2nd Ed.) .Prentice Hall International (UK) Ltd.Google Scholar
- Ziqi Wang, Andrew Pavlo, Hyeontaek Lim, Viktor Leis, Huanchen Zhang, Michael Kaminsky, and David G. Andersen. 2018. Building a Bw-Tree Takes More Than Just Buzz Words. In Proceedings of the 2018 International Conference on Management of Data - SIGMOD '18. ACM, 473--488. https://doi.org/10.1145/3183713.3196895Google Scholar
- Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, and Michael Stonebraker. 2014. Staring Into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores. Proceedings of the VLDB Endowment, Vol. 8, 3 (2014), 209--220. https://doi.org/10.14778/2735508.2735511Google ScholarDigital Library
Recommendations
Time-shift replacement algorithm for main memory performance optimization
Page replacement algorithms of main memory in modern operating systems are crucial in system performance. When memory is full, a page replacement algorithm exploits temporal locality and frequency of page references to evict the page that is least ...
Main-memory index structures with fixed-size partial keys
The performance of main-memory index structures is increasingly determined by the number of CPU cache misses incurred when traversing the index. When keys are stored indirectly, as is standard in main-memory databases, the cost of key retrieval in terms ...
Comments