Abstract
The increasing heterogeneity of the underlying hardware forces modern database system engineers to implement multiple variants of a single database operator (e.g., join, selection). With increasing heterogeneity, these variants become too complex to maintain and tune for different devices. To overcome these disadvantages, developers use an alternative, primitive-based operator design. This design paradigm splits the database operators into granular functions or primitives and executes a given operator by combing the necessary primitives. Hence, we require only a limited set of these primitives as we reuse them for multiple database operations. Thus, tuning a single primitive improves efficiency of all the database operations using it.
In this survey, we provide an overview of a primitive-based database engine. First, we list different primitives from literature and place them in a hierarchy from the finest granular level to a complete database operator. Second, for each of primitive we list its possible tuning opportunities. Finally, we discuss the significance of primitive-based execution on the query engine. Overall, this survey aims to serve as a general reference for implementing a primitive-based query engine and possible strategies to tune it for specific processors.
Similar content being viewed by others
References
Abadi DJ, Myers DS, DeWitt DJ, Madden S (2007) Materialization strategies in a column-oriented DBMS. Proceedings of the International Conference on Data Engineering (ICDE). IEEE, Istanbul, Turkey, pp 466–475
Albutiu MC, Kemper A, Neumann T (2012) Massively parallel sort-merge joins in main memory multi-core database systems. Proceedings VLDB Endowment 5(10):1064–1075
Blelloch GE (1990a) Prefix sums and their applications
Blelloch GE (1990b) Vector models for data-parallel computing. MIT Press, Cambridge, MA, USA
Boncz PA, Kersten ML (1999) MIL primitives for querying a fragmented world. VLDB J 8(2):101–119
Breß S (2014) The design and implementation of CoGaDB: a column-oriented GPU-accelerated DBMS. Datenbank Spektrum 14(3):199–209
Breß S, Köcher B, Funke H, Rabl T, Markl V (2017) Generating custom code for efficient query execution on heterogeneous processors. Computing Research Repository (CoRR) abs/1709.00700
Broneske D, Breß S, Heimel M, Saake G (2014) Toward hardware-sensitive database operations. Proeedings of the International Conference on Extending Database Technology (EDBT), pp 1–6
Broneske D, Köppen V, Saake G, Schäler M (2017a) Accelerating multi-column selection predicates in main-memory – the Elf approach. In: Proceedings of the International Conference on Data Engineering (ICDE), pp 647–658
Broneske D, Meister A, Saake G (2017b) Hardware-sensitive scan operator variants for compiled selection pipelines. In: Datenbanksysteme für Business, Technologie und Web (BTW)
Diamos G, Wu H, Lele A, Wang J et al (2012) Efficient relational algebra algorithms and data structures for GPU. Tech. rep.. Georgia Institute of Technology, Atlanta, Georgia
Govindaraju N, Raghuvanshi N, Henson M, Tuft D, Manocha D (2005) A cache-efficient sorting algorithm for database and data mining computations using graphics processors. Tech. rep.. University of North Carolina, Chapel Hill, NC, USA
Graefe G (1993) Query evaluation techniques for large databases. ACM Comput Surv 25(2):73–170
He B, Yang K, Fang R, Lu M, Govindaraju N, Luo Q, Sander P (2008) Relational joins on graphics processors. Proceedings of the International Conference on Management of Data (SIGMOD). ACM, Vancouver, Canada, pp 511–524
He B, Lu M, Yang K, Fang R, Govindaraju NK, Luo Q, Sander PV (2009) Relational query coprocessing on graphics processors. ACM Trans Database Syst 34(4):1–21
Heimel M, Saecker M, Pirk H, Manegold S, Markl V (2013) Hardware-oblivious parallelism for in-memory column-stores. Proceedings VLDB Endowment 6(9):709–720
Hennessy JL, Patterson DA (2011) Computer architecture: a quantitative approach, 5th edn. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA
Horn D (2005) Stream reduction operations for GPGPU applications. In: Pharr M (ed) GPU Gems, vol 2. Addison-Wesley, pp 573–589
Kim C et al (2010) FAST: fast architecture sensitive tree search on modern CPUs and GPUs. Proceedings of the International Conference on Management of Data (SIGMOD). ACM, Indianapolis, Indiana, USA, pp 339–350
Knuth DE (1997) The art of computer programming: fundamental algorithms, vol 1, 3rd edn. Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, USA
Neumann T (2011) Efficiently compiling efficient query plans for modern hardware. Proceedings VLDB Endowment 4(9):539–550
Pantela S, Idreos S (2015) One loop does not fit all. Proceedings of the International Conference on Management of Data (SIGMOD). ACM, Melbourne, Australia, pp 2073–2074
Pirk H, Moll O, Zaharia M, Madden S (2016) Voodoo – A vector algebra for portable database performance on modern hardware. Proceedings VLDB Endowment 9(14):1707–1718
Polychroniou O, Ross KA (2014) A comprehensive study of main-memory partitioning and its application to large-scale comparison- and radix-sort. Proceedings of the International Conference on Management of Data (SIGMOD), pp 755–766
Polychroniou O, Ross KA (2015) Efficient lightweight compression alongside fast scans. Proceedings of the International Workshop on Data Management on New Hardware (DaMoN).
Polychroniou O, Raghavan A, Ross KA (2015) Rethinking simd vectorization for in-memory databases. Proceedings of the International Conference on Management of Data (SIGMOD), pp 1493–1508
Rao J, Ross K (2000) Making B+-Trees cache conscious in main memory. In: Proceedings of the International Conference on Management of Data (SIGMOD). ACM, Dallas, Texas, USA, pp 475–486
Rauhe H, Dees J, Sattler KU, Faerber F (2013) Multi-level parallel query execution framework for CPU and GPU. In: Proceedings of the European Conference on Advances in Databases and Information Systems (ADBIS). Springer, Genoa, Italy, pp 330–343
Richter S, Alvarez V, Dittrich J (2015) A seven-dimensional analysis of hashing methods and its implications on query processing. Proceedings VLDB Endowment 9(3):96–107
Rosenfeld V, Heimel M, Viebig C, Markl V (2015) The operator variant selection problem on heterogeneous hardware. Proceedings of the International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures (ADMS).
Ross KA (2004) Selection conditions in main memory. ACM Trans Database Syst 29(1):132–161
Ross KA (2007) Efficient hash probes on modern processors. In: Proceedings of the International Conference on Data Engineering (ICDE), pp 1297–1301
Schuhknecht FM, Khanchandani P, Dittrich J (2015) On the surprising difficulty of simple things. Proceedings VLDB Endowment 8(9):934–937
Sengupta S, Harris M, Zhang Y, Owens JD (2007) Scan primitives for GPU computing. In: Proceedings of the ACM Symposium on Graphics Hardware (SIGGRAPH). Eurographics Association, San Diego, California, pp 97–106
Sidler D, Owaida M, István Z, Kara K, Alonso G (2017) doppiodb: a hardware accelerated database. In: International Conference on Field Programmable Logic and Applications (FPL), pp 1–1
Sitaridi EA, Ross KA (2013) Optimizing select conditions on GPUs. Proceedings of the International Workshop on Data Management on New Hardware (DaMoN), pp 1–8
Willhalm T, Boshmaf Y, Plattner H, Popovici N, Zeier A, Schaffner J (2009) SIMD-scan: ultra fast in-memory table scan using on-chip vector processing units. Proceedings VLDB Endowment 2(1):385–394
Willhalm T, Oukid I, Müller I, Faerber F (2013) Vectorizing database column scans with complex predicates. In: Proceedings of the International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures (ADMS), pp 1–12
Wu H, Diamos G, Cadambi S, Yalamanchili S (2012) Kernel weaver: automatically fusing database primitives for efficient GPU computation. In: Proceedings of the International Symposium on Microarchitecture (MICRO). IEEE, Vancouver, BC, Canada, pp 107–118
Zeuch S, Freytag J (2015) Selection on modern CPUs. Proceedings of the International Workshop on In-Memory Data Mangement and Analytics (IMDM), pp 1–8
Zeuch S, Freytag JC, Huber F (2014) Adapting tree structures for processing with SIMD instructions. In: Proceedings of the International Conference on Extending Database Technology (EDBT), pp 97–108
Acknowledgements
This work was partially funded by the DFG (grant no.: SA 465/51-1 and PI 447/9)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gurumurthy, B., Broneske, D., Drewes, T. et al. Cooking DBMS Operations using Granular Primitives. Datenbank Spektrum 18, 183–193 (2018). https://doi.org/10.1007/s13222-018-0295-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13222-018-0295-8