Abstract
To meet response time and throughput demands, data processing architects continuously adapt query processing systems to novel hardware features. For instance, data processing systems already shifted from disk-oriented to main memory-oriented architectures to efficiently exploit the ever-increasing capacities of main memory. A prominent example for such new developments are emerging memory technologies such as very large caches, high-bandwidth memory (HBM), non-uniform memory access (NUMA) or remote-memory designs like CXL. These memories complement regular DRAM by trading off between properties such as capacity, read/write throughput or access latency. However, these degrees of freedom and their inherent complexity make it difficult for database systems to profit from the new hardware.
Taking HBM – as integrated in the 2023 Intel “Sapphire Rapids” Xeon Max processors – as a most recent example for emerging memory technologies, we present results from microbenchmarks that demonstrate different worker-thread saturation patterns for DRAM and HBM in this paper. Then, we subsequently derive characteristics and a cost model to make dynamic on-the-fly data-distribution/movement decisions. Based on this model, we show that for data processing queries with a specific data-reuse pattern, dynamic data (re-)distribution from DRAM to HBM can decrease end-to-end query run times.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
slightly below the available HBM of \({16}\,\text {GiB}\) per NUMA node to avoid an out-of-memory condition of unknown origin we experience on Linux.
References
Abadi, D., et al.: The Beckman report on database research. Commun. ACM 59(2), 92–99 (2016)
Arnold, O., Haas, S., Fettweis, G.P., Schlegel, B., Kissinger, T., Lehner, W.: An application-specific instruction set for accelerating set-oriented database primitives. In: SIGMOD, pp. 767–778 (2014)
Baumstark, A., Jibril, M.A., Sattler, K.: Processing-in-memory for databases: query processing and data transfer. In: DaMoN@SIGMOD, pp. 107–111 (2023)
Boncz, P.A., Kersten, M.L., Manegold, S.: Breaking the memory wall in MonetDB. Commun. ACM 51(12), 77–85 (2008)
Breß, S., Funke, H., Teubner, J.: Robust query processing in co-processor-accelerated databases. In: SIGMOD, pp. 1891–1906 (2016)
Geyer, A., et al.: Near to far: an evaluation of disaggregated memory for in-memory data processing. In: DIMES@SOSP, pp. 16–22 (2023)
Hedam, N., Clausen, M.T., Bonnet, P., Lee, S., Larsen, K.F.: Delilah: eBPF-offload on computational storage. In: DaMoN@SIGMOD, pp. 70–76 (2023)
Intel: Intel Xeon CPU Max Series – Configuration and Tuning Guide. Intel (2023)
Intel: Intel Xeon CPU MAX 9468 processor (2024). https://www.intel.com/content/www/us/en/products/sku/232596/intel-xeon-cpu-max-9468-processor-105m-cache-2-10-ghz/specifications.html
Jeong, J., Park, K., Lee, S., Bonnet, P., Lerner, A., Cudré-Mauroux, P.: Far-and-near: Co-designed storage reliability between database and SSDs. In: CIDR (2023)
Jiang, W., Korolija, D., Alonso, G.: Data processing with FPGAs on modern architectures. In: SIGMOD, pp. 77–82 (2023)
Jun, H., et al.: HBM (high bandwidth memory) DRAM technology and architecture. In: 2017 IEEE International Memory Workshop (IMW), pp. 1–4 (2017)
Karnagel, T., Habich, D., Lehner, W.: Adaptive work placement for query processing on heterogeneous computing resources. Proc. VLDB Endow. 10(7), 733–744 (2017)
Kissinger, T., Kiefer, T., Schlegel, B., Habich, D., Molka, D., Lehner, W.: ERIS: a NUMA-aware in-memory storage engine for analytical workload. In: ADMS@VLDB, pp. 74–85 (2014)
Lerner, A., et al.: Databases on modern networks: a decade of research that now comes into practice. Proc. VLDB Endow. 16(12), 3894–3897 (2023)
Li, F., Das, S., Syamala, M., Narasayya, V.R.: Accelerating relational databases by leveraging remote memory and RDMA. In: SIGMOD, pp. 355–370. ACM (2016)
Oukid, I., Booss, D., Lespinasse, A., Lehner, W., Willhalm, T., Gomes, G.: Memory management techniques for large-scale persistent-main-memory systems. Proc. VLDB Endow. 10(11), 1166–1177 (2017)
Pandis, I., Johnson, R., Hardavellas, N., Ailamaki, A.: Data-oriented transaction execution. PVLDB 3(1), 928–939 (2010)
Pohl, C., Sattler, K., Graefe, G.: Joins on high-bandwidth memory: a new level in the memory hierarchy. VLDB J. 29(2–3), 797–817 (2020)
Psaroudakis, I., Scheuer, T., May, N., Sellami, A., Ailamaki, A.: Adaptive NUMA-aware data placement and task scheduling for analytical workloads in main-memory column-stores. PVLDB 10(2), 37–48 (2016)
The CXL Consortium: Compute express link. https://www.computeexpresslink.org/ (2019). Accessed 22 Mar 2023
Thostrup, L., Doci, G., Boeschen, N., Luthra, M., Binnig, C.: Distributed GPU joins on fast RDMA-capable networks. Proc. ACM Manage. Data 1(1), 29:1–29:26 (2023)
Ungethüm, A., et al.: Hardware-oblivious SIMD parallelism for in-memory column-stores. In: CIDR (2020)
Ziegler, T., Binnig, C., Leis, V.: ScaleStore: a fast and cost-efficient storage engine using DRAM, NVMe, and RDMA. In: SIGMOD, pp. 685–699 (2022)
Acknowledgements
We gratefully acknowledge Dr. Alexander Krause for an initial microbenchmark setup to perform accurate measurements. This work was partly supported by the German Research Foundation (DFG) priority program SPP 2377 under grants no. SCHI 1447/1-1 and LE 1416/30-1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors declare no competing interests.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Berthold, A., Schmidt, L., Obersteiner, A., Habich, D., Lehner, W., Schirmeier, H. (2024). On-The-Fly Data Distribution to Accelerate Query Processing in Heterogeneous Memory Systems. In: Tekli, J., Gamper, J., Chbeir, R., Manolopoulos, Y. (eds) Advances in Databases and Information Systems. ADBIS 2024. Lecture Notes in Computer Science, vol 14918. Springer, Cham. https://doi.org/10.1007/978-3-031-70626-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-70626-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70628-8
Online ISBN: 978-3-031-70626-4
eBook Packages: Computer ScienceComputer Science (R0)