On-The-Fly Data Distribution to Accelerate Query Processing in Heterogeneous Memory Systems

Berthold, André; Schmidt, Lennart; Obersteiner, Anton; Habich, Dirk; Lehner, Wolfgang; Schirmeier, Horst

doi:10.1007/978-3-031-70626-4_12

André Berthold¹¹,
Lennart Schmidt¹²,
Anton Obersteiner¹¹,
Dirk Habich¹²,
Wolfgang Lehner¹² &
…
Horst Schirmeier¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14918))

Included in the following conference series:

European Conference on Advances in Databases and Information Systems

394 Accesses

Abstract

To meet response time and throughput demands, data processing architects continuously adapt query processing systems to novel hardware features. For instance, data processing systems already shifted from disk-oriented to main memory-oriented architectures to efficiently exploit the ever-increasing capacities of main memory. A prominent example for such new developments are emerging memory technologies such as very large caches, high-bandwidth memory (HBM), non-uniform memory access (NUMA) or remote-memory designs like CXL. These memories complement regular DRAM by trading off between properties such as capacity, read/write throughput or access latency. However, these degrees of freedom and their inherent complexity make it difficult for database systems to profit from the new hardware.

Taking HBM – as integrated in the 2023 Intel “Sapphire Rapids” Xeon Max processors – as a most recent example for emerging memory technologies, we present results from microbenchmarks that demonstrate different worker-thread saturation patterns for DRAM and HBM in this paper. Then, we subsequently derive characteristics and a cost model to make dynamic on-the-fly data-distribution/movement decisions. Based on this model, we show that for data processing queries with a specific data-reuse pattern, dynamic data (re-)distribution from DRAM to HBM can decrease end-to-end query run times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.99; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

RapidSwap: a Hierarchical Far Memory

Locality-Adaptive Parallel Hash Joins Using Hardware Transactional Memory

A Highly Scalable Index Structure for Multicore In-Memory Database Systems

Notes

1.
slightly below the available HBM of ${16}\,\text {GiB}$ per NUMA node to avoid an out-of-memory condition of unknown origin we experience on Linux.

References

Abadi, D., et al.: The Beckman report on database research. Commun. ACM 59(2), 92–99 (2016)
Article Google Scholar
Arnold, O., Haas, S., Fettweis, G.P., Schlegel, B., Kissinger, T., Lehner, W.: An application-specific instruction set for accelerating set-oriented database primitives. In: SIGMOD, pp. 767–778 (2014)
Google Scholar
Baumstark, A., Jibril, M.A., Sattler, K.: Processing-in-memory for databases: query processing and data transfer. In: DaMoN@SIGMOD, pp. 107–111 (2023)
Google Scholar
Boncz, P.A., Kersten, M.L., Manegold, S.: Breaking the memory wall in MonetDB. Commun. ACM 51(12), 77–85 (2008)
Article Google Scholar
Breß, S., Funke, H., Teubner, J.: Robust query processing in co-processor-accelerated databases. In: SIGMOD, pp. 1891–1906 (2016)
Google Scholar
Geyer, A., et al.: Near to far: an evaluation of disaggregated memory for in-memory data processing. In: DIMES@SOSP, pp. 16–22 (2023)
Google Scholar
Hedam, N., Clausen, M.T., Bonnet, P., Lee, S., Larsen, K.F.: Delilah: eBPF-offload on computational storage. In: DaMoN@SIGMOD, pp. 70–76 (2023)
Google Scholar
Intel: Intel Xeon CPU Max Series – Configuration and Tuning Guide. Intel (2023)
Google Scholar
Intel: Intel Xeon CPU MAX 9468 processor (2024). https://www.intel.com/content/www/us/en/products/sku/232596/intel-xeon-cpu-max-9468-processor-105m-cache-2-10-ghz/specifications.html
Jeong, J., Park, K., Lee, S., Bonnet, P., Lerner, A., Cudré-Mauroux, P.: Far-and-near: Co-designed storage reliability between database and SSDs. In: CIDR (2023)
Google Scholar
Jiang, W., Korolija, D., Alonso, G.: Data processing with FPGAs on modern architectures. In: SIGMOD, pp. 77–82 (2023)
Google Scholar
Jun, H., et al.: HBM (high bandwidth memory) DRAM technology and architecture. In: 2017 IEEE International Memory Workshop (IMW), pp. 1–4 (2017)
Google Scholar
Karnagel, T., Habich, D., Lehner, W.: Adaptive work placement for query processing on heterogeneous computing resources. Proc. VLDB Endow. 10(7), 733–744 (2017)
Article Google Scholar
Kissinger, T., Kiefer, T., Schlegel, B., Habich, D., Molka, D., Lehner, W.: ERIS: a NUMA-aware in-memory storage engine for analytical workload. In: ADMS@VLDB, pp. 74–85 (2014)
Google Scholar
Lerner, A., et al.: Databases on modern networks: a decade of research that now comes into practice. Proc. VLDB Endow. 16(12), 3894–3897 (2023)
Article Google Scholar
Li, F., Das, S., Syamala, M., Narasayya, V.R.: Accelerating relational databases by leveraging remote memory and RDMA. In: SIGMOD, pp. 355–370. ACM (2016)
Google Scholar
Oukid, I., Booss, D., Lespinasse, A., Lehner, W., Willhalm, T., Gomes, G.: Memory management techniques for large-scale persistent-main-memory systems. Proc. VLDB Endow. 10(11), 1166–1177 (2017)
Article Google Scholar
Pandis, I., Johnson, R., Hardavellas, N., Ailamaki, A.: Data-oriented transaction execution. PVLDB 3(1), 928–939 (2010)
Google Scholar
Pohl, C., Sattler, K., Graefe, G.: Joins on high-bandwidth memory: a new level in the memory hierarchy. VLDB J. 29(2–3), 797–817 (2020)
Article Google Scholar
Psaroudakis, I., Scheuer, T., May, N., Sellami, A., Ailamaki, A.: Adaptive NUMA-aware data placement and task scheduling for analytical workloads in main-memory column-stores. PVLDB 10(2), 37–48 (2016)
Google Scholar
The CXL Consortium: Compute express link. https://www.computeexpresslink.org/ (2019). Accessed 22 Mar 2023
Thostrup, L., Doci, G., Boeschen, N., Luthra, M., Binnig, C.: Distributed GPU joins on fast RDMA-capable networks. Proc. ACM Manage. Data 1(1), 29:1–29:26 (2023)
Google Scholar
Ungethüm, A., et al.: Hardware-oblivious SIMD parallelism for in-memory column-stores. In: CIDR (2020)
Google Scholar
Ziegler, T., Binnig, C., Leis, V.: ScaleStore: a fast and cost-efficient storage engine using DRAM, NVMe, and RDMA. In: SIGMOD, pp. 685–699 (2022)
Google Scholar

Download references

Acknowledgements

We gratefully acknowledge Dr. Alexander Krause for an initial microbenchmark setup to perform accurate measurements. This work was partly supported by the German Research Foundation (DFG) priority program SPP 2377 under grants no. SCHI 1447/1-1 and LE 1416/30-1.

Author information

Authors and Affiliations

TU Dresden, Operating Systems Group, Dresden, Germany
André Berthold, Anton Obersteiner & Horst Schirmeier
TU Dresden, Database Research Group, Dresden, Germany
Lennart Schmidt, Dirk Habich & Wolfgang Lehner

Authors

André Berthold
View author publications
You can also search for this author in PubMed Google Scholar
Lennart Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Anton Obersteiner
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Habich
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Lehner
View author publications
You can also search for this author in PubMed Google Scholar
Horst Schirmeier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to André Berthold .

Editor information

Editors and Affiliations

Lebanese American University Engineering School, Lebanese American University, Chouran Beirut, Lebanon
Joe Tekli
Free University of Bozen-Bolzano, Bozen-Bolzano, Italy
Johann Gamper
Université de Pau et des Pays de l’Adour, Anglet, France
Richard Chbeir
Open University of Cyprus, Nicosia, Cyprus
Yannis Manolopoulos

Ethics declarations

Disclosure of Interests

The authors declare no competing interests.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berthold, A., Schmidt, L., Obersteiner, A., Habich, D., Lehner, W., Schirmeier, H. (2024). On-The-Fly Data Distribution to Accelerate Query Processing in Heterogeneous Memory Systems. In: Tekli, J., Gamper, J., Chbeir, R., Manolopoulos, Y. (eds) Advances in Databases and Information Systems. ADBIS 2024. Lecture Notes in Computer Science, vol 14918. Springer, Cham. https://doi.org/10.1007/978-3-031-70626-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-70626-4_12
Published: 01 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70628-8
Online ISBN: 978-3-031-70626-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On-The-Fly Data Distribution to Accelerate Query Processing in Heterogeneous Memory Systems