Skip to main content

On-The-Fly Data Distribution to Accelerate Query Processing in Heterogeneous Memory Systems

  • Conference paper
  • First Online:
Advances in Databases and Information Systems (ADBIS 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14918))

Included in the following conference series:

  • 394 Accesses

Abstract

To meet response time and throughput demands, data processing architects continuously adapt query processing systems to novel hardware features. For instance, data processing systems already shifted from disk-oriented to main memory-oriented architectures to efficiently exploit the ever-increasing capacities of main memory. A prominent example for such new developments are emerging memory technologies such as very large caches, high-bandwidth memory (HBM), non-uniform memory access (NUMA) or remote-memory designs like CXL. These memories complement regular DRAM by trading off between properties such as capacity, read/write throughput or access latency. However, these degrees of freedom and their inherent complexity make it difficult for database systems to profit from the new hardware.

Taking HBM – as integrated in the 2023 Intel “Sapphire Rapids” Xeon Max processors – as a most recent example for emerging memory technologies, we present results from microbenchmarks that demonstrate different worker-thread saturation patterns for DRAM and HBM in this paper. Then, we subsequently derive characteristics and a cost model to make dynamic on-the-fly data-distribution/movement decisions. Based on this model, we show that for data processing queries with a specific data-reuse pattern, dynamic data (re-)distribution from DRAM to HBM can decrease end-to-end query run times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    slightly below the available HBM of \({16}\,\text {GiB}\) per NUMA node to avoid an out-of-memory condition of unknown origin we experience on Linux.

References

  1. Abadi, D., et al.: The Beckman report on database research. Commun. ACM 59(2), 92–99 (2016)

    Article  Google Scholar 

  2. Arnold, O., Haas, S., Fettweis, G.P., Schlegel, B., Kissinger, T., Lehner, W.: An application-specific instruction set for accelerating set-oriented database primitives. In: SIGMOD, pp. 767–778 (2014)

    Google Scholar 

  3. Baumstark, A., Jibril, M.A., Sattler, K.: Processing-in-memory for databases: query processing and data transfer. In: DaMoN@SIGMOD, pp. 107–111 (2023)

    Google Scholar 

  4. Boncz, P.A., Kersten, M.L., Manegold, S.: Breaking the memory wall in MonetDB. Commun. ACM 51(12), 77–85 (2008)

    Article  Google Scholar 

  5. Breß, S., Funke, H., Teubner, J.: Robust query processing in co-processor-accelerated databases. In: SIGMOD, pp. 1891–1906 (2016)

    Google Scholar 

  6. Geyer, A., et al.: Near to far: an evaluation of disaggregated memory for in-memory data processing. In: DIMES@SOSP, pp. 16–22 (2023)

    Google Scholar 

  7. Hedam, N., Clausen, M.T., Bonnet, P., Lee, S., Larsen, K.F.: Delilah: eBPF-offload on computational storage. In: DaMoN@SIGMOD, pp. 70–76 (2023)

    Google Scholar 

  8. Intel: Intel Xeon CPU Max Series – Configuration and Tuning Guide. Intel (2023)

    Google Scholar 

  9. Intel: Intel Xeon CPU MAX 9468 processor (2024). https://www.intel.com/content/www/us/en/products/sku/232596/intel-xeon-cpu-max-9468-processor-105m-cache-2-10-ghz/specifications.html

  10. Jeong, J., Park, K., Lee, S., Bonnet, P., Lerner, A., Cudré-Mauroux, P.: Far-and-near: Co-designed storage reliability between database and SSDs. In: CIDR (2023)

    Google Scholar 

  11. Jiang, W., Korolija, D., Alonso, G.: Data processing with FPGAs on modern architectures. In: SIGMOD, pp. 77–82 (2023)

    Google Scholar 

  12. Jun, H., et al.: HBM (high bandwidth memory) DRAM technology and architecture. In: 2017 IEEE International Memory Workshop (IMW), pp. 1–4 (2017)

    Google Scholar 

  13. Karnagel, T., Habich, D., Lehner, W.: Adaptive work placement for query processing on heterogeneous computing resources. Proc. VLDB Endow. 10(7), 733–744 (2017)

    Article  Google Scholar 

  14. Kissinger, T., Kiefer, T., Schlegel, B., Habich, D., Molka, D., Lehner, W.: ERIS: a NUMA-aware in-memory storage engine for analytical workload. In: ADMS@VLDB, pp. 74–85 (2014)

    Google Scholar 

  15. Lerner, A., et al.: Databases on modern networks: a decade of research that now comes into practice. Proc. VLDB Endow. 16(12), 3894–3897 (2023)

    Article  Google Scholar 

  16. Li, F., Das, S., Syamala, M., Narasayya, V.R.: Accelerating relational databases by leveraging remote memory and RDMA. In: SIGMOD, pp. 355–370. ACM (2016)

    Google Scholar 

  17. Oukid, I., Booss, D., Lespinasse, A., Lehner, W., Willhalm, T., Gomes, G.: Memory management techniques for large-scale persistent-main-memory systems. Proc. VLDB Endow. 10(11), 1166–1177 (2017)

    Article  Google Scholar 

  18. Pandis, I., Johnson, R., Hardavellas, N., Ailamaki, A.: Data-oriented transaction execution. PVLDB 3(1), 928–939 (2010)

    Google Scholar 

  19. Pohl, C., Sattler, K., Graefe, G.: Joins on high-bandwidth memory: a new level in the memory hierarchy. VLDB J. 29(2–3), 797–817 (2020)

    Article  Google Scholar 

  20. Psaroudakis, I., Scheuer, T., May, N., Sellami, A., Ailamaki, A.: Adaptive NUMA-aware data placement and task scheduling for analytical workloads in main-memory column-stores. PVLDB 10(2), 37–48 (2016)

    Google Scholar 

  21. The CXL Consortium: Compute express link. https://www.computeexpresslink.org/ (2019). Accessed 22 Mar 2023

  22. Thostrup, L., Doci, G., Boeschen, N., Luthra, M., Binnig, C.: Distributed GPU joins on fast RDMA-capable networks. Proc. ACM Manage. Data 1(1), 29:1–29:26 (2023)

    Google Scholar 

  23. Ungethüm, A., et al.: Hardware-oblivious SIMD parallelism for in-memory column-stores. In: CIDR (2020)

    Google Scholar 

  24. Ziegler, T., Binnig, C., Leis, V.: ScaleStore: a fast and cost-efficient storage engine using DRAM, NVMe, and RDMA. In: SIGMOD, pp. 685–699 (2022)

    Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge Dr. Alexander Krause for an initial microbenchmark setup to perform accurate measurements. This work was partly supported by the German Research Foundation (DFG) priority program SPP 2377 under grants no. SCHI 1447/1-1 and LE 1416/30-1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to André Berthold .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors declare no competing interests.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Berthold, A., Schmidt, L., Obersteiner, A., Habich, D., Lehner, W., Schirmeier, H. (2024). On-The-Fly Data Distribution to Accelerate Query Processing in Heterogeneous Memory Systems. In: Tekli, J., Gamper, J., Chbeir, R., Manolopoulos, Y. (eds) Advances in Databases and Information Systems. ADBIS 2024. Lecture Notes in Computer Science, vol 14918. Springer, Cham. https://doi.org/10.1007/978-3-031-70626-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70626-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70628-8

  • Online ISBN: 978-3-031-70626-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics