skip to main content
10.1145/3592980.3595315acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Accelerating Main-Memory Table Scans with Partial Virtual Views

Published: 18 June 2023 Publication History

Abstract

In main-memory column stores, column scans are one of the base operations performed when answering analytical queries. Typically, one or multiple columns must be filtered with respect to the given query predicate, which, by default, involves inspecting all data of the involved columns. To reduce the amount of data to scan, there exist essentially two strategies: (1) Create a coarse-granular index on the column, then use it for early pruning during each scan. While creating such an index is relatively lightweight, unfortunately, accessing the relevant portions of the column through the index causes unpleasant overhead during scanning. (2) Create materialized views that contain semantic portions of the column and filter on these. While this enables fast scans, unfortunately, it requires physical copying and causes significant space overhead. To break this trade-off, in the following, we propose a view-based strategy that avoids any physical copying of column data while providing optimal scan performance. We achieve this by utilizing tools of the virtual memory subsystem provided by the OS: On the lowest level, we materialize all columns within physical main memory. On top of that, we allow the creation of arbitrarily many partial views in virtual memory that map to subsets of the physical columns having certain properties of interest. Creation, maintenance, and usage of these partial virtual views happens fully adaptively as a side-product of scan-based query processing.

References

[1]
Jens Dittrich and Alekh Jindal. 2011. Towards a One Size Fits All Database Architecture. In Fifth Biennial Conference on Innovative Data Systems Research, CIDR 2011, Asilomar, CA, USA, January 9-12, 2011, Online Proceedings. www.cidrdb.org, 195–198. http://cidrdb.org/cidr2011/Papers/CIDR11_Paper25.pdf
[2]
Thomer M. Gil and Samuel Madden. 2007. Scoop: An Adaptive Indexing Scheme for Stored Data in Sensor Networks. In ICDE. IEEE Computer Society, 1345–1349.
[3]
Immanuel Haffner and Jens Dittrich. 2023. A simplified Architecture for Fast, Adaptive Compilation and Execution of SQL Queries. In Proceedings 26th International Conference on Extending Database Technology, EDBT 2023, Ioannina, Greece, March 28-31, 2023, Julia Stoyanovich, Jens Teubner, Nikos Mamoulis, Evaggelia Pitoura, Jan Mühlig, Katja Hose, Sourav S. Bhowmick, and Matteo Lissandrini (Eds.). OpenProceedings.org, 1–13. https://doi.org/10.48786/edbt.2023.01
[4]
Immanuel Haffner, Felix Martin Schuhknecht, and Jens Dittrich. 2018. An analysis and comparison of database cracking kernels. In Proceedings of the 14th International Workshop on Data Management on New Hardware, Houston, TX, USA, June 11, 2018, Wolfgang Lehner and Kenneth Salem (Eds.). ACM, 10:1–10:10. https://doi.org/10.1145/3211922.3211930
[5]
Felix Halim, Stratos Idreos, Panagiotis Karras, and Roland H. C. Yap. 2012. Stochastic Database Cracking: Towards Robust Adaptive Indexing in Main-Memory Column-Stores. Proc. VLDB Endow. 5, 6 (2012), 502–513. https://doi.org/10.14778/2168651.2168652
[6]
Stratos Idreos, Martin L. Kersten, and Stefan Manegold. 2007. Database Cracking. In Third Biennial Conference on Innovative Data Systems Research, CIDR 2007, Asilomar, CA, USA, January 7-10, 2007, Online Proceedings. www.cidrdb.org, 68–78. http://cidrdb.org/cidr2007/papers/cidr07p07.pdf
[7]
Stratos Idreos, Stefan Manegold, Harumi A. Kuno, and Goetz Graefe. 2011. Merging What’s Cracked, Cracking What’s Merged: Adaptive Indexing in Main-Memory Column-Stores. Proc. VLDB Endow. 4, 9 (2011), 585–597. https://doi.org/10.14778/2002938.2002944
[8]
Alekh Jindal. 2012. OctopusDB : flexible and scalable storage management for arbitrary database engines. Ph. D. Dissertation. Saarland University, Saarbrücken, Germany.
[9]
Alfons Kemper and Thomas Neumann. 2011. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In Proceedings of the 27th International Conference on Data Engineering, ICDE 2011, April 11-16, 2011, Hannover, Germany, Serge Abiteboul, Klemens Böhm, Christoph Koch, and Kian-Lee Tan (Eds.). IEEE Computer Society, 195–206. https://doi.org/10.1109/ICDE.2011.5767867
[10]
Eamonn J. Keogh. 2017. Indexing and Mining Time Series Data. In Encyclopedia of GIS. Springer, 933–939.
[11]
Dean De Leo and Peter A. Boncz. 2019. Packed Memory Arrays - Rewired. In 35th IEEE International Conference on Data Engineering, ICDE 2019, Macao, China, April 8-11, 2019. IEEE, 830–841. https://doi.org/10.1109/ICDE.2019.00079
[12]
Felix Schuhknecht and Justus Henneberg. 2023. Why Your Experimental Results Might Be Wrong. In Proceedings of the 19th International Workshop on Data Management on New Hardware (DaMoN ’23), June 19, 2023, Seattle, WA, USA. ACM. https://doi.org/10.1145/3592980.3595317
[13]
Felix Martin Schuhknecht. 2016. Closing the circle of algorithmic and system-centric database optimization: a comprehensive survey on adaptive indexing, data partitioning, and the rewiring of virtual memory. Ph. D. Dissertation. Saarland University, Saarbrücken, Germany.
[14]
Felix Martin Schuhknecht, Jens Dittrich, and Ankur Sharma. 2016. RUMA has it: Rewired User-space Memory Access is Possible!Proc. VLDB Endow. 9, 10 (2016), 768–779. https://doi.org/10.14778/2977797.2977803
[15]
Felix Martin Schuhknecht, Alekh Jindal, and Jens Dittrich. 2013. The Uncracked Pieces in Database Cracking. Proc. VLDB Endow. 7, 2 (2013), 97–108. https://doi.org/10.14778/2732228.2732229
[16]
Felix Martin Schuhknecht, Alekh Jindal, and Jens Dittrich. 2016. An experimental evaluation and analysis of database cracking. VLDB J. 25, 1 (2016), 27–52. https://doi.org/10.1007/s00778-015-0397-y
[17]
Felix Martin Schuhknecht, Aaron Priesterroth, Justus Henneberg, and Reza Salkhordeh. 2021. AnyOLAP: Analytical Processing of Arbitrary Data-Intensive Applications without ETL. Proc. VLDB Endow. 14, 12 (2021), 2823–2826. https://doi.org/10.14778/3476311.3476354
[18]
Mohamed Ziauddin, Andrew Witkowski, You Jung Kim, Janaki Lahorani, Dmitry Potapov, and Murali Krishna. 2017. Dimensions Based Data Clustering and Zone Maps. Proc. VLDB Endow. 10, 12 (2017), 1622–1633. https://doi.org/10.14778/3137765.3137769

Cited By

View all
  • (2025)Practical DB-OS Co-Design with Privileged Kernel BypassProceedings of the ACM on Management of Data10.1145/37097143:1(1-27)Online publication date: 11-Feb-2025

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DaMoN '23: Proceedings of the 19th International Workshop on Data Management on New Hardware
June 2023
119 pages
ISBN:9798400701917
DOI:10.1145/3592980
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2023

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SIGMOD/PODS '23
Sponsor:

Acceptance Rates

DaMoN '23 Paper Acceptance Rate 17 of 23 submissions, 74%;
Overall Acceptance Rate 94 of 127 submissions, 74%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Practical DB-OS Co-Design with Privileged Kernel BypassProceedings of the ACM on Management of Data10.1145/37097143:1(1-27)Online publication date: 11-Feb-2025

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media