ABSTRACT
Precise prediction of query execution performance is the basis for various database optimization strategies. With columnar in-memory databases, cost modeling changes in two dimensions: First, models for disk-based databases are not well-suited as the new bottleneck is main memory access. Second, the possibility to execute mixed workloads creates new challenges. For transactional and analytical queries with aggregation operations, memory access patterns and thus execution times vary significantly. This paper discusses the influences of data characteristics on aggregation operations and elevates not considered factors by existing cost model approaches. Further, we present benchmarks implemented and executed on a columnar in-memory research database to underline our assumptions.
- A. Ailamaki, D. DeWitt, M. Hill, and D. Wood. DBMSs on a Modern Processor: Where Does Time Go? VLDB, Sept. 1999. Google ScholarDigital Library
- J. Cieslewicz and K. A. Ross. Adaptive aggregation on chip multiprocessors. In VLDB, Sept. 2007. Google ScholarDigital Library
- E. F. Codd. A Relational Model of Data for Large Shared Data Banks. Commun. ACM, 13(6):377--387, 1970. Google ScholarDigital Library
- G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73--169, June 1993. Google ScholarDigital Library
- J. Gray and Bosworth. Data cube: a relational aggregation operator generalizing GROUP-BY, CROSS-TAB, and SUB-TOTALS. In ICDE, pages 152--159, 1996. Google ScholarDigital Library
- M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudré-Mauroux, and S. Madden. HYRISE - A Main Memory Hybrid Storage Engine. PVLDB, 4(2):105--116, 2010. Google ScholarDigital Library
- Heng Li. Benchmark of Dictionary Structures. http://lh3lh3.users.sourceforge.net/udb.shtml, 2009.Google Scholar
- S. Listgarten and M.-A. Neimat. Modelling Costs for a MM-DBMS. In RTDB, pages 72--78, 1996.Google Scholar
- S. Manegold, P. A. Boncz, and M. L. Kersten. Generic Database Cost Models for Hierarchical Memory Systems. In VLDB, pages 191--202. Morgan Kaufmann, 2002. Google ScholarDigital Library
- P. J. Mucci, S. Browne, C. Deane, and G. Ho. Papi: A portable interface to hardware performance counters. In In Proceedings of the Department of Defense HPCMP Users Group Conference, pages 7--10, 1999.Google Scholar
- H. Plattner. A common database approach for OLTP and OLAP using an in-memory column database. In SIGMOD, June 2009. Google ScholarDigital Library
- J. Smith and D. Smith. Database abstractions: aggregation. ACM Transactions on Database Systems, 1977. Google ScholarDigital Library
- M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. J. O'Neil, P. E. O'Neil, A. Rasin, N. Tran, and S. B. Zdonik. C-Store: A Column-oriented DBMS. In VLDB, pages 553--564. ACM, 2005. Google ScholarDigital Library
- J. A. Storer. Data Compression: Methods and Theory. Computer Science Press, 1988. Google ScholarDigital Library
- D. Taniar, C. Leung, J. Rahayu, and S. Goel. High-Performance Parallel Database Processing and Grid Databases. John Wiley & Sons, 2008. Google ScholarDigital Library
Index Terms
- An in-depth analysis of data aggregation cost factors in a columnar in-memory database
Recommendations
Aggregation strategies for columnar in-memory databases in a mixed workload
PIKM '11: Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge managementThe recent trend towards analytics on operational data has led to an approach of reunifying online transactional processing and online analytical processing in one single database. The advent of columnar in-memory databases makes this viable and ...
Efficient logging for enterprise workloads on column-oriented in-memory databases
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementThe introduction of a 64 bit address space in commodity operating systems and the constant drop in hardware prices made large capacities of main memory in the order of terabytes technically feasible and economically viable. Especially column-oriented in-...
DEMO: Adjustably encrypted in-memory column-store
CCS '13: Proceedings of the 2013 ACM SIGSAC conference on Computer & communications securityRecent databases are implemented as in-memory column-stores. Adjustable encryption offers a solution to encrypted database processing in the cloud. We show that the two technologies play well together by providing an analysis and prototype results that ...
Comments