skip to main content
10.1145/2390045.2390057acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

An in-depth analysis of data aggregation cost factors in a columnar in-memory database

Authors Info & Claims
Published:02 November 2012Publication History

ABSTRACT

Precise prediction of query execution performance is the basis for various database optimization strategies. With columnar in-memory databases, cost modeling changes in two dimensions: First, models for disk-based databases are not well-suited as the new bottleneck is main memory access. Second, the possibility to execute mixed workloads creates new challenges. For transactional and analytical queries with aggregation operations, memory access patterns and thus execution times vary significantly. This paper discusses the influences of data characteristics on aggregation operations and elevates not considered factors by existing cost model approaches. Further, we present benchmarks implemented and executed on a columnar in-memory research database to underline our assumptions.

References

  1. A. Ailamaki, D. DeWitt, M. Hill, and D. Wood. DBMSs on a Modern Processor: Where Does Time Go? VLDB, Sept. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Cieslewicz and K. A. Ross. Adaptive aggregation on chip multiprocessors. In VLDB, Sept. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. F. Codd. A Relational Model of Data for Large Shared Data Banks. Commun. ACM, 13(6):377--387, 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73--169, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Gray and Bosworth. Data cube: a relational aggregation operator generalizing GROUP-BY, CROSS-TAB, and SUB-TOTALS. In ICDE, pages 152--159, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudré-Mauroux, and S. Madden. HYRISE - A Main Memory Hybrid Storage Engine. PVLDB, 4(2):105--116, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Heng Li. Benchmark of Dictionary Structures. http://lh3lh3.users.sourceforge.net/udb.shtml, 2009.Google ScholarGoogle Scholar
  8. S. Listgarten and M.-A. Neimat. Modelling Costs for a MM-DBMS. In RTDB, pages 72--78, 1996.Google ScholarGoogle Scholar
  9. S. Manegold, P. A. Boncz, and M. L. Kersten. Generic Database Cost Models for Hierarchical Memory Systems. In VLDB, pages 191--202. Morgan Kaufmann, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. J. Mucci, S. Browne, C. Deane, and G. Ho. Papi: A portable interface to hardware performance counters. In In Proceedings of the Department of Defense HPCMP Users Group Conference, pages 7--10, 1999.Google ScholarGoogle Scholar
  11. H. Plattner. A common database approach for OLTP and OLAP using an in-memory column database. In SIGMOD, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Smith and D. Smith. Database abstractions: aggregation. ACM Transactions on Database Systems, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. J. O'Neil, P. E. O'Neil, A. Rasin, N. Tran, and S. B. Zdonik. C-Store: A Column-oriented DBMS. In VLDB, pages 553--564. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. A. Storer. Data Compression: Methods and Theory. Computer Science Press, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Taniar, C. Leung, J. Rahayu, and S. Goel. High-Performance Parallel Database Processing and Grid Databases. John Wiley & Sons, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An in-depth analysis of data aggregation cost factors in a columnar in-memory database

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DOLAP '12: Proceedings of the fifteenth international workshop on Data warehousing and OLAP
      November 2012
      154 pages
      ISBN:9781450317214
      DOI:10.1145/2390045

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 November 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate29of79submissions,37%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader