skip to main content
10.1145/2259016.2259025acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

Micro-specialization: dynamic code specialization of database management systems

Published: 31 March 2012 Publication History

Abstract

Database management systems (DBMSes) form a cornerstone of modern IT infrastructure, and it is essential that they have excellent performance. Much of the work to date on optimizing DBMS performance has emphasized ensuring efficient data access from secondary storage. This paper shows that DBMSes can also benefit significantly from dynamic code specialization. Our approach focuses on the iterative query evaluation loops typically used by such systems. Query evaluation involves extensive references to the relational schema, predicate values, and join types, which are all invariant during query evaluation, and thus are subject to dynamic value-based code specialization.
We introduce three distinct types of specialization, each corresponding to a particular kind of invariant. We realize these techniques, in concert termed micro-specialization, via a DBMS-independent run-time environment and apply them to a high-performance open-source DBMS, PostgreSQL. We show that micro-specialization requires minimal changes to the DBMS and can yield performance improvements simultaneously across a wide range of queries and modifications, in terms of storage, CPU usage, and I/O time of standard DBMS benchmarks. We also discuss an integrated development environment that helps DBMS developers apply micro-specializations to identified target code sequences.

References

[1]
D. Abadi, S. Madden, and M. Ferreira. Integrating Compression and Execution in Column-Oriented Database Systems. In Proceedings of the ACM SIGMOD international conference on Management of data, pages 671--682, New York, NY, USA, 2006.
[2]
A.-R. Adl-Tabatabai, M. Cierniak, G.-Y. Lueh, V. M. Parikh, and J. M. Stichnoth. Fast, Effective Code Generation in a Just-in-Time Java Compiler. In Proc. ACM SIGPLAN '98 Conference on Programming Language Design and Implementation, pages 280--290, June 1998.
[3]
A. V. Aho, R. Sethi, and J. D. Ullman. Compilers -- Principles, Techniques, and Tools. Addison-Wesley, Reading, Mass., 1985.
[4]
A. Ailamaki, D. J. DeWitt, and M. D. Hill. Data Page Layouts for Relational Databases on Deep Memory Hierarchies. VLDB J., 11(3):198--215, 2002.
[5]
A. Ailamaki, D. J. DeWitt, M. D. Hill, and M. Skounakis. Weaving Relations for Cache Performance. In VLDB, pages 169--180, 2001.
[6]
V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: A Transparent Dynamic Optimization System. In SIGPLAN '00 Conference on Programming Language Design and Implementation, pages 1--12, June 2000.
[7]
D. Carmean, B. Falsaf, B. C. Kuszmaul, J. M. Patel, and K. A. Ross. Architecture-Conscious Databases: Sub-Optimization or the Next Big Leap? In DaMoN, 2005.
[8]
S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry. Improving Hash Join Performance through Prefetching. Data Engineering, International Conference on, 0:116, 2004.
[9]
C. Consel, J. L. Lawall, and A.-F. Le Meur. A Tour of Tempo: a Program Specializer for the C Language. Science of Computer Programming, 52:341--370, 2004.
[10]
C. Consel and F. Noël. A General Approach for Run-Time Specialization and its Application to C. In Conference Record of the 23rd ACM Symposium on Principles of Programming Languages (POPL'96), pages 145--156, Jan. 21--24, 1996.
[11]
V. Developers. Callgrind: A Call-Graph Generating Cache and Branch Prediction Profiler. http://valgrind.org/docs/manual/cl-manual.html(accessed October 27, 2010).
[12]
D. R. Engler, W. C. Hsieh, and M. F. Kaashoek. 'C: A Language for High-Level, Efficient, and Machine-Independent Dynamic Code Generation. In Proc. 23rd ACM Symposium on Principles of Programming Languages (POPL '96), pages 131--144, Jan. 1996.
[13]
B. Grant, M. Mock, M. Philipose, C. Chambers, and S. Eggers. Annotation-Directed Run-Time Specialization in C. In Proc. ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM-97), pages 163--178, June 12--13 1997.
[14]
S. Harizopoulos and A. Ailamaki. Steps towards Cache-Resident Transaction Processing. In VLDB, pages 660--671, 2004.
[15]
A. Krall. Efficient JavaVM Just-in-Time Compilation. In Proc. Int. Conf. on Parallel Architectures and Compilation Techniques, pages 205--212, Oct. 1998.
[16]
K. Krikellas, S. Viglas, and M. Cintra. Generating Code for Holistic Query Evaluation. In ICDE, pages 613--624, 2010.
[17]
D. Lussier. BenchmarkSQL. http://sourceforge.net/projects/benchmarksql/(accessed August 15, 2010).
[18]
S. Manegold, P. A. Boncz, and M. L. Kersten. Optimizing Main-Memory Join on Modern Hardware. IEEE Trans. Knowl. Data Eng., 14(4):709--730, 2002.
[19]
D. Martinenghi and M. Tagliasacchi. Proximity Rank Join. Proc. VLDB Endow., 3:352--363, September 2010.
[20]
X. Martinez-Palau, D. Dominguez-Sal, and J. L. Larriba-Pey. Two-Way Replacement Selection. Proc. VLDB Endow., 3:871--881, September 2010.
[21]
R. Muth, S. Watterson, and S. K. Debray. Code specialization based on value profiles. In Proc. 7th. International Static Analysis Symposium (SAS 2000), pages 340--359, June 2000.
[22]
T. Neumann. Efficiently Compiling Efficient Query Plans for Modern Hardware. Proc. VLDB Endow., 4(9):539--550, 2011.
[23]
F. Noël, L. Hornof, C. Consel, and J. L. Lawall. Automatic, template-based run-time specialization: Implementation and experimental study. In Proc. 1998 International Conference on Computer Languages, pages 132--142, 1998.
[24]
PostgresSQL Global Development Group. PostgresSQL. http://www.postgresql.org/(accessed August 29, 2010).
[25]
S. Pramanik, A. Watve, C. R. Meiners, and A. Liu. Transforming Range Queries to Equivalent Box Queries to Optimize Page Access. Proc. VLDB Endow., 3:409--416, September 2010.
[26]
C. Pu, T. Autrey, A. Black, C. Consel, C. Cowan, J. Inouye, L. Kethana, J. Walpole, and K. Zhang. Optimistic incremental specialization: streamlining a commercial operating system. In Proceedings of the fifteenth ACM symposium on Operating systems principles, SOSP '95, pages 314--321, New York, NY, USA, 1995. ACM.
[27]
J. Rao, H. Pirahesh, C. Mohan, and G. M. Lohman. Compiled Query Execution Engine using JVM. In ICDE, page 23, 2006.
[28]
A. Shatdal, C. Kant, and J. F. Naughton. Cache Conscious Algorithms for Relational Query Processing. In J. B. Bocca, M. Jarke, and C. Zaniolo, editors, In Proceedings of VLDB'94, September 12--15, 1994, Santiago de Chile, Chile, pages 510--521. Morgan Kaufmann, 1994.
[29]
J. Sompolski, M. Zukowski, and P. A. Boncz. Vectorization vs. Compilation in Query Execution. In DaMoN, pages 33--40, 2011.
[30]
M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, P. O'Neil, A. Rasin, N. Tran, and S. Zdonik. C-store: A Column-Oriented DBMS. In In Proceedings of VLDB'05, pages 553--564. VLDB Endowment, 2005.
[31]
M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The End of an Architectural Era: (It's Time for a Complete Rewrite). In In Proceedings of VLDB'07, pages 1150--1160. VLDB Endowment, 2007.
[32]
TPC. TPC Transaction Processing Performance Council - TPC-H. http://www.tpc.org/tpch/(accessed August 29, 2010).
[33]
VoltDB Inc. How VoltDB Works. http://voltdb.com/content/how-voltdb-works (accessed January 17, 2011).
[34]
B.-S. Yang et al. LaTTe: A Java VM Just-in-Time Compiler with Fast and Efficient Register Allocation. In Proc. Int. Conf. on Parallel Architectures and Compilation Techniques (PACT '99), pages 128--138, Oct. 12--16, 1999.
[35]
J. Zhou and K. A. Ross. Buffering Database Operations for Enhanced Instruction Cache Performance. In SIGMOD Conference, pages 191--202, 2004.

Cited By

View all
  • (2024)Morpheus: A Run Time Compiler and Optimizer for Software Data PlanesIEEE/ACM Transactions on Networking10.1109/TNET.2023.334628632:3(2269-2284)Online publication date: 1-Jun-2024
  • (2022)Automatic Array Transformation to Columnar Storage at Run TimeProceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3546919(16-28)Online publication date: 14-Sep-2022
  • (2022)Domain specific run time optimization for software data planesProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507769(1148-1164)Online publication date: 28-Feb-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CGO '12: Proceedings of the Tenth International Symposium on Code Generation and Optimization
March 2012
285 pages
ISBN:9781450312066
DOI:10.1145/2259016
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 March 2012

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

CGO '12

Acceptance Rates

CGO '12 Paper Acceptance Rate 26 of 90 submissions, 29%;
Overall Acceptance Rate 312 of 1,061 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Morpheus: A Run Time Compiler and Optimizer for Software Data PlanesIEEE/ACM Transactions on Networking10.1109/TNET.2023.334628632:3(2269-2284)Online publication date: 1-Jun-2024
  • (2022)Automatic Array Transformation to Columnar Storage at Run TimeProceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3546919(16-28)Online publication date: 14-Sep-2022
  • (2022)Domain specific run time optimization for software data planesProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507769(1148-1164)Online publication date: 28-Feb-2022
  • (2021)Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in UmbraThe VLDB Journal10.1007/s00778-020-00643-4Online publication date: 2-Jun-2021
  • (2018)Building Efficient Query Engines in a High-Level LanguageACM Transactions on Database Systems10.1145/318365343:1(1-45)Online publication date: 11-Apr-2018
  • (2018)Designing an Adaptive VM That Combines Vectorized and JIT Execution on Heterogeneous Hardware2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00215(1684-1688)Online publication date: Apr-2018
  • (2018)Adaptive Execution of Compiled Queries2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00027(197-208)Online publication date: Apr-2018
  • (2018)Runtime Specialization of PostgreSQL Query ExecutorPerspectives of System Informatics10.1007/978-3-319-74313-4_27(375-386)Online publication date: 18-Jan-2018
  • (2017)Generalized Micro-Specialization for Modern DBMS ArchitecturesProceedings of the 2017 ACM International Conference on Management of Data10.1145/3055167.3055182(49-51)Online publication date: 14-May-2017
  • (2016)Hash Map InliningProceedings of the 2016 International Conference on Parallel Architectures and Compilation10.1145/2967938.2967949(235-246)Online publication date: 11-Sep-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media