Article

Buffering databse operations for enhanced instruction cache performance

Authors:

Kenneth A. RossAuthors Info & Claims

SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data

Pages 191 - 202

https://doi.org/10.1145/1007568.1007592

Published: 13 June 2004 Publication History

Abstract

As more and more query processing work can be done in main memory access is becoming a significant cost component of database operations. Recent database research has shown that most of the memory stalls are due to second-level cache data misses and first-level instruction cache misses. While a lot of research has focused on reducing the data cache misses, relatively little research has been done on improving the instruction cache performance of database systems.We first answer the question "Why does a database system incur so many instruction cache misses?" We demonstrate that current demand-pull pipelined query execution engines suffer from significant instruction cache thrashing between different operators. We propose techniques to buffer database operations during query execution to avoid instruction cache thrashing. We implement a new light-weight "buffer" operator and study various factors which may affect the cache performance. We also introduce a plan refinement algorithm that considers the query plan and decides whether it is beneficial to add additional "buffer" operators and where to put them. The benefit is mainly from better instruction locality and better hardware branch prediction. Our techniques can be easily integrated into current database systems without significant changes. Our experiments in a memory-resident PostgreSQL database system show that buffering techniques can reduce the number of instruction cache misses by up to 80% and improve query performance by up to 15%.

References

[1]

LMbench - Tools for Performance Analysis. http://www.bitmover.com/lmbench/.

[2]

A. Ailamaki, D. J. DeWitt, M. D. Hill, and M. Skounakis. Weaving relations for cache performance. In Proceedings of VLDB Conference, 2001.

Digital Library

[3]

A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. DBMSs on a modern processor: Where does time go? In Proceedings of VLDB conference, 1999.

Digital Library

[4]

M. Annavaram, J. M. Patel, and E. S. Davidson. Call graph prefetching for database applications. In Proceedings of International Symposium on High Performance Computer Architecture, 2001.

Digital Library

[5]

L. A. Barroso, K. Gharachorloo, and E. Bugnion. Memory system characterization of commercial workloads. In Proceedings of International Symposium on Computer Architecture, 1998.

Digital Library

[6]

P. Boncz, S. Manegold, and M. L. Kersten. Database architecture optimized for the new bottleneck: Memory access. In Proceedings of VLDB Conference, 1999.

Digital Library

[7]

I.-C. K. Chen, C.-C. Lee, and T. N. Mudge. Instruction prefetching using branch prediction information. In Proceedings of International Symposium on Microarchitecture, 1997.

[8]

S. Chen, P. B. Gibbons, and T. C. Mowry. Improving index performance through prefetching. In Proceedings of ACM SIGMOD Conference, 2001.

Digital Library

[9]

S. Chen, P. B. Gibbons, T. C. Mowry, and G. Valentin. Fractal prefetching B+-trees: Optimizing both cache and disk performance. In Proceedings of ACM SIGMOD Conference, 2002.

Digital Library

[10]

N. Gloy and M. D. Smith. Procedure placement using temporal-ordering information. ACM Transactions on Programming Languages and Systems, 21(5): 977--1027, 1999.

Digital Library

[11]

G. Graefe. Volcano, an extensible and parallel query evaluation system. IEEE Transactions on knowledge and data enginnering, 6(6):934--944, 1994.

Digital Library

[12]

L. M. Haas et al. Starburst mid-flight: as the dust clears. IEEE Transactions on knowledge and data engineering, 2(1):143, 1990.

Digital Library

[13]

A. H. Hashemi, D. R. Kaeli, and B. Calder. Efficient procedure mapping using cache line coloring. In SIGPLAN Conference on Programming Language Design and Implementation, 1997.

Digital Library

[14]

G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, 1st Quarter, 2001.

[15]

Intel Corp. VTune performance analyzer. http://www.intel.com/software/products/vtune/.

[16]

Intel Inc. IA32 intel architecture optimization reference manual. 2003.

[17]

K. Keeton, D. A. Patterson, Y. Q. He, R. C. Raphael, and W. E. Baker. Performance characterization of a quad pentium pro smp using oltp workloads. In Proceedings of the 25th International Symposium on Computer Architecture, 1998.

Digital Library

[18]

K. Kim, S. K. Cha, and K. Kwon. Optimizing multidimensional index trees for main memory access. In Proceedings of ACM SIGMOD Conference, 2001.

Digital Library

[19]

C.-K. Luk and T. C. Mowry. Cooperative prefetching: Compiler and hardware support for effective instruction prefetching in modern processors. In International Symposium on Microarchitecture, 1998.

Digital Library

[20]

A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems, 1994.

Digital Library

[21]

S. Padmanabhan, T. Malkemus, R. Agarwal, and A. Jhingran. Block oriented processing of relational database operations in modern computer architectures. In Proceedings of ICDE Conference, 2001.

Digital Library

[22]

S. E. Perl and R. L. Sites. Studies of windows nt performance using dynamic execution traces. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, 1996.

Digital Library

[23]

K. Pettis and R. C. Hansen. Profile guided code positioning. In Proceedings of ACM SIGPLAN conference, 1990

Digital Library

[24]

A. Ramirez et al. Optimization of instruction fetch for decision support workloads. In Proceedings of the International Conference on Parallel Processing, 1999.

Digital Library

[25]

A. Ramirez et al. Code layout optmizations for transaction processing workloads. In Proceedings of International Symposium on Computer Architecture, 2001.

Digital Library

[26]

J. Rao and K. A. Ross. Cache conscious indexing for decision-support in main memory. In Proceedings of VLDB Conference, 1999.

Digital Library

[27]

J. Rao and K. A. Ross. Making B+ trees chache conscious in main memory. In Proceedings of ACM SIGMOD Conference, 2000.

Digital Library

[28]

P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access with selection in a relational database management system. In Proceedings of ACM SIGMOD Conference, 1979.

Digital Library

[29]

A. Shatdal, C. Kant, and J. F. Naughton. Cache conscious algorithms for relational query processing. In Proceedings of VLDB Conference, pages 510--521, 1994.

Digital Library

[30]

J. Smith and W. C. Hsu. Prefetching in supercomputer instruction caches. In Proceedings of Supercomputing, 1992.

Digital Library

[31]

M. Stonebraker, L. A. Rowe, and M. Hirohama. The implementation of Postgres. In Transactions on Knowledge and Data Engineering, 1990.

Digital Library

[32]

Transaction Processing Performance Council. TPC Benchmark H. Available via http://www.tpc.com/tpch/.

[33]

J. Zhou and K. A. Ross. Buffering accesses to memory-resident index structures. In Proceedings of VLDB Conference, 2003.

Digital Library

Cited By

Eymer JDexter PRaskind JLiu Y(2024)A Runtime System for Interruptible Query Processing: When Incremental Computing Meets Fine-Grained ParallelismProceedings of the ACM on Programming Languages10.1145/36897728:OOPSLA2(1729-1756)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689772
Li YHu HLei CZhou XQian W(2024)Hill-Cache: Adaptive Integration of Recency and Frequency in Caching with Hill-Climbing2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00302(3947-3960)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00302
Lin WQin JChen YJin ZXu JZhang YCai SFu LChen YChen W(2023)JACO: JAva Code Layout Optimizer Enabling Continuous Optimization without Pausing Application Services2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00032(295-306)Online publication date: 31-Oct-2023
https://doi.org/10.1109/CLUSTER52292.2023.00032
Show More Cited By

Buffering databse operations for enhanced instruction cache performance

Recommendations

A Performance Study of Instruction Cache Prefetching Methods

Prefetching methods for instruction caches are studied via trace-driven simulation. The two primary methods are "fall-through" prefetch (sometimes referred to as "one block lookahead") and "target" prefetch. Fall-through prefetches are for sequential ...
Balanced Instruction Cache: Reducing Conflict Misses of Direct-Mapped Caches through Balanced Subarray Accesses

It is observed that the limited memory space of directmapped caches is not used in balance therefore incurs extra conflict misses. We propose a novel cache organization of a balanced cache, which balances accesses to cache sets at the granularity of ...
Aspects of cache memory and instruction buffer performance

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data

June 2004

988 pages

ISBN:1581138598

DOI:10.1145/1007568

Conference Chairs:
Arnd Christian König
Microsoft Research
,
Stefan Dessloch
University of Kaiserslautern, Germany
,
General Chair:
Patrick Valduriez
INRIA, France
,
Program Chair:
Gerhard Weikum
University of the Saarland

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

SIGMOD/PODS04

Sponsor:

SIGMOD

SIGMOD/PODS04: International Conference on Management of Data and Symposium on Principles Database and Systems

June 13 - 18, 2004

Paris, France

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

59
Total Citations
View Citations
999
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)3

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Eymer JDexter PRaskind JLiu Y(2024)A Runtime System for Interruptible Query Processing: When Incremental Computing Meets Fine-Grained ParallelismProceedings of the ACM on Programming Languages10.1145/36897728:OOPSLA2(1729-1756)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689772
Li YHu HLei CZhou XQian W(2024)Hill-Cache: Adaptive Integration of Recency and Frequency in Caching with Hill-Climbing2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00302(3947-3960)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00302
Lin WQin JChen YJin ZXu JZhang YCai SFu LChen YChen W(2023)JACO: JAva Code Layout Optimizer Enabling Continuous Optimization without Pausing Application Services2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00032(295-306)Online publication date: 31-Oct-2023
https://doi.org/10.1109/CLUSTER52292.2023.00032
Khan TUgur MNathella KSunwoo DLitz HJimenez DKasikci B(2022)Whisper: Profile-Guided Branch Misprediction Elimination for Data Center Applications2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00017(19-34)Online publication date: Oct-2022
https://doi.org/10.1109/MICRO56248.2022.00017
Khan TBrown NSriraman ASoundararajan NKumar RDevietti JSubramoney SPokam GLitz HKasikci B(2021)Twig: Profile-Guided BTB Prefetching for Data Center ApplicationsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480124(816-829)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480124
Kang DJiang RBlanas SLi GLi ZIdreos SSrivastava D(2021)Jigsaw: A Data Storage and Query Processing Engine for Irregular Table PartitioningProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457547(898-911)Online publication date: 9-Jun-2021
https://dl.acm.org/doi/10.1145/3448016.3457547
Chen HD'silva JChen HKemme BHendren L(2020)HorseIR: bringing array programming languages together with database query processingACM SIGPLAN Notices10.1145/3393673.327695153:8(37-49)Online publication date: 6-Apr-2020
https://dl.acm.org/doi/10.1145/3393673.3276951
Khan TSriraman ADevietti JPokam GLitz HKasikci B(2020)I-SPY: Context-Driven Conditional Instruction Prefetching with Coalescing2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00024(146-159)Online publication date: Oct-2020
https://doi.org/10.1109/MICRO50266.2020.00024
Zhang SHe JZhou AHe BBoncz PManegold SAilamaki ADeshpande AKraska T(2019)BriskStreamProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3300067(705-722)Online publication date: 25-Jun-2019
https://dl.acm.org/doi/10.1145/3299869.3300067
Chen HD'silva JChen HKemme BHendren LFelgentreff T(2018)HorseIR: bringing array programming languages together with database query processingProceedings of the 14th ACM SIGPLAN International Symposium on Dynamic Languages10.1145/3276945.3276951(37-49)Online publication date: 24-Oct-2018
https://dl.acm.org/doi/10.1145/3276945.3276951
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten