skip to main content
article

A case for shared instruction cache on chip multiprocessors running OLTP

Published: 27 September 2003 Publication History

Abstract

Due to their large code footprint, OLTP workloads suffer from significant I-cache miss rates on contemporary microprocessors. This paper analyzes the I-stream behavior of an OLTP workload, called the Oracle Database Benchmark (ODB), on Chip-Multiprocessors (CMP). Our results show that, although, the overall code footprint of ODB is large, multiple ODB threads running concurrently on multiple processors tend to access common code segments frequently, thus exhibiting significant constructive sharing. In fact, in a CMP system, an I-cache shared between multiple processors incurs similar miss rate as a dedicated I-cache per processor where the per processor I-cache has the same capacity as the shared I-cache. Based on these observations, this paper makes the case for a shared I-cache organization in a CMP, instead of the traditional approach of using a dedicated I-cache per processor.Furthermore, this paper shows that OLTP code stream exhibits good spatial locality. Adding a simple dedicated Line Buffer per processor can exploit this spatial locality effectively, to reduce latency and bandwidth requirements on the shared cache. The proposed shared I-cache organization results in an improvement of at least 5X in miss rate over a dedicated cache organization, for the same total capacity.

References

[1]
L. A. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese. Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing. In Proceedings of the 27th International Symposium on Computer Architecture, pages 282--293, June 2000.
[2]
K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson and K. Chang. The Case for a Single-Chip Multiprocessor. In Proceedings of the 7th International Symposium on Architectural Support for Parallel Languages and Operating Systems, pages 2--11, October 1996.
[3]
A. Ailamaki, D. DeWitt, M. Hill, and D. Wood. DBMSs on a Modern Processor: Where Does Time Go? In Proceedings of the 25th International Conference on Very Large Data Bases, pages 266--277, September 1999.
[4]
K. Keeton, D. A. Patterson, Y. Q. He, R. C. Raphael, and W. E. Baker. Performance Characterization of a Quad Pentium Pro SMP Using OLTP Workloads. In Proceedings of the 25th International Symposium on Computer Architecture, pages 15--26, June 1998.
[5]
L. A Barroso, K. Gharachorloo, and E. Bugnion. Memory System Characterization of Commercial Workloads. In Proceedings of the 25th International Symposium on Computer Architecture, pages 3--14, June 1998.
[6]
P. Ranganathan and K. Gharachorloo and S. V. Adve and L. A. Barroso. Performance of Database Workloads on Shared-Memory Systems with Out-of-Order Processors. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 307--318, October 1998.
[7]
N. P. Jouppi. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers. In Proceedings of the 17th International Symposium on Computer Architecture, pages 364--373, May 1990.
[8]
K. M. Wilson and K. Olukotun. Designing High-Bandwidth On-Chip Caches. In Proceedings of the 24th International Symposium on Computer Architecture, pages 121--132, June 1997.
[9]
Standard Performance Council. The SPEC95 CPU Benchmark Suite. http://www.spec.org/cpu2000
[10]
A. R. Alameldeen and D. A. Wood. Variability in Architectural Simulations of Multi-threaded Workload. In Proceedings of the 9th Annual International Symposium on High Performance Computer Architecture, pages 7--18, February 2003.
[11]
J. Lo, L. A. Barroso, S. Eggers, K. Gharachorloo, H. Levy, and S. Parekh. An Analysis of Database Workload Performance on Simultaneous Multithreaded Processors. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pages 39 - 50 June 1998.
[12]
A. Ramirez, L. A. Barroso, K. A. Gharachorloo, R. Cohn, J. Larriba-Pey, P. G. Lowney, M. Valero. Code Layout Optimizations for Transaction Processing Workloads. In Proceedings of the 28 th Intl. Symposium on Computer Architecture, pages 155--164, June 2001.
[13]
P. S. Magnusson, F. Dahlgren, H. Grahn, M. Karlsson, F. Larsson, F. Lundholm, A. Moestedt, J. Nilsson, P. Stenström, and B. Werner. SimICS/sun4m: A Virtual Workstation. In Proceedings of the Usenix Annual Technical Conference, pages 119--130, June 1998.

Cited By

View all
  • (2008)Performance Implications of Next-Generation Multi-processing Platforms on e-Business Server ApplicationsProceedings of the 2008 IEEE International Conference on e-Business Engineering10.1109/ICEBE.2008.13(37-44)Online publication date: 22-Oct-2008
  • (2006)Improving the performance and power efficiency of shared helpers in CMPsProceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems10.1145/1176760.1176802(345-356)Online publication date: 22-Oct-2006
  • (2006)Large scale Itanium® 2 processor OLTP workload characterization and optimizationProceedings of the 2nd international workshop on Data management on new hardware10.1145/1140402.1140406(3-es)Online publication date: 25-Jun-2006
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 32, Issue 3
Special issue: MEDEA-2003 workshop
June 2004
81 pages
ISSN:0163-5964
DOI:10.1145/1024295
Issue’s Table of Contents
  • cover image ACM Conferences
    MEDEA '03: Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
    September 2003
    75 pages
    ISBN:9781450378208
    DOI:10.1145/1152923

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 September 2003
Published in SIGARCH Volume 32, Issue 3

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2008)Performance Implications of Next-Generation Multi-processing Platforms on e-Business Server ApplicationsProceedings of the 2008 IEEE International Conference on e-Business Engineering10.1109/ICEBE.2008.13(37-44)Online publication date: 22-Oct-2008
  • (2006)Improving the performance and power efficiency of shared helpers in CMPsProceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems10.1145/1176760.1176802(345-356)Online publication date: 22-Oct-2006
  • (2006)Large scale Itanium® 2 processor OLTP workload characterization and optimizationProceedings of the 2nd international workshop on Data management on new hardware10.1145/1140402.1140406(3-es)Online publication date: 25-Jun-2006
  • (2005)Dynamically configurable shared CMP helper engines for improved performanceACM SIGARCH Computer Architecture News10.1145/1105734.110574433:4(70-79)Online publication date: 1-Nov-2005
  • (2004)Managing Wire Delay in Large Chip-Multiprocessor CachesProceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2004.21(319-330)Online publication date: 4-Dec-2004
  • (2019)Reducing Data Movement and Energy in Multilevel Cache Hierarchies without Losing Performance: Can you have it all?Proceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1109/PACT.2019.00037(382-393)Online publication date: 23-Sep-2019
  • (2017)Sharing the instruction cache among lean cores on an asymmetric CMP for HPC applications2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS.2017.7975265(3-12)Online publication date: Apr-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media