Skip to main content

Segmented Bitline Cache: Exploiting Non-uniform Memory Access Patterns

  • Conference paper
High Performance Computing - HiPC 2006 (HiPC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4297))

Included in the following conference series:

Abstract

On chip caches in modern processors account for a sizable fraction of the dynamic and leakage power. Much of this power is wasted, required only because the memory cells farthest from the sense amplifiers in the cache must discharge a large capacitance on the bitlines. We reduce this capacitance by segmenting the memory cells along the bitlines, and turning off the segmenters to reduce the overall bitline capacitance.

The success of this cache relies on accessing segments near the sense-amps much more often than remote segments. We show that the access pattern to the first level data and instruction cache is extremely skewed. Only a small set of cache lines are accessed frequently. We exploit this non-uniform cache access pattern by mapping the frequently accessed cache lines closer to the sense amp. These lines are isolated by segmenting circuits on the bitlines and hence dissipate lesser power when accessed.

Modifications to the address decoder enable a dynamic re-mapping of cache lines to segments. In this paper, we explore the design-space of segmenting the level one data and instruction caches. Instruction and data caches show potential power savings of 10% and 6% respectively on the subset of benchmarks simulated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Amrutur, B.S., Horowitz, M.A.: Speed and power scaling of srams. IEEE Journal of Solid-State Circuits 35(2), 175–185 (2000)

    Article  Google Scholar 

  2. Bradley, D., Mahoney, P., Stackhouse, B.: The 16kb single-cycle read acess cache on a next-generation 64b itanium microprocessor. In: International Solid State Cirtuits Conference (2002)

    Google Scholar 

  3. Burger, D.C., Austin, T.M.: The simplescalar tool set, version 2.0. Technical Report CS-TR-1997-1342, University of Wisconsin, Madison (June 1997)

    Google Scholar 

  4. Ghose, K., Kamble, M.B.: Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. In: International Symposium on Low Power Electronics and Design, pp. 70–75 (1999)

    Google Scholar 

  5. Lau, J., Schoenmackers, S., Calder, B.: Transition phase classification and prediction. In: 11th International Symposium on High Performance Computer Architecture (February 2005)

    Google Scholar 

  6. Rabaey, J.M.: Digital integrated circuits: A design perspective (1996)

    Google Scholar 

  7. Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically characterizing large scale program behavior. In: 10th International Conference on Architectural Support for Programming Languages and Operating Systems (October 2002)

    Google Scholar 

  8. Wilton, S.J., Jouppi, N.P.: Cacti: An enhanced cache access and cycle time model. IEEE Journal of Solid-State Circuits (May 1996)

    Google Scholar 

  9. Yang, B.-D., Kim, L.-S.: A low-power sram using hierarchical bit line and local sense amplifiers. IEEE Journal of Solid-State Circuits (June 2005)

    Google Scholar 

  10. Yang, S.-H., Falsafi, B.: Near-optimal precharging in high-performance nanoscale cmos caches. In: 36th International Symposium on Microarchitecture (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rao, R., Wenck, J., Franklin, D., Amirtharajah, R., Akella, V. (2006). Segmented Bitline Cache: Exploiting Non-uniform Memory Access Patterns. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2006. HiPC 2006. Lecture Notes in Computer Science, vol 4297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11945918_17

Download citation

  • DOI: https://doi.org/10.1007/11945918_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68039-0

  • Online ISBN: 978-3-540-68040-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics