Segmented Bitline Cache: Exploiting Non-uniform Memory Access Patterns

Rao, Ravishankar; Wenck, Justin; Franklin, Diana; Amirtharajah, Rajeevan; Akella, Venkatesh

doi:10.1007/11945918_17

Ravishankar Rao²⁰,
Justin Wenck²⁰,
Diana Franklin²¹,
Rajeevan Amirtharajah²⁰ &
…
Venkatesh Akella²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4297))

Included in the following conference series:

International Conference on High-Performance Computing

869 Accesses
2 Citations

Abstract

On chip caches in modern processors account for a sizable fraction of the dynamic and leakage power. Much of this power is wasted, required only because the memory cells farthest from the sense amplifiers in the cache must discharge a large capacitance on the bitlines. We reduce this capacitance by segmenting the memory cells along the bitlines, and turning off the segmenters to reduce the overall bitline capacitance.

The success of this cache relies on accessing segments near the sense-amps much more often than remote segments. We show that the access pattern to the first level data and instruction cache is extremely skewed. Only a small set of cache lines are accessed frequently. We exploit this non-uniform cache access pattern by mapping the frequently accessed cache lines closer to the sense amp. These lines are isolated by segmenting circuits on the bitlines and hence dissipate lesser power when accessed.

Modifications to the address decoder enable a dynamic re-mapping of cache lines to segments. In this paper, we explore the design-space of segmenting the level one data and instruction caches. Instruction and data caches show potential power savings of 10% and 6% respectively on the subset of benchmarks simulated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Cache Behavior Analysis with SP-Relative Addressing for WCET Estimation

Dissecting the Phytium 2000+ Memory Hierarchy via Microbenchmarking

Reused-Based Replacement Policy for Last-Level Cache with Minimum Hardware Cost

References

Amrutur, B.S., Horowitz, M.A.: Speed and power scaling of srams. IEEE Journal of Solid-State Circuits 35(2), 175–185 (2000)
Article Google Scholar
Bradley, D., Mahoney, P., Stackhouse, B.: The 16kb single-cycle read acess cache on a next-generation 64b itanium microprocessor. In: International Solid State Cirtuits Conference (2002)
Google Scholar
Burger, D.C., Austin, T.M.: The simplescalar tool set, version 2.0. Technical Report CS-TR-1997-1342, University of Wisconsin, Madison (June 1997)
Google Scholar
Ghose, K., Kamble, M.B.: Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. In: International Symposium on Low Power Electronics and Design, pp. 70–75 (1999)
Google Scholar
Lau, J., Schoenmackers, S., Calder, B.: Transition phase classification and prediction. In: 11th International Symposium on High Performance Computer Architecture (February 2005)
Google Scholar
Rabaey, J.M.: Digital integrated circuits: A design perspective (1996)
Google Scholar
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically characterizing large scale program behavior. In: 10th International Conference on Architectural Support for Programming Languages and Operating Systems (October 2002)
Google Scholar
Wilton, S.J., Jouppi, N.P.: Cacti: An enhanced cache access and cycle time model. IEEE Journal of Solid-State Circuits (May 1996)
Google Scholar
Yang, B.-D., Kim, L.-S.: A low-power sram using hierarchical bit line and local sense amplifiers. IEEE Journal of Solid-State Circuits (June 2005)
Google Scholar
Yang, S.-H., Falsafi, B.: Near-optimal precharging in high-performance nanoscale cmos caches. In: 36th International Symposium on Microarchitecture (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Davis
Ravishankar Rao, Justin Wenck, Rajeevan Amirtharajah & Venkatesh Akella
California Polytechnic State University, San Luis Obispo
Diana Franklin

Authors

Ravishankar Rao
View author publications
You can also search for this author in PubMed Google Scholar
Justin Wenck
View author publications
You can also search for this author in PubMed Google Scholar
Diana Franklin
View author publications
You can also search for this author in PubMed Google Scholar
Rajeevan Amirtharajah
View author publications
You can also search for this author in PubMed Google Scholar
Venkatesh Akella
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

,
Yves Robert
Department of Electrical and Computer Engineering, Rutgers, the State University of New Jersey, 94 Brett Road, NJ 08854, Piscataway, USA
Manish Parashar
Hewlett-Packard ISO, Sy 192, Whitefield Road, Mahadevapura Post, 560048, Bangalore, India
Ramamurthy Badrinath
Department of Electrical Engineering, University of Southern California, 90089-2562, Los Angeles, CA, USA
Viktor K. Prasanna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rao, R., Wenck, J., Franklin, D., Amirtharajah, R., Akella, V. (2006). Segmented Bitline Cache: Exploiting Non-uniform Memory Access Patterns. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2006. HiPC 2006. Lecture Notes in Computer Science, vol 4297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11945918_17

Download citation

DOI: https://doi.org/10.1007/11945918_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68039-0
Online ISBN: 978-3-540-68040-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Segmented Bitline Cache: Exploiting Non-uniform Memory Access Patterns

Abstract

Access this chapter

Preview

Similar content being viewed by others

Cache Behavior Analysis with SP-Relative Addressing for WCET Estimation

Dissecting the Phytium 2000+ Memory Hierarchy via Microbenchmarking

Reused-Based Replacement Policy for Last-Level Cache with Minimum Hardware Cost

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Segmented Bitline Cache: Exploiting Non-uniform Memory Access Patterns

Abstract

Access this chapter

Preview

Similar content being viewed by others

Cache Behavior Analysis with SP-Relative Addressing for WCET Estimation

Dissecting the Phytium 2000+ Memory Hierarchy via Microbenchmarking

Reused-Based Replacement Policy for Last-Level Cache with Minimum Hardware Cost

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation