Dynamic Partitioning of Shared Cache Memory

Suh, G. E.; Rudolph, L.; Devadas, S.

doi:10.1023/B:SUPE.0000014800.27383.8f

Dynamic Partitioning of Shared Cache Memory

Published: April 2004

Volume 28, pages 7–26, (2004)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

G. E. Suh,
L. Rudolph &
S. Devadas

912 Accesses
271 Citations
6 Altmetric
Explore all metrics

Abstract

This paper proposes dynamic cache partitioning amongst simultaneously executing processes/threads. We present a general partitioning scheme that can be applied to set-associative caches.

Since memory reference characteristics of processes/threads can change over time, our method collects the cache miss characteristics of processes/threads at run-time. Also, the workload is determined at run-time by the operating system scheduler. Our scheme combines the information, and partitions the cache amongst the executing processes/threads. Partition sizes are varied dynamically to reduce the total number of misses.

The partitioning scheme has been evaluated using a processor simulator modeling a two-processor CMP system. The results show that the scheme can improve the total IPC significantly over the standard least recently used (LRU) replacement policy. In a certain case, partitioning doubles the total IPC over standard LRU. Our results show that smart cache management and scheduling is essential to achieve high performance with shared cache memory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

D. H. Albonesi. Selective cache ways: On-demand cache resource allocation. In The 32nd Annual IEEE/ACM International Symposium on Microarchitecture, 1999.
B. K. Bershad, B. J. Chen, D. Lee, and T. H. Romer. Avoiding conflict misses dynamically in large direct-mapped caches. In ASPLOS VI, 1994.
D. Burger and T. M. Austin. The SimpleScalar Tool Set, Version 2.0. Technical report, University of Wisconsin-Madison Computer Science Department, 1997.
D. T. Chiou. Extending the reach of microprocessors: Column and curious caching. Ph.D. Thesis, Massachusetts Institute of Technology, 1999.
W. J. Dally, S. Keckler, N. Carter, A. Chang, M. Filo, and W. S. Lee. M-Machine Architecture v1.0. Technical Report Concurrent VLSI Architecture Memo 58, Massachusetts Institute of Technology, 1994.
S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, R. L. Stamm, and D. M. Tullsen. Simultaneous multithreading: A platform for next-generation processors. IEEE Micro, 17(5), 1997.
B. Fox. Discrete optimization via marginal analysis. Management Science, 13, 1966.
A. González, M. Valero, N. Topham, and J. M. Parcerisa. Eliminating cache conflict misses through XOR-based placement functions. In The 1997 International Conference on Supercomputing, 1997.
J. L. Henning. SPEC CPU2000: Measuring CPU performance in the new millennium. IEEE Computer, 2000.
Intel. Intel StrongARM SA-1100 microprocessor, 1999.
J. L. Lo, J. S. Emer, H. M. Levy, R. L. Stamm, D. M. Tullsen, and S. J. Eggers. Converting threadlevel parallelism to instruction-level parallelism via simultaneous multithreading. ACM Transactions on Computer Systems, 15, 1997.
M. D. Powell, A. Agarwal, T. Vijaykumar, B. Falsafi, and K. Roy. Reducing set-associative cache energy via selective direct-mapping and way prediction. In The 34th Annual IEEE/ACM International Symposium on Microarchitecture, 2001.
H. S. Stone, J. Turek, and J. L. Wolf. Optimal partitioning of cache memory. IEEE Transactions on Computers, 41(9), 1992.
R. A. Sugumar and S. G. Abraham. Set-associative cache simulation using generalized binomial trees. ACM Transactions on Computer Systems, 1995.
G. E. Suh, S. Devadas, and L. Rudolph. Analytical cache models with application to cache partitioning. In The 15th International Conference on Supercomputing, 2001a.
G. E. Suh, S. Devadas, and L. Rudolph. A new memory monitoring scheme for memory-aware scheduling and partitioning. In The 8th High-Performance Computer Architecture, 2002.
G. E. Suh, L. Rudolph, and S. Devadas. Effects of memory performance on parallel job scheduling. In 7th International Workshop on Job Scheduling Strategies for Parallel Processing (in LNCS 2221), pp. 116–132, 2001b.
D. Thiébaut, H. S. Stone, and J. L. Wolf. Improving disk cache hit-ratios through cache partitioning. IEEE Transactions on Computers, 41(6), 1992.
N. Topham and A. González. Randomized cache placement for eliminating conflicts. IEEE Transactions on Computers, 48(2), 1999.
D. M. Tullsen, S. J. Eggers, and H. M. Levy. Simultaneous multi-threading: Maximizing on-chip parallelism. In The 22nd Annual International Symposium on Computer Architecture, 1995.
O. S. Unsal, I. Koren, C. M. Krishna, and C. A. Moritz. The minimax cache: An energy-efficient framework for media processors. In The 8th High-Performance Computer Architecture, 2002.
S. Yang, M. D. Powell, B. Falsafi, and T. N. Vijaykumar. Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay. In The 8th High-Performance Computer Architecture, 2002.
M. Zhang and K. Asanović. Highly-associative caches for low-power processors. In Kool Chips Workshop in 33rd International Symposium on Microarchitecture, 2000.

Download references

Authors

G. E. Suh
View author publications
You can also search for this author in PubMed Google Scholar
L. Rudolph
View author publications
You can also search for this author in PubMed Google Scholar
S. Devadas
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suh, G.E., Rudolph, L. & Devadas, S. Dynamic Partitioning of Shared Cache Memory. The Journal of Supercomputing 28, 7–26 (2004). https://doi.org/10.1023/B:SUPE.0000014800.27383.8f

Download citation

Issue Date: April 2004
DOI: https://doi.org/10.1023/B:SUPE.0000014800.27383.8f

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic Partitioning of Shared Cache Memory

Abstract

Access this article

Similar content being viewed by others

When Partitioning Works and When It Doesn’t: An Empirical Study on Cache Way Partitioning

Improving multiprocessor performance with fine-grain coherence bypass

SRCP: sharing and reuse-aware replacement policy for the partitioned cache in multicore systems

References

Rights and permissions

About this article

Cite this article

Navigation

Dynamic Partitioning of Shared Cache Memory

Abstract

Access this article

Similar content being viewed by others

When Partitioning Works and When It Doesn’t: An Empirical Study on Cache Way Partitioning

Improving multiprocessor performance with fine-grain coherence bypass

SRCP: sharing and reuse-aware replacement policy for the partitioned cache in multicore systems

References

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation