Skip to main content
Log in

Dynamic Partitioning of Shared Cache Memory

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper proposes dynamic cache partitioning amongst simultaneously executing processes/threads. We present a general partitioning scheme that can be applied to set-associative caches.

Since memory reference characteristics of processes/threads can change over time, our method collects the cache miss characteristics of processes/threads at run-time. Also, the workload is determined at run-time by the operating system scheduler. Our scheme combines the information, and partitions the cache amongst the executing processes/threads. Partition sizes are varied dynamically to reduce the total number of misses.

The partitioning scheme has been evaluated using a processor simulator modeling a two-processor CMP system. The results show that the scheme can improve the total IPC significantly over the standard least recently used (LRU) replacement policy. In a certain case, partitioning doubles the total IPC over standard LRU. Our results show that smart cache management and scheduling is essential to achieve high performance with shared cache memory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. H. Albonesi. Selective cache ways: On-demand cache resource allocation. In The 32nd Annual IEEE/ACM International Symposium on Microarchitecture, 1999.

  2. B. K. Bershad, B. J. Chen, D. Lee, and T. H. Romer. Avoiding conflict misses dynamically in large direct-mapped caches. In ASPLOS VI, 1994.

  3. D. Burger and T. M. Austin. The SimpleScalar Tool Set, Version 2.0. Technical report, University of Wisconsin-Madison Computer Science Department, 1997.

  4. D. T. Chiou. Extending the reach of microprocessors: Column and curious caching. Ph.D. Thesis, Massachusetts Institute of Technology, 1999.

  5. W. J. Dally, S. Keckler, N. Carter, A. Chang, M. Filo, and W. S. Lee. M-Machine Architecture v1.0. Technical Report Concurrent VLSI Architecture Memo 58, Massachusetts Institute of Technology, 1994.

  6. S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, R. L. Stamm, and D. M. Tullsen. Simultaneous multithreading: A platform for next-generation processors. IEEE Micro, 17(5), 1997.

  7. B. Fox. Discrete optimization via marginal analysis. Management Science, 13, 1966.

  8. A. González, M. Valero, N. Topham, and J. M. Parcerisa. Eliminating cache conflict misses through XOR-based placement functions. In The 1997 International Conference on Supercomputing, 1997.

  9. J. L. Henning. SPEC CPU2000: Measuring CPU performance in the new millennium. IEEE Computer, 2000.

  10. Intel. Intel StrongARM SA-1100 microprocessor, 1999.

  11. J. L. Lo, J. S. Emer, H. M. Levy, R. L. Stamm, D. M. Tullsen, and S. J. Eggers. Converting threadlevel parallelism to instruction-level parallelism via simultaneous multithreading. ACM Transactions on Computer Systems, 15, 1997.

  12. M. D. Powell, A. Agarwal, T. Vijaykumar, B. Falsafi, and K. Roy. Reducing set-associative cache energy via selective direct-mapping and way prediction. In The 34th Annual IEEE/ACM International Symposium on Microarchitecture, 2001.

  13. H. S. Stone, J. Turek, and J. L. Wolf. Optimal partitioning of cache memory. IEEE Transactions on Computers, 41(9), 1992.

  14. R. A. Sugumar and S. G. Abraham. Set-associative cache simulation using generalized binomial trees. ACM Transactions on Computer Systems, 1995.

  15. G. E. Suh, S. Devadas, and L. Rudolph. Analytical cache models with application to cache partitioning. In The 15th International Conference on Supercomputing, 2001a.

  16. G. E. Suh, S. Devadas, and L. Rudolph. A new memory monitoring scheme for memory-aware scheduling and partitioning. In The 8th High-Performance Computer Architecture, 2002.

  17. G. E. Suh, L. Rudolph, and S. Devadas. Effects of memory performance on parallel job scheduling. In 7th International Workshop on Job Scheduling Strategies for Parallel Processing (in LNCS 2221), pp. 116–132, 2001b.

  18. D. Thiébaut, H. S. Stone, and J. L. Wolf. Improving disk cache hit-ratios through cache partitioning. IEEE Transactions on Computers, 41(6), 1992.

  19. N. Topham and A. González. Randomized cache placement for eliminating conflicts. IEEE Transactions on Computers, 48(2), 1999.

  20. D. M. Tullsen, S. J. Eggers, and H. M. Levy. Simultaneous multi-threading: Maximizing on-chip parallelism. In The 22nd Annual International Symposium on Computer Architecture, 1995.

  21. O. S. Unsal, I. Koren, C. M. Krishna, and C. A. Moritz. The minimax cache: An energy-efficient framework for media processors. In The 8th High-Performance Computer Architecture, 2002.

  22. S. Yang, M. D. Powell, B. Falsafi, and T. N. Vijaykumar. Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay. In The 8th High-Performance Computer Architecture, 2002.

  23. M. Zhang and K. Asanović. Highly-associative caches for low-power processors. In Kool Chips Workshop in 33rd International Symposium on Microarchitecture, 2000.

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suh, G.E., Rudolph, L. & Devadas, S. Dynamic Partitioning of Shared Cache Memory. The Journal of Supercomputing 28, 7–26 (2004). https://doi.org/10.1023/B:SUPE.0000014800.27383.8f

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:SUPE.0000014800.27383.8f

Navigation