Abstract
Chip multiprocessors (CMPs) usually employ shared, last-level caches to use on-chip memory resources effectively. Unfortunately, conventional replacement policies applied to shared caches fail to partition memory resources among cores to achieve an optimal execution throughput. This paper presents a novel replacement policy that dynamically estimates how many misses would be eliminated if one more block per set would be allocated to a certain processor taking into account the extra misses for some other processor. Our implementation makes novel use of shadow tags for the estimation. We show that it can yield 50% higher execution throughput on a 4-way CMP and in contrast to previously proposed schemes, we did not observe any noticeable degradation of performance for any application in the SPEC2000 we used.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This work is partly sponsored by the HiPEAC Network of Excellence funded by EU under FP6.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Kim, S., Chandra, D., Solihin, Y.: Fair cache sharing and partitioning on a chip multiprocessor architecture. In: PACT (2004)
Suh, G., Devadas, S., Rudolph, L.: Dynamic cache partitioning for simultaneous multithreading systems. IASTED Parallel and Dist. Computing Systems (2001)
Suh, G.E., Devadas, S., Rudolph, L.: A new memory monitoring scheme for memory-aware scheduling and partitioning. In: HPCA (2002)
Suh, G.E., Devadas, S., Rudolph, L.: Dynamic partitioning of shared cache memory. The Journal of Supercomputing 28(1) (2004)
Austin, T., Larson, E., Ernst, D.: SimpleScalar: An infrastructure for computer system modeling. IEEE Computer 35(2) (2002)
Smith, J.E.: Characterizing computer performance with a single number. Communications of the ACM 31(10), 1202–1206 (1988)
Dybdahl, H., Stenström, P.: Enhancing lower level cache performance by early miss determination and block bypassing. In: ICCD (submitted, 2006)
Chishti, Z., Powell, M.D., Vijaykumar, T.N.: Optimizing replication, communication, and capacity allocation in CMPs. SIGARCH Comput. Arc. News 33(2) (2005)
Zhang, M., Asanovic, K.: Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors. In: ISCA (2005)
Chang, J., Sohi, G.S.: Cooperative caching for chip multiprocessors. In: ISCA (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dybdahl, H., Stenström, P., Natvig, L. (2006). A Cache-Partitioning Aware Replacement Policy for Chip Multiprocessors. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2006. HiPC 2006. Lecture Notes in Computer Science, vol 4297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11945918_9
Download citation
DOI: https://doi.org/10.1007/11945918_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68039-0
Online ISBN: 978-3-540-68040-6
eBook Packages: Computer ScienceComputer Science (R0)