Abstract
The LRU replacement policy is commonly used in the last-level caches of multiprocessors. However, LRU policy does not work well for memory intensive workloads which working set are greater than the available cache size. When a new arrival cache block is inserted at the MRU position, it may never be reused until being evicted from the cache but occupy the cache space for a long time during its movement from the MRU to the LRU position. This results in inefficient use of cache space. If we insert a new cache block at the LRU position directly, the cache performance can be improved by keeping some fraction of the working sets is retained in the caches.
In this work, we propose Enhanced Dynamic Insertion Policy (EDIP) and Thread Aware Enhanced Dynamic Insertion Policy (TAEDIP) which can adjust the probability of insertion at MRU by set dueling. The runtime information of the previous and the next BIP level are gathered and compared with current level to choose an appropriate BIP level. At the same time, access frequency is used to choose a victim. In this way, our work can get less miss rate than LRU for workloads with large work set. For workloads with small working set, the miss rate of our design is close to LRU replacement policy. Simulation results in single core configuration with 1MB 16-way LLC show that EDIP reduces CPI over LRU and DIP by an average of 11.4% and 1.8% respectively. On quad-core configuration with 4MB 16-way LLC. TAEDIP improves the performance on the weighted speedup metric by 11.2% over LRU and 3.7% over TADIP on average. For fairness metric, TAEDIP improves the performance by 11.2% over LRU and 2.6% over TADIP on average.
This work is supported by Nature Science Foundation of China under Grant No. 60833004,60970002, and the National 863 High Technology Research Program of China(No.2008AA01A201).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Beckmann, B.M., Marty, M.R., Wood, D.A.: ASR: Adaptive Selective Replication for CMP Caches. In: MICRO-39, pp. 443–454 (2006)
Jaleel, A., Cohn, R., Luk, C.K., Jacob, B.: CMP$im: A pinbased on-the-fly multi-core cache simulator. In: Workshop on Modeling Benchmarking and Simulation (MoBS) Colocated with ISCA-35 (2008)
Jaleel, A., Hasenplaugh, W., Qureshi, M., Sebot, J., Steely Jr., S., Emer, J.: Adaptive insertion policies for managing shared caches. In: PACT-17, pp. 208–219 (2008)
Jaleel, A.K.B., Theobald Jr., S.C.S., Emer, J.: High performance cache replacement using re-reference interval prediction (RRIP). In: ISCA-37, pp. 60–71 (2010)
Lee, D., Choi, J., Kim, J.H., Noh, S.H., Min, S.L., Cho, Y., Kim, C.S.: LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies. IEEE Transactions on Computers 50(12), 1352–1361 (2001)
Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: PLDI 2005, pp. 190–200 (2005)
Luo, K., Gummaraju, J., Franklin, M.: Balancing throughput and fairness in SMT processors. In: ISPASS, pp. 164–171 (2001)
Qureshi, M.K., Patt, Y.: Utility Based Cache Partitioning: A Low Overhead High-Performance Runtime Mechanism to Partition Shared Caches. In: MICRO-39 (2006)
Qureshi, M., Jaleel, A., Patt, Y., Steely, S., Emer, J.: Adaptive Insertion Policies for High Performance Caching. In: ISCA-34, pp. 167–178 (2007)
Qureshi, M.K., Lynch, D.N., Mutlu, O., Patt, Y.N.: A case for MLP-aware cache replacement. In: ISCA-33, pp. 167–178 (2006)
Snavely, A., Tullsen, D.M.: Symbiotic Jobscheduling for a Simultaneous Multithreading Processor. In: ASPLOS-9, pp. 234–244 (2000)
Stone, H.S., Turek, J., Wolf, J.L.: Optimal Partitioning of Cache Memory. IEEE Transactions on Computers 41(9), 1054–1068 (1992)
Suh, G.E., Rudolph, L., Devadas, S.: Dynamic Partitioning of Shared Cache Memory. Journal of Supercomputing 28(1), 7–26 (2004)
Xie, Y., Loh, G.H.: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches. In: ISCA-36, pp. 174–183 (2009)
Zhang, X., Li, C., Liu, Z., Wang, H., Wang, D., Ikenaga, T.: A Novel Cache Replacement Policy via Dynamic Adaptive Insertion and Re-Reference Prediction. IEICE Transactions on Electronics E94-C(4), 468–478 (2011)
Zhang, X., Li, C., Wang, H., Wang, D.: A Cache Replacement Policy Using Adaptive Insertion and Re-reference Prediction. In: SBAC-PAD-22, pp. 95–102 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, C., Wang, D., Xue, Y., Wang, H., Zhang, X. (2011). Enhanced Adaptive Insertion Policy for Shared Caches. In: Temam, O., Yew, PC., Zang, B. (eds) Advanced Parallel Processing Technologies. APPT 2011. Lecture Notes in Computer Science, vol 6965. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24151-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-24151-2_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24150-5
Online ISBN: 978-3-642-24151-2
eBook Packages: Computer ScienceComputer Science (R0)