Abstract
As the number of on-chip cores and memory demands of applications increase, judicious management of cache resources has become not merely attractive but imperative. Cache partitioning, that is, dividing cache space between applications based on their memory demands, is a promising approach to provide capacity benefits of shared cache with performance isolation of private caches. However, naively partitioning the cache may lead to performance loss, unfairness, and lack of quality-of-service guarantees. It is clear that intelligent techniques are required for realizing the full potential of cache partitioning. In this article, we present a survey of techniques for partitioning shared caches in multicore processors. We categorize the techniques based on important characteristics and provide a bird’s eye view of the field of cache partitioning.
- Manu Awasthi, Kshitij Sudan, Rajeev Balasubramonian, and John Carter. 2009. Dynamic hardware-assisted software-controlled page placement to manage capacity allocation and sharing within large caches. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’09). 250--261.Google ScholarCross Ref
- Nathan Beckmann and Daniel Sanchez. 2016. Modeling cache performance beyond LRU. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’16).Google ScholarCross Ref
- Ramazan Bitirgen, Engin Ipek, and Jose F. Martinez. 2008. Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach. In Proceedings of the International Symposium on Microarchitecture. 318--329. Google ScholarDigital Library
- Jacob Brock, Chencheng Ye, Chen Ding, Yechen Li, Xiaolin Wang, and Yingwei Luo. 2015. Optimal cache partition-sharing. In Proceedings of the International Conference on Parallel Processing (ICPP’15). 749--758. Google ScholarDigital Library
- J. Chang and G. S. Sohi. 2007. Cooperative cache partitioning for chip multiprocessors. In Proceedings of the International Conference on Supercomputing. 242--252. Google ScholarDigital Library
- Jichuan Chang and Gurindar S. Sohi. 2006. Cooperative caching for chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA’06). 264--276. Google ScholarDigital Library
- Derek Chiou, Prabhat Jain, Larry Rudolph, and Srinivas Devadas. 2000. Application-specific memory management for embedded systems using software-controlled caches. In Proceedings of the Design Automation Conference. 416--419. Google ScholarDigital Library
- Pat Conway, Nathan Kalyanasundharam, Gregg Donley, Kevin Lepak, and Bill Hughes. 2010. Cache hierarchy and memory subsystem of the AMD opteron processor. IEEE Micro. 30, 2 (2010), 16--29. Google ScholarDigital Library
- Henry Cook, Miquel Moreto, Sarah Bird, Khanh Dao, David A. Patterson, and Krste Asanovic. 2013. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. In Proceedings of the International Symposium on Computer Architecture (ISCA’13). 308--319. Google ScholarDigital Library
- Zehan Cui, Licheng Chen, Yungang Bao, and Mingyu Chen. 2014. A swap-based cache set index scheme to leverage both superpage and page coloring optimizations. In Proceedings of the Design Automation Conference. 1--6. Google ScholarDigital Library
- Nam Duong, Dali Zhao, Taesu Kim, Rosario Cammarota, Mateo Valero, and Alexander V. Veidenbaum. 2012. Improving cache management policies using dynamic reuse distances. In Proceedings of the International Symposium on Microarchitecture. 389--400. Google ScholarDigital Library
- Haakon Dybdahl and Per Stenstrom. 2007. An adaptive shared/private NUCA cache partitioning scheme for chip multiprocessors. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’07). 2--12. Google ScholarDigital Library
- Saurabh Gupta and Huiyang Zhou. 2015. Spatial locality-aware cache partitioning for effective cache sharing. In Proceedings of the International Conference on Parallel Processing (ICPP’15). 150--159. Google ScholarDigital Library
- Prateek D. Halwe, Shirshendu Das, and Hemangee K. Kapoor. 2013. Towards a better cache utilization using controlled cache partitioning. In Proceedings of 2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing (DASC’13). 179--186. Google ScholarDigital Library
- William Hasenplaugh, Pritpal S. Ahuja, Aamer Jaleel, Simon Steely Jr., and Joel Emer. 2012. The gradient-based cache partitioning algorithm. ACM Trans. Architect. Code Optim. (TACO) 8, 4 (2012), 44. Google ScholarDigital Library
- Andrew Herdrich, Edwin Verplanke, Priya Autee, Ramesh Illikkal, Chris Gianos, Ronak Singhal, and Ravi Iyer. 2016. Cache QoS: From concept to reality in the Intel Xeon processor E5-2600 v3 product family. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’16). 657--668.Google ScholarCross Ref
- Enric Herrero, José González, and Ramon Canal. 2010. Elastic cooperative caching: An autonomous dynamically adaptive memory hierarchy for chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA’10). 419--428. Google ScholarDigital Library
- Lisa R. Hsu, Steven K. Reinhardt, Ravishankar Iyer, and Srihari Makineni. 2006. Communist, utilitarian, and capitalist cache policies on CMPs: Caches as a shared resource. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’06). 13--22. Google ScholarDigital Library
- Ravi Iyer. 2004. CQoS: A framework for enabling QoS in shared caches of CMP platforms. In Proceedings of the International Conference on Supercomputing. 257--266. Google ScholarDigital Library
- Ravi Iyer, Li Zhao, Fei Guo, Ramesh Illikkal, Srihari Makineni, Don Newell, Yan Solihin, Lisa Hsu, and Steve Reinhardt. 2007. QoS policies and architecture for cache/memory in CMP platforms. ACM SIGMETRICS Perform. Eval. Rev. 35, 1 (2007), 25--36. Google ScholarDigital Library
- Aamer Jaleel, William Hasenplaugh, Moinuddin Qureshi, Julien Sebot, Simon Steely Jr., and Joel Emer. 2008. Adaptive insertion policies for managing shared caches. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’08). 208--219. Google ScholarDigital Library
- Xinxin Jin, Haogang Chen, Xiaolin Wang, Zhenlin Wang, Xiang Wen, Yingwei Luo, and Xiaoming Li. 2009. A simple cache partitioning approach in a virtualized environment. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA’09). 519--524.Google ScholarCross Ref
- Jongpil Jung, Seonpil Kim, and Chong-Min Kyung. 2010. Latency-aware utility-based NUCA cache partitioning in 3D-stacked multi-processor systems. In Proceedings of the VLSI System on Chip Conference (VLSI-SoC’10). 125--130.Google Scholar
- Mahmut Kandemir, Ramya Prabhakar, Mustafa Karakoy, and Yuanrui Zhang. 2011a. Multilayer cache partitioning for multiprogram workloads. In Proceedings of the European Conference on Parallel Processing. 130--141. Google ScholarDigital Library
- Mahmut Kandemir, Taylan Yemliha, and Emre Kultursay. 2011b. A helper thread based dynamic cache partitioning scheme for multithreaded applications. In Proceedings of the Design Automation Conference. 954--959. Google ScholarDigital Library
- Dimitris Kaseridis, Muhammad Faisal Iqbal, and Lizy Kurian John. 2014. Cache friendliness-aware managementof shared last-level caches for high performance multi-core systems. IEEE Trans. Comput. 63, 4 (2014), 874--887. Google ScholarDigital Library
- Dimitris Kaseridis, J. Stuecheli, Jian Chen, and Lizy K. John. 2010. A bandwidth-aware memory-subsystem resource management using non-invasive resource profilers for large CMP systems. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’10). 1--11.Google Scholar
- Dimitris Kaseridis, Jeffrey Stuecheli, and Lizy K. John. 2009. Bank-aware dynamic cache partitioning for multicore architectures. In Proceedings of the International Conference on Parallel Processing (ICPP’09). 18--25. Google ScholarDigital Library
- Harshad Kasture and Daniel Sanchez. 2014. Ubik: Efficient cache sharing with strict qos for latency-critical workloads. In Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). 729--742. Google ScholarDigital Library
- Kamil Kedzierski, Miquel Moreto, Francisco J. Cazorla, and Mateo Valero. 2010. Adapting cache partitioning algorithms to pseudo-LRU replacement policies. In Proceedings of the 2010 IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS’10). 1--12.Google ScholarCross Ref
- Samira Khan, Alaa R. Alameldeen, Chris Wilkerson, Onur Mutlu, and Daniel A. Jiménez. 2014. Improving cache performance by exploiting read-write disparity. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’14).Google Scholar
- Seongbeom Kim, Dhruba Chandra, and Yan Solihin. 2004. Fair cache sharing and partitioning in a chip multiprocessor architecture. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’04). 111--122. Google ScholarDigital Library
- I. Kotera, K. Abe, R. Egawa, H. Takizawa, and H. Kobayashi. 2011. Power-aware dynamic cache partitioning for CMPs. Trans. HiPEAC (2011), 135--153. Google ScholarDigital Library
- Vivek Kozhikkottu, Abhisek Pan, Vijay Pai, Sujit Dey, and Anand Raghunathan. 2014. Variation aware cache partitioning for multithreaded programs. In Proceedings of the Design Automation Conference. 1--6. Google ScholarDigital Library
- Hyunjin Lee, Sangyeun Cho, and Bruce R. Childers. 2011. CloudCache: Expanding and shrinking private caches. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’11). 219--230. Google ScholarDigital Library
- Jaekyu Lee and Hyesoon Kim. 2012. TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’12). 1--12. Google ScholarDigital Library
- J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. 2008. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’08). 367--378.Google Scholar
- J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. 2009. Enabling software multicore cache management with lightweight hardware support. In Proceedings of the Conference on Supercomputing (SC). Google ScholarDigital Library
- Xing Lin and Rajeev Balasubramonian. 2011. Refining the utility metric for utility-based cache partitioning. In 9th Annual Workshop on Duplicating, Deconstructing, and Debunking (WDDD), in conjunction with the 38th International Symposium on Computer Architecture (ISCA-38) (2011).Google Scholar
- Fang Liu, Xiaowei Jiang, and Yan Solihin. 2010. Understanding how off-chip memory bandwidth partitioning in chip multiprocessors affects system performance. In Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA’10). 1--12.Google Scholar
- Lei Liu, Yong Li, Zehan Cui, Yungang Bao, Mingyu Chen, and Chengyong Wu. 2014. Going vertical in memory management: Handling multiplicity by multi-policy. In Proceedings of the International Symposium on Computer Architecture (ISCA’14). 169--180. Google ScholarDigital Library
- David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2016. Improving resource efficiency at scale with heracles. ACM Trans. Comput. Syst. (TOCS) 34, 2 (2016), 6. Google ScholarDigital Library
- Richard L. Mattson, Jan Gecsei, Donald R. Slutz, and Irving L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM Systems Journal 9, 2 (1970), 78--117. Google ScholarDigital Library
- Intel Corporation. 2016. Intel 64 and IA-32 Architectures Developer’s Manual: Vol. 3B, System Programming Guide, Part 2. Retrieved from http://goo.gl/sw24WL.Google Scholar
- OSU-CSE News. 2010. Intel Puts OSU-CSE Inside. Retrieved from http://web.cse.ohio-state.edu/news/news118.shtml.Google Scholar
- Vineeth Mekkat, Anup Holey, Pen-Chung Yew, and Antonia Zhai. 2013. Managing shared last-level cache in a heterogeneous multicore processor. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’13). 225--234. Google ScholarDigital Library
- Sparsh Mittal. 2016a. A survey of architectural techniques for managing process variation. Comput. Surveys 48, 4 (2016), 54:1--54:29. Google ScholarDigital Library
- Sparsh Mittal. 2016b. A survey of cache bypassing techniques. J. Low Power Elect. Appl. 6, 2 (2016), 5:1--5:30.Google Scholar
- Sparsh Mittal, Yanan Cao, and Zhao Zhang. 2014a. MASTER: A multicore cache energy saving technique using dynamic cache reconfiguration. IEEE Trans. VLSI Syst. 22, 8 (2014), 1653--1665.Google ScholarCross Ref
- Sparsh Mittal, Matthew Poremba, Jeffrey Vetter, and Yuan Xie. 2014b. Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool. Technical Report ORNL/TM-2014/636. Oak Ridge National Laboratory, USA.Google Scholar
- Sparsh Mittal and Jeffrey Vetter. 2015. A survey of CPU-GPU heterogeneous computing techniques. Comput. Surveys 47, 4 (2015), 69:1--69:35. Google ScholarDigital Library
- Sparsh Mittal and Jeffrey Vetter. 2016. A survey of techniques for architecting DRAM caches. IEEE Trans. Parallel. Distrib. Syst. (TPDS) 27, 6 (2016), 1852--1863.Google ScholarDigital Library
- Sparsh Mittal, Jeffrey S. Vetter, and Dong Li. 2015. A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches. IEEE Trans. Parallel Distrib. Syst. (TPDS) 26, 6 (2015), 1524--1537.Google ScholarDigital Library
- Sparsh Mittal and Zhao Zhang. 2013. MANAGER: A Multicore Shared Cache Energy Saving Technique for QoS Systems. Technical Report. Iowa State University.Google Scholar
- Miquel Moreto, Francisco J. Cazorla, Alex Ramirez, Rizos Sakellariou, and Mateo Valero. 2009. FlexDCP: A QoS framework for CMP architectures. ACM SIGOPS Operat. Syst. Rev. 43, 2 (2009), 86--96. Google ScholarDigital Library
- Miquel Moreto, Francisco J Cazorla, Alex Ramirez, and Mateo Valero. 2008. MLP-aware dynamic cache partitioning. In High Performance Embedded Architectures and Compilers. 337--352. Google ScholarDigital Library
- Sai Prashanth Muralidhara, Mahmut Kandemir, and Padma Raghavan. 2010. Intra-application cache partitioning. In Proceedings of the 2010 IEEE International Symposium on Parallel 8 Distributed Processing (IPDPS’10). 1--12.Google ScholarCross Ref
- Konstantinos Nikas, Matthew Horsnell, and Jim Garside. 2008. An adaptive Bloom filter cache partitioning scheme for multicore architectures. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS’08). 25--32.Google ScholarCross Ref
- Taecheol Oh, Kiyeon Lee, and Sangyeun Cho. 2011. An analytical performance model for co-management of last-level cache and bandwidth sharing. In Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’11). 150--158. Google ScholarDigital Library
- Abhisek Pan and Vijay S. Pai. 2013. Imbalanced cache partitioning for balanced data-parallel programs. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’13). 297--309. Google ScholarDigital Library
- Pavlos Petoumenos, Georgios Keramidas, Håkan Zeffer, Stefanos Kaxiras, and Erik Hagersten. 2006. StatShare: A statistical model for managing cache sharing via decay. In Proceedings of the Workshop on Modeling, Benchmarking and Simulation (MoBS’06).Google Scholar
- Miquel Moreto Planas, Francisco Cazorla, Alex Ramirez, and Mateo Valero. 2007. Explaining dynamic cache partitioning speed ups. IEEE Comput. Arch. Lett. 6, 1 (2007), 1--4. Google ScholarDigital Library
- Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely, and Joel Emer. 2007. Adaptive insertion policies for high performance caching. In Proceedings of the International Symposium on Computer Architecture (2007), 381--391. Google ScholarDigital Library
- Moinuddin K. Qureshi and Yale N. Patt. 2006. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. 423--432. Google ScholarDigital Library
- Nauman Rafique, Won-Taek Lim, and Mithuna Thottethodi. 2006. Architectural support for operating system-driven CMP cache management. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 2--12. Google ScholarDigital Library
- R. Reddy and P. Petrov. 2010. Cache partitioning for energy-efficient and interference-free embedded multitasking. ACM Trans. Embed. Comput. Syst. (TECS) 9, 3 (2010), 16. Google ScholarDigital Library
- Daniel Sanchez and Christos Kozyrakis. 2010. The ZCache: Decoupling ways and associativity. In Proceedings of the International Symposium on Microarchitecture. 187--198. Google ScholarDigital Library
- D. Sanchez and C. Kozyrakis. 2011. Vantage: Scalable and efficient fine-grain cache partitioning. In Proceedings of the International Symposium on Computer Architecture. 57--68. Google ScholarDigital Library
- Alex Settle, Dan Connors, Enric Gibert, and Antonio González. 2006. A dynamically reconfigurable cache for multithreaded processors. J. Embed. Comput. 2, 2 (2006), 221--233. Google ScholarDigital Library
- Shekhar Srikantaiah, Reetuparna Das, Asit K. Mishra, Chita R. Das, and Mahmut Kandemir. 2009a. A case for integrated processor-cache partitioning in chip multiprocessors. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC’09). 6. Google ScholarDigital Library
- Shekhar Srikantaiah, Mahmut Kandemir, and Qian Wang. 2009b. SHARP control: Controlled shared cache management in chip multiprocessors. In Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42’09). 517--528. Google ScholarDigital Library
- H. S. Stone, J. Turek, and J. L. Wolf. 1992. Optimal partitioning of cache memory. IEEE Trans. Comput. 41, 9 (1992), 1054--1068. Google ScholarDigital Library
- Lavanya Subramanian, Vivek Seshadri, Arnab Ghosh, Samira Khan, and Onur Mutlu. 2015. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’15). 62--75. Google ScholarDigital Library
- G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2001. Analytical cache models with applications to cache partitioning. In Proceedings of the International Conference on Supercomputing. 1--12. Google ScholarDigital Library
- G. E. Suh, L. Rudolph, and S. Devadas. 2004. Dynamic partitioning of shared cache memory. J. Supercomput. 28, 1 (2004), 7--26. Google ScholarDigital Library
- Vivy Suhendra and Tulika Mitra. 2008. Exploring locking 8 partitioning for predictable shared caches on multi-cores. In Proceedings of the Design Automation Conference. 300--303. Google ScholarDigital Library
- Karthik T. Sundararajan, Vasileios Porpodas, Timothy M. Jones, Nigel P. Topham, and Bjorn Franke. 2012. Cooperative partitioning: Energy-efficient cache partitioning for high-performance CMPs. In Proceedings of the International Symposium on High-Performance Computer Architecture, (2012), 1--12. Google ScholarDigital Library
- David Tam, Reza Azimi, Livio Soares, and Michael Stumm. 2007. Managing shared L2 caches on multicore systems in software. In Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture. 26--33.Google Scholar
- Keshavan Varadarajan, S. K. Nandy, Vishal Sharda, Amrutur Bharadwaj, Ravi Iyer, Srihari Makineni, and Donald Newell. 2006. Molecular caches: A caching structure for dynamic creation of application-specific heterogeneous cache regions. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). 433--442. Google ScholarDigital Library
- Ruisheng Wang and Lizhong Chen. 2014. Futility scaling: High-associativity cache partitioning. In Proceedings of the Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 356--367. Google ScholarDigital Library
- Xiaorui Wang, Kai Ma, and Yefu Wang. 2012. Cache latency control for application fairness or differentiation in power-constrained chip multiprocessors. IEEE Trans. Comput. 61, 10 (2012), 1371--1385. Google ScholarDigital Library
- Xiaodong Wang and José F. Martínez. 2015. XChange: A market-based approach to scalable dynamic multi-resource allocation in multicore architectures. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’15). 113--125.Google Scholar
- Y. Xie and G. H. Loh. 2009. PIPP: Promotion/insertion pseudo-partitioning of multi-core shared caches. In ACM SIGARCH Computer Architecture News, Vol. 37. ACM, 174--183. Google ScholarDigital Library
- Yuejian Xie and Gabriel H. Loh. 2010. Scalable shared-cache management by containing thrashing workloads. In Proceedings of the International Conference on High-Performance Embedded Architectures and Compilers. 262--276. Google ScholarDigital Library
- Ying Ye, Richard West, Zhuoqun Cheng, and Ye Li. 2014. Coloris: A dynamic cache partitioning system using page coloring. In Proceedings of the International Conference on Parallel Architectures and Compilation. 381--392. Google ScholarDigital Library
- Thomas Y. Yeh and Glenn Reinman. 2005. Fast and fair: Data-stream quality of service. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES’05). 237--248. Google ScholarDigital Library
- Chenjie Yu and Peter Petrov. 2010. Off-chip memory bandwidth minimization through cache partitioning for multi-core platforms. In Proceedings of the Design Automation Conference. 132--137. Google ScholarDigital Library
- Heechul Yun and Prathap Kumar Valsan. 2015. Evaluating the isolation effect of cache partitioning on COTS multicore platforms. In Proceedings of the 11th Annual Workshop on Operating Systems Platforms for Embedded Real-Time Applications (OSPERT’15). 45.Google Scholar
- Dongyuan Zhan, Hong Jiang, and Sharad C. Seth. 2014. CLU: Co-optimizing locality and utility in thread-aware capacity management for shared last level caches. IEEE Trans. Comput. 63, 7 (2014), 1656--1667. Google ScholarDigital Library
- Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. 2009. Towards practical page coloring-based multicore cache management. In Proceedings of the European Conference on Computer Systems. 89--102. Google ScholarDigital Library
- Miao Zhou, Yu Du, Bruce Childers, Rami Melhem, and Daniel Mossé. 2012. Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems. ACM Trans. Arch. Code Optim. (TACO) 8, 4 (2012), 53. Google ScholarDigital Library
- Miao Zhou, Yu Du, Bruce Childers, Daniel Mosse, and Rami Melhem. 2016. Symmetry-agnostic coordinated management of the memory hierarchy in multicore systems. ACM Trans. Arch. Code Optim. (TACO) 12, 4 (2016), 61. Google ScholarDigital Library
Index Terms
- A Survey of Techniques for Cache Partitioning in Multicore Processors
Recommendations
Vantage: scalable and efficient fine-grain cache partitioning
ISCA '11: Proceedings of the 38th annual international symposium on Computer architectureCache partitioning has a wide range of uses in CMPs, from guaranteeing quality of service and controlled sharing to security-related techniques. However, existing cache partitioning schemes (such as way-partitioning) are limited to coarse-grain ...
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10: Proceedings of the 37th annual international symposium on Computer architecturePractical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...
Comments