ABSTRACT
In recent years, the increased working set size of applications craves for more memory demand in terms of large size Last Level Caches (LLC). To fulfill this, embedded DRAM (eDRAM) caches have been considered as one of the best alternatives over conventional SRAM caches. eDRAM has a property of low leakage and provides more capacity in the same area footprint of SRAM. However, its retention period consumes significant refresh energy in the periodic refresh. In this paper, we present an approach to minimize the total energy spent on refreshes by considering the presence of private blocks in the LLC. Our approach restricts refreshing of those blocks that are loaded exclusively from the main memory on an LLC miss. Experimental result using full system simulation show 55% reduction in the total number of refreshes compared to baseline policy; and 62% reduction in total power consumption over SRAM.
- A. Agrawal et al. 2013. Refrint: Intelligent refresh to minimize power in on-chip multiprocessor cache hierarchies. In HPCA-2013. 400--411. Google ScholarDigital Library
- A. Agrawal et al. 2014. Mosaic: Exploiting the spatial locality of process variation to reduce refresh energy in on-chip eDRAM modules. In HPCA-2014. 84--95.Google Scholar
- C. Bienia et al. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In PACT-2008. 72--81. Google ScholarDigital Library
- J. Barth et al. 2008. A 500 MHz Random Cycle, 1.5 ns Latency, SOI Embedded DRAM Macro Featuring a Three-Transistor Micro Sense Amplifier. IEEE Journal of Solid-State Circuits 43, 1 (Jan 2008), 86--95.Google ScholarCross Ref
- J. Kong et al. 2017. Towards refresh-optimized EDRAM-based caches with a selective fine-grain round-robin refresh scheme. Microprocessors and Microsystems 49 (2017), 95--104. Google ScholarDigital Library
- J. Liu et al. 2012. RAIDR: Retention-aware intelligent DRAM refresh. In ISCA-2012. Google ScholarDigital Library
- Mohammad Alizadeh et al. 2012. Versatile refresh: low complexity refresh scheduling for high-throughput multi-banked eDRAM. In ACM SIGMETRICS'12. Google ScholarDigital Library
- M. Lodde et al. 2012. Dynamic Last-Level Cache Allocation to Reduce Area and Power Overhead in Directory Coherence Protocols. In Euro-Par 2012 Parallel Processing. Springer Berlin Heidelberg, Berlin, Heidelberg, 206--218. Google ScholarDigital Library
- M. T. Chang et al. 2013. Technology comparison for large last-level caches (L3Cs): Low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM. In HPCA-2013. 143--154. Google ScholarDigital Library
- N. Binkert et al. 2011. The gem5 simulator. ACM SIGARCH Computer Architecture News 39, 2 (2011), 1--7. Google ScholarDigital Library
- N. Muralimanohar et al. 2009. CACTI 6.0: A tool to model large caches. HP laboratories (2009), 22--31.Google Scholar
- P. G. Emma et al. 2008. Rethinking Refresh: Increasing Availability and Reducing Power in DRAM for Cache Applications. IEEE Micro 28, 6 (Nov 2008), 47--56. Google ScholarDigital Library
- S. Agarwal et al. 2018. Reuse-Distance-Aware Write-Intensity Prediction of Dataless Entries for Energy-Efficient Hybrid Caches. IEEE T-VLSI) Systems 26, 10 (Oct 2018), 1881--1894.Google Scholar
- S. Mittal et al. 2017. DESTINY: A Comprehensive Tool with 3D and Multi-Level Cell Memory Modeling Capability. JoLPE 7, 3 (2017).Google Scholar
- Z. Jaksic and R. Canal. 2014. DRAM-based coherent caches and how to take advantage of the coherence protocol to reduce the refresh energy. In DATE-2014. Google ScholarDigital Library
- S. Mittal. 2013. A Cache Reconfiguration Approach for Saving Leakage and Refresh Energy in Embedded DRAM Caches. CoRR abs/1309.7082 (2013).Google Scholar
- W. R. Reohr. 2006. Memories: Exploiting Them and Developing Them, 303--310 pages.Google Scholar
Index Terms
- Towards Optimizing Refresh Energy in embedded-DRAM Caches using Private Blocks
Recommendations
Refresh optimised embedded-dram caches based on zero data detection
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied ComputingMuti-level cache hierarchy with large sized last level caches (LLCs) have emerged to minimise the performance gap between the processing cores and the main memory. Traditionally, LLCs are made using SRAM technology, however, recent trends have shown ...
Low-energy volatile STT-RAM cache design using cache-coherence-enabled adaptive refresh
Spin-Torque Transfer RAM (STT-RAM) is a promising candidate for SRAM replacement because of its excellent features, such as fast read access, high density, low leakage power, and CMOS technology compatibility. However, wide adoption of STT-RAM as cache ...
Versatile refresh: low complexity refresh scheduling for high-throughput multi-banked eDRAM
Performance evaluation reviewMulti-banked embedded DRAM (eDRAM) has become increasingly popular in high-performance systems. However, the data retention problem of eDRAM is exacerbated by the larger number of banks and the high-performance environment in which it is deployed: The ...
Comments