ABSTRACT
Heavy leakage power consumption of on-chip last level caches (LLCs) has become the primary obstacle for architecting chip multi-processors (CMPs) in recent times. As leakage power has a direct relationship with the supply voltage, hence, periodic access profile based dynamic voltage scaling (DVS) in the LLC banks can be a promising option towards reducing this heavy cache leakage. A plethora of prior attempts have reduced this by anticipating working set size (WSS) of the applications and eventually putting some portions of the cache banks in low power mode. This proposed work aims to reduce leakage by putting a whole LLC bank into a low power (snoozy) mode through exploiting DVS at cache banks having minimal usages. Additionally, the resulting performance impacts of the low power snoozy mode are alleviated further by putting some snoozy banks in active mode on-demand. Experimental evaluations using full system simulation on a multi-banked 2MB 8-way set associative L2 cache show 10% more leakage savings on an average over a prior drowsy technique.
- A. Bardine et al. . 2007 a. Analysis of Static and Dynamic Energy Consumption in NUCA Caches: Initial Results. ACM MEDEA (Septemeber . 2007), 105--112. Google ScholarDigital Library
- A. Mandke Dani et al. . 2011 a. Adaptive Power Optimization of On-chip SNUCA Cache on Tiled Chip Multicore Architecture using Remap Policy. Second Workshop on Architecture and Multi-Core Applications (2011), 12--17. Google ScholarDigital Library
- B. Fitzgerald et al. . 2013. Drowsy Cache Partitioning for Reduced Static and Dynamic Energy in the Cache Hierarchy. International Green Computing Conference (IGCC) (June . 2013), 1--6.Google Scholar
- C Bienia et al. . 2008. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Princeton University Tech. Rep. TR-811-08 (2008).Google Scholar
- G. Keramidas et al. . 2007 b. Applying decay to reduce dynamic power in set-associative caches. HiPEAC (2007), 38--53. Google ScholarDigital Library
- Hang-Sheng Wang et al. . 2002 a. Orion: a power-performance simulator for interconnection networks MICRO-35. 294 -- 305. Google ScholarDigital Library
- K. Flautner et al. . 2002 b. Drowsy caches: simple techniques for reducing leakage power Proceedings of 29th Annual Int. Symp. on Comp. Arch. 148--157. Google ScholarDigital Library
- K. Inoue et al. . 1999. Way-predicting set-associative cache for high performance and low energy consumption. In Proceedings of Int. Symp. on Low power electronics and design. ACM, 273--275. Google ScholarDigital Library
- M. Loghi et al. . 2009. Tag overflow buffering: Reducing total memory energy by reduced-tag matching. IEEE Trans. on VLSI systems Vol. 17, 5 (2009), 728--732. Google ScholarDigital Library
- M. Powell et al. . 2000. Gated-Vdd: A Circuit Technique to Reduce Leakage in Deep-submicron Cache Memories Proceedings of Int. Symp. on Low Power Electronics and Design. Google ScholarDigital Library
- N. Binkert et al. . 2011 b. The Gem5 Simulator. ACM SIGARCH Computer Architure News Vol. 39, 2 (Aug. . 2011), 1--7. Google ScholarDigital Library
- S. Dropsho et al. . 2002 c. Integrating adaptive on-chip storage structures for reduced dynamic power Proceedings of Parallel Architectures and Compilation Techniques. IEEE, 141--152. Google ScholarDigital Library
- Y. Guo et al. . 2011 c. Energy-efficient hardware data prefetching. IEEE Trans. on VLSI Systems Vol. 19, 2 (2011), 250--263. Google ScholarDigital Library
- Z. Huiyang et al. . 2003. Adaptive Mode Control: A Static Power Efficient Cache Design. ACM Trans. on Embedded Comp. Syst. Vol. 2, 3 (August . 2003), 347--372. Google ScholarDigital Library
- H. K. Kapoor et al. . 2015. Static energy reduction by performance linked cache capacity management in Tiled CMPs ACM SAC. Google ScholarDigital Library
- G. H. Loh . 2008. 3D-stacked memory architectures for multi-core processors 35th Int. Symp. on Comp. Arch. IEEE, 453--464. Google ScholarDigital Library
- Naveen M. et al. . 2008. CACTI 6.0: A Tool to Understand Large Caches. (2008).Google Scholar
- S. Mittal . 2014. A survey of architectural techniques for improving cache power efficiency. Sust. Comp.: Informatics and Systems Vol. 4, 1 (2014), 33 -- 43.Google ScholarCross Ref
- M. Rawlins and A. Gordon-Ross . 2011. On the interplay of loop caching, code compression, and cache configuration Proceedings of the 16th Asia and South Pacific Design Automation Conference. IEEE Press, 243--248. Google ScholarDigital Library
- W Zang and A Gordon-Ross . 2013. A Survey on Cache Tuning from a Power/Energy Perspective. ACM Comput. Surv. Vol. 45, 3 (July . 2013), 32:1--32:49. Google ScholarDigital Library
Index Terms
- Utility Aware Snoozy Caches for Energy Efficient Chip Multi-Processors
Recommendations
A leakage-aware cache sharing technique for low-power chip multi-processors (CMPs) with private L2 caches
MEDEA '08: Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecturePower dissipation becomes an important issue in modern microprocessors such as chip multiprocessors (CMPs). Especially as the process technology advances below 90nm, the leakage power consumption becomes dominant in the total power dissipation, thus ...
Revisiting level-0 caches in embedded processors
CASES '12: Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systemsLevel-0 (L0) caches have been proposed in the past as an inexpensive way to improve performance and reduce energy consumption in resource-constrained embedded processors. This paper proposes new L0 data cache organizations using the assumption that an ...
An Energy-Efficient Partitioned Instruction Cache Architecture for Embedded Processors
Energy efficiency of cache memories is crucial in designing embedded processors. Reducing energy consumption in the instruction cache is especially important, since the instruction cache consumes a significant portion of total processor energy. This ...
Comments