ABSTRACT
As Networks-on-Chip (NoCs) continue to consume a large fraction of the total chip power budget, dynamic voltage and frequency scaling (DVFS) has evolved into an integral part of NoC designs. Efficient DVFS relies on accurate predictions of future network state. Most previous approaches are reactive and based on network-centric metrics, such as buffer occupation and channel utilization. However, we find that there is little correlation between those metrics and subsequent NoC traffic, which leads to suboptimal DVFS decisions. In this work, we propose to utilize highly predictable properties of cache-coherence communication to derive more specific and reliable NoC traffic predictions. A DVFS mechanism based on our traffic predictions, reduces power by 41% compared to a baseline without DVFS and by 21% on average when compared to a state-of-the-art DVFS implementation, while only degrading performance by 3%.
- M. Badr and N. Enright Jerger. SynFull: Synthetic traffic models capturing cache coherent behaviour. In International Symposium on Computer Architecture (ISCA), pages 109--120, June 2014. Google ScholarDigital Library
- D. U. Becker. Efficient Microarchitecture for Network-on-Chip Routers. PhD thesis, Stanford University, August 2012.Google Scholar
- C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In Proc. of PACT, Oct. 2008. Google ScholarDigital Library
- N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood. The gem5 simulator. SIGARCH Comput. Archit. News, 39(2):1--7, Aug. 2011. Google ScholarDigital Library
- P. Bogdan, R. Marculescu, S. Jain, and R. Gavila. An optimal control approach to power management for multi-voltage and frequency islands multiprocessor platforms under highly variable workloads. In International Symposium on Networks on Chip (NoCS), pages 35--42, May 2012. Google ScholarDigital Library
- X. Chen, Z. Xu, H. Kim, P. Gratz, J. Hu, M. Kishinevsky, and U. Ogras. In-network monitoring and control policy for DVFS of CMP networks-on-chip and last level caches. In International Symposium on Networks on Chip (NoCS), pages 43--50, May 2012. Google ScholarDigital Library
- R. David, P. Bogdan, and R. Marculescu. Dynamic power management for multicores: Case study using the Intel SCC. In Internationa Conference on VLSI and System-on-Chip (VLSI-SoC), pages 147--152, Oct 2012.Google Scholar
- S. Demetriades and S. Cho. BarrierWatch: Characterizing multithreaded workloads across and within program-defined epochs. In Proceedings of the 8th ACM International Conference on Computing Frontiers, CF '11, pages 5:1--5:11, 2011. Google ScholarDigital Library
- S. Demetriades and S. Cho. Predicting coherence communication by tracking synchronization points at run time. In Proceedings of the International Symposium on Microarchitecture, MICRO-45, pages 351--362, 2012. Google ScholarDigital Library
- R. Dennard, V. Rideout, E. Bassous, and A. LeBlanc. Design of ion-implanted MOSFET's with very small physical dimensions. IEEE Journal of Solid-State Circuits, 9(5):256--268, Oct 1974.Google ScholarDigital Library
- J. Howard, S. Dighe, S. R. Vangal, G. Ruhl, N. Borkar, S. Jain, V. Erraguntla, M. Konow, M. Riepen, M. Gries, G. Droege, T. Lund-Larsen, S. Steibl, S. Borkar, V. K. De, and R. Van Der Wijngaart. A 48-core IA-32 processor in 45 nm CMOS using on-die message-passing and DVFS for performance and power scaling. IEEE Journal of Solid-State Circuits, 46(1), January 2011.Google ScholarCross Ref
- N. Jiang, D. Becker, G. Michelogiannakis, J. Balfour, B. Towles, D. Shaw, J. Kim, and W. Dally. A detailed and flexible cycle-accurate network-on-chip simulator. In International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 86--96, April 2013.Google ScholarCross Ref
- W. Kim, M. Gupta, G.-Y. Wei, and D. Brooks. System level analysis of fast, per-core DVFS using on-chip switching regulators. In International Symposium on High Performance Computer Architecture, pages 123--134, Feb 2008.Google Scholar
- G. Liang and A. Jantsch. Adaptive power management for the on-chip communication network. In Digital System Design: Architectures, Methods and Tools, 2006. DSD 2006. 9th EUROMICRO Conference on, pages 649--656, 2006. Google ScholarDigital Library
- M. Martin, P. Harper, D. Sorin, M. Hill, and D. Wood. Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors. In International Symposium on Computer Architecture, pages 206--217, June 2003. Google ScholarDigital Library
- A. K. Mishra, R. Das, S. Eachempati, R. Iyer, N. Vijaykrishnan, and C. R. Das. A case for dynamic frequency tuning in on-chip networks. In Proceedings of the International Symposium on Microarchitecture, MICRO 42, pages 292--303, 2009. Google ScholarDigital Library
- J. Muttersbach, T. Villiger, and W. Fichtner. Practical design of globally-asynchronous locally-synchronous systems. In Proceedings of the International Symposium on Advanced Research in Asynchronous Circuits and Systems, pages 52--59, 2000. Google ScholarDigital Library
- U. Ogras, R. Marculescu, D. Marculescu, and E. G. Jung. Design and management of voltage-frequency island partitioned networks-on-chip. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, 17(3):330--341, March 2009. Google ScholarDigital Library
- A. Rahimi, M. E. Salehi, S. Mohammadi, and S. M. Fakhraie. Low-energy GALS NoC with FIFO-monitoring dynamic voltage scaling. Microelectronics Journal, 42(6):889--896, 2011. Google ScholarDigital Library
- L. Shang, L.-S. Peh, and N. Jha. Power-efficient interconnection networks: Dynamic voltage scaling with links. Computer Architecture Letters, 1(1):6--6, January 2002. Google ScholarDigital Library
- A. Sinkar, H. Ghasemi, M. Schulte, U. Karpuzcu, and N. S. Kim. Low-cost per-core voltage domain support for power-constrained high-performance processors. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, 22(4):747--758, April 2014. Google ScholarDigital Library
- C. Sun, C.-H. O. Chen, G. Kurian, L. Wei, J. Miller, A. Agarwal, L.-S. Peh, and V. Stojanovic. DSENT - a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In Proc. of the International Symposium on Networks-on-Chip, May 2012. Google ScholarDigital Library
- J.-Y. Won, X. Chen, P. Gratz, J. Hu, and V. Soteriou. Up by their bootstraps: Online learning in artificial neural networks for CMP uncore power management. In International Symposium on High Performance Computer Architecture (HPCA), pages 308--319, Feb 2014.Google ScholarCross Ref
Index Terms
- Improving DVFS in NoCs with Coherence Prediction
Recommendations
In-network monitoring and control policy for DVFS of CMP networks-on-chip and last level caches
Special Section on Networks on Chip: Architecture, Tools, and MethodologiesIn chip design today and for a foreseeable future, the last-level cache and on-chip interconnect is not only performance critical but also a substantial power consumer. This work focuses on employing dynamic voltage and frequency scaling (DVFS) policies ...
VISION: a framework for voltage island aware synthesis of interconnection networks-on-chip
GLSVLSI '11: Proceedings of the 21st edition of the great lakes symposium on Great lakes symposium on VLSIHigh power dissipation has today become one of the major challenges in chip multiprocessor (CMP) design. Designers in recent years have proposed several techniques to alleviate the power challenge, one of which is the use of voltage islands (VIs) that ...
A framework for low power synthesis of interconnection networks-on-chip with multiple voltage islands
The problem of VI-aware Network-on-Chip (NoC) design is extremely challenging, especially with the increasing core counts in today's power-hungry Chip Multiprocessors (CMPs). In this paper, we propose a novel framework for automating the synthesis of ...
Comments