Simulation of NoC power-gating: Requirements, optimizations, and the Agate simulator

https://doi.org/10.1016/j.jpdc.2016.03.006Get rights and content

Highlights

  • Key requirements of NoC power-gating simulation regarding accuracy, efficiency, full-system compatibility, extensibility and flexibility are identified.

  • Insights on the problems, solutions and optimizations generally applicable to NoC power-gating simulation platforms are provided.

  • A cycle-accurate simulator that facilitates research on NoC router power-gating is proposed.

Abstract

The static power consumption of networks-on-chip (NoCs) has been increasing across each technology generation. Power-gating is a very promising approach that can dramatically reduce NoC static power but may potentially cause substantial performance penalty. Significant research is needed to explore effective ways of applying power-gating to NoC routers. To enable further research advancement, cycle-accurate NoC power-gating simulation infrastructure is much needed. In this work, we identify key requirements for NoC power-gating simulation and discuss three important optimizations that can enable such simulators to handle router pipeline draining and handshaking correctly and efficiently. We also propose Agate, an effective NoC power-gating simulator that satisfies the key requirements and optimizations. It can be integrated into Gem5 for closed-loop, full-system simulation of NoC-based many-core computing systems. We demonstrate the capability of Agate by simulating and evaluating several power-gating schemes, including the recently proposed Power Punch power-gating scheme.

Introduction

With an increasing number of cores integrated on a chip, networks-on-chip (NoCs) have been adopted as the de facto architecture for current and future many-core processors. While NoCs provide a more scalable interconnection solution compared with traditional buses and point-to-point interconnects, the added complexities of buffers, crossbars and various control logic in the NoC greatly increase its power demand. Industrial and research chips have shown that NoCs can draw a substantial percentage of the overall chip power (e.g., 17% in Niagara  [17], 19% in SCORPIO  [9], and 28% in Intel’s Teraflop  [10]). In particular, the static power consumption accounts for an increasing percentage of the total NoC power due to continuing scaling of transistor size and supply voltage. A recent study shows that router static power can be nearly 64% of the total router power due to the fact that the average network utilization in real benchmarks is relatively low  [7]. Therefore, it is imperative to explore effective approaches that can dramatically reduce NoC static power.

Power-gating is a very promising approach that can significantly reduce static power consumption. It has been proposed and applied to cores and execution units for some time  [11], but only more recently has it been investigated for NoC routers  [4], [5], [7], [8], [14], [16], [15], [19], [21]. Power-gating cuts off power supply to routers when they are idle, thus being able to eliminate the static power of the entire router. Meanwhile, this approach is also prone to incur a considerable increase in packet latency. Packets that encounter a powered-off router must wait for the router to wake up before proceeding, and may experience similar wakeup latency several times as multiple routers along the path to destination could be powered off.

Considering the large potential of power-gating in saving power but also the substantial impact on performance, a couple of works have been conducted to explore productive ways of applying power-gating to on-chip routers. Yet, significant opportunities still exist and need further exploration, particularly given the large number of factors that may influence power-gating effectiveness such as application behaviors, routing algorithm, topology, traffic pattern, etc. To enable exploration along these promising research lines, a cycle-accurate simulation infrastructure that supports closed-loop, full-system simulation is much needed.

In this work, we first identify the key requirements for NoC power-gating simulation and discuss three important optimizations that can enable correct and efficient handling of pipeline draining and handshaking issues that are specific to NoC router power-gating. We then present a useful simulator tool called, Agate, which is a detailed cycle-accurate NoC power-gating simulator that complements Garnet  [1]. The proposed simulator can be integrated into the Gem5 simulator  [3] to conduct full-system simulation. Agate includes features for monitoring the usage of NoC router components in Garnet, suspending and resuming NoC router pipeline, handling the signaling between NoC routers, and modeling NoC router power state transitions effectively. It implements important optimizations to minimize simulation overheads and accurately capture the power-saving potential of various schemes under evaluation. In addition to these features, Agate is also equipped with extensive instrumentation for reporting various power-gating related statistics. The simulator is very flexible as new power-gating schemes can be easily created under the structure of Agate source code. Moreover, Agate is also able to incorporate output from other power/energy modeling simulators at different levels for more accurate power/energy estimation. As an example, the output from the DSENT simulator  [22] has been incorporated in Agate as a source of power input.

The main contributions of this work are three-fold:

  • Key requirements of NoC power-gating simulation regarding accuracy, efficiency, full-system compatibility, extensibility and flexibility are identified;

  • Insights on the problems, solutions and optimizations generally applicable to NoC power-gating simulation platforms are provided; and

  • A cycle-accurate simulator that facilitates research on NoC router power-gating is proposed.

Given the importance of NoC power-gating techniques and the useful features that Agate offers, we believe Agate will be a valuable simulation tool for the NoC community once made publicly accessible. The rest of the paper is organized as follows. Section  2 provides more background and discusses the requirements of simulating NoC power-gating. Section  3 discusses the problems, potential solutions and optimizations in NoC power-gating simulation, and Section  4 describes in detail the proposed Agate cycle-accurate NoC power-gating simulator tool. Section  5 demonstrates the proposed simulator in simulating and evaluating various power-gating schemes. Finally, related work is summarized in Section  6, and Section  7 concludes the paper.

Section snippets

Power gating of NoC routers

Recent studies have started to show the promise of power-gating as an effective approach for reducing NoC static power. As depicted in Fig. 1, it is implemented by inserting appropriately sized header transistor(s), typically a non-leaky “sleep” switch with high threshold voltage, between Vdd and the router. By controlling the sleep signal to turn off the header transistor, the supply voltage to the router is cut off, thus avoiding static power by removing the leakage currents in both

Problems, potential solutions and optimizations

When designing simulators that meet the above requirements, care must be taken to ensure correct operations in simulation while accurately reflecting the power-saving potential of the evaluated power-gating schemes. This section discusses the problems, potential solutions and three important optimizations that we propose for simulating NoC power-gating behaviors correctly and efficiently. A canonical four-stage router pipeline is assumed, which consists of route compute (RC), VC allocation

Overview

In this section we present a simulator tool, called Agate, that meets the key requirements for simulating NoC power-gating techniques, as discussed in Section  2, while also employing the proposed solutions and optimizations described in Section  3 above. Agate is an important simulation tool for facilitating the advancement of research on architecture-level power-gating techniques and their power-performance trade-offs. To enable full-system simulation, Agate is implemented as a closed-loop,

Evaluation and discussion

In this section, we use the proposed Agate NoC power-gating simulator to evaluate and compare four power-gating schemes. While preliminary versions of Agate were used in the evaluation of some recently proposed schemes in  [5], [6], [7], [23], here the fully developed version is used to demonstrate how this simulation tool can be extended to simulate and evaluate various kinds of power-gating schemes, including the Power Punch scheme proposed in  [7]. Full-system simulation is conducted using

Related work

As computing systems shift to the many-core paradigm, NoC has become the primary scalable approach for on-chip communication  [9], [10], [17]. A couple of simulation platforms for on-chip networks have been developed so far. NOXIM  [18] and SICOSYS  [20] are NoC-only simulators without full system simulation support. BookSim  [12] supports standalone simulation as well as integration into full-system simulators. Garnet  [1] implements detailed modeling for NoCs and can be incorporated into the

Conclusion

Power-gating is a promising approach to reduce NoC static power but may potentially cause substantial performance penalty. In this work, we investigate the important issue of correctly and efficiently simulating NoC power-gating schemes. We identify key requirements that a NoC power-gating simulator should satisfy, propose three important optimizations to increase the accuracy and efficiency of the simulation, and present Agate, a timely simulation tool to facilitate research in exploring

Acknowledgment

This work is supported, in part, by the Software and Hardware Foundations program of the NSF’s Directorate for Computer & Information Science & Engineering.

Lizhong Chen is an assistant professor in the School of Electrical Engineering and Computer Science at the Oregon State University. He received his Ph.D. in Computer Engineering and M.S. in Electrical Engineering from USC in 2014 and 2011, respectively, and B.S. in Electrical Engineering from Zhejiang University in 2009. His research interests are in the area of architecture, application and emerging technology of computing systems, including embedded and mobile devices, many-core processors

References (23)

  • N. Agarwal, T. Krishna, L. Peh, N.K. Jha, Garnet: A detailed on-chip network model inside a full-system simulator, in:...
  • C. Bienia, K. Li, Parsec 2.0: A new benchmark suite for chipmultiprocessors, in: 5th Annual Workshop on Modeling,...
  • N. Binkert et al.

    The gem5 Simulator

    Comput. Archit. News

    (2011)
  • J. Camacho, J. Flich, J. Duato, H. Eberle, W. Olesinski, Towards an efficient NoC topology through multiple injection...
  • L. Chen, T.M. Pinkston, NoRD: Node-router decoupling for effective power-gating of on-chip routers, in: 45th IEEE/ACM...
  • L. Chen, L. Zhao, R. Wang, T.M. Pinkston, MP3: Minimizing performance penalty for power-gating of clos network-on-chip,...
  • L. Chen, D. Zhu, M. Pedram, T.M. Pinkston, Power punch: Towards non-blocking power-gating of NoC routers, in: 21st...
  • R. Das, S. Narayanasamy, S.K. Satpathy, R.G. Dreslinski, Catnap: Energy proportional multiple network-on-chip, in: 40th...
  • B.K. Daya, C.H.O. Chen, S. Subramanian, K. Woo-Cheol, P. Sunghyun, T. Krishna, J. Holt, A.P. Chandrakasan, L.-S. Peh,...
  • Y. Hoskote, S. Vangal, A. Singh, N. Borkar, S. Borkar, A 5-GHz mesh interconnect for a Teraflops processor, in: 40th...
  • Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. Zyuban, H. Jacobson, P. Bose, Microarchitectural techniques for power...
  • Cited by (9)

    View all citing articles on Scopus

    Lizhong Chen is an assistant professor in the School of Electrical Engineering and Computer Science at the Oregon State University. He received his Ph.D. in Computer Engineering and M.S. in Electrical Engineering from USC in 2014 and 2011, respectively, and B.S. in Electrical Engineering from Zhejiang University in 2009. His research interests are in the area of architecture, application and emerging technology of computing systems, including embedded and mobile devices, many-core processors and GPUs, data centers and high performance computing systems. He is the recipient of the Chu Kochen Award (the highest honor in Zhejiang University), the National Scholarship from the Ministry of Education of China, and the Chinese Government Award for Outstanding Students Studying Abroad.

    Di Zhu received her B.S. degree with distinction in electronic engineering from Tsinghua University, Beijing, China, in 2011. She is currently a Ph.D. candidate at University of Southern California, under the supervision of Prof. Massoud Pedram. Her research interest includes on-chip networks, hybrid electrical energy storage (HEES) systems, and dynamic power management.

    Massoud Pedram, who is the Stephen and Etta Varra Professor in the Ming Hsieh department of Electrical Engineering at University of Southern California, received a Ph.D. in Electrical Engineering and Computer Sciences from the University of California, Berkeley in 1991. He holds 10 US patents and has published four books, 13 book chapters, and more than 140 archival and 380 conference papers. His research ranges from low power electronics, energy-efficient processing, and cloud computing to photovoltaic cell power generation, energy storage, and power conversion, and from RT-level optimization of VLSI circuits to synthesis and physical design of quantum circuits. For this research, he and his students have received seven conference and two IEEE Transactions Best Paper Awards. Dr. Pedram is a recipient of the 1996 Presidential Early Career Award for Scientists and Engineers, a Fellow of the IEEE, an ACM Distinguished Scientist, and currently serves as the Editor-in-Chiefs of the ACM Transactions on Design Automation of Electronic Systems and the IEEE Journal on Emerging and Selected Topics in Circuits and Systems. He has served on the technical program committee of a number of premiere conferences in his field and was the founding Technical Program Co-chair of the 1996 International Symposium on Low Power Electronics and Design and the Technical Program Chair of the 2002 International Symposium on Physical Design.

    Timothy Mark Pinkston received the B.S.E.E. degree from The Ohio State University in 1985 and the M.S.E.E. and Ph.D. degrees from Stanford University in 1986 and 1993, respectively. He is currently a professor in the Ming Hsieh Department of Electrical Engineering and Vice Dean of Faculty Affairs in the Viterbi School of Engineering at the University of Southern California. Recently, he served three years as the program director of the Computer and Information Science and Engineering Directorate of the US National Science Foundation (NSF) for the computer systems architecture area and the Expeditions in Computing Program. His research interests include interconnection networks and communication architectures for parallel processing systems, in particular multicore and multiprocessor computers. His professional service includes serving on the editorial board of IEEE Transactions on Parallel and Distributed Systems (TPDS) and on the IEEE TPDS Editor-in-Chief search and re-appointment committees. He has taken on leadership roles and membership in many conferences and workshops in the field, including ISCA, HPCA, ICPP, IPDPS, NOCS and HiPC. He recently served as the Program Chair for ICPADS’06, the General Chair for IPDPS’07, and the Program Chair for HPCA’09. He is a fellow of the IEEE.

    Lizhong Chen and Di Zhu have equal contributions in this paper.

    View full text