ABSTRACT
Multiple networks have been used in several processor implementations to scale bandwidth and ensure protocol-level deadlock freedom for different message classes. In this paper, we observe that a multiple-network design is also attractive from a power perspective and can be leveraged to achieve energy proportionality by effective power gating.
Unlike a single-network design, a multiple-network design is more amenable to power gating, as its subnetworks (subnets) can be power gated without compromising the connectivity of the network. To exploit this opportunity, we propose the Catnap architecture which consists of synergistic subnet selection and power-gating policies. Catnap maximizes the number of consecutive idle cycles in a router, while avoiding performance loss due to overloading a subnet.
We evaluate a 256-core processor with a concentrated mesh topology using synthetic traffic and 35 applications. We show that the average network power of a power-gating optimized multiple-network design with four subnets could be 44% lower than a bandwidth equivalent single-network design for an average performance cost of about 5%.
- D. Abts, M. R. Marty, P. M. Wells, P. Klausler, and H. Liu, "Energy proportional datacenter networks," in ISCA, 2010. Google ScholarDigital Library
- J. D. Balfour and W. J. Dally, "Design tradeoffs for tiled cmp on-chip networks," in ICS, 2006. Google ScholarDigital Library
- L. A. Barroso and U. Hölzle, "The case for energy-proportional computing," IEEE Computer, 2007. Google ScholarDigital Library
- E. Baydal, P. Lopez, and J. Duato, "A family of mechanisms for congestion control in wormhole networks," IEEE Trans. Parallel Distrib. Syst., 2005. Google ScholarDigital Library
- S. Borkar, "Design challenges of technology scaling," Micro, IEEE, 1999. Google ScholarDigital Library
- S. Borkar, "Thousand core chips: a technology perspective," in DAC-44, 2007. Google ScholarDigital Library
- J. Camacho and J. Flich, "Hpc-mesh: A homogeneous parallel concentrated mesh for fault-tolerance and energy savings," in ANCS-7, 2011. Google ScholarDigital Library
- L. Chen and T. M. Pinkston, "Nord: Node-router decoupling for effective power-gating of on-chip routers," in MICRO-45, 2012. Google ScholarDigital Library
- W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2003. Google ScholarDigital Library
- R. Das, O. Mutlu, T. Moscibroda, and C. Das, "Application-Aware Prioritization Mechanisms for On-Chip Networks," in MICRO-42, 2009. Google ScholarDigital Library
- X. Fan, W.-D. Weber, and L. A. Barroso, "Power provisioning for a warehouse-sized computer," in ISCA, 2007. Google ScholarDigital Library
- M. Galles, "Scalable pipelined interconnect for distributed endpoint routing: the sgi spider chip," in Symposium on High Performance Interconnects (Hot Interconnects), 1996, pp. 141--146.Google Scholar
- P. Gratz, B. Grot, and S. W. Keckler, "Regional congestion awareness for load balance in networks-on-chip," in HPCA-16, 2008.Google Scholar
- M. Hayenga, D. Johnson, and M. H. Lipasti, "Pitfalls of orion-based simulation," in WDDD-10, 2010.Google Scholar
- J. Howard and et al., "A 48-core ia-32 message-passing processor with dvfs in 45nm cmos," in ISSCC, 2010.Google Scholar
- Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. V. Zyuban, H. M. Jacobson, and P. Bose, "Microarchitectural techniques for power gating of execution units," in ISLPED, 2004. Google ScholarDigital Library
- N. E. Jerger and L. S. Peh, On-Chip Networks, Synthesis Lecture in Computer Architecture. Morgan and Claypool Publishers, 2003. Google ScholarDigital Library
- A. B. Kahng, B. Li, L.-S. Peh, and K. Samadi, "Orion 2.0: A fast and accurate noc power and area model for early-stage design space exploration," in DATE, 2009. Google ScholarDigital Library
- J. Kim, J. Balfour, and W. Dally, "Flattened butterfly topology for on-chip networks," MICRO-40, 2007. Google ScholarDigital Library
- H. Matsutani, M. Koibuchi, D. Ikebuchi, K. Usami, H. Nakamura, and H. Amano, "Performance, area, and power evaluations of ultrafine-grained run-time power-gating routers for cmps," in Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 2011. Google ScholarDigital Library
- H. Matsutani, M. Koibuchi, H. Amano, and D. Wang, "Run-time power gating of on-chip routers using look-ahead routing," in ASP-DAC, 2008. Google ScholarDigital Library
- D. Meisner, B. T. Gold, and T. F. Wenisch, "Powernap: eliminating server idle power," in ASPLOS, 2009. Google ScholarDigital Library
- H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi, "Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation," in MICRO-37, 2004. Google ScholarDigital Library
- L.-S. Peh and W. J. Dally, "A Delay Model and Speculative Architecture for Pipelined Routers," in Proceedings of the 7th International Symposium on High-Performance Computer Architecture, 2001. Google ScholarDigital Library
- A. Samih, R. Wang, A. Krishna, C. Maciocco, C. Tai, and Y. Solihin, "Energy-efficient interconnect via router parking," in HPCA-19, 2013. Google ScholarDigital Library
- K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. R. Moore, "Exploiting ILP, TLP, and DLP with The Polymorphous TRIPS Architecture," in ISCA-30, 2003. Google ScholarDigital Library
- M. B. Taylor, J. S. Kim, J. E. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffmann, P. Johnson, J.-W. Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. P. Amarasinghe, and A. Agarwal, "The raw microprocessor: A computational fabric for software circuits and general-purpose programs," IEEE Micro, 2002. Google ScholarDigital Library
- M. Thottethodi, A. R. Lebeck, and S. S. Mukherjee, "Self-tuned congestion control for multiprocessor networks," in HPCA-7, 2001. Google ScholarDigital Library
- S. Volos, C. Seiculescu, B. Grot, N. Pour, B. Falsafi, and G. De Micheli, "Ccnoc: Specializing on-chip interconnects for energy efficiency in cache-coherent servers," in NOCS-6, 2012. Google ScholarDigital Library
- H. Wang, L.-S. Peh, and S. Malik, "Power-driven design of router microarchitectures in on-chip networks," in MICRO, 2003. Google ScholarDigital Library
- H. Wang, "A power model for routers: Modeling alpha 21364 and infini-band routers," IEEE Micro, 2003. Google ScholarDigital Library
- D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.-C. Miao, J. F. B. III, and A. Agarwal, "On-chip interconnection architecture of the tile processor," IEEE Micro, 2007. Google ScholarDigital Library
Recommendations
Catnap: energy proportional multiple network-on-chip
ICSA '13Multiple networks have been used in several processor implementations to scale bandwidth and ensure protocol-level deadlock freedom for different message classes. In this paper, we observe that a multiple-network design is also attractive from a power ...
Catnap: exploiting high bandwidth wireless interfaces to save energy for mobile devices
MobiSys '10: Proceedings of the 8th international conference on Mobile systems, applications, and servicesEnergy management is a critical issue for mobile devices, with network activity often consuming a significant portion of the total system energy. In this paper, we propose Catnap, a system that reduces energy consumption of mobile devices by allowing ...
NoC topology synthesis for supporting shutdown of voltage islands in SoCs
DAC '09: Proceedings of the 46th Annual Design Automation ConferenceIn many Systems on Chips (SoCs), the cores are clustered in to voltage islands. When cores in an island are unused, the entire island can be shutdown to reduce the leakage power consumption. However, today, the interconnect architecture is a bottleneck ...
Comments