skip to main content
10.1145/2485922.2485950acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Catnap: energy proportional multiple network-on-chip

Published:23 June 2013Publication History

ABSTRACT

Multiple networks have been used in several processor implementations to scale bandwidth and ensure protocol-level deadlock freedom for different message classes. In this paper, we observe that a multiple-network design is also attractive from a power perspective and can be leveraged to achieve energy proportionality by effective power gating.

Unlike a single-network design, a multiple-network design is more amenable to power gating, as its subnetworks (subnets) can be power gated without compromising the connectivity of the network. To exploit this opportunity, we propose the Catnap architecture which consists of synergistic subnet selection and power-gating policies. Catnap maximizes the number of consecutive idle cycles in a router, while avoiding performance loss due to overloading a subnet.

We evaluate a 256-core processor with a concentrated mesh topology using synthetic traffic and 35 applications. We show that the average network power of a power-gating optimized multiple-network design with four subnets could be 44% lower than a bandwidth equivalent single-network design for an average performance cost of about 5%.

References

  1. D. Abts, M. R. Marty, P. M. Wells, P. Klausler, and H. Liu, "Energy proportional datacenter networks," in ISCA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. D. Balfour and W. J. Dally, "Design tradeoffs for tiled cmp on-chip networks," in ICS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. A. Barroso and U. Hölzle, "The case for energy-proportional computing," IEEE Computer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E. Baydal, P. Lopez, and J. Duato, "A family of mechanisms for congestion control in wormhole networks," IEEE Trans. Parallel Distrib. Syst., 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Borkar, "Design challenges of technology scaling," Micro, IEEE, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Borkar, "Thousand core chips: a technology perspective," in DAC-44, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Camacho and J. Flich, "Hpc-mesh: A homogeneous parallel concentrated mesh for fault-tolerance and energy savings," in ANCS-7, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Chen and T. M. Pinkston, "Nord: Node-router decoupling for effective power-gating of on-chip routers," in MICRO-45, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Das, O. Mutlu, T. Moscibroda, and C. Das, "Application-Aware Prioritization Mechanisms for On-Chip Networks," in MICRO-42, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. X. Fan, W.-D. Weber, and L. A. Barroso, "Power provisioning for a warehouse-sized computer," in ISCA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Galles, "Scalable pipelined interconnect for distributed endpoint routing: the sgi spider chip," in Symposium on High Performance Interconnects (Hot Interconnects), 1996, pp. 141--146.Google ScholarGoogle Scholar
  13. P. Gratz, B. Grot, and S. W. Keckler, "Regional congestion awareness for load balance in networks-on-chip," in HPCA-16, 2008.Google ScholarGoogle Scholar
  14. M. Hayenga, D. Johnson, and M. H. Lipasti, "Pitfalls of orion-based simulation," in WDDD-10, 2010.Google ScholarGoogle Scholar
  15. J. Howard and et al., "A 48-core ia-32 message-passing processor with dvfs in 45nm cmos," in ISSCC, 2010.Google ScholarGoogle Scholar
  16. Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. V. Zyuban, H. M. Jacobson, and P. Bose, "Microarchitectural techniques for power gating of execution units," in ISLPED, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. E. Jerger and L. S. Peh, On-Chip Networks, Synthesis Lecture in Computer Architecture. Morgan and Claypool Publishers, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. B. Kahng, B. Li, L.-S. Peh, and K. Samadi, "Orion 2.0: A fast and accurate noc power and area model for early-stage design space exploration," in DATE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Kim, J. Balfour, and W. Dally, "Flattened butterfly topology for on-chip networks," MICRO-40, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Matsutani, M. Koibuchi, D. Ikebuchi, K. Usami, H. Nakamura, and H. Amano, "Performance, area, and power evaluations of ultrafine-grained run-time power-gating routers for cmps," in Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Matsutani, M. Koibuchi, H. Amano, and D. Wang, "Run-time power gating of on-chip routers using look-ahead routing," in ASP-DAC, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Meisner, B. T. Gold, and T. F. Wenisch, "Powernap: eliminating server idle power," in ASPLOS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi, "Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation," in MICRO-37, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. L.-S. Peh and W. J. Dally, "A Delay Model and Speculative Architecture for Pipelined Routers," in Proceedings of the 7th International Symposium on High-Performance Computer Architecture, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Samih, R. Wang, A. Krishna, C. Maciocco, C. Tai, and Y. Solihin, "Energy-efficient interconnect via router parking," in HPCA-19, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. R. Moore, "Exploiting ILP, TLP, and DLP with The Polymorphous TRIPS Architecture," in ISCA-30, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. B. Taylor, J. S. Kim, J. E. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffmann, P. Johnson, J.-W. Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. P. Amarasinghe, and A. Agarwal, "The raw microprocessor: A computational fabric for software circuits and general-purpose programs," IEEE Micro, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Thottethodi, A. R. Lebeck, and S. S. Mukherjee, "Self-tuned congestion control for multiprocessor networks," in HPCA-7, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Volos, C. Seiculescu, B. Grot, N. Pour, B. Falsafi, and G. De Micheli, "Ccnoc: Specializing on-chip interconnects for energy efficiency in cache-coherent servers," in NOCS-6, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. H. Wang, L.-S. Peh, and S. Malik, "Power-driven design of router microarchitectures in on-chip networks," in MICRO, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H. Wang, "A power model for routers: Modeling alpha 21364 and infini-band routers," IEEE Micro, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.-C. Miao, J. F. B. III, and A. Agarwal, "On-chip interconnection architecture of the tile processor," IEEE Micro, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ISCA '13: Proceedings of the 40th Annual International Symposium on Computer Architecture
    June 2013
    686 pages
    ISBN:9781450320795
    DOI:10.1145/2485922
    • cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 41, Issue 3
      ICSA '13
      June 2013
      666 pages
      ISSN:0163-5964
      DOI:10.1145/2508148
      Issue’s Table of Contents

    Copyright © 2013 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 23 June 2013

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    ISCA '13 Paper Acceptance Rate56of288submissions,19%Overall Acceptance Rate543of3,203submissions,17%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader