Skip to main content

Advertisement

Log in

Energy-efficient NoC with multi-granularity power optimization

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

As the core count grows rapidly, NoC (Network-on-Chip) consumes an increasing fraction of the modern processors/SoCs (System-on-Chips) power. It is thus very important to design energy-efficient NoC architecture. Multi-NoC (Multiple Network-on-Chip) has demonstrated its advantages in power gating for reducing leakage power, which constitutes a significant fraction of NoC power. In this paper, we propose Chameleon, a novel heterogeneous Multi-NoC design. Chameleon employs a fine-grained power gating algorithm which exploits power saving opportunities at different levels of granularity simultaneously. Integrated with a congestion-aware traffic allocation policy, Chameleon is able to achieve both high performance and low power at varying network utilization. Our experimental results on both synthetic and real workloads show that Chameleon delivers an average of 2.61 % higher performance than Catnap, the best in the literature. More importantly, Chameleon consumes an average of 27.75 % less power than Catnap.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Salihundam P, Jain S, Jacob T, Kumar S, Erraguntla V, Hoskote Y, Vangal SR, Ruhl G, Borkar N (2011) A 2 tb/s 6 x 4 mesh network for a single-chip cloud computer with dvfs in 45 nm cmos. J Solid-State Circ 46(4):757–766

    Article  Google Scholar 

  2. Kim JS, Taylor MB, Miller J, Wentzlaff D (2003) Energy characterization of a tiled architecture processor with on-chip networks. In: Proceedings of the 2003 International Symposium on Low Power Electronics and Design, ser. ISLPED ’03. New York, ACM, pp 424–427

  3. Das R, Narayanasamy S, Satpathy SK, Dreslinski RG (2013) Catnap: energy proportional multiple network-on-chip. In: Proceedings of the 40th Annual International Symposium on Computer Architecture, ser. ISCA ’13. New York, NY, USA, ACM, pp 320–331

  4. Parikh R, Das R, Bertacco V (2014) Power-aware nocs through routing and topology reconfiguration. In: Proceedings of the 51st Annual Design Automation Conference, ser. DAC ’14, New York, NY, USA, ACM, pp 162:1–162:6

  5. Sun C, Chen C-HO, Kurian G, Wei L, Miller J, Agarwal A, Peh LS, Stojanovic V (2012) Dsent - a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In: Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, ser. NOCS ’12, Washington, IEEE Computer Society, pp 201–210

  6. Moscibroda T, Mutlu O (2009) A case for bufferless routing in on-chip networks. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ser. ISCA ’09. New York, ACM, pp 196–207

  7. Samih A, Wang R, Krishna A, Maciocco C, Tai C, Solihin Y (2013) Energy-efficient interconnect via router parking. In: Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), ser. HPCA ’13, Washington, IEEE Computer Society, pp 508–519

  8. Taylor MB, Kim J, Miller J, Wentzlaff D, Ghodrat F, Greenwald B, Hoffman H, Johnson P, Lee J-W, Lee W, Ma A, Saraf A, Seneski M, Shnidman N, Strumpen V, Frank M, Amarasinghe S, Agarwal A (2002) The raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE Micro 22(2):25–35

    Article  Google Scholar 

  9. Sankaralingam K, Nagarajan R, Liu H, Kim C, Huh J, Burger D, Keckler SW, Moore CR (2003) Exploiting ilp, tlp, and dlp with the polymorphous trips architecture. In: Proceedings of the 30th Annual International Symposium on Computer Architecture, ser. ISCA ’03, New York, ACM, pp 422–433

  10. Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao C-C, Brown JF III, Agarwal A (2007) On-chip interconnection architecture of the tile processor. IEEE Micro 27(5):15–31

    Article  Google Scholar 

  11. Matsutani H, Koibuchi M, Wang D, Amano H (2008) Run-time power gating of on-chip routers using look-ahead routing. In: Proceedings of the 2008 Asia and South Pacific Design Automation Conference, ser. ASP-DAC ’08. Los Alamitos, IEEE Computer Society Press, pp 55–60

  12. Matsutani H, Koibuchi M, Ikebuchi D, Usami K, Nakamura H, Amano H (2011) Performance, area, and power evaluations of ultrafine-grained run-time power-gating routers for cmps. Trans Comp-Aided Des Integ Cir Sys 30(4):520–533

  13. Kim G, Kim J, Yoo S (2011) Flexibuffer: reducing leakage power in on-chip network routers. In: Proceedings of the 48th Design Automation Conference, ser. DAC ’11, New York, ACM, pp 936–941

  14. Chen L, Pinkston TM (2012) Nord: node-router decoupling for effective power-gating of on-chip routers. In: Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-45. Washington, IEEE Computer Society, pp 270–281

  15. Chen L, Zhao L, Wang R, Pinkston TM (2014) “MP3: minimizing performance penalty for power-gating of clos network-on-chip,” in 20th IEEE International Symposium on High Performance Computer Architecture, HPCA (2014) Orlando, FL, USA, February 15–19, IEEE Computer Society 2014:296–307

  16. Chen L, Zhu D, Pedram M, Pinkston TM (2015) Power punch: towards non-blocking power-gating of noc routers. In: Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), ser. HPCA ’15, Washington, IEEE Computer Society

  17. Dally WJ, Towles B (2004) Principles and practices of interconnection networks. Morgan Kaufmann

  18. Jiang N, Becker DU, Michelogiannakis G, Balfour J, Towles B, Kim J, Dally WJ (2013) A detailed and flexible cycle-accurate network-on-chip simulator. In: Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software

  19. Badr M, Jerger NE (2014) Synfull: Synthetic traffic models capturing cache coherent behaviour. In: Proceeding of the 41st Annual International Symposium on Computer Architecuture, ser. ISCA ’14, Piscataway, NJ, USA: IEEE Press, pp 109–120. [Online]. Available: http://dl.acm.org/citation.cfm?id=2665671.2665691

  20. Bienia C (2011) Benchmarking modern multiprocessors. Ph.D. dissertation, Princeton, NJ, USA, aAI3445564

  21. Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) The splash-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, ser. ISCA ’95, New York, NY, USA, ACM, pp 24–36. [Online]. doi:10.1145/223982.223990

  22. Michelogiannakis G, Shalf J (2014) Variable-width datapath for on-chip network static power reduction. In: Proceedings of the 2014 IEEE/ACM Sixth International Symposium on Networks-on-Chip, ser. NOCS ’14, Washington, DC, USA, IEEE Computer Society, pp 96–103

  23. Chaitin GJ (1982) Register allocation & spilling via graph coloring. In: SIGPLAN ’82: Proceedings of the 1982 SIGPLAN symposium on Compiler construction. ACM Press, Boston, MA, USA, pp 98–101

  24. Briggs P, Cooper KD, Torczon L (1994) Improvements to graph coloring register allocation. ACM Trans Program Lang Syst 16(3):428–455

    Article  Google Scholar 

  25. Wang L, Yang X, Dai H (2013) Scratchpad memory allocation for arrays in permutation graphs. Sci Chin Inf Sci 56(5):1–13

    MathSciNet  Google Scholar 

  26. Wang L, Xue J, Yang X (2014) Acyclic orientation graph coloring for software-managed memory allocation. Sci Chin Inf Sci 57(9):1–18

    Article  Google Scholar 

  27. Richter RJ (1990) A reconfigurable interconnection network for flexible pipelining. In: CONPAR 90-VAPP IV, Joint International Conference on Vector and Parallel Processing, Zurich, Switerland, September 10–13, Proceedings, 1990, pp 397–404

  28. Bhandarkar SM, Arabnia HR (1995) The refine multiprocessor—theoretical properties and algorithms. Parallel Comput 21(11):1783–1805

    Article  Google Scholar 

  29. Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomputing 25(1):43–62

    Article  MATH  Google Scholar 

  30. Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recog Artificial Intell 9(2):201–229

    Article  Google Scholar 

  31. Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomputing 10(3):243–269

    Article  MATH  Google Scholar 

  32. Mishra AK, Mutlu O, Das CR (2013) A heterogeneous multiple network-on-chip design: an application-aware approach. In: The 50th Annual Design Automation Conference 2013, DAC ’13, Austin, TX, USA, May 29-June 07, 2013

  33. Balfour J, Dally WJ (2006) Design tradeoffs for tiled cmp on-chip networks. In: Proceedings of the 20th Annual International Conference on Supercomputing, ser. ICS ’06, New York, NY, USA, ACM, pp 187–198

  34. Fallin C, Nazario G, Yu X, Chang K, Ausavarungnirun R, Mutlu O (2012) Minbd: minimally-buffered deflection routing for energy-efficient interconnect. In: Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, ser. NOCS ’12, Washington, DC, USA, IEEE Computer Society, pp 1–10

  35. Bokhari H, Javaid H, Shafique M, Henkel J, Parameswaran S (2014) Darknoc: designing energy-efficient network-on-chip with multi-vt cells for dark silicon. In: Proceedings of the 51st Annual Design Automation Conference, ser. DAC ’14, New York, NY, USA, ACM, pp 161:1–161:6

  36. Wu J, Dong D, Liao X, Wang L (2015) Chameleon: adaptive energy-efficient heterogeneous network-on-chip. In: 33rd IEEE International Conference on Computer Design, ICCD 2015, New York City, NY, USA, pp 419–422

Download references

Acknowledgments

This research is supported by the National Natural Science Foundation of China (No. 61370018, 61272482) and FANEDD under Grant No. 201450.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dezun Dong.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, J., Dong, D., Liao, X. et al. Energy-efficient NoC with multi-granularity power optimization. J Supercomput 73, 1654–1671 (2017). https://doi.org/10.1007/s11227-016-1859-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1859-8

Keywords

Navigation