Energy-efficient NoC with multi-granularity power optimization

Wu, Ji; Dong, Dezun; Liao, Xiangke; Wang, Li

doi:10.1007/s11227-016-1859-8

Energy-efficient NoC with multi-granularity power optimization

Published: 10 September 2016

Volume 73, pages 1654–1671, (2017)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Ji Wu¹,
Dezun Dong¹,
Xiangke Liao¹ &
…
Li Wang¹

487 Accesses
12 Citations
Explore all metrics

Abstract

As the core count grows rapidly, NoC (Network-on-Chip) consumes an increasing fraction of the modern processors/SoCs (System-on-Chips) power. It is thus very important to design energy-efficient NoC architecture. Multi-NoC (Multiple Network-on-Chip) has demonstrated its advantages in power gating for reducing leakage power, which constitutes a significant fraction of NoC power. In this paper, we propose Chameleon, a novel heterogeneous Multi-NoC design. Chameleon employs a fine-grained power gating algorithm which exploits power saving opportunities at different levels of granularity simultaneously. Integrated with a congestion-aware traffic allocation policy, Chameleon is able to achieve both high performance and low power at varying network utilization. Our experimental results on both synthetic and real workloads show that Chameleon delivers an average of 2.61 % higher performance than Catnap, the best in the literature. More importantly, Chameleon consumes an average of 27.75 % less power than Catnap.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HVCRouter: Energy Efficient Network-on-Chip Router with Heterogeneous Virtual Channels

Energy-Efficient Networks-on-Chip Architectures: Design and Run-Time Optimization

Adaptive distribution of control messages for improving bandwidth utilization in multiple NoC

Article 07 May 2023

References

Salihundam P, Jain S, Jacob T, Kumar S, Erraguntla V, Hoskote Y, Vangal SR, Ruhl G, Borkar N (2011) A 2 tb/s 6 x 4 mesh network for a single-chip cloud computer with dvfs in 45 nm cmos. J Solid-State Circ 46(4):757–766
Article Google Scholar
Kim JS, Taylor MB, Miller J, Wentzlaff D (2003) Energy characterization of a tiled architecture processor with on-chip networks. In: Proceedings of the 2003 International Symposium on Low Power Electronics and Design, ser. ISLPED ’03. New York, ACM, pp 424–427
Das R, Narayanasamy S, Satpathy SK, Dreslinski RG (2013) Catnap: energy proportional multiple network-on-chip. In: Proceedings of the 40th Annual International Symposium on Computer Architecture, ser. ISCA ’13. New York, NY, USA, ACM, pp 320–331
Parikh R, Das R, Bertacco V (2014) Power-aware nocs through routing and topology reconfiguration. In: Proceedings of the 51st Annual Design Automation Conference, ser. DAC ’14, New York, NY, USA, ACM, pp 162:1–162:6
Sun C, Chen C-HO, Kurian G, Wei L, Miller J, Agarwal A, Peh LS, Stojanovic V (2012) Dsent - a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In: Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, ser. NOCS ’12, Washington, IEEE Computer Society, pp 201–210
Moscibroda T, Mutlu O (2009) A case for bufferless routing in on-chip networks. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, ser. ISCA ’09. New York, ACM, pp 196–207
Samih A, Wang R, Krishna A, Maciocco C, Tai C, Solihin Y (2013) Energy-efficient interconnect via router parking. In: Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), ser. HPCA ’13, Washington, IEEE Computer Society, pp 508–519
Taylor MB, Kim J, Miller J, Wentzlaff D, Ghodrat F, Greenwald B, Hoffman H, Johnson P, Lee J-W, Lee W, Ma A, Saraf A, Seneski M, Shnidman N, Strumpen V, Frank M, Amarasinghe S, Agarwal A (2002) The raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE Micro 22(2):25–35
Article Google Scholar
Sankaralingam K, Nagarajan R, Liu H, Kim C, Huh J, Burger D, Keckler SW, Moore CR (2003) Exploiting ilp, tlp, and dlp with the polymorphous trips architecture. In: Proceedings of the 30th Annual International Symposium on Computer Architecture, ser. ISCA ’03, New York, ACM, pp 422–433
Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao C-C, Brown JF III, Agarwal A (2007) On-chip interconnection architecture of the tile processor. IEEE Micro 27(5):15–31
Article Google Scholar
Matsutani H, Koibuchi M, Wang D, Amano H (2008) Run-time power gating of on-chip routers using look-ahead routing. In: Proceedings of the 2008 Asia and South Pacific Design Automation Conference, ser. ASP-DAC ’08. Los Alamitos, IEEE Computer Society Press, pp 55–60
Matsutani H, Koibuchi M, Ikebuchi D, Usami K, Nakamura H, Amano H (2011) Performance, area, and power evaluations of ultrafine-grained run-time power-gating routers for cmps. Trans Comp-Aided Des Integ Cir Sys 30(4):520–533
Kim G, Kim J, Yoo S (2011) Flexibuffer: reducing leakage power in on-chip network routers. In: Proceedings of the 48th Design Automation Conference, ser. DAC ’11, New York, ACM, pp 936–941
Chen L, Pinkston TM (2012) Nord: node-router decoupling for effective power-gating of on-chip routers. In: Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-45. Washington, IEEE Computer Society, pp 270–281
Chen L, Zhao L, Wang R, Pinkston TM (2014) “MP3: minimizing performance penalty for power-gating of clos network-on-chip,” in 20th IEEE International Symposium on High Performance Computer Architecture, HPCA (2014) Orlando, FL, USA, February 15–19, IEEE Computer Society 2014:296–307
Chen L, Zhu D, Pedram M, Pinkston TM (2015) Power punch: towards non-blocking power-gating of noc routers. In: Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), ser. HPCA ’15, Washington, IEEE Computer Society
Dally WJ, Towles B (2004) Principles and practices of interconnection networks. Morgan Kaufmann
Jiang N, Becker DU, Michelogiannakis G, Balfour J, Towles B, Kim J, Dally WJ (2013) A detailed and flexible cycle-accurate network-on-chip simulator. In: Proceedings of the 2013 IEEE International Symposium on Performance Analysis of Systems and Software
Badr M, Jerger NE (2014) Synfull: Synthetic traffic models capturing cache coherent behaviour. In: Proceeding of the 41st Annual International Symposium on Computer Architecuture, ser. ISCA ’14, Piscataway, NJ, USA: IEEE Press, pp 109–120. [Online]. Available: http://dl.acm.org/citation.cfm?id=2665671.2665691
Bienia C (2011) Benchmarking modern multiprocessors. Ph.D. dissertation, Princeton, NJ, USA, aAI3445564
Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) The splash-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, ser. ISCA ’95, New York, NY, USA, ACM, pp 24–36. [Online]. doi:10.1145/223982.223990
Michelogiannakis G, Shalf J (2014) Variable-width datapath for on-chip network static power reduction. In: Proceedings of the 2014 IEEE/ACM Sixth International Symposium on Networks-on-Chip, ser. NOCS ’14, Washington, DC, USA, IEEE Computer Society, pp 96–103
Chaitin GJ (1982) Register allocation & spilling via graph coloring. In: SIGPLAN ’82: Proceedings of the 1982 SIGPLAN symposium on Compiler construction. ACM Press, Boston, MA, USA, pp 98–101
Briggs P, Cooper KD, Torczon L (1994) Improvements to graph coloring register allocation. ACM Trans Program Lang Syst 16(3):428–455
Article Google Scholar
Wang L, Yang X, Dai H (2013) Scratchpad memory allocation for arrays in permutation graphs. Sci Chin Inf Sci 56(5):1–13
MathSciNet Google Scholar
Wang L, Xue J, Yang X (2014) Acyclic orientation graph coloring for software-managed memory allocation. Sci Chin Inf Sci 57(9):1–18
Article Google Scholar
Richter RJ (1990) A reconfigurable interconnection network for flexible pipelining. In: CONPAR 90-VAPP IV, Joint International Conference on Vector and Parallel Processing, Zurich, Switerland, September 10–13, Proceedings, 1990, pp 397–404
Bhandarkar SM, Arabnia HR (1995) The refine multiprocessor—theoretical properties and algorithms. Parallel Comput 21(11):1783–1805
Article Google Scholar
Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomputing 25(1):43–62
Article MATH Google Scholar
Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recog Artificial Intell 9(2):201–229
Article Google Scholar
Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomputing 10(3):243–269
Article MATH Google Scholar
Mishra AK, Mutlu O, Das CR (2013) A heterogeneous multiple network-on-chip design: an application-aware approach. In: The 50th Annual Design Automation Conference 2013, DAC ’13, Austin, TX, USA, May 29-June 07, 2013
Balfour J, Dally WJ (2006) Design tradeoffs for tiled cmp on-chip networks. In: Proceedings of the 20th Annual International Conference on Supercomputing, ser. ICS ’06, New York, NY, USA, ACM, pp 187–198
Fallin C, Nazario G, Yu X, Chang K, Ausavarungnirun R, Mutlu O (2012) Minbd: minimally-buffered deflection routing for energy-efficient interconnect. In: Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, ser. NOCS ’12, Washington, DC, USA, IEEE Computer Society, pp 1–10
Bokhari H, Javaid H, Shafique M, Henkel J, Parameswaran S (2014) Darknoc: designing energy-efficient network-on-chip with multi-vt cells for dark silicon. In: Proceedings of the 51st Annual Design Automation Conference, ser. DAC ’14, New York, NY, USA, ACM, pp 161:1–161:6
Wu J, Dong D, Liao X, Wang L (2015) Chameleon: adaptive energy-efficient heterogeneous network-on-chip. In: 33rd IEEE International Conference on Computer Design, ICCD 2015, New York City, NY, USA, pp 419–422

Download references

Acknowledgments

This research is supported by the National Natural Science Foundation of China (No. 61370018, 61272482) and FANEDD under Grant No. 201450.

Author information

Authors and Affiliations

National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha, 410073, China
Ji Wu, Dezun Dong, Xiangke Liao & Li Wang

Authors

Ji Wu
View author publications
You can also search for this author inPubMed Google Scholar
Dezun Dong
View author publications
You can also search for this author inPubMed Google Scholar
Xiangke Liao
View author publications
You can also search for this author inPubMed Google Scholar
Li Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dezun Dong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, J., Dong, D., Liao, X. et al. Energy-efficient NoC with multi-granularity power optimization. J Supercomput 73, 1654–1671 (2017). https://doi.org/10.1007/s11227-016-1859-8

Download citation

Published: 10 September 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s11227-016-1859-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Energy-efficient NoC with multi-granularity power optimization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HVCRouter: Energy Efficient Network-on-Chip Router with Heterogeneous Virtual Channels

Energy-Efficient Networks-on-Chip Architectures: Design and Run-Time Optimization

Adaptive distribution of control messages for improving bandwidth utilization in multiple NoC

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now