Exploring Heterogeneous NoC Design Space in Heterogeneous GPU-CPU Architectures

Fang, Juan; Leng, Zhen-Yu; Liu, Si-Tong; Yao, Zhi-Cheng; Sui, Xiu-Feng

doi:10.1007/s11390-015-1505-6

Exploring Heterogeneous NoC Design Space in Heterogeneous GPU-CPU Architectures

Regular Paper
Published: 21 January 2015

Volume 30, pages 74–83, (2015)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Juan Fang¹,
Zhen-Yu Leng¹,
Si-Tong Liu¹,
Zhi-Cheng Yao² &
…
Xiu-Feng Sui²

216 Accesses
12 Citations
Explore all metrics

Abstract

Computer architecture is transiting from the multicore era into the heterogeneous era in which heterogeneous architectures use on-chip networks to access shared resources and how a network is configured will likely have a significant impact on overall performance and power consumption. Recently, heterogeneous network on chip (NoC) has been proposed not only to achieve performance comparable to that of the NoCs with buffered routers but also to reduce buffer cost and energy consumption. However, heterogeneous NoC design for heterogeneous GPU-CPU architectures has not been studied in depth. This paper first evaluates the performance and power consumption of a variety of static hot-potato based heterogeneous NoCs with different buffered and bufferless router placements, which is helpful to explore the design space for heterogeneous GPU-CPU interconnection. Then it proposes Unidirectional Flow Control (UFC), a simple credit-based flow control mechanism for heterogeneous NoC in GPU-CPU architectures to control network congestion. UFC can guarantee that there are always unoccupied entries in buffered routers to receive flits coming from adjacent bufferless routers. Our evaluations show that when compared to hot-potato routing, UFC improves performance by an average of 14.1% with energy increased by an average of 5.3% only.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Application-aware NoC management in GPUs multitasking

Article 25 January 2019

P-NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA

Article 19 June 2020

TB-TBP: a task-based adaptive routing algorithm for network-on-chip in heterogenous CPU-GPU architectures

Article Open access 23 October 2023

References

Ma K, Li X, Chen W et al. GreenGPU: A holistic approach to energy efficiency in GPU-CPU heterogeneous architectures. In Proc. the 41st Int. Conf. Parallel Processing, September 2012, pp.48-57.
Lee J, Samadi M, Park Y et al. Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems. In Proc. the 22nd Int. Conf. Parallel Architectures and Compilation Techniques, Sept. 2013, pp.245-255.
Lee J, Kim H. TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture. In Proc. the 18th Int. Symp. High Performance Computer Architecture, February 2012, pp.91-102.
Borkar S. Thousand core chips: A technology perspective. In Proc. the 44th Conf. Design Automation, June 2007, pp.746-749.
Hoskote Y, Vangal S, Singh A et al. A 5-GHz mesh interconnect for a teraflops processor. IEEE Micro, 2007, 27(5): 51-61.
Article Google Scholar
Owens J D, Dally W J, Ho R et al. Research challenges for on-chip interconnection networks. IEEE Micro, 2007, 27(5): 96-108.
Article Google Scholar
Wentzlaff D, Griffin P, Hoffmann H et al. On-chip interconnection architecture of the tile processor. IEEE Micro, 2007, 27(5): 15-31.
Article Google Scholar
Taylor M B, Lee W, Miller J et al. Evaluation of the raw microprocessor: An exposed-wire-delay architecture for ILP and streams. ACM SIGARCH Computer Architecture News, 2004, 32(2): 2-13.
Article Google Scholar
Moscibroda T, Mutlu O. A case for bufferless routing in on-chip networks. ACM SIGARCH Computer Architecture News, 2009, 37(3): 196-207.
Article Google Scholar
Michelogiannakis G, Sanchez D, Dallv W J et al. Evaluating bufferless flow control for on-chip networks. In Proc. the 4th Int. Symp. Networks-on-Chip, May 2010, pp.9-16.
Jafri S A R, Hong Y J, Thottethodi M et al. Adaptive flow control for robust performance and energy. In Proc. the 43rd Int. Symp. Microarchitecture, December 2010, pp.433-444.
Nychis G P, Fallin C, Moscibroda T et al. On-chip networks from a networking perspective: Congestion and scalability in many-core interconnects. ACM SIGCOMM Computer Communication Review, 2012, 42(4): 407-418.
Article Google Scholar
Fallin C, Craik C, Mutlu O. CHIPPER: A low-complexity bufferless deflection router. In Proc. the 17th Int. Symp. High Performance Computer Architecture, February 2011, pp.144-155.
Zhao H, Kandemir M, Ding W et al. Exploring heterogeneous NoC design space. In Proc. Int. Conf. Computer-Aided Design, November 2011, pp.787-793.
Nilsson E. Design and implementation of a hot-potato switch in a network on chip [Master Thesis]. Royal Institute of Technology, Sweden, 2002.
Lee J, Li S, Kim H et al. Adaptive virtual channel partitioning for network-on-chip in heterogeneous architectures. ACM Trans. Design Automation of Electronic Systems, 2013, 18(4): 48:1-48:28.
Kim H, Kim Y, Kim J. Clumsy flow control for highthroughput bufferless on-chip networks. IEEE Computer Architecture Letters, 2013, 12(2): 47-50.
Article Google Scholar
Kahng A B, Li B, Peh L S et al. ORION 2.0: A power-area simulator for interconnection networks. IEEE Trans. Very Large Scale Integration Systems, 2012, 20(1): 191-196.
Henning J L. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News, 2006, 34(4): 1-17.
Article MathSciNet Google Scholar
Che S, Boyer M, Meng J et al. Rodinia: A benchmark suite for heterogeneous computing. In Proc. Int. Symp. Workload Characterization, October 2009, pp.44-54.
Patil H, Cohn R, Charnev M et al. Pinpointing representative portions of large Intel^® Itanium^® programs with dynamic instrumentation. In Proc. the 37th Int. Symp. Microarchitecture, December 2004, pp.81-92.
Grot B, Hestness J, Keckler S W, Multu O. Express cube topologies for on-chip interconnects. In Proc. the 15th Int. Symp. High Performance Computer Architecture, February 2009, pp.163-174.
Balfour J, Dally W J, Black-Schaffer D et al. An energyefficient processor architecture for embedded systems. IEEE Computer Architecture Letters, 2008, 7(1):29-32.
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science, Beijing University of Technology, Beijing, 100124, China
Juan Fang, Zhen-Yu Leng & Si-Tong Liu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Zhi-Cheng Yao & Xiu-Feng Sui

Authors

Juan Fang
View author publications
You can also search for this author in PubMed Google Scholar
Zhen-Yu Leng
View author publications
You can also search for this author in PubMed Google Scholar
Si-Tong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Cheng Yao
View author publications
You can also search for this author in PubMed Google Scholar
Xiu-Feng Sui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan Fang.

Additional information

This work was supported by the National Natural Science Foundation of China under Grant Nos. 61202076, 61202062.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, J., Leng, ZY., Liu, ST. et al. Exploring Heterogeneous NoC Design Space in Heterogeneous GPU-CPU Architectures. J. Comput. Sci. Technol. 30, 74–83 (2015). https://doi.org/10.1007/s11390-015-1505-6

Download citation

Received: 15 July 2014
Revised: 12 November 2014
Published: 21 January 2015
Issue Date: January 2015
DOI: https://doi.org/10.1007/s11390-015-1505-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring Heterogeneous NoC Design Space in Heterogeneous GPU-CPU Architectures

Abstract

Access this article

Similar content being viewed by others

Application-aware NoC management in GPUs multitasking

P-NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA

TB-TBP: a task-based adaptive routing algorithm for network-on-chip in heterogenous CPU-GPU architectures

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exploring Heterogeneous NoC Design Space in Heterogeneous GPU-CPU Architectures

Abstract

Access this article

Similar content being viewed by others

Application-aware NoC management in GPUs multitasking

P-NoC: Performance Evaluation and Design Space Exploration of NoCs for Chip Multiprocessor Architecture Using FPGA

TB-TBP: a task-based adaptive routing algorithm for network-on-chip in heterogenous CPU-GPU architectures

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation