All-to-all broadcasting in torus Network on Chip

Touzene, Abderezak; Day, Khaled

doi:10.1007/s11227-015-1406-z

All-to-all broadcasting in torus Network on Chip

Published: 21 March 2015

Volume 71, pages 2585–2596, (2015)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Abderezak Touzene¹ &
Khaled Day¹

336 Accesses
6 Citations
6 Altmetric
Explore all metrics

Abstract

This paper proposes and evaluates the performance of an all-to-all broadcasting algorithm for a 2D torus Network on Chip (NoC). The proposed algorithm uses special spanning trees called NEWS spanning trees. These trees are link conflict free which implies that the communication steps of the all-to-all algorithm are contention free. The proposed all-to-all broadcasting algorithm is optimal in terms of transmission time and does not need any additional buffer memory like in the all-to-all algorithm for the 2D torus (IEEE Trans Comput 50:1029–1032, 2001). Reducing the amount of buffer space is a very important issue in NoC architectures. Our algorithm is therefore a more efficient solution for all-to-all broadcasting in 2D torus NoC multi-core systems compared to previously proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ackland B et al (2000) A single chip, 1.6-Billion, 16-b MAC/s Multiprocessor DSP. IEEE J Solid State Circuits:412–424
Benini L, De Micheli G (2002) Networks on chips: a new SoC paradigm. IEEE Comput 35:70–78
Article Google Scholar
Benini L, De Micheli G (2000) System-level power optimization: techniques and tools. ACM Trans Design Autom Electr Syst:115–192
Dally WJ, Towles B (2001) Route packets, not wires: on-chip interconnection networks. In: Proc. Design Automatin Conf. (DAC). pp 684–689
Marculescu R, Ogras UY, Peh L, Jerger NE, Hoskote Y (2009) Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives. IEEE Trans Comput Aided Design Integr Circuits Syst 28(1):3–21
Article Google Scholar
Bertozzi D et al (2005) NoC synthesis flow for customized domain specific multiprocessor systems-on-chip. IEEE Trans Parallel Distrib Syst 16(2):113–129
Article Google Scholar
Benini L, De Micheli G (2006) Networks on chips: technology and tools. Morgan Kaufmann
Guerrier P, Greiner A (2000) A generic architecture for on-chip packet-switched interconnections. In: Proc. Design and Test in Europe (DATE). pp 250–256
Kumar S et al (2002) A network on chip architecture and design methodology. Proc Intl Symposium VLSI (ISVLSI):117–124
Bjerregaard T, Mahadevan S (2006) A survey of research and practices of network-on-chip. ACM Comput Surv 38(1) (article 1)
Ogras UY, Hu J, Marculescu R (2005) Key re-search problems in NoC design: a holistic perspective. In: CODES. pp 69–75
Dally WJ (1990) Performance analysis of k-ary n-cube interconnection networks. IEEE Trans Comput 39(6):775–785
Article MathSciNet Google Scholar
Dally WJ, Seitz CL (1986) The torus routing chip. J Distrib Comput 1(4):187–196
Article Google Scholar
Zhang Z, Guo Z, Yang Y (2012) Efficient all-to-all broadcast in Gaussian On-Chip-Networks. IEEE Trans Comput 62(10):1959–1971
Article MathSciNet Google Scholar
Saad Y, Schultz MH (1989) Data communication in parallel architectures. Parallel Comput 11:131–150
Article MATH MathSciNet Google Scholar
Johnsson SL, Ho CT (1989) Optimum broadcasting and personalized communication in hypercubes. IEEE Trans Comput 38(9):1249–1268
Article MathSciNet Google Scholar
Bruck J, Ho CT, Kipnis S, Weathersby D (1994) Efficient Algorithms for All-to-All Communications in Multi-Port Message-Passing Systems. In: ACM Symposium on Parallel Algorithms and Architectures. pp 298–309
Calvin C, Perennes S, Trystram D (1995) All-to-all broadcast in torus with wormhole-like routing. In: Proc. of 7th IEEE Symposium on Parallel and Distributed Processing. pp 130–137
Yang Y, Wang J (1999) Efficient all-to-all broadcast in all-port mesh and torus networks. In: Proceedings of the Fifth International Symposium on High-Performance Computer Architecture. Orlando, pp 290–299
Yang Y, Wang J (2001) Pipelined all-to-all broadcast in all-port meshes and tori. IEEE Trans Comput 50(10):1029–1032
Google Scholar
Yang Y, Wang J (2002) Near-optimal all-to-all broadcast in multidimensional all-port meshes and tori. IEEE Trans Parallel Distrib Syst 13(2):128–141
Article Google Scholar
Huang H (2010) Efficient all-to-all broadcast algorithm in torus networks. IEEE Int Conf Intell Comput Intell Syst:911–916
Touzene A (1991) Brigitte plateau, optimal multinode broadcast on a mesh connected graph with reduced bufferization. Distrib Memory Comput Lect Notes Comput Sci 487(1991):143–152
Article Google Scholar
Hassoun S, Alpert CJ, Thiagarajan M (2002) Optimal buffered routing path construction for single and multiple clock domain systems. In: Proceedings of the 2002 IEEE/ACM International Conference on Computer-Aided Design. pp 247–253
Ogras UY, Marculescu R (2006) It’s a small world after all: NoC performance optimization via long-range link insertion. IEEE Trans Very Large Scale Integr Syst 14(7):693–706
Article Google Scholar
Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor: theoretical properties and algorithms, parallel computing (journal). Elsevier 21(11):1783–1806
Google Scholar
Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th Annual International High Performance Computing Conference. The 1993 High Performance Computing: New Horizons Supercomputing Symposium. Calgary, Alberta, Canada, pp 349–357
Wani MA, Arabnia HR (2003) Parallel edge-region-based degmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63
Article MATH Google Scholar
Arabnia HR (1990) A parallel algorithm for the arbitrary rotation of digitized images using process-and-data-decomposition approach. J Parallel Distrib Comput 10(2):188–193
Article Google Scholar
Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitized images. Int J Eurographics Assoc 8(1):3–12
Google Scholar
Bhandarkar SM, Arabnia HR (1995) The hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114
Article Google Scholar
Arabnia HR, Oliver MA (1987) A transputer network for the arbitrary rotation of digitised images. Comput J 30(5):425–433
Article Google Scholar
Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomput 10(3):243–270
Article MATH Google Scholar
Arabnia HR, Oliver MA (1987) Arbitrary rotation of raster images with SIMD machine architectures. Int J Eurographics Assoc 6(1):3–12
Google Scholar
Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recogn Artif Intell 9(2):201–229
Article Google Scholar
Arabnia HR (1996) Distributed Stereocorrelation Algorithm. Int J Comput Commun 707–712
Touzene A (2014) On all-to-all broadcast in dense gaussian network on-chip. IEEE Trans Parallel Distrib Syst 99:1
Article Google Scholar
Touzene A (2014) All-to-all broadcast in hexagonal torus networks on-chip. IEEE Trans Parallel Distrib Syst 99:1 (no. preprints)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Sultan Qaboos University, Muscat, Sultanate of Oman
Abderezak Touzene & Khaled Day

Authors

Abderezak Touzene
View author publications
You can also search for this author in PubMed Google Scholar
Khaled Day
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abderezak Touzene.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Touzene, A., Day, K. All-to-all broadcasting in torus Network on Chip. J Supercomput 71, 2585–2596 (2015). https://doi.org/10.1007/s11227-015-1406-z

Download citation

Published: 21 March 2015
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11227-015-1406-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

All-to-all broadcasting in torus Network on Chip

Abstract

Access this article

Similar content being viewed by others

Tree-based wireless NoC architecture: enhancing scalability and latency

Efficient Broadcast Scheme Based on Sub-network Partition for Many-Core CMPs on Gem5 Simulator

An Asynchronous 2D-Torus Network-on-Chip Using Adaptive Routing Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

All-to-all broadcasting in torus Network on Chip

Abstract

Access this article

Similar content being viewed by others

Tree-based wireless NoC architecture: enhancing scalability and latency

Efficient Broadcast Scheme Based on Sub-network Partition for Many-Core CMPs on Gem5 Simulator

An Asynchronous 2D-Torus Network-on-Chip Using Adaptive Routing Algorithm

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation