Scheduling array redistribution with virtual channel support

Souravlas, Stavros; Roumeliotis, Manos

doi:10.1007/s11227-015-1519-4

Scheduling array redistribution with virtual channel support

Published: 10 September 2015

Volume 71, pages 4215–4234, (2015)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Stavros Souravlas¹ &
Manos Roumeliotis¹

102 Accesses
2 Citations
Explore all metrics

Abstract

This paper addresses the well-known problem of redistributing data arrays over a multiprocessor network. The block-cyclic redistribution techniques found in the literature deal effectively with this problem, based on the assumption that there is a direct link between all the processors of the network. Most of these techniques aim at reducing the number of messages and the total redistribution cost. However, an application of these techniques on non-all-to-all communication networks, like tori, shows that these techniques suffer long delays. In this work, we try to solve the general block-cyclic redistribution problem on non-all-to-all networks, by grouping the messages into well-defined classes, and transferring them with the support of a well-specified number of virtual channels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arabnia HR, Oliver MA (1987) Arbitrary rotation of raster images with SIMD machine architectures. Int J Eurogr Assoc (Comput Gr Forum) 6(1):3–12
Google Scholar
Arabnia HR, Oliver MA (1987) A transputer network for the arbitrary rotation of digitised images. Comput J 30(5):425–433
Article Google Scholar
Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Int J Eurogr Assoc (Comput Gr Forum) 8(1):3–12
Google Scholar
Arabnia HR (1990) A parallel algorithm for the arbitrary rotation of digitized images using process-and-data-decomposition approach. J Parallel Distrib Comput 10(2):188–193
Article Google Scholar
Arabnia HR, Smith JW (1993) A Reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th annual international high performance computing conference. The 1993 high performance computing: new horizons supercomputing symposium. Calgary, Alberta, Canada, June, pp 349–357
Arabnia HR (1995) A distributed stereocorrelation algorithm. In: Proceedings of computer communications and networks (ICCCN’95), IEEE, pp 479–482
Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomput 10(3):243–270
Article MATH Google Scholar
Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recogn Artif Intell (IJPRAI) (special issue on VLSI algorithms and architectures for computer vision, image processing, pattern recognition and AI) 9(2):201–229
Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor: theoretical properties and algorithms. Parallel Comput 21(11):1783–1806
Article Google Scholar
Bhandarkar SM, Arabnia HR (1995) The Hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114
Article Google Scholar
Chung YC, Hsu CH, Bai SW (1998) A basic-cycle calculation technique for efficient dynamic data redistribution. IEEE Trans Parallel Distrib Syst 32:359–377
Article Google Scholar
Desprez F, Dongarra J, Petitet A, Randriamaro C, Robert Yves (1998) Scheduling block-cyclic array redistribution. IEEE Trans Parallel Distrib Syst 9(2):192–205
Article Google Scholar
Jun D, Negishi Y (2010) Overlapping methods of all-to-all communication and FFT algorithms for torus-connected massively parallel supercomputers. International conference on high performance computing, networking, storage and analysis (SC). New Orleans, Louisiana, pp 1–9
Faraj A, Xin Y, Patarasuk P (2007) A message scheduling scheme for all-to-all personalized communication on ethernet switched clusters. IEEE Trans Parallel Distrib Syst 18(2):264–276
Article Google Scholar
Feng R, Zhang P, Deng Y (2013) Deadlock-free routing algorithms for 6D Mesh/iBT interconnection networks. In: Proceedings of the 5th IEEE international conference on information technology: new generations. Las Vegas, Nevada, pp 253–258
Kalns ET, Ni LM (1995) Processor mapping techniques toward efficient data redistribution. IEEE Trans Parallel Distrib Syst 6:1234–1247
Article Google Scholar
Hsu C-H, Chung Y-C, Yang D-L, Dow C-R (2001) A generalized processor mapping technique for array redistribution. IEEE Trans Parallel Distrib Syst 12(7):743–757
Article Google Scholar
Huang JW, Chu CP (2006) An efficient communication scheduling method for the processor mapping technique applied data redistribution. J Supercomput 37:297–318
Article Google Scholar
Kumar S, Sabharwal Y, Garg R, Heidelberger P (2008) Optimization of all-to-all communication on the blue Gene/L supercomputer. In: Proceedings of the 37th international conference on parallel processing. Washington, DC, USA, pp 320–329
Lim YW, Bhat PB, Prasanna VK (1998) Efficient algorithms for block cyclic redistribution of arrays. Algorithmica 24:298–330
Article MathSciNet Google Scholar
Liu G, Gu N, Ren K, Tao Y (2006) Optimal all-to-all personalized communication in All-Port Tori. First international multi-symposiums on computer and computational sciences, IMSCCS ’06. Hanzhou, Zhejiang, pp 20–24
Luo W, Xiang D (2012) Deadlock-free routing algorithm for torus networks. IEEE Trans Parallel Distrib Syst 23(5):800–808
Article Google Scholar
Ramanujam RS, Lin B (2013) Randomized throughput-optimal oblivious routing for torus networks. IEEE Trans Comput 62(6):561–574
Article MathSciNet Google Scholar
Ramaswamy S, Simons P, Banerjee P (1996) Optimization for efficient array redistribution on distributed memory multicomputers. J Parallel Distrib Comput 38:217–228
Article MATH Google Scholar
Souravlas SI, Roumeliotis M (2004) A pipeline technique for dynamic data transfer on a multiprocessor grid. Int J Parallel Progr 32(5):361–388
Article MATH Google Scholar
Thakur R, Choudhary A, Ramanujam J (1996) Efficient algorithms for array redistribution. IEEE Trans Parallel Distrib Syst 7:587–594
Article Google Scholar
Wani MA, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multi-ring network. J Supercomput 25(1):43–63
Article MATH Google Scholar
Yang Y, Wang J (2002) Near-optimal all-to-all broadcast in multidimensional all-port meshes and tori. IEEE Comput Trans Parallel Distrib Syst 13(2):128–141
Article Google Scholar
Zhang P, Deng Y (2012) Design and analysis of pipelined broadcast algorithms for the all-port interlaced bypass torus networks. IEEE Comput Trans Parallel Distrib Syst 23(12):2245–2253
Article Google Scholar

Download references

Author information

Authors and Affiliations

Applied Informatics Department, University of Macedonia, 156 Egnatia St., 54006, Thessaloniki, Greece
Stavros Souravlas & Manos Roumeliotis

Authors

Stavros Souravlas
View author publications
You can also search for this author in PubMed Google Scholar
Manos Roumeliotis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stavros Souravlas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Souravlas, S., Roumeliotis, M. Scheduling array redistribution with virtual channel support. J Supercomput 71, 4215–4234 (2015). https://doi.org/10.1007/s11227-015-1519-4

Download citation

Published: 10 September 2015
Issue Date: November 2015
DOI: https://doi.org/10.1007/s11227-015-1519-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scheduling array redistribution with virtual channel support

Abstract

Access this article

Similar content being viewed by others

COSTA: Communication-Optimal Shuffle and Transpose Algorithm with Process Relabeling

A Methodology for Handling Data Movements by Anticipation: Position Paper

On Parallel Array P Systems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scheduling array redistribution with virtual channel support

Abstract

Access this article

Similar content being viewed by others

COSTA: Communication-Optimal Shuffle and Transpose Algorithm with Process Relabeling

A Methodology for Handling Data Movements by Anticipation: Position Paper

On Parallel Array P Systems

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation