Skip to main content
Log in

A message passing strategy for array redistributions in a torus network

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The array redistribution problem occurs in many important applications in parallel computing. In this paper, we consider this problem in a torus network. Tori are preferred to other multidimensional networks (like hypercubes) due to their better scalability (IEE Trans. Parallel Distrib. Syst. 50(10), 1201–1218, [2001]). We present a message combining approach that splits any array redistribution problem in a series of broadcasts where all sources send messages of the same size, thus a balanced traffic load is achieved. Unlike existing array redistribution algorithms, the scheme introduced in this work eliminates the need for data reorganization in the memory of the source and target processors. Moreover, the processing of the scheduled broadcasts is pipelined, thus the total cost of redistribution is reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Yang Y, Wang J (2001) Pipelined all-to-all broadcast in all-port meshes and tori. IEEE Trans Parallel Distrib Syst 50(10):1201–1218

    Google Scholar 

  2. Kaushik SD, Huang CH, Johnson RW, Sadayappan P (1994) An approach to communication-efficient data redistribution. In: Proceedings of the 8th ACM international conference on supercomputing, July 1994, Manchester, England

  3. Park N, Prassana VK, Raghavendra CS (1999) Efficient algorithms for block-cyclic array redistribution between processor sets. IEEE Trans Parallel Distrib Syst 10(12):1217–1240

    Article  Google Scholar 

  4. Prylli L, Touranchean B (1997) Fast runtime block cyclic data redistribution on multiprocessors. Parallel Distrib Comput 45:63–72

    Article  MATH  Google Scholar 

  5. Ramaswamy S, Benerjee P (1995) Automatic generation of efficient array redistribution routines for distributed memory multicomputers. In: Proc fifth symp frontiers of massively parallel computation, Feb 1995, pp 342–349

  6. Wang L, Stichnoth JM, Chatterjee S (1996) Runtime performance of parallel array assignment: an empirical study. In: Proc 1996 ACM/IEEE supercomputing conf. http://www.supercomp.org/sc96/proceedings

  7. Sundar NS, Jayasimha DN, Panda DK, Sadayappan P (2001) Hybrid algorithms for complete exchange in 2D meshes. IEEE Trans Parallel Distrib Syst 12(12):1201–1218

    Article  Google Scholar 

  8. Kalns ET, Ni LM (1995) Processor mapping techniques toward efficient data redistribution. IEEE Trans Parallel Distrib Syst 6(12):1234–1247

    Article  Google Scholar 

  9. Hsu C-H, Chung Y-C, Yang D-L, Dow C-R (2001) A generalized processor mapping technique for array redistribution. IEEE Trans Parallel Distrib Syst 12(7):743–757

    Article  Google Scholar 

  10. Huang J-W, Chu C-P (2006) An efficient communication scheduling method for the processor mapping technique applied data redistribution. J Supercomput 37:297–318

    Article  Google Scholar 

  11. Thakur R, Choudhary A, Ramanujam J (1996) Efficient algorithms for array redistribution. IEEE Trans Parallel Distrib Syst 7(6):587–594

    Article  Google Scholar 

  12. Walker DW, Otto SW (1996) Redistribution of block-cyclic data distributions using MPI. Concur Practice Exp 8(9):707–728

    Article  Google Scholar 

  13. Lim YW, Bhat PB, Prasanna VK (1998) Efficient algorithms for block cyclic redistribution of arrays. Algorithmica 24:298–330

    Article  MathSciNet  Google Scholar 

  14. Desprez F, Dongarra J, Petitet A, Randriamaro C, Robert Y (1998) Scheduling block-cyclic array redistribution. IEEE Trans Parallel Distrib Syst 9(2):192–205

    Article  Google Scholar 

  15. Guo M, Nakata I (2001) A framework for efficient data redistribution on distributed memory multicomputers. J Supercomput 20:243–265

    Article  MATH  Google Scholar 

  16. Tseng Y-C, Gupta SKS (1996) All-to-all personalized communication in a wormhole-routed torus. IEEE Trans Parallel Distrib Syst 7(5):498–505

    Article  Google Scholar 

  17. Souravlas SI, Roumeliotis M (2004) A pipeline technique for dynamic data transfer on a multiprocessor grid. Int J Parallel Program 32(5):361–388

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stavros Souravlas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Souravlas, S., Roumeliotis, M. A message passing strategy for array redistributions in a torus network. J Supercomput 46, 40–57 (2008). https://doi.org/10.1007/s11227-008-0185-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-008-0185-1

Keywords

Navigation