Skip to main content
Log in

An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Array redistribution is usually required for more efficiently executing a data-parallel program on distributed memory multi-computers. In performing array redistribution using synchronous communication mode, data communications among the processors should be properly arranged to avoid incurring higher data transfer cost. Some efficient communication scheduling methods for the Block-Cyclic redistribution have been proposed. On the other hand, the processor mapping technique can help reduce the data transfer cost of redistribution. To avoid degrading the benefit of data transfer cost reduction, it is needed to construct optimal communication schedules for the redistribution in which the processor mapping technique is applied. In this paper, we present a unified approach to constructing optimal communication schedules for the processor mapping technique applied Block-Cyclic redistribution. The proposed method is founded on the processor mapping technique and can more efficiently construct the required communication schedules than other optimal scheduling methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. T. Kalns and L. M. Ni. Processor mapping techniques toward efficient data redistribution. IEEE Transactions on Parallel and Distributed Systems, 6(12):1234–1247, 1995.

    Article  Google Scholar 

  2. D. Bau, I. Kodukula, V. Kotlyar, K. Pingali, and P. Stodghill. Solving alignment using elementary linear algebra. In Conference Record of the 7th Workshop on Languages and Compilers for Parallel Computing, pp. 46–60, 1994.

  3. J. Ramanujam and P. Sadayappan. Compile-time techniques for data distribution in distributed memory machines. IEEE Transactions on Parallel and Distributed Systems, 2(4):472–482, 1991.

    Article  Google Scholar 

  4. M. Dion and Y. Robert. Mapping Affine Loop Nests: New Results. Parallel Computing, 22(10):1373–1397, 1996.

    Article  MATH  MathSciNet  Google Scholar 

  5. A. W. Lam and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing, 24, (3/4):445–475, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  6. W.-L. Chang, J.-W. Huang, and C.-P. Chu. Using elementary linear algebra to solve data alignment for arrays with linear or quadratic references. IEEE Transactions on Parallel and Distributed Systems, 15(1): 28–39, 2004.

    Article  Google Scholar 

  7. S. Hiranandani, K. Kennedy, J. Mellor-Crummey, and A. Sethi. Compilation techniques for block-cyclic distributions. ACM International Conference on Supercomputing, pp. 392–403, 1994.

  8. S. Chatterjee, J. R. Gilbert, F. J. E. Long, R. Schreiber, and S.-H. Teng. Generating local address and communication sets for data parallel programs. Journal of Parallel and Distributed Computing, 26:72–84, 1995.

    Article  MATH  Google Scholar 

  9. S. K. S. Gupta, S. D. Kaushik, C.-H. Huang, and P. Sadayappan. On compiling array expressions for efficient execution on distributed-memory machines. Journal of Parallel and Distributed Computing, 32:155–172, 1996.

    Article  Google Scholar 

  10. N. Park, V. K. Prasanna, and C. S. Raghavendra. Efficient algorithms for block-cyclic array redistribution between processor sets. IEEE Transactions on Parallel and Distributed Systems, 10(12):1217–1240, 1999.

    Article  MATH  Google Scholar 

  11. C.-H. Hsu and Y.-H. Chung. Efficient methods for kr → r and r → kr array redistribution. The Journal of Supercomputing, 12:253–276, 1998.

    Article  MATH  Google Scholar 

  12. S. Ramaswamy and P. Banerjee. Automatic generation of efficient array redistribution routines for distributed memory multicomputers. In Frontiers ’95: The Fifth Symposium on the Frontiers of Massively Parallel Computation, pp. 342–349, 1995.

  13. S. Ramaswamy, B. Simons, and P. Banerjee. Optimization for Efficient Array Redistribution on Distributed Memory Multicomputers. Journal of Parallel and Distributed Computing, 38:217–228, 1996

    Article  MATH  Google Scholar 

  14. R. Thakur, A. Choudhary, and G. Fox. Runtime array redistribution in HPF programs. In: Proceedings of Scalable High Performance Computing Conference, pp. 309–316, 1994.

  15. R. Thakur, A. Choudhary, and J. Ramanujam. Efficient algorithm for array redistribution. IEEE Transactions on Parallel and Distributed Systems, 7(6):587–594, 1996

    Article  Google Scholar 

  16. L. Prylli and B. Tourancheau. Fast runtime block cyclic data redistribution on multiprocessors. Journal of Parallel and Distributed Computing, 45:63–72, 1997.

    Article  MATH  Google Scholar 

  17. C.-H. Hsu, S.-W. Bai, Y.-C. Chung, and C.-S. Yang. A generalized basic-cycle calculation method for efficient array redistribution. IEEE Transactions on Parallel and Distributed Systems, 11(12):1201–1216, 2000.

    Article  Google Scholar 

  18. A. Wakatani and M. Wolfe. Optimization of array redistribution for distributed memory multicomputers. Parallel Compu ting, 21(9):1485–1490, 1995.

    Article  MATH  Google Scholar 

  19. S. D. Kaushik, C.-H. Huang, R. W. Johnson, and P. Sadayappan. An approach to communication-efficient data redistribution. In: Proceedings of International Conference on Supercomputing, pp. 364–373, 1994.

  20. S. D. Kaushik, C.-H. Huang, J. Ramanujam, and P. Sadayappan. Multi-phase array redistribution: Modeling and evaluation. In: Proceedings of International Parallel Processing Symposium, pp. 441–445, 1995.

  21. D. W. Walker and S. W. Otto. Redistribution of block-cyclic data distributions using MPI. Concurrency: Practice and Experience, 8(9):707–728, 1996.

    Article  Google Scholar 

  22. F. Desprez, J. Dongarra, C. Randriamaro, and Y. Robert. Scheduling block-cyclic array redistribution. IEEE Transactions on Parallel and Distributed Systems, 9(2):192–205, 1998.

    Article  Google Scholar 

  23. M. Guo, I. Nakata, and Y. Yamashita. Contention-free communication scheduling for array redistribution. Parallel Computing, 26(10):1325–1343, 2000.

    Article  MATH  Google Scholar 

  24. E. T. Kalns and L. M. Ni. DaReL: A portable data redistribution library for distributed-memory machines. In Proceedings of Scalable Parallel Libraries Conference II, October 1994.

  25. C.-H. Hsu, Y.-C. Chung, D.-L.Yang, and C.-R. Dow. A generalized processor mapping technique for array redistribution. IEEE Transactions on Parallel and Distributed Systems, 12(7):743–757, 2001.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jih-Woei Huang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, JW., Chu, CP. An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution. J Supercomput 37, 297–318 (2006). https://doi.org/10.1007/s11227-006-6615-z

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-006-6615-z

Keywords

Navigation