An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

Huang, Jih-Woei; Chu, Chih-Ping

doi:10.1007/s11227-006-6615-z

An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

Published: September 2006

Volume 37, pages 297–318, (2006)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Jih-Woei Huang¹ &
Chih-Ping Chu¹

48 Accesses
7 Citations
Explore all metrics

Abstract

Array redistribution is usually required for more efficiently executing a data-parallel program on distributed memory multi-computers. In performing array redistribution using synchronous communication mode, data communications among the processors should be properly arranged to avoid incurring higher data transfer cost. Some efficient communication scheduling methods for the Block-Cyclic redistribution have been proposed. On the other hand, the processor mapping technique can help reduce the data transfer cost of redistribution. To avoid degrading the benefit of data transfer cost reduction, it is needed to construct optimal communication schedules for the redistribution in which the processor mapping technique is applied. In this paper, we present a unified approach to constructing optimal communication schedules for the processor mapping technique applied Block-Cyclic redistribution. The proposed method is founded on the processor mapping technique and can more efficiently construct the required communication schedules than other optimal scheduling methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scheduling array redistribution with virtual channel support

Article 10 September 2015

TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer

Abstract Parallel Array Types and Ghost Cell Update Implementation

References

E. T. Kalns and L. M. Ni. Processor mapping techniques toward efficient data redistribution. IEEE Transactions on Parallel and Distributed Systems, 6(12):1234–1247, 1995.
Article Google Scholar
D. Bau, I. Kodukula, V. Kotlyar, K. Pingali, and P. Stodghill. Solving alignment using elementary linear algebra. In Conference Record of the 7th Workshop on Languages and Compilers for Parallel Computing, pp. 46–60, 1994.
J. Ramanujam and P. Sadayappan. Compile-time techniques for data distribution in distributed memory machines. IEEE Transactions on Parallel and Distributed Systems, 2(4):472–482, 1991.
Article Google Scholar
M. Dion and Y. Robert. Mapping Affine Loop Nests: New Results. Parallel Computing, 22(10):1373–1397, 1996.
Article MATH MathSciNet Google Scholar
A. W. Lam and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing, 24, (3/4):445–475, 1998.
Article MATH MathSciNet Google Scholar
W.-L. Chang, J.-W. Huang, and C.-P. Chu. Using elementary linear algebra to solve data alignment for arrays with linear or quadratic references. IEEE Transactions on Parallel and Distributed Systems, 15(1): 28–39, 2004.
Article Google Scholar
S. Hiranandani, K. Kennedy, J. Mellor-Crummey, and A. Sethi. Compilation techniques for block-cyclic distributions. ACM International Conference on Supercomputing, pp. 392–403, 1994.
S. Chatterjee, J. R. Gilbert, F. J. E. Long, R. Schreiber, and S.-H. Teng. Generating local address and communication sets for data parallel programs. Journal of Parallel and Distributed Computing, 26:72–84, 1995.
Article MATH Google Scholar
S. K. S. Gupta, S. D. Kaushik, C.-H. Huang, and P. Sadayappan. On compiling array expressions for efficient execution on distributed-memory machines. Journal of Parallel and Distributed Computing, 32:155–172, 1996.
Article Google Scholar
N. Park, V. K. Prasanna, and C. S. Raghavendra. Efficient algorithms for block-cyclic array redistribution between processor sets. IEEE Transactions on Parallel and Distributed Systems, 10(12):1217–1240, 1999.
Article MATH Google Scholar
C.-H. Hsu and Y.-H. Chung. Efficient methods for kr → r and r → kr array redistribution. The Journal of Supercomputing, 12:253–276, 1998.
Article MATH Google Scholar
S. Ramaswamy and P. Banerjee. Automatic generation of efficient array redistribution routines for distributed memory multicomputers. In Frontiers ’95: The Fifth Symposium on the Frontiers of Massively Parallel Computation, pp. 342–349, 1995.
S. Ramaswamy, B. Simons, and P. Banerjee. Optimization for Efficient Array Redistribution on Distributed Memory Multicomputers. Journal of Parallel and Distributed Computing, 38:217–228, 1996
Article MATH Google Scholar
R. Thakur, A. Choudhary, and G. Fox. Runtime array redistribution in HPF programs. In: Proceedings of Scalable High Performance Computing Conference, pp. 309–316, 1994.
R. Thakur, A. Choudhary, and J. Ramanujam. Efficient algorithm for array redistribution. IEEE Transactions on Parallel and Distributed Systems, 7(6):587–594, 1996
Article Google Scholar
L. Prylli and B. Tourancheau. Fast runtime block cyclic data redistribution on multiprocessors. Journal of Parallel and Distributed Computing, 45:63–72, 1997.
Article MATH Google Scholar
C.-H. Hsu, S.-W. Bai, Y.-C. Chung, and C.-S. Yang. A generalized basic-cycle calculation method for efficient array redistribution. IEEE Transactions on Parallel and Distributed Systems, 11(12):1201–1216, 2000.
Article Google Scholar
A. Wakatani and M. Wolfe. Optimization of array redistribution for distributed memory multicomputers. Parallel Compu ting, 21(9):1485–1490, 1995.
Article MATH Google Scholar
S. D. Kaushik, C.-H. Huang, R. W. Johnson, and P. Sadayappan. An approach to communication-efficient data redistribution. In: Proceedings of International Conference on Supercomputing, pp. 364–373, 1994.
S. D. Kaushik, C.-H. Huang, J. Ramanujam, and P. Sadayappan. Multi-phase array redistribution: Modeling and evaluation. In: Proceedings of International Parallel Processing Symposium, pp. 441–445, 1995.
D. W. Walker and S. W. Otto. Redistribution of block-cyclic data distributions using MPI. Concurrency: Practice and Experience, 8(9):707–728, 1996.
Article Google Scholar
F. Desprez, J. Dongarra, C. Randriamaro, and Y. Robert. Scheduling block-cyclic array redistribution. IEEE Transactions on Parallel and Distributed Systems, 9(2):192–205, 1998.
Article Google Scholar
M. Guo, I. Nakata, and Y. Yamashita. Contention-free communication scheduling for array redistribution. Parallel Computing, 26(10):1325–1343, 2000.
Article MATH Google Scholar
E. T. Kalns and L. M. Ni. DaReL: A portable data redistribution library for distributed-memory machines. In Proceedings of Scalable Parallel Libraries Conference II, October 1994.
C.-H. Hsu, Y.-C. Chung, D.-L.Yang, and C.-R. Dow. A generalized processor mapping technique for array redistribution. IEEE Transactions on Parallel and Distributed Systems, 12(7):743–757, 2001.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, 701, ROC
Jih-Woei Huang & Chih-Ping Chu

Authors

Jih-Woei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Ping Chu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jih-Woei Huang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, JW., Chu, CP. An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution. J Supercomput 37, 297–318 (2006). https://doi.org/10.1007/s11227-006-6615-z

Download citation

Issue Date: September 2006
DOI: https://doi.org/10.1007/s11227-006-6615-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

Abstract

Access this article

Similar content being viewed by others

Scheduling array redistribution with virtual channel support

TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer

Abstract Parallel Array Types and Ghost Cell Update Implementation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An Efficient Communication Scheduling Method for the Processor Mapping Technique Applied Data Redistribution

Abstract

Access this article

Similar content being viewed by others

Scheduling array redistribution with virtual channel support

TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer

Abstract Parallel Array Types and Ghost Cell Update Implementation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation