A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers

Guo, Minyi; Nakata, Ikuo

doi:10.1023/A:1011602732570

A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers

Published: November 2001

Volume 20, pages 243–265, (2001)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Minyi Guo¹ &
Ikuo Nakata²

65 Accesses
30 Citations
Explore all metrics

Abstract

Array redistribution is required often in programs on distributed memory parallel computers. It is essential to use efficient algorithms for redistribution; otherwise the performance of the programs will degrade considerably. The redistribution overheads consist of two parts: index computation and inter-processor communication. In this paper, by using a notation for the local data description called an LDD, we propose a framework to optimize the array redistribution algorithm both in index computation and inter-processor communication. That is, our work makes an effort to optimize not only the computation cost but also communication cost for array redistribution algorithms. We present an efficient index computation method and generate a schedule that minimizes the number of communication steps and eliminates node contention in each communication step. Some experiments show the efficiency and flexibility of our techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

R. Bixby, K. Kennedy, and U. Kremer. Automatic data layout using 0-1 integer programming. In Proceedings of the 1994 International Conference on Parallel Archs. and Compilation Techniques, Montreal, Canada, Aug. 1994.
Y. Chung, C. Hsu, and S. Bai. A basic-cycle calculation technique for efficient dynamic data redistribution. IEEE Transactions on Parallel and Distributed Systems, 9(4):359-377, 1988.
Google Scholar
K. Nakazawa, H. Nakamura, T. Boku, I. Nakata, and Y. Yamashita. CP-PACS: a massively parallel processor at the University of Tsukuba. Parallel Computing, 25(13–14):1635-1661, 1999.
Google Scholar
F. Desprez, J. Dongarra, A. Petitet, C. Randriamaro, and Y. Robert. Scheduling block-cyclic array redistribution. IEEE Transactions on Parallel and Distributed Systems,9(2):192-205, 1998.
Google Scholar
M. Guo, Y. Yamashita, and I. Nakata. Efficient implementation of multi-dimensional array redistribution. IEICE Transactions on Information andSystems, E81-D(11):1195-1204, 1998.
Google Scholar
M. Guo, Y. Yamashita, and I. Nakata. Improving performance of multi-dimensional array redistribution on distributed memory machines. In Proceedings of the Third International Workshop on High-Level Parallel Programming Models and Supportive Environments, Orlando, Fla. March 1998.
M. Guo. Efficient techniques for data distribution and redistribution in parallelizing compilers. Ph.D. Thesis, University of Tsukuba, Japan, July 1998.
HPF Forum. High Performance Fortran Language Speci.cation, version 2.0 ed. Rice University, Houston, Texas, 1996.
Google Scholar
C. Hsu, S. Bai, Y. Chung, and C. Yang. A generalizedbasic-cycle calculation methodfor efficient array redistribution. IEEE Transactions on Parallel andDistributedSystems, 11(12):1201-1216, 2000.
Google Scholar
S. D. Kaushik, C.-H. Huang, R. W. Johmson, and P. Sadayappan. An approach to communication efficient data redistribution. In Proceedings of the 8th ACM International Conference on Supercomputing, Manchester, U.K., July 1994.
S. D. Kaushik, C.-H. Huang, and P. Sadayappan. Efficient index set generation for compiling HPF array statements on distributed-memory machines. Journal of Parallel andDistributedComputing, 38(2):237-247, 1996.
Google Scholar
S. D. Kaushik, C.-H. Huang, J. Ramanujam, and P. Sadayappan. Multi-phase redistribution: a communication-efficient approach to array redistribution. Technical report, The Ohio State University, 1995.
E. T. Kalns and L. M. Ni. Processor mapping techniques toward efficient data redistribution. IEEE Transactions on Parallel andDistributedSystems, 6(12):1234-1247, 1995.
Google Scholar
K. Kennedy, N. Nedeljkovic, and A. Sethi. Efficient address generation for block-cyclic distributions. In Proceedings of the International Conference on Supercomputing, Barcelona, July 1995.
K. Kennedy and U. Kremer. Automatic data layout for high performance Fortran. In Proceedings of Supercomputing'95, San Diego, Calif., Dec. 1995.
U. Kremer. NP-completeness of dynamic remapping. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, Dec. 1993.
Y. W. Lim, P. B. Bhat, and V. Prasanna. Efficient algorithms for block-cyclic redistribution of arrays. IEEE Symposium on Parallel andDistributedProcessing, Oct. 1996.
Y. W. Lim, N. Park, and V. Prasanna. Efficient algorithms for multi-dimensional block-cyclic redistribution of arrays. In Proceedings of the 26th International Conference on Parallel Processing, Bloomingdale, IL, Aug. 1997.
K. Nakazawa, H. Nakamura, and T. Boku. The architecture of massively parallel processor CP-PACS. Journal of Information Processing Society of Japan, 37(1):18-28, 1996(in Japanese).
Google Scholar
D. J. Palermo and P. Banerjee. Automatic selection of dynamic data partitioning schemes for distributed-memory multicomputers. In Proceedings of the 8th Workshop on Languages and Compilers for Parallel Computing, Aug. 1995.
D. J. Palermo, E. W. Hodges IV, and P. Banerjee. Dynamic data partitioning for distributed-memory multicomputers. Journal of Parallel andDistributedComputing, No. 38:158-175, 1996.
N. Park, V. K. Prasanna, and C. S. Raghavendra. Efficient algorithms for block-cyclic array redistribution between processor sets. IEEE Transactions on Parallel andDistributedSystems, 10(12):1217-1239, 1999.
Google Scholar
S. Ramaswamy, B. Simons, and P. Banerjee. Optimizations for efficient array redistribution on distributed memory multicomputers. Journal of Parallel and Distributed Computing, 38:217-228, 1996.
Google Scholar
S. Ranka, J.-C., Wang, and G. Fox. Static and run-time algorithms for all-to-many personalized communication on permutation networks. IEEE Transactions on Parallel and Distributed Systems, 5(12):1266-1274, (1994).
Google Scholar
S. Ranka, R. Shankar, and K. Alsabti. Many-to-many personalizedcommunication with bounded traffic. In Proceedings of Frontiers'95, 1995.
J. Stichnoth, D. O'Hallaron, and T. Gross. Generating communication for array statements: design, implementation, andevaluation, Journal of Parallel andDistributedComputing, pp. 150-159, 1994.
R. Thakur, A. Choudhary, and G. Fox. Runtime array redistribution in HPF programs. In Proceedings Scalable High Performance Computing Conference, May 1994, pp. 309-316.
R. Thakur, A. Choudhary, and J. Ramanujam. Efficient algorithms for array redistribution. IEEE Transactions on Parallel andDistributedSystems, 7(6):587-593, 1996.
Google Scholar
E. H. Tseng and J. L. Gaudiot. Communication generation for aligned and cyclic(k) distributions using integer lattice. IEEE Transactions on Parallel and Distributed Systems, 10(2):136-146, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Software, The University of Aizu, Aizu-Wakamatsu City, Fukushima, Japan
Minyi Guo
Faculty of Computer and Information Sciences, Hosei University, Tokyo, Japan
Ikuo Nakata

Authors

Minyi Guo
View author publications
You can also search for this author in PubMed Google Scholar
Ikuo Nakata
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, M., Nakata, I. A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers. The Journal of Supercomputing 20, 243–265 (2001). https://doi.org/10.1023/A:1011602732570

Download citation

Issue Date: November 2001
DOI: https://doi.org/10.1023/A:1011602732570

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers

Abstract

Access this article

Similar content being viewed by others

Scheduling array redistribution with virtual channel support

HDArray: Parallel Array Interface for Distributed Heterogeneous Devices

Abstract Parallel Array Types and Ghost Cell Update Implementation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers

Abstract

Access this article

Similar content being viewed by others

Scheduling array redistribution with virtual channel support

HDArray: Parallel Array Interface for Distributed Heterogeneous Devices

Abstract Parallel Array Types and Ghost Cell Update Implementation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation