Partitioning and mapping of nested loops for linear array multicomputers

Sheu, Jang-Ping; Chen, Tzung-Shi

doi:10.1007/BF01245404

Partitioning and mapping of nested loops for linear array multicomputers

Published: March 1995

Volume 9, pages 183–202, (1995)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Jang-Ping Sheu¹ &
Tzung-Shi Chen¹

40 Accesses
9 Citations
Explore all metrics

Abstract

In distributed-memory multicomputers, minimizing interprocessor communication is the key to the efficient execution of parallel programs. In order to reduce the amount of communication overhead, parallel programs on multicomputers must be carefully scheduled by parallelizing compilers. This paper proposes some compilation techniques for partitioning and mapping nested loops with constant data dependences onto linear array multicomputers. First, a systematic partition strategy is proposed to project ann-dimensional computational structure, representing ann-nested loop, onto a line to form a one-dimensional projected structure with low communication overhead. Then, a mapping algorithm is proposed for mapping the partitioned loops onto linear arrays in a way that balances the workload and minimizes the communication cost among processors. Finally, parallel execution codes can be automatically generated for such linear array multicomputers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bannerjee, U., Chen, S.C., Kuck, D.J., and Towle, R.A. 1979. Time and parallel processor bounds for Fortran-like loops.IEEE Trans. Comps., C-28, 9(Sept.): 660–670.
Google Scholar
Bokhari, S.H. 1988. Partitioning problems in parallel, pipelined, and distributed computing.IEEE Trans. Comps., 37(Jan.): 48–57.
Google Scholar
Chen, T.S., and Sheu, J.P. 1994. Communication-free data allocation techniques for parallelizing compilers on multicomputers.IEEE Trans. Parallel and Distr. Systems, 5, 9(Sept.): 924–938.
Google Scholar
D'Hollander, E.H. 1989. Partitioning and labeling of index sets in do loops with constant dependence. InProc., 1989 Internat. Conf. on Parallel Processing, vol. 2 (Aug.), pp. 139–144.
Google Scholar
Ercal, F., Ramanujam, J., and Sadayappan, P. 1990. Task allocation onto a hypercube by recursive mincut bipartitioning.J. Parallel and Distr. Comp., 10, 1(Sept.): 35–44.
Google Scholar
Friedberg, S.H., Insel, A.J., and Spence, L.E. 1979.Linear Algebra. Prentice-Hall, Englewood Cliffs, N.J.
Google Scholar
Gupta, R. 1992. Synchronization and communication costs of loop partitioning on shared-memory multiprocessor systems.IEEE Trans. Parallel and Distr. Systems, 3, 4(July): 505–512.
Google Scholar
King, C.T., Chou, W.H., and Ni, L.M. 1990. Pipelined data-parallel algorithms: Part II, design.IEEE Trans. Parallel and Distr. Systems, 1, 4(Oct.): 486–499.
Google Scholar
Kung, S.Y. 1987.VLSI Array Processors. Prentice-Hall, Englewood Cliffs, N.J.
Google Scholar
Lamport, L. 1974. The parallel execution of do loops.CACM, 17, 2(Feb.): 83–93.
Google Scholar
Liu, L.S., Ho, C.W., and Sheu, J.P. 1990. On the parallelism of nested for-loops using index shift method. InProc., 1990 Internat. Conf. on Parallel Processing, vol. 2(Aug.), pp. 119–123.
Google Scholar
Lu, M., and Fang, J.Z. 1992. A solution of the cache ping-pong problem in multiprocessor systems.J. Parallel and Distr. Comp., 16(Oct.): 158–171.
Google Scholar
Padua, D.A., and Wolfe, M.J. 1986. Advanced compiler optimizations for supercomputers.CACM (Dec.): 1184–1201.
Google Scholar
Peir, J.-K., and Cytron, R. 1989. Minimum distance: A method for partitioning recurrences for multiprocessors.IEEE Trans. Comps., 38, 8(Aug.): 1203–1211.
Google Scholar
Ramanujam, J., and Sadayappan, P. 1989. A methodology for parallelizing programs for multicomputers and complex memory multiprocessors. InProc., ACM Internat. Conf. on Supercomputing (Nov.), pp. 637–646.
Google Scholar
Ramanujam, J., and Sadayappan, P. 1990. Tiling of iteration spaces for multicomputers. InProc., 1990 Internat. Conf. on Parallel Processing (Aug.), pp. 23–36.
Google Scholar
Sadayappan, P., and Ercal, F. 1987. Nearest-neighbor mapping of finite element graphs onto processor meshes.IEEE Trans. Comps., C-36, 12(Dec.): 1408–1424.
Google Scholar
Shang, W., and Fortes, J.A.B. 1988. Independent partitioning of algorithms with uniform dependencies. InProc., 1988 Internat. Conf. on Parallel Processing (Aug.), pp. 26–33.
Google Scholar
Shang, W., and Fortes, J.A.B. 1991. Time optimal linear schedules for algorithms with uniform dependencies.IEEE Trans. Comps., 40, 6(June): 723–742.
Google Scholar
Sheu, J.P., and Tai, T.H. 1991. Partitioning and mapping nested loops on multiprocessor systems.IEEE Trans. Parallel and Distr. Systems, 2, 4(Oct).: 430–439.
Google Scholar
Wolf, M.E., and Lam, M.S. 1991. A loop transformation theory and an algorithm to maximize parallelism.IEEE Trans. Parallel and Distr. Systems, 2, 4(Oct.): 452–471.
Google Scholar
Wolfe, M.J. 1989a. More iteration space tiling. InProc., ACM Internat. Conf. on Supercomputing (Nov.), pp. 655–664.
Google Scholar
Wolfe, M.J. 1989b.Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge, Mass.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Central University, 32054, Chung-Li, Taiwan, R.O.C.
Jang-Ping Sheu & Tzung-Shi Chen

Authors

Jang-Ping Sheu
View author publications
You can also search for this author in PubMed Google Scholar
Tzung-Shi Chen
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sheu, JP., Chen, TS. Partitioning and mapping of nested loops for linear array multicomputers. J Supercomput 9, 183–202 (1995). https://doi.org/10.1007/BF01245404

Download citation

Received: 15 July 1993
Accepted: 15 August 1994
Issue Date: March 1995
DOI: https://doi.org/10.1007/BF01245404

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Partitioning and mapping of nested loops for linear array multicomputers

Abstract

Access this article

Similar content being viewed by others

Shared Memory Parallelism in Modern C++ and HPX

Efficient High-Level Programming in Plain Java

MT-3000: a heterogeneous multi-zone processor for HPC

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Partitioning and mapping of nested loops for linear array multicomputers

Abstract

Access this article

Similar content being viewed by others

Shared Memory Parallelism in Modern C++ and HPX

Efficient High-Level Programming in Plain Java

MT-3000: a heterogeneous multi-zone processor for HPC

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation