Regular Article
TheaddapSystem on the iPSC/860: Automatic Data Distribution and Parallelization

https://doi.org/10.1006/jpdc.1996.0001Get rights and content

Abstract

This paper describes theaddapsystem—a parallelizing compiler for distributed memorymimdmachines that automatically computes a data distribution for the arrays of the source program by a branch-and-bound algorithm and parallelizes the inner loops of the program by inserting the necessary communication statements to access nonlocal array sections. The branch-and-bound algorithm incrementally constructs paths in a decision tree where each node on a path corresponds to the distribution of an array of the source program. For each path, a communication analysis tool computes the corresponding communication costs. Based on these costs, the data distribution algorithm tries to find the best data distribution by searching for the cheapest path from a leaf to the root of the decision tree. By rejecting expensive paths as early as possible, the algorithm actually builds only a few paths, corresponding to a small fraction of the decision tree. Therefore, the runtime of the data distribution phase remains quite small also for larger input programs. The structure of the algorithm makes it easy to allow redistributions during program execution. The communication analysis tool computes the communication costs of a data distribution by determining the number and size of the messages that each processor has to receive during program execution. The tool also takes sequentializations into account that are caused by data dependences. A prototype implementation of the system generates code for an Intel iPSC/860. Tests show that the communication costs are determined quite accurately and that the array distributions computed cause only a small communication overhead compared to other data distributions. This results in good speedup values for most of the parallelized programs.

References (0)

Cited by (0)

View full text