Regular Article
Effects of Loop Fusion and Statement Migration on the Speedup of Vector Multiprocessors

https://doi.org/10.1006/jpdc.1995.1144Get rights and content

Abstract

Vector multiprocessors rely on both spatial and temporal parallelism for achieving significant speedup. For singly nested loops, we study the effect on the speedup of (1) loop fusion and (2) increasing the granule size of parallel-vector loops using extracted statements from scalar loops. The proposed optimization migrate vector statements from one loop to another, create new loops, and reduce others. Loops and statements that belong to strongly connected data paths are vertically fused, whenever possible, in order to promote chaining, register, and cache reuse. To reduce loop synchronization, horizontal fusion is also used for independent loops having compatible dependence types. Finally, vector operations are scheduled based on knowledge of the timing of arithmetic pipelines, load and store operations, and management of the available resource. Testing is carried out using synthetic Fortran programs on the Convex C240 vector multiprocessor. The proposed loop fusion improves the speedup by 18 to 43% over the C240 commercial optimizing compiler. Chaining-oriented scheduling and allocation yields 9 to 15% improvement over the highest optimization option of the C240 compiler.

References (0)

Cited by (1)

View full text