Abstract
Loop tiling is an efficient loop transformation, mainly applied to detect coarse-grained parallelism in loops. It is a difficult task to apply n-dimensional non-rectangular tiles to generate parallel loops. This paper offers an efficient scheme to apply non-rectangular n-dimensional tiles in non-rectangular iteration spaces, to generate parallel loops. In order to exploit wavefront parallelism efficiently, all the tiles with equal sum of coordinates are assumed to reside on the same wavefront. Also, in order to assign parallelepiped tiles on each wavefront to different processors, an improved block scheduling strategy is offered in this paper.
Similar content being viewed by others
References
Allen R, Kennedy K (2001) Optimizing compilers for modern architectures. Morgan Kaufmann, Los Altos
Athanasaki M, Sotiropoulos A, Tsoukalas G, Koziris N, Tsanakas P (2005) Hyperplane grouping and pipelined schedules: how to execute tiled loops fast on clusters of SMPs. J Supercomput 33:197–226
Banerjee U (1993) Loop transformations for restructuring compilers the foundations. Kluwer Academic, Dordrecht
Bik AJC, Wijshoff HAG (1994) Implementation of Fourier–Motzkin elimination. Leiden University, Leiden
Darte A, Robert Y, Vivien F (2000) Scheduling and automatic parallelization. Birkhäuser, Basel
Eisenbeis C, Sogno JC (1992) A general algorithm for data dependency analysis. In: International Conference on Supercomputing, Washington, July 19–23, 1992
Goumas G, Athanasaki M, Koziris N (2002) Code generation methods for tiling transformations. J Inf Sci Eng 18:667–691
Goumas G, Sotiropoulos A, Koziris N (2001) Minimizing completion time for loop tiling with computation and communication overlapping. IEEE
Kandemir M, Bordawekar R, Choudhary A, Ramanujam J (1997) A unified tiling approach for out-of-core computations
Manjikian N, Abdelrahman TS (1996) Scheduling of wavefront parallelism on scalable shared-memory multiprocessors. Department of Electrical and Computer Engineering, University of Toronto, Toronto
Miyandashti FJ (1997) Loop uniformization in shared-memory MIMD machine. Master Thesis, Iran University of Science and Technology (in Persian)
Parsa S, Lotfi S (2006) A new genetic algorithms for loop tiling. J Supercomput 37:249–269
Parsa S, Lotfi S (2006) An outline of a loop parallelization approach in multi-dimensional Cartesian space. In: The 6th Conference perspectives of system informatics, 27–30 June 2006
Rastello F, Robert Y (2002) Automatic partitioning of parallel loops with parallelepiped-shaped tiles. IEEE Trans Parallel Distrib Syst 13(5):460–470
Wolf ME (1989) Iteration space tiling for memory hierarchies. In: Gary Rodrigue, the 3rd conference on parallel processing for scientific computing, December 1989
Wolf ME (1989) More iteration space tiling. In: Supercomputing’88, November 1989
Wolf ME, Lam MS (1991) A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans Parallel Distrib Syst 2(4):452–471
Zhao Y, Kennedy K (2005) Scalarization using loop alignment and loop skewing. J Supercomput 31:5–46
Zima H, Chapman B (1991) Super compilers for parallel and vector computers. Addison-Wesley, Reading
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lotfi, S., Parsa, S. Parallel loop generation and scheduling. J Supercomput 50, 289–306 (2009). https://doi.org/10.1007/s11227-008-0262-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-008-0262-5