Parallel loop generation and scheduling

Lotfi, Shahriar; Parsa, Saeed

doi:10.1007/s11227-008-0262-5

Parallel loop generation and scheduling

Published: 27 February 2009

Volume 50, pages 289–306, (2009)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Shahriar Lotfi¹ &
Saeed Parsa²

115 Accesses
11 Citations
Explore all metrics

Abstract

Loop tiling is an efficient loop transformation, mainly applied to detect coarse-grained parallelism in loops. It is a difficult task to apply n-dimensional non-rectangular tiles to generate parallel loops. This paper offers an efficient scheme to apply non-rectangular n-dimensional tiles in non-rectangular iteration spaces, to generate parallel loops. In order to exploit wavefront parallelism efficiently, all the tiles with equal sum of coordinates are assumed to reside on the same wavefront. Also, in order to assign parallelepiped tiles on each wavefront to different processors, an improved block scheduling strategy is offered in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Allen R, Kennedy K (2001) Optimizing compilers for modern architectures. Morgan Kaufmann, Los Altos
Google Scholar
Athanasaki M, Sotiropoulos A, Tsoukalas G, Koziris N, Tsanakas P (2005) Hyperplane grouping and pipelined schedules: how to execute tiled loops fast on clusters of SMPs. J Supercomput 33:197–226
Article Google Scholar
Banerjee U (1993) Loop transformations for restructuring compilers the foundations. Kluwer Academic, Dordrecht
MATH Google Scholar
Bik AJC, Wijshoff HAG (1994) Implementation of Fourier–Motzkin elimination. Leiden University, Leiden
Google Scholar
Darte A, Robert Y, Vivien F (2000) Scheduling and automatic parallelization. Birkhäuser, Basel
MATH Google Scholar
Eisenbeis C, Sogno JC (1992) A general algorithm for data dependency analysis. In: International Conference on Supercomputing, Washington, July 19–23, 1992
Goumas G, Athanasaki M, Koziris N (2002) Code generation methods for tiling transformations. J Inf Sci Eng 18:667–691
MathSciNet Google Scholar
Goumas G, Sotiropoulos A, Koziris N (2001) Minimizing completion time for loop tiling with computation and communication overlapping. IEEE
Kandemir M, Bordawekar R, Choudhary A, Ramanujam J (1997) A unified tiling approach for out-of-core computations
Manjikian N, Abdelrahman TS (1996) Scheduling of wavefront parallelism on scalable shared-memory multiprocessors. Department of Electrical and Computer Engineering, University of Toronto, Toronto
Google Scholar
Miyandashti FJ (1997) Loop uniformization in shared-memory MIMD machine. Master Thesis, Iran University of Science and Technology (in Persian)
Parsa S, Lotfi S (2006) A new genetic algorithms for loop tiling. J Supercomput 37:249–269
Article Google Scholar
Parsa S, Lotfi S (2006) An outline of a loop parallelization approach in multi-dimensional Cartesian space. In: The 6th Conference perspectives of system informatics, 27–30 June 2006
Rastello F, Robert Y (2002) Automatic partitioning of parallel loops with parallelepiped-shaped tiles. IEEE Trans Parallel Distrib Syst 13(5):460–470
Article Google Scholar
Wolf ME (1989) Iteration space tiling for memory hierarchies. In: Gary Rodrigue, the 3rd conference on parallel processing for scientific computing, December 1989
Wolf ME (1989) More iteration space tiling. In: Supercomputing’88, November 1989
Wolf ME, Lam MS (1991) A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans Parallel Distrib Syst 2(4):452–471
Article Google Scholar
Zhao Y, Kennedy K (2005) Scalarization using loop alignment and loop skewing. J Supercomput 31:5–46
Article MATH Google Scholar
Zima H, Chapman B (1991) Super compilers for parallel and vector computers. Addison-Wesley, Reading
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Tabriz, Tabriz, Iran
Shahriar Lotfi
Faculty of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Saeed Parsa

Authors

Shahriar Lotfi
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Parsa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shahriar Lotfi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lotfi, S., Parsa, S. Parallel loop generation and scheduling. J Supercomput 50, 289–306 (2009). https://doi.org/10.1007/s11227-008-0262-5

Download citation

Received: 12 August 2007
Accepted: 22 December 2008
Published: 27 February 2009
Issue Date: December 2009
DOI: https://doi.org/10.1007/s11227-008-0262-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Parallel loop generation and scheduling

Abstract

Access this article

Similar content being viewed by others

Parallelizing the dual revised simplex method

Performance improvement of the triangular matrix product in commodity clusters

Parallel border tracking in binary images using GPUs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parallel loop generation and scheduling

Abstract

Access this article

Similar content being viewed by others

Parallelizing the dual revised simplex method

Performance improvement of the triangular matrix product in commodity clusters

Parallel border tracking in binary images using GPUs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation