Abstract
Automatic parallelization in the polyhedral model is based on affine transformations from an original computation domain (iteration space) to a target space-time domain, often with a different transformation for each variable. Code generation is an often ignored step in this process that has a significant impact on the quality of the final code. It involves making a trade-off between code size and control code simplification/optimization. Previous methods of doing code generation are based on loop splitting, however they have nonoptimal behavior when working on parameterized programs. We present a general parameterized method for code generation based on dual representation of polyhedra. Our algorithm uses a simple recursion on the dimensions of the domains, and enables fine control over the tradeoff between code size and control overhead.
Similar content being viewed by others
REFERENCES
D. J. Kuck, The Structure of Computers and Computations, Wiley, New York (1978).
R. Karp, R. Miller, and S. Winograd, The organization of computations for uniform recurrence equations, J. Assoc. Computing Machinery 14(3):563–590 (July 1967).
P. Feautrier, Dataflow analysis of array and scalar references, IJPP 20(1):23–53 (February 1991).
P. Feautrier, Some efficient solutions to the affine scheduling problem, Part I, one dimen-sional time, and Part II, multidimensional time, IJPP 21(5-6):313–348; 389–420 (October 1992).
S. Rajopadhye and R. Fujimoto, Synthesizing systolic arrays from recurrence equations, Parallel Computing 14:163–189 (June 1990).
P. Quinton and V. Van Dongen, The mapping of linear recurrence equations on regular arrays, J. VLSI Signal Proc. 1(2):95–113 (1989).
U. Banerjee, Loop Transformations for Restructuring Compilers: The Foundations, Kluwer Academic Publishers, Norwell, Massachusetts (1993).
F. Irigoin, Code generation for the hyperplane method and for loop interchange, Technical Report ENSMP-CAI-88-E102-CAI-I, Ecole Nationale Superieure des Mines de Paris (October 1988).
M. E. Wolf and M. Lam, Loop transformation theory and an algorithm to maximize parallelism, IEEE Trans. Parallel Distrib. Syst. 2(4):452–471 (October 1991).
A. Darte and F. Vivien, Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs, Rapport de recherche 96-06, LIP, ENS Lyon (April 1996).
A. Darte and Y. Robert, Mapping uniform loop nests onto distributed memory architectures, Rapport de recherche 93-03, LIP (January 1993).
P. Feautrier, Toward automatic distribution, Parallel Proc. Lett. 4(3):233–244 (September 1994).
C. Ancourt and F. Irigoin, Scanning polyhedra with DO loops, Third Symp. Principles and Practice of Parallel Programming (PPoPP), ACM SIGPLAN, ACM Press, pp. 39–50 (1991).
D. Wilde, A library for doing polyhedral operations, Technical Report PI 785, IRISA, France (December 1993).
H. Le Verge, V. Van Dongen, and D. Wilde, Loop nest synthesis using the polyhedral library, Technical Report PI 830, IRISA, France (May 1994).
T. S. Motzkin, H. Raiffa, G. L. Thompson, and R. M. Thrall, Contribution to the Theory of Games, Princeton University Press (1953).
N. V. Chernikova, Algorithm for finding a general formula for the no n-negative solution of a system of linear inequalities, U.S.S.R. Computational Math. Math. Phys. 5(2):228–233 (1965).
M. Griebl, C. Lengauer, and S. Wetzel, Code generation in the polytope model, Proc. Int'l. Conf. on Parallel Architectures and Compilation Techniques (PACT'98), IEEE Computer Society Press, pp. 106–111 (1998).
W. Kelly, W. Pugh, and E. Rosser, Code generation for multiple mappings, Frontiers '95: Fifth Symp. Frontiers of Massively Parallel Computation, McLean, Virginia (February 1995).
P. Clauss and V. Loechner, Parametric analysis of polyhedral iteration spaces, IEEE Int'l. Conf. on Application Specific Array Processors, ASAP'96, IEEE Computer Society (August 1996).
F. Andre, M. Le Fur, Y. Maheo, and J.-L. Pazat, The Pandore data-parallel compiler and its portable runtime, LNCS 919:176–183 (1995).
S. P. Amarasinghe, Parallelizing compiler techniques based on linear inequalities, Ph.D. thesis, Stanford University (January 1997).
F. Irigoin, P. Jouvelot, and R. Triolet, Semantical interprocedural parallelization: An overview of the PIPS project, Confe. Proc. Int'l. Conf. Supercomputing, Cologne, Germany, ACM SIGARCH, pp. 244–251 (June 1991).
W. Li and K. Pingali, A singular loop transformation framework based on non-singular matrices, IJPP 22(2):183–205 (April 1994).
A. Schrijver, Theory of Linear and Integer Programming, A Wiley-Interscience Publication, John Wiley, Chichester, New York (1986).
J. Ramanujam, Nonunimodular transformations of nested loops, Supercomputing '92, Minneapolis, Minnesota, pp. 214–223 (November 1992).
J. Ramanujam, Beyond unimodular transformation, J. Supercomputing 9(4):365–389 (1995).
J. Xue, Automating nonunimodular loop transformations for massive parallelism, Parallel Computing 20(5):711–728 (April 1994).
J.-F. Collard, T. Risset, and P. Feautrier, Construction of DO loops from systems of affine constraints, Parallel Proc. Lett. 5(3):421–436 (September 1995).
P. Feautrier, Parametric integer programming, Rairo Recherche Ope-rationnelle 22(3): 243–268 (September 1988).
P. Boulet, Bouclettes: A Fortran loop parallelizer, LNCS 1067:784–791 (1996).
P. Quinton, S. Rajopadhye, and D. Wilde, Deriving imperative code from functional programs, Seventh Conf. Functional Prog. Lang. and Computer Architecture, ACM, La Jolla, California, pp. 36–44 (June 1995).
O. Albiez, Parcours d'une superposition de domaines, RenPar'10 (June 1998).
J.-F. Collard, Code generation in automatic parallelizers, C. Girault (ed.), Proc. Int'l. Conf. on Applications in Parallel and Distributed Computing, IFIP W.G 10.3, Caracas, Venezuela, North Holland, pp. 185–194 (April 1994).
M. Griebl and C. Lengauer, The loop parallelizer LooPo, Michael Gerndt (ed.), Proc. Sixth Workshop on Compilers for Parallel Computers, Konferenzen des Forschungszentrums Jülich, Forschungszentrum Jülich, Vol. 21, pp. 311–320 (1996).
W. Kelly and W. Pugh, A framework for unifying reordering transformations, Technical Report CS-TR-3193, Dept. of Computer Science, University of Maryland (April 1993).
Z. Chamski, Nested loop sequences: Towards efficient loop structures in automatic parallelization, Technical Report PI 772, IRISA, France (October 1993).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Quilleré, F., Rajopadhye, S. & Wilde, D. Generation of Efficient Nested Loops from Polyhedra. International Journal of Parallel Programming 28, 469–498 (2000). https://doi.org/10.1023/A:1007554627716
Issue Date:
DOI: https://doi.org/10.1023/A:1007554627716