Skip to main content

Communication-free parallelization via affine transformations

  • Postlinear Loop Transformations
  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 1994)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 892))

Abstract

The paper describes a parallelization algorithm for programs consisting of arbitrary nestings of loops and sequences of loops. The code produced by our algorithm yields all the degrees of communication-free parallelism that can be obtained via loop fission, fusion, interchange, reversal, skewing, scaling, reindexing and statement reordering. The algorithm first assigns the iterations of instructions in the program to processors via affine processor mappings, then generates the correct code by ensuring that the code executed by each processor is a subsequence of the original sequential execution sequence.

This research was supported in part by DARPA contract DABT63-91-K-0003 and an NSF Young Investigator award.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. R. Allen, D. Callahan, and K. Kennedy. Automatic decomposition of scientific programs for parallel execution. In Proceedings, 14th Annual ACM Symposium on Principles of Programming Languages, Munich, Germany, January 1987.

    Google Scholar 

  2. J. R. Allen and K. Kennedy. Automatic translation of Fortran programs to vector form. ACM Transactions on Programming Languages and Systems, 9(4):491–542, October 1987.

    Google Scholar 

  3. S. P. Amarasinghe and M. S. Lam. Communication optimization and code generation for distributed memory machines. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation, June 1993.

    Google Scholar 

  4. C. Ancourt and F. Irigoin. Scanning polyhedra with DO loops. In Proceedings of the Third ACM/SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 39–50, April 1991.

    Google Scholar 

  5. J. M. Anderson and M. S. Lam. Global optimizations for parallelism and locality on scalable parallel machines. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation, June 1993.

    Google Scholar 

  6. E. Ayguadé and J. Torres. Partitioning the statement per iteration space using non-singular matrices. In Proceedings of the 1993 ACM International Conference on Supercomputing, July 1993.

    Google Scholar 

  7. U. Banerjee. Speedup of Ordinary Programs. PhD thesis, University of Illinois at Urbana-Champaign, October 1979.

    Google Scholar 

  8. U. Banerjee. Unimodular transformations of double loops. In Proceedings of the Third Workshop on Programming Languages and Compilers for Parallel Computing, pages 192–219, August 1990.

    Google Scholar 

  9. U. Banerjee. Loop Transformations for Restructuring Compilers. Kluwer Academic, 1993.

    Google Scholar 

  10. S. Carr and K. Kennedy. Compiler blockability of numerical algorithms. In Proceedings Supercomputing '92, pages 114–125, November 1992.

    Google Scholar 

  11. P. Feautrier. Some efficient solution to the affine scheduling problem, part II, multidimensional time. Int. J. of Parallel Programming, 21(6), December 1992.

    Google Scholar 

  12. P. Feautrier. Some efficient solutions to the affine scheduling problem, part I, one dimensional time. Int. J. of Parallel Programming, 21(5):313–348, October 1992.

    Google Scholar 

  13. P. Feautrier. Towards automatic distribution. Technical Report 92.95, Institut Blaise Pascal/Laboratoire MASI, December 1992.

    Google Scholar 

  14. C. H. Huang and P. Sadayappan. Communication-free hyperplane partitioning of nested loops. Journal of Parallel and Distributed Computing, 19:90–102, 1993.

    Google Scholar 

  15. W. Kelly and W. Pugh. A framework for unifying reordering transformations. Technical Report CS-TR-2995.1, University of Maryland, April 1993.

    Google Scholar 

  16. K. Kennedy and K. S. McKinley. Optimizing for parallelism and data locality. In Proceedings of the 1992 ACM International Conference on Supercomputing, pages 323–334, July 1992.

    Google Scholar 

  17. K. Kennedy and K. S. McKinley. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In Proceedings of the Sixth Workshop on Programming Languages and Compilers for Parallel Computing, August 1993.

    Google Scholar 

  18. V. Sarkar and R. Thekkath. A general framework for iteration-reordering loop transformations. In Proceedings of the SIGPLAN '92 Conference on Programming Language Design and Implementation, pages 175–187, June 1992.

    Google Scholar 

  19. J. Torres, E. Ayguadé, J. Labarta, and M. Valero. Align and distribute-based linear loop transformations. In Proceedings of the Sixth Workshop on Programming Languages and Compilers for Parallel Computing, August 1993.

    Google Scholar 

  20. M. E. Wolf. Improving Locality and Parallelism in Nested Loops. PhD thesis, Stanford University, August 1992. Published as CSL-TR-92-538.

    Google Scholar 

  21. M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. Transactions on Parallel and Distributed Systems, 2(4):452–470, October 1991.

    Google Scholar 

  22. M. J. Wolfe. Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge, MA, 1989.

    Google Scholar 

  23. M. J. Wolfe. Massive parallelism through program restructuring. In Symposium on Frontiers on Massively Parallel Computation, pages 407–415, October 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Keshav Pingali Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lim, A.W., Lam, M.S. (1995). Communication-free parallelization via affine transformations. In: Pingali, K., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1994. Lecture Notes in Computer Science, vol 892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0025873

Download citation

  • DOI: https://doi.org/10.1007/BFb0025873

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58868-9

  • Online ISBN: 978-3-540-49134-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics