Skip to main content

Loop Transformations for Hierarchical Parallelism and Locality

  • Conference paper
  • First Online:
Languages, Compilers, and Run-Time Systems for Scalable Computers (LCR 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1511))

Abstract

The increasing depth of memory and parallelism hierarchies in future scalable computer systems poses many challenges to parallelizing compilers. In this paper, we address the problem of selecting and implementing iteration-reordering loop transformations for hierarchical parallelism and locality. We present a two-pass algorithm for selecting sequences of Block, Unimodular, Parallel, and Coalesce transformations for optimizing locality and parallelism for a specified parallelism hierarchy model. These general transformation sequences are implemented using a framework for iteration-reordering loop transformations that we developed in past work [15].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Utpal Banerjee. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, Norwell, Massachusetts, 1988.

    Google Scholar 

  2. Utpal Banerjee. Unimodular Transformations of Double Loops. Proceedings of the Third Workshop on Languages and Compilers for Parallel Computing, August 1990.

    Google Scholar 

  3. Jyh-Herng Chow, Leonard E. Lyon, and Vivek Sarkar. Automatic Parallelization for Symmetric Shared-Memory Multiprocessors. CASCON’ 96 conference, November 1996.

    Google Scholar 

  4. Jeanne Ferrante, Vivek Sarkar, and Wendy Thrash. On Estimating and Enhancing Cache Effectiveness. Lecture Notes in Computer Science, (589):328–343, 1991. Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing, Santa Clara, California, USA, August 1991. Edited by U. Banerjee, D. Gelernter, A. Nicolau, D. Padua.

    Google Scholar 

  5. Francois Irigoin. Code generation for the hyperplane method and for loop interchange. Technical report, Ecole Nationale Superieure des Mines de Paris, October 1988. Report ENSMP-CAI-88-E102/CAI/I.

    Google Scholar 

  6. Francois Irigoin and Remi Triolet. Supernode Partitioning. Conference Record of Fifteenth ACM Symposium on Principles of Programming Languages, 1988.

    Google Scholar 

  7. Induprakas Kodukula, Nawaaz Ahmed, and Keshav Pingali. Data-centric Multilevel Blocking. Proceedings of the ACM SIGPLAN’ 97 Conference on Programming Language Design and Implementation, Las Vegas, Nevada, pages 346–357, June 1997.

    Google Scholar 

  8. L. Lamport. The Parallel Execution of DO Loops. Communications of the ACM, 17(2):83–93, February 1974.

    Google Scholar 

  9. Kathryn S. McKinley, Steve Carr, and Chau-Wen Tseng. Improving Data Locality with Loop Transformations. ACM Transactions on Programming Languages and Systems, 18:423–453, July 1996.

    Google Scholar 

  10. Nicholas Mitchell, Larry Carter, Jeanne Ferrante, and Karin Hogstedt. Quantifying the Multi-Level Nature of Tiling Interactions. In Languages and compilers for parallel computing. Proceedings of the 10th international workshop. Held Aug., 1997 in Minneapolis, MN., Lecture Notes in Computer Science. Springer-Verlag, New York, 1998. (to appear).

    Google Scholar 

  11. Constantine Polychronopoulos. Loop Coalescing: A Compiler Transformation for Parallel Machines. Technical report, U. of Ill., January 1987. Submitted for publication to the 1987 International Conference on Parallel Processing, St. Charles, Ill.

    Google Scholar 

  12. Constantine D. Polychronopoulos and David J. Kuck. Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers, C-36(12), December 1987.

    Google Scholar 

  13. Vivek Sarkar. Automatic Selection of High Order Transformations in the IBM XL Fortran Compilers. IBM Journal of Research and Development, 41(3), May 1997.

    Google Scholar 

  14. Vivek Sarkar, Guang R. Gao, and Shaohua Han. Locality Analysis for Distributed Shared-Memory Multiprocessors. In Languages and compilers for parallel computing. Proceedings of the 9th international workshop. Held Aug., 1996 in Santa Clara, CA., Lecture Notes in Computer Science. Springer-Verlag, New York, 1997.

    Google Scholar 

  15. Vivek Sarkar and Radhika Thekkath. A General Framework for Iteration-Reordering Loop Transformations. Proceedings of the ACM SIGPLAN’ 92 Conference on Programming Language Design and Implementation, pages 175–187, June 1992.

    Google Scholar 

  16. Michael E. Wolf and Monica S. Lam. A Data Locality Optimization Algorithm. Proceedings of the ACM SIGPLAN Symposium on Programming Language Design and Implementation, pages 30–44, June 1991.

    Google Scholar 

  17. Michael E. Wolf and Monica S. Lam. A Loop Transformation Theory and an Algorithm to Maximize Parallelism. IEEE Transactions on Parallel and Distributed Systems, 2(4):452–471, October 1991.

    Google Scholar 

  18. Michael J. Wolfe. Optimizing Supercompilers for Supercomputers. Pitman, London and The MIT Press, Cambridge, Massachusetts, 1989. In the series, Research Monographs in Parallel and Distributed Computing.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sarkar, V. (1998). Loop Transformations for Hierarchical Parallelism and Locality. In: O’Hallaron, D.R. (eds) Languages, Compilers, and Run-Time Systems for Scalable Computers. LCR 1998. Lecture Notes in Computer Science, vol 1511. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49530-4_5

Download citation

  • DOI: https://doi.org/10.1007/3-540-49530-4_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65172-7

  • Online ISBN: 978-3-540-49530-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics