A new algorithm for global optimization for parallelism and locality

Appelbe, Bill; Doddapaneni, Srinivas; Hardnett, Charles

doi:10.1007/BFb0025875

A new algorithm for global optimization for parallelism and locality

Bill Appelbe¹,
Srinivas Doddapaneni¹ &
Charles Hardnett¹

Postlinear Loop Transformations
Conference paper
First Online: 01 January 2005

133 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 892))

Abstract

Converting sequential programs to execute on parallel computers is difficult because of the need to globally optimize for both parallelism and data locality. The choice of which loop nests to parallelize, and how, drastically affects data locality. Similarly, data distribution directives, such as DISTRIBUTE in High Performance Fortran (HPF), affects available parallelism and locality. What is needed is a systematic approach to converting programs to parallel form, based upon analysis that identifies opportunities for both parallelism and locality in one representation.

This paper presents a global framework for optimizing parallelism and locality, based upon constraint solving for locality between potentially parallel loop nests. We outline the theory behind the framework, and provide a global algorithm for parallelizing programs while optimizing for locality. We also give results from applying the algorithm to parallelizing the Perfect benchmarks, targeted at the KSR-1, and analyze the results. Unlike other approaches, we do not assume an explicit distribution of data to processors. The distribution is inferred from locality constraints and available parallelism. This approach works well for machines such as the KSR-1, where there is no explicit distribution of data. However, our approach could be used to generate code for distributed memory processors (such as generating HPF) with explicit data distribution.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

Anderson, J., and Lam, M. S. Global optimizations for parallelism and locality on scalable parallel machines. In SIGPLAN Programming Language Design and Implementation (1993), pp. 112–125. 2. Appelbe, B., Doddapaneni, S., Hardnett, C., and Smith, K. Determining transformation sequences for loop parallelization. Tech. Rep. GIT-ICS-92/59, Georgia Institute of Technology, Nov. 1992.
Google Scholar
Appelbe, B., Hardnett, C., and Doddapaneni, S. Aligning data structures for parallelism and locality. Tech. Rep. GIT-CC-94-20, Georgia Institute of Technology, Feb. 1994.
Google Scholar
Appelbe, B., and Smith, K. Determining transformation sequences for loop parallelization. In Fifth Workshop on Languages and Compilers for Parallel Computing (July 1993).
Google Scholar
Callahan, D. A Global Approach to Detection of Parallelism. PhD thesis, Rice University, 1987. Rice Tech Report, COMP TR87-50.
Google Scholar
Chatterjee, S., Gilbert, J., and Schreiber, R. The alignment-distribution graph. In Sixth Workshop on Languages and Compilers for Parallel Computing (July 1993), pp. 234–252.
Google Scholar
Gupta, M., and Banerjee, P. Paradigm: A compiler for automatic data distribution on multicomputers. In International Conference on Supercomputing (June 1993), pp. 87–96.
Google Scholar
Haghighat, M., and Polychronopoulos, C. Symbolic analysis for parallelizing compilers. Tech. rep., University of Illinois, 1994.
Google Scholar
Kennedy, K., and McKinley, K. Optimizing for parallelism and data locality. In International Conference on Supercomputing (July 1992), pp. 323–334.
Google Scholar
Ramanujam, J., and Sadayappan, P. Tiling multidimensional iteration spaces for nonshared memory machines. In Supecomputing '91 (Nov. 1991), pp. 111–121.
Google Scholar
Wolf, M. E., and Lam, M. S. A data locality optimizing algorithm. In SIG-PLAN Programming Language Design and Implementation (1991), pp. 30–44.
Google Scholar
Wolf, M. E., and Lam, M. S. A loop transformation theory and an algorithm to maximize parallelism. IEEE Transactions on Parallel and Distributed Systems 2, 4 (October 1991), 452–482.
Google Scholar
Zima, H., and Chapman, B.Supercompilers for Parallel and Vector Computers. ACM Press, New York, New York, 1990.
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computing, Georgia Institute of Technology, 30332, Atlanta, GA
Bill Appelbe, Srinivas Doddapaneni & Charles Hardnett

Authors

Bill Appelbe
View author publications
You can also search for this author in PubMed Google Scholar
Srinivas Doddapaneni
View author publications
You can also search for this author in PubMed Google Scholar
Charles Hardnett
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Keshav Pingali Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Appelbe, B., Doddapaneni, S., Hardnett, C. (1995). A new algorithm for global optimization for parallelism and locality. In: Pingali, K., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1994. Lecture Notes in Computer Science, vol 892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0025875

Download citation

DOI: https://doi.org/10.1007/BFb0025875
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58868-9
Online ISBN: 978-3-540-49134-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics