Advanced loop optimizations for parallel computers

Polychronopoulos, Constantine D.

doi:10.1007/3-540-18991-2_15

Advanced loop optimizations for parallel computers

Constantine D. Polychronopoulos¹

Session 4A: Compilers And Restructuring Techniques I
Conference paper
First Online: 01 January 2005

157 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 297))

Abstract

So far, most of the work on program dependence analysis has concentrated primarily on compile-time techniques, that are not always accurate and which are often conservative. By coupling the compiler's ability to perform elaborate program optimizations and the run-time system's more accurate knowledge of certain program characteristics, we can uncover and exploit even more parallelism in ordinary programs. By performing run-time dependence checking, certain types of loops that were previously treated as serial can be executed concurrently. This paper presents a run-time dependence checking scheme and a new compiler optimization aiming at parallelizing serial loops. In particular we present cycle shrinking, a compiler transformation that "shrinks" the dependence distances in serial loops, allowing parts of such loops to execute concurrently. Code reordering for minimizing communication and Subscript blocking, are also discussed briefly.

This work was supported in part by the National Science Foundation under Grant No. NSF DCR84-10110 and NSF DCR84-06916, the U. S. Department of Energy under Grant No. DOE DE-FG02-85ER25001, and the IBM Donation.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

A.V. Aho, R. Sethi, and J.D. Ullman, Compilers: Principles, Techniques, and Tools, Addison-Wesley, Reading, Massachusetts, 1986.
Google Scholar
F.E. Allen and J. Cocke, “A Catalogue of Optimizing Transformations,” Design and Optimization of Compilers, R. Rustin, Ed. Prentice-Hall, Englewood Cliffs, N.J., 1972, pp. 1–30.
Google Scholar
J.R. Allen and K. Kennedy, “PFC: A Program to Convert Fortran to Parallel Form,” Techn. Rept. MASC-TR82-6, Rice University, Houston, Texas, March 1982.
Google Scholar
Alliant Computer Systems Corp., "FX/Series Architecture Manual," Acton, Massachusetts, 1985
Google Scholar
American National Standards Institute, American National Standard for Information Systems. Programming Language Fortran S8 (X3.9-198x). Revision of X3.9-1978, Draft S8, Version 99, ANSI, New York, April 1986.
Google Scholar
U. Banerjee, "Speedup of Ordinary Programs," Ph.D. Thesis, University of Illinois at Urbana-Champaign, DCS Report No. UIUCDCS-R-79-989, October 1979.
Google Scholar
B. Brode, “Precompilation of Fortran Programs to Facilitate Array Processing,” Computer 14, 9, September 1981, pp. 46–51.
Google Scholar
S. Chen, "Large-scale and High-speed Multiprocessor System for Scientific Applications — Cray-X-MP-2 Series," Proc. of NATO Advanced Research Workshop on High Speed Computing, Kawalik(Editor), pp. 59–67, June 1983.
Google Scholar
R.G. Cytron, “Doacross: Beyond Vectorization for Multiprocessors (Extended Abstract),” Proceedings of the 1986 International Conference on Parallel Processing, St. Charles, IL, pp. 836–844, August, 1986.
Google Scholar
J. R. Beckman Davies, “Parallel Loop Constructs for Multiprocessors,” M.S. Thesis, University of Illinois at Urbana-Champaign, DCS Report No. UIUCDCS-R-81-1070, May, 1981.
Google Scholar
K. Kennedy, “Automatic Vectorization of Fortran Programs to Vector Form,” Technical Report, Rice University, Houston, TX, October, 1980.
Google Scholar
D.J. Kuck, R.H. Kuhn, B. Leasure, and M. Wolfe, "The Structure of an Advanced Vectorizer for Pipelined Processors," Fourth International Computer Software and Applications Conference, October, 1980.
Google Scholar
D.J. Kuck, R. Kuhn, D. Padua, B. Leasure, and M. Wolfe, "Dependence Graphs and Compiler Optimizations," Proceedings of the 8-th ACM Symposium on Principles of Programming Languages, pp. 207–218, January 1981.
Google Scholar
D. J. Kuck, E. S. Davidson, D. H. Lawrie, and A.H. Sameh, “Parallel Supercomputing Today and the Cedar Approach,” Science 231, 4740 February 28, 1986, pp. 967–974.
Google Scholar
D.J. Kuck, The Structure of Computers and Computations, Volume 1, John Wiley and Sons, New York, 1978.
Google Scholar
P. Mehrotra and J. Van Rosendale, “The Blaze Language: A Parallel Language for Scientific Programming,” Rep. 85–29, Institute for Computer Applications in Science and Engineering, NASA Langley Research Center, Hampton, Va., May 1985.
Google Scholar
K. Miura and K. Uchida, “Facom Vector Processor VP-100/VP-200,” High Speed Computation, NATO ASI Series, Vol. F7, J.S. Kowalik Ed., Springer-Verlag, New York, 1984.
Google Scholar
S. Nagashima, Y. Inagami, T. Odaka, and S. Kawabe, “Design Consideration for a High-Speed Vector Processor: The Hitachi S-810,” Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers, ICCD 84, IEEE Press, New York, 1984.
Google Scholar
D.A. Padua Haiek, "Multiprocessors: Discussions of Some Theoretical and Practical Problems," Ph.D. Thesis, University of Illinois at Urbana-Champaign, DCS Report No. UIUCDCS-R-79-990, November 1979.
Google Scholar
D.A. Padua, and M. Wolfe, “Advanced Compiler Optimizations for Supercomputers,” Communications of the ACM, Vol. 29, No. 12, pp. 1184–1201, December 1986.
Google Scholar
C. D. Polychronopoulos and D. J. Kuck, “Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers,” to appear IEEE Transactions on Computers, Special Issue on Supercomputing, December, 1987.
Google Scholar
C. D. Polychronopoulos, D. J. Kuck, and D. A. Padua, “Utilizing Multidimensional Loop Parallelism on Large-Scale Parallel Processor Systems,” accepted for publication, IEEE Transactions on Computers, September 1987.
Google Scholar
C. D. Polychronopoulos, “On Program Restructuring, Scheduling, and Communication for Parallel Processor Systems,” Ph.D. Thesis, CSRD No. 595, Center for Supercomputing Research and Development, University of Illinois, August, 1986.
Google Scholar
C. D. Polychronopoulos, “More on Loop Optimizations,” Technical Report, Center for Supercomputing Research and Development, University of Illinois, September, 1987.
Google Scholar
M. J. Wolfe, “Optimizing Supercompilers for Supercomputers,” Ph.D. Thesis, University of Illinois at Urbana-Champaign, DCS Report No. UIUCCDCS-R-82-1105, 1982.
Google Scholar
C. Q. Zhu and P. C. Yew, “A Synchronization Scheme and Its Applications for Large Multiprocessor Systems,” Proc. of the 1984 International Conference on Distributed Computing Systems, pp. 486–493, May 1984.
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Supercomputing Research and Development and Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, 104 South Wright Street, 61801, Urbana, Illinois, USA
Constantine D. Polychronopoulos

Authors

Constantine D. Polychronopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

E. N. Houstis T. S. Papatheodorou C. D. Polychronopoulos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Polychronopoulos, C.D. (1988). Advanced loop optimizations for parallel computers. In: Houstis, E.N., Papatheodorou, T.S., Polychronopoulos, C.D. (eds) Supercomputing. ICS 1987. Lecture Notes in Computer Science, vol 297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-18991-2_15

Download citation

DOI: https://doi.org/10.1007/3-540-18991-2_15
Published: 27 May 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-18991-6
Online ISBN: 978-3-540-38888-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics