Interprocedural communication optimizations for distributed memory compilation

Agrawal, Gagan; Saltz, Joel

doi:10.1007/BFb0025885

Interprocedural communication optimizations for distributed memory compilation

Gagan Agrawal¹ &
Joel Saltz¹

How to Communicate Better
Conference paper
First Online: 01 January 2005

135 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 892))

Abstract

Managing communication is a difficult problem in distributed memory compilation. When the exact data to be communicated cannot be determined at compile time, communication optimizations can be performed by runtime routines which generate schedule for communication. This leads to two optimization problems: placing communication so that data once communicated can be reused if possible and placing schedule calls so that the result of runtime preprocessing can be reused for communicating as many times as possible. In large application codes, computation and communication is spread across multiple subroutines, so acceptable performance cannot be achieved without performing these optimizations across subroutine boundaries. In this paper, we present an Interprocedural Analysis Framework for these two optimization problems. Our optimizations are based on a program abstraction we call Control & Call Flow Graph. This extends the call graph abstraction by storing the control flow relations between various call sites within a subroutine. We show how communication placement and schedule call placement problems can be solved by data-flow analysis on Control & Call Flow Graph structure.

This work was supported by NSF under grant No. ASC 9213821 and by ONR under contract No. N00014-93-1-0158. The authors assume all responsibility for the contents of the paper.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

Gagan Agrawal, Alan Sussman, and Joel Saltz. Compiler and runtime support for structured and block structured applications. In Proceedings Supercomputing '93, pages 578–587. IEEE Computer Society Press, November 1993.
Google Scholar
Gagan Agrawal, Alan Sussman, and Joel Saltz. Efficient runtime support for parallelizing block structured applications. In Proceedings of the Scalable High Performance Computing Conference (SHPCC-94), pages 158–167. IEEE Computer Society Press, May 1994.
Google Scholar
Gagan Agrawal, Alan Sussman, and Joel Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. IEEE Transactions on Parallel and Distributed Systems, 1994. To appear. Also available as University of Maryland Technical Report CS-TR-3143 and UMIACS-TR-93-94.
Google Scholar
Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986.
Google Scholar
D. Callahan. The program summary graph and flow-sensitive interprocedural data flow analysis. In Proceedings of the SIGPLAN '88 Conference on Program Language Design and Implementation, Atlanta, GA, June 1988.
Google Scholar
K. Cooper, M. W. Hall, and L. Torczon. An experiment with inline substitution. Software—Practice and Experience, 21(6):581–601, June 1991.
Google Scholar
K. Cooper, K. Kennedy, and L. Torczon. The impact of interprocedural analysis and optimization in the rn programming environment. ACM Transactions on Programming Languages and Systems, 8(4):491–523, October 1986.
Google Scholar
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451–490, October 1991.
Google Scholar
R. Das, D. J. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy. The design and implementation of a parallel unstructured Euler solver using software primitives. AIAA Journal, 32(3):489–496, March 1994.
Google Scholar
Raja Das, Joel Saltz, and Reinhard von Hanxleden. Slicing analysis and indirect access to distributed arrays. In Proceedings of the 6th Workshop on Languages and Compilers for Parallel Computing, pages 152–168. Springer-Verlag, August 1993. Also available as University of Maryland Technical Report CS-TR-3076 and UMIACS-TR-93-42.
Google Scholar
D.M. Dhamdhere. Practical adaptation of the global optimization algorithm of Morel and Renvoise. ACM Transactions on Programming Languages and Systems, 13(2):291–294, April 1991.
Google Scholar
K. Drechsler and M. Stadel. A solution to a problem with Morel and Renvoise's “Global optimization by suppression of partial redundancies”. ACM Transactions on Programming Languages and Systems, 10(4):635–640, October 1988.
Google Scholar
M. W. Hall, S. Hiranandani, K. Kennedy, and C. Tseng. Interprocedural compilation of Fortran D for MIMD distributed-memory machines. In Proceedings of Supercomputing '92, Minneapolis, MN, November 1992.
Google Scholar
M. W. Hall, K. Kennedy, and K. S. McKinley. Interprocedural transformations for parallel code generation. In Proceedings of Supercomputing '91, Albuquerque, NM, November 1991.
Google Scholar
Mary Hall. Managing Interprocedural Optimization. PhD thesis, Rice University, October 1990.
Google Scholar
Mary Hall, John M Mellor Crummey, Alan Carle, and Rene G Rodriguez. FIAT: A framework for interprocedural analysis and transformations. In Proceedings of the 6th Workshop on Languages and Compilers for Parallel Computing, pages 522–545. Springer-Verlag, August 1993.
Google Scholar
Reinhard v. Hanxleden. Handling irregular problems with Fortran D — a preliminary report. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993. Also available as CRPC Technical Report CRPC-TR93339-S.
Google Scholar
Reinhard v. Hanxleden. Give-n-take: A balanced code placement framework. Technical Report CRPC-TR94388-S, Center for Research on Parallel Computation, Rice University, March 1994.
Google Scholar
P. Havlak and K. Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems, 2(3): 350–360, July 1991.
Google Scholar
High Performance Fortran Forum. High Performance Fortran language specification. Scientific Programming, 2(1–2):1–170, 1993.
Google Scholar
S. Hiranandani, K. Kennedy, and C. Tseng. Evaluation of compiler optimizations for Fortran D on MIMD distributed-memory machines. In Proceedings of the Sixth International Conference on Supercomputing. ACM Press, July 1992.
Google Scholar
Seema Hiranandani, Ken Kennedy, and Chau-Wen Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8): 66–80, August 1992.
Google Scholar
Susan Horwitz, Thomas Reps, and David Binkley. Interprocedural slicing using dependence graphs. ACM Transactions on Programming Languages and Systems, 12(1):26–60, January 1990.
Google Scholar
J. Knoop, O. Rüthing, and B. Steffen. Lazy code motion. In Proceedings of the ACM SIGPLAN '92 Conference on Program Language Design and Implementation, San Francisco, CA, June 1992.
Google Scholar
E. Morel and C. Renvoise. Global optimization by suppression of partial redundancies. Communications of the ACM, 22(2):96–103, February 1979.
Google Scholar
E. Myers. A precise interprocedural data flow algorithm. In Conference Record of the Eighth ACM Symposium on the Principles of Programming Languages, pages 219–230, January 1981.
Google Scholar
Ravi Ponnusamy, Joel Saltz, and Alok Choudhary. Runtime-compilation techniques for data partitioning and communication schedule reuse. In Proceedings Supercomputing '93, pages 361–370. IEEE Computer Society Press, November 1993. Also available as University of Maryland Technical Report CS-TR-3055 and UMIACS-TR-93-32.
Google Scholar
B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Global value numbers and redundant computations. In Conference Record of the Fifteenth ACM Symposium on the Principles of Programming Languages, pages 12–27, San Diego, CA, January 1988.
Google Scholar
A. Sorkin. Some comments on “A solution to a problem with Morel and Renvoise's ‘Global optimization by suppression of partial redundancies' ”. ACM Transactions on Programming Languages and Systems, 11(4):666–668, October 1989.
Google Scholar
Alan Sussman, Gagan Agrawal, and Joel Saltz. A manual for the multiblock PARTI runtime primitives, revision 4.1. Technical Report CS-TR-3070.1 and UMIACS-TR-93-36.1, University of Maryland, Department of Computer Science and UMIACS, December 1993.
Google Scholar
Mark Weiser. Program slicing. IEEE Transactions on Software Engineering, 10:352–357, 1984.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Maryland, 20742, College Park, MD
Gagan Agrawal & Joel Saltz

Authors

Gagan Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Joel Saltz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Keshav Pingali Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agrawal, G., Saltz, J. (1995). Interprocedural communication optimizations for distributed memory compilation. In: Pingali, K., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1994. Lecture Notes in Computer Science, vol 892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0025885

Download citation

DOI: https://doi.org/10.1007/BFb0025885
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58868-9
Online ISBN: 978-3-540-49134-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics