Summary
For distributed-memory multicomputers, the quality of the data partitioning for a given application is crucial to obtaining high performance. This task has traditionally been the user’s responsibility, but in recent years much effort has been directed to automating the selection of data partitioning schemes. Several researchers have proposed systems that are able to produce data distributions that remain in effect for the entire execution of an application. For complex programs, however, such static data distributions may be insufficient to obtain acceptable performance. The selection of distributions that dynamically change over the course of a program’s execution adds another dimension to the data partitioning problem. In this chapter we present an approach for selecting dynamic data distributions as well as a technique for analyzing the resulting data redistribution in order to generate efficient code.
This research, performed at the University of Illinois, was supported in part by the National Aeronautics and Space Administration under Contract NASA NAG1- 613, in part by an Office of Naval Research Graduate Fellowship, and in part by the Advanced Research Projects Agency under contract DAA-H04-94-G-0273 administered by the Army Research office. We are also grateful to the National Center for Supercomputing Applications and the San Diego Supercomputing Center for providing access to their machines.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley Publ., Reading, MA, 1986.
J. M. Anderson and M. S. Lam. Global optimizations for parallelism and lo cality on scalable parallel machines. In Proc. of the ACM SIGPLAN’ 93 Conf. on Programming Language Design and Implementation, 112–125, Albuquerque, NM, June 1993.
E. Ayguadé, J. Garcia, M. Girones, M. L. Grande, and J. Labarta. Data redistribution in an automatic data distribution tool. In Proc. of the 8th Workshop on Languages and Compilers for Parallel Computing, volume 1033 of Lecture Notes in Computer Science, 407–421, Columbus, OH, Aug. 1995. Springer-Verlag. 1996.
P. Banerjee, J. A. Chandy, M. Gupta, E. W. Hodges IV, J. G. Holm, A. Lain, D. J. Palermo, S. Ramaswamy, and E. Su. The PARADIGM compiler for distributed-memory multicomputers. IEEE Computer, 28(10):37–47, Oct. 1995.
D. Bau, I. Koduklula, V. Kotlyar, K. Pingali, and P. Stodghill. Solving align ment using elementary linear algebra. In Proc. of the 7th Workshop on Languages and Compilers for Parallel Computing, volume 892 of Lecture Notes in Computer Science, 46–60, Ithica, NY, 1994. Springer-Verlag. 1995.
R. Bixby, K. Kennedy, and U. Kremer. Automatic data layout using 0-1 integer programming. In Proc. of the 1994 Int’l Conf. on Parallel Architectures and Compilation Techniques, 111–122, Montréal, Canada, Aug. 1994.
B. Chapman, T. Fahringer, and H. Zima. Automatic support for data distribution on distributed memory multiprocessor systems. In Proc. of the 6th Work shop on Languages and Compilers for Parallel Computing, volume 768 of Lecture Notes in Computer Science, 184–199, Portland, OR, Aug. 1993. Springer-Verlag. 1994.
S. Chatterjee, J. R. Gilbert, R. Schreiber, and S. H. Teng. Automatic array alignment in data-parallel programs. In Proc. of the 20th ACM SIGPLAN Symp. on Principles of Programming Languages, 16–28, Charleston, SC, Jan. 1993.
F. Coelho and C. Ancourt. Optimal compilation of HPF remappings (extended abstract). Tech. Report CRI A-277, Centre de Recherche en Informatique, École des mines de Paris, Fontainebleau, France, Nov. 1995.
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. on Programming Languages and Systems, 13(4):451–490, Oct. 1991.
T. Fahringer. Automatic Performance Prediction for Parallel Programs on Massively Parallel Computers. Ph.D. thesis, Univ. of Vienna, Austria, Sept. 1993. TR93-3.
J. A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Trans. on Computers, c-30:478–490, July 1981.
J. Garcia, E. Ayguadé, and J. Labarta. A novel approach towards automatic data distribution. In Proc. of the Workshop on Automatic Data Layout and Performance Prediction, Houston, TX, Apr. 1995.
G. Golub and J. M. Ortega. Scientific Computing: An Introduction with Parallel Computing. Academic Press, San Diego, CA, 1993.
M. Gupta. Automatic Data Partitioning on Distributed Memory Multicomputers. Ph.D. thesis, Dept. of Computer Science, Univ. of Illinois, Urbana, IL, Sept. 1992. CRHC-92-19/UILU-ENG-92-2237.
M. Gupta and P. Banerjee. Compile-time estimation of communication costs on multicomputers. In Proc. of the 6th Int’l Parallel Processing Symp., 470–475, Beverly Hills, CA, Mar. 1992.
M. Gupta and P. Banerjee. PARADIGM: A partitioning on multicomputers. In Proc. of the 7th ACM Int’l Conf. on Super computing, Tokyo, Japan, July 1993.
M. W. Hall, S. Hiranandani, K. Kennedy, and C. Tseng. Interprocedural compilation of Fortran D for MIMD distributed-memory machines. In Proc. of Supercomputing’ 92, 522–534, Minneapolis, MN, Nov. 1992.
S. Hiranandani, K. Kennedy, and C. Tseng. Compiling Fortran D for MIMD distributed memory machines. Communications of the ACM, 35(8):66–80, Aug. 1992.
S. Hiranandani, K. Kennedy, and C.-W. Tseng. Evaluation of compiler optimizations for Fortran D on MIMD distributed-memory machines. In Proc. of the 6th ACM Int’l Conf. on Supercomputing, 1–14, Washington D.C., July 1992.
E. W. Hodges IV. High Performance Fortran support for the PARADIGM compiler. Master’s thesis, Dept. of Electrical and Computer Eng., Univ. of Illinois, Urbana, IL, Oct. 1995. CRHC-95-23/UILU-ENG-95-2237.
D. E. Hudak and S. G. Abraham. Compiling Parallel Loops for High Performance Computers — Partitioning, Data Assignment and Remapping. Kluwer Academic Pub., Boston, MA, 1993.
W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery. The Superblock: An effective technique for VLIW and superscalar compilation. The Journal of Supercomputing, 7(1):229–248, Jan. 1993.
K. Kennedy and U. Kremer. Automatic data layout for High Performance Fortran. In Proc. of Supercomputing’ 95, San Diego, CA, Dec. 1995.
K. Knobe, J. Lukas, and G. Steele, Jr. Data optimization: Allocation of arrays to reduce communication on SIMD machines. Journal of Parallel and Distributed Computing, 8(2):102–118, Feb. 1990.
C. Koelbel, D. Loveman, R. Schreiber, G. Steele, Jr., and M. Zosel. The High Performance Fortran Handbook. The MIT Press, Cambridge, MA, 1994.
U. Kremer. Automatic Data Layout for High Performance Fortran. Ph.D. thesis, Rice Univ., Houston, TX, Oct. 1995. CRPC-TR95559-S.
B. Krishnamurthy, editor. Practical Reusable UNIX Software. John Wiley and Sons Inc., New York, NY, 1995.
J. Li and M. Chen. The data alignment phase in compiling programs for distributed-memory machines. Journal of Parallel and Distributed Computing, 13(2):213–221, Oct. 1991.
D. J. Palermo. Compiler Techniques for Optimizing Communication and Data Distribution for Distributed-Memory Multicomputers. Ph.D. thesis, Dept. of Electrical and Computer Eng., Univ. of Illinois, Urbana, IL, June 1996. CRHC-96-09/UILU-ENG-96-2215.
D. J. Palermo, E. W. Hodges IV, and P. Banerjee. Dynamic data partitioning for distributed-memory multicomputers. Journal of Parallel and Distributed Computing, 38(2):158–175, Nov. 1996. special issue on Compilation Techniques for Distributed Memory Systems.
D. J. Palermo, E. W. Hodges IV, and P. Banerjee. Interprocedural array redistribution data-flow analysis. In Proc. of the 9th Workshop on Languages and Compilers for Parallel Computing, San Jose, CA, Aug. 1996.
D. J. Palermo, E. Su, J. A. Chandy, and P. Banerjee. Compiler optimizations for distributed memory multicomputers used in the PARADIGM compiler. In Proc. of the 23rd Int’l Conf. on Parallel Processing, II:1–10, St. Charles, IL, Aug. 1994.
C. D. Polychronopoulos, M. Girkar, M. R. Haghighat, C. L. Lee, B. Leung, and D. Schouten. Parafrase-2: An environment for parallelizing, partitioning, synchronizing and scheduling programs on multiprocessors. In Proc. of the 18th Int’l Conf. on Parallel Processing, II:39–48, St. Charles, IL, Aug. 1989.
J. Ramanujam and P. Sadayappan. Compile-time techniques for data distribution in distributed memory machines. IEEE Trans. on Parallel and Distributed Systems, 2(4):472–481, Oct. 1991.
S. Ramaswamy and P. Banerjee. Automatic generation of efficient array redistribution routines for distributed memory multicomputers. In Frontiers’ 95: The 5th Symp. on the Frontiers of Massively Parallel Computation, 342–349, McLean, VA, Feb. 1995.
R. Sadourny. The dynamics of finite-difference models of the shallow-water equations. Journal of the Atmospheric Sciences, 32(4), Apr. 1975.
T. J. Sheffler, R. Schreiber, J. R. Gilbert, and W. Pugh. Efficient distribution analysis via graph contraction. In Proc. of the 8th Workshop on Languages and Compilers for Parallel Computing, volume 1033 of Lecture Notes in Computer Science, 377–391, Columbus, OH, Aug. 1995. Springer-Verlag. 1996.
H. Sivaraman and C. S. Raghavendra. Compiling for MIMD distributed memory machines. Tech. Report EECS-94-021, School of Electrical Enginnering and Computer Science, Washington State Univ., Pullman, WA, 1994.
R. E. Tarjan. Data Structures and Network Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, PA, 1983.
C. W. Tseng. An Optimizing Fortran D Compiler for MIMD Distributed-Memory Machines. Ph.D. thesis, Rice Univ., Houston, TX, Jan. 1993. COMP TR93-199.
P. S. Tseng. Compiling programs for a linear systolic array. In Proc. of the ACM SIGPLAN’ 90 Conf. on Programming Language Design and Implementation, 311–321, White Plains, NY, June 1990.
R. von Hanxleden and K. Kennedy. Give-N-Take — A balanced code place ment framework. In Proc. of the ACM SIGPLAN’ 94 Conf. on Programming Language Design and Implementation, 107–120, Orlando, FL, June 1994.
S. Wholey. Automatic data mapping for distributed-memory parallel computers. In Proc. of the 6th ACM Int’l Conf. on Supercomputing, 25–34, Washington D.C., July 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Palermo, D.J., Hodges, E.W., Banerjee, P. (2001). Compiler Optimization of Dynamic Data Distributions for Distributed-Memory Multicomputers. In: Pande, S., Agrawal, D.P. (eds) Compiler Optimizations for Scalable Parallel Systems. Lecture Notes in Computer Science, vol 1808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45403-9_13
Download citation
DOI: https://doi.org/10.1007/3-540-45403-9_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41945-7
Online ISBN: 978-3-540-45403-8
eBook Packages: Springer Book Archive