Abstract
In this paper we present a unified approach for compiling programs for Distributed-Memory Multiprocessors (DMM). Parallelization of sequential programs for DMM is much more difficult to achieve than for shared memory systems due to the exclusive local memory of each Virtual Processor (VP). The approach presented distributes computations among VPs of the system and maps data onto their private memories. It tries to obtain maximum parallelism out of DO loops while minimizing interprocessor communication.
The method presented, which is named Graph Traverse Scheduling (GTS), is considered in this paper for single-nested loops including one or several recurrences. In the parallel code generated, dependences included in a hamiltonian recurrence that involves all the statements of the loop are enforced by the sequential execution of the computation assigned to each VP. Other dependences not included in the hamiltonian recurrence and involving data mapped onto different VPs will need explicit communication and synchronization.
This work has been supported by the Ministry of Education of Spain (CICYT) in program TIC 299/89 and 392/89
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J.R. Allen and K. Kennedy, "Automatic Translation of FORTRAN Programs to Vector Form", ACM Transactions on Programming Languages and Systems, Vol. 9, No. 4, October 1987.
E. Ayguadé, J. Labarta, J. Torres and P. Borensztejn, "GTS: Parallelization and Vectorization of Tight Recurrences", Proc. of the Supercomputing'89, Reno-Nevada, November 1989.
E. Ayguadé, J. Labarta, J. Torres, J.M. Liaberia and M. Valero, "Parallelism Evaluation and Partitioning of Nested Loops for Shared-Memory Multiprocessors", Proc. of the 3rd Workshop on Programming Languages and Compilers for Parallel Computing, Irvine-California, August 1990.
E. Ayguadé, "Automatic Parallelization of Recurrences in Numerical Sequential Programs", Ph.D. Thesis, Departament d'Arquitectura do Computadors, Universitat Politècnica de Catalunya, Oct. 1989 (in spanish).
Z. Chen and C-C. Chang, "Iteration-Level Parallel Execution of DO Loops with a Reduced Set of Dependence Relations", Journal of Parallel and Distributed Computing, No. 4, 1987.
A. Fernandez, J.M. Llaberia, J.J. Navarro and M. Valero-Garcia, "Interleaving Partitions of Systolic Algorithms for Programming Distributed Memory Multiprocessors", Proceedings of the 2nd European Distributed Memory Computers Conference, Springer-Verlag (in this volume), 1991.
D. Callahan and K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors", The Journal of Supercomputing, No. 2, October 1988.
K. Gallivan, W. Jalby and D. Gannon, "On the problem of Optimizing Data Transfers for Complex Memory Systems", Proceedings of the 1988 ACM International Conference on Supercomputing, St. Malo-France, 1988.
H.M. Gerndt, "Automatic Parallelization for Distributed Memory Multiprocessing Systems", Ph.D. dissertation, University of Bonn, Technical Report Series ACPC/TR 90-1, Austrian Center for Parallel Computation, 1990.
O.H.Ibarra and S.M.Sohn, "On Mapping Systolic Algorithms onto the Hypercube", Proceedings of the 1989 International Conference on Parallel Processing, Vol. 1, August 1989.
iPSC/2, Intel Corporation, 1988. Order Number 280110-001.
K. Kennedy and H.P. Zima, "Virtual Shared Memory for Distributed-Memory Machines", Proceedings of the 4th Hypercube Conference, Monterey-California, 1989.
D.J. Kuck, R.H. Kuhn, D.A. Padua, B. Leasure and M. Wolfe, "Dependence Graphs and Compiler Optimizations", Proc. of the 8th ACM Symposium on Principles of Programming Languages Williamsburg, January 1981.
C.D. Polychronopoulos, M. Girkar, M. R. Haghighat, C.L. Lee, B. Leung, D. Schouten, "Parafrase-2: An Environment for Parallelizing, Partitioning, Synchronizing and Scheduling Programs on Multiprocessors", Proceedings of the 1989 International Conference on Parallel Processing, Vol. II, August 1989.
D. Pountain, "Virtual Channels: The Next Generation of Transputers", BYTE, April 1990.
J. Ramanujam and P. Sadayappan, "A Methodology for Parallelizing Programs for Multicomputers and Complex Memory Multiprocessors", Proceedings of the Supercomputing'89, Reno-Nevada, November 1989.
J. Torres, E. Ayguadé, J. Labarta, J.M. Llaberia and M. Valero, "A Technique for Data and Task Partitioning of Nested Loops for Distributed-Memory Parallel Computers", Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, UPC/DAC Research Report RR-90/13, June 1990.
Ping-Sheng Tseng, "A Parallelizing Compiler for Distributed Memory Parallel Computers", Ph.D. Thesis, Carnegie Mellon University, CMU-CS-89-148, May 1989.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1991 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Torres, J., Ayguadé, E., Labarta, J., Llaberia, J.M., Valero, M. (1991). On automatic loop data-mapping for distributed-memory multiprocessors. In: Bode, A. (eds) Distributed Memory Computing. EDMCC 1991. Lecture Notes in Computer Science, vol 487. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032934
Download citation
DOI: https://doi.org/10.1007/BFb0032934
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-53951-3
Online ISBN: 978-3-540-46478-5
eBook Packages: Springer Book Archive