Abstract
This paper presents a detailed case study of programming in a parallel programming system which targets complete and controlled parallelization of array-oriented computations. The purpose is to demonstrate how coherent integration of control and data parallelism enables both effective realization of the potential parallelism of applications and matching of the degree of parallelism in a program to the resources of the execution environment. (“Our ability to reason is constrained by the language in which we reason.”) The programming system is based on an integrated graphical, declarative representation of control parallelism and data partitioning parallelism. The computation used for the example is even-odd reduction of block tridiagonal matrices. This computation has three phases, each with a different parallel structure. We derive, implement and measure the execution of a dynamic parallel computation structure which employs differing levels of control and data parallelism in each phase of the computation to give load balanced execution across a range of number of processors. The program formulated in the integrated representation revealed parallelism not shown in the original algorithm and has a constant level of actual parallelism throughout the computation where the original algorithm had unbalanced levels of parallelism during different phases of the computation. The resulting program shows near-linear speed-up across all phases of the computation for number of processors ranging from 2 to 32.
Preview
Unable to display preview. Download preview PDF.
References
Dwip N Banerjee: Integration of Data Partitioning in a Visual Parallel Programming Environment, Dissertation Proposal, Department of Computer Sciences, The University of Texas at Austin
Dwip N. Banerjee and J. C. Browne: Complete Parallelization of Computations: Integration of Data Partitioning and Functional Parallelism for Dynamic Data Structures, Proceedings of the 10th International Parallel Processing Symposium, Honolulu, Hawaii, 1996
J. C. Browne: Formulation and Programming of Parallel Computers: a Unified Approach, Proceedings of the International Conference on Parallel Processing, 1985, pp. 624–631
Chapman et. al.: A Software Architecture for Multidisciplinary Applications Integrating Task and Data Parallelism, ICASE Report 94-18, March 1994, Langley, VA
G. Cheng, G. Fox and K. Mills, “Integrating Multiple Programming Paradigms on Connection Machine CM5 in a Dataflow-Based Software Environment,” Technical Report, Northeast Parallel Architectures Center, Syracuse University, 1993.
Ian Foster, David R. Kohr Jr., Rakesh Krishnaiyer, Alok Choudhary: Double Standards: Bringing Task Parallelism to HPF via the Message Passing Interface
I. Foster, B. Avalani, A. Choudhary and M. Xu, “A Compilation System that Integrates High Performance Fortran and Fortran M,” In Proceedings of the 1994 Scalable High Performance Computing Conference, pages 293-300. IEEE Computer Society, 1994.
M. Girkar and C. Polychronopoulos, “Automatic Extraction of Functional Parallelism from Ordinary Programs,” IEEE Transactions on Parallel and Distributed Systems, 3(2):166–178, 1992.
T. Gross, D. O'Hallaron and J. Subhlok, “Task Parallelism in a High Performance Fortran Framework,” IEEE Parallel and Distributed Technology, 2(2):16–26, Fall 1994.
S. Lakshmivarahan and Sudarshan K. Dhall: Analysis and Design of Parallel Algorithms: Arithmetic and Matrix Problems, McGraw Hill, 1990
P. Mehrotra and M. Haines, “An Overview of the Opus Language and Runtime System,” ICASE Report 94-39, Institute for Computer Application in Science and Engineering, Hampton, VA, May 1994.
Peter Newton and J. C Browne: The CODE 2.0 Graphical Parallel Programming Language, Proceedings of the ACM International Conference on Supercomputing, July, 1992
Almadena Chtchelkavona, John Gunnels, Greg Morrow, James Overfelt, Robert A. van de Geijn: Parallel Implementation of BLAS: General Techniques for Level 3 BLAS, Concurrency: Practice and Experience to appear
S. Ramaswamy, S. Sapatnekar and P. Banerjee, “A Framework for Exploiting Data and Functional Parallelism on Distributed Memory Multicomputers,” Technical Report CRHC-94-10, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, IL, 1994.
B. Seevers, M. Quinn and P. Hatcher, “A Parallel Programming Environment Supporting Data-Parallel Modules,” International Journal of Parallel Programming, 21(5), October 1992.
H. Zima, H.-J. Bast and M. Gerndt, “SUPERB: A Tool for Semi-Automatic MIMD/SIMD Parallelization,” Parallel Computing, 6:1–18, 1988.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Banerjee, D., Browne, J.C. (1998). “Optimal” parallelism through integration of data and control parallelism: A case study in complete parallelization. In: Li, Z., Yew, PC., Chatterjee, S., Huang, CH., Sadayappan, P., Sehr, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1997. Lecture Notes in Computer Science, vol 1366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032702
Download citation
DOI: https://doi.org/10.1007/BFb0032702
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64472-9
Online ISBN: 978-3-540-69788-6
eBook Packages: Springer Book Archive