Abstract
This paper presents a methodology for synthesizing parallel programs for block recursive algorithms such as fast Fourier transforms and Strassen's matrix multiplication algorithm. A block recursive algorithm is expressed as a tensor product formula which consists of matrix sums, matrix products, direct sums, tensor products, componentwise matrix operations, and stride permutations. These mathematical operations can be mapped to high-level programming language constructs such as iteration, sequential composition, parallel composition, vector operations, and index computation. Translation of a tensor product formula consisting of these primitives into a parallel program involves determination of the proper indexing schemes for the arrays. This paper gives an algorithm to determine the indexing scheme and the code required for the index computation. Various parallel programs can be synthesized by manipulating tensor product formulas to exploit different computational structures. In this paper, we discuss some issues involved in formula manipulation for a particular target machine, the Cray Y-MP.
This work was supported in part by DARPA, order number 7898, monitored by NIST under grant number 60NANB1D1151, DARPA, order number 7899, monitored by NIST under grant number 60NANB1D1150, and Ohio State University Seed Grant, No. 221337.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
S. K. S. Gupta, S. D. Kaushik, C.-H. Huang, J. R. Johnson, R. W. Johnson, and P. Sadayappan. A methodology for generating data distributions to optimize communication. In Proceedings of Fourth IEEE Symposium on Parallel and Distributed Computing, 1992. To appear.
C.-H. Huang, J. R. Johnson, and R. W. Johnson. A tensor product formulation of Strassen's matrix multiplication algorithm. Applied Mathematics Letters, 3(3):67–71, 1990.
C.-H. Huang, J. R. Johnson, and R. W. Johnson. Generating parallel programs from tensor product formulas: A case study of Strassen's matrix multiplication algorithm. In Proc. International Conference on Parallel Processing 1992, pages 104–108, 1992.
J. R. Johnson, C.-H Huang, and R. W. Johnson. Tensor permutations and block matrix allocation. In Second International Workshop on Array Structures (ATABLE-92), 1992. To appear.
J. R. Johnson, R. W. Johnson, D. Rodriguez, and R. Tolimieri. A methodology for desigining, modifying, and implementing Fourier transform algorithms on various architectures. Circuits Systems Signal Process., 9(4):449–499, 1990.
R. W. Johnson, C.-H. Huang, and J. R. Johnson. Programming schemata for tensor products. Preprint.
R. W. Johnson, C.-H. Huang, and J. R. Johnson. Multilinear algebra and parallel programming. J. Supercomputing, 9:189–218, 1991.
S.D. Kaushik, S. Sharma, C.-H. Huang, J.R. Johnson, R.W. Johnson, and P. Sadayappan. An algebraic theory for modeling direct interconnection networks. In Supercomputing '92, pages 488–497, Nov. 1992.
C. Van Loan. Computational Frameworks for the Fast Fourier Transform. SIAM, 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gupta, S., Huang, C.H., Sadayappan, P., Johnson, R. (1993). On the synthesis of parallel programs from tensor product formulas for block recursive algorithms. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1992. Lecture Notes in Computer Science, vol 757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57502-2_52
Download citation
DOI: https://doi.org/10.1007/3-540-57502-2_52
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57502-3
Online ISBN: 978-3-540-48201-7
eBook Packages: Springer Book Archive