Abstract
We discuss a programming methodology based on the use of multilinear algebra to design and implement parallel algorithms for linear computations. In particular, we review techniques for implementing expressions involving the tensor product. We then show how the tensor product can be used to formulate Strassen's matrix multiplication algorithm. We report on our experience using this formulation and these techniques to implement a parallel version of Strassen's matrix multiplication algorithm on the Encore Multimax.
Similar content being viewed by others
References
Anderson, T.E., Lazowska, E.D., and Levy, H.M. 1988. The performance implications of thread management alternatives for shared-memory multiprocessors. 88–09–04, Dept. of Comp. Sci., Univ. of Washington, Seattle, Wash.
Auslander, L. 1989. The tensor product as a programming tool. Unpub.
Auslander, L., and MacKenzie, R.E. 1963. Introduction to Differentiable Manifolds. McGraw-Hill, New York.
Bailey, D.H. 1988. Extra high speed matrix multiplication on the Cray-2. SIAM J. Sci. Stat. Comput., 9, 3, (May), 603–607.
Cooley, J.W. 1989. The IBM 3090 vector facility algorithm and program design. Unpub.
Cray Research. 1989. CRAY Y-MP Computer Systems Functional Description Manual. HR-4001A, Cray Research, Inc., Mendota Heights, Minn.
de Groote, H.F. 1987. Lectures on the Complexity of Bilinear Problems. Springer-Verlag, Berlin.
Encore Computer. 1988. Encore Parallel Threads Manual. 724-06210 Rev. A, Encore Computer Corp.
Encore Computer. 1989. Multimax Technical Summary. 726-01759 Rev. E, Encore Computer Corp.
Higham, N.J. 1990. Exploiting fast matrix multiplication within the Level 3 BLAS. ACM Trans. Math. Software, 16, 4 (Dec.), 352.
Huang, C.-H., Johnson, J.R., and Johnson, R.W. 1990a. An implementation of Strassen's matrix multiplication algorithm using tensor products. In Proc., First Annual OSU Workshop on Parallel Computing (Columbus, Oh., Mar. 21–23), pp. 38–46.
Huang, C.-H., Johnson, J.R., and Johnson, R.W. 1990b. A tensor product formulation of Strassen's matrix multiplication algorithm. Appl. Math. Lett., 3, 3: 67–71.
Huang, C.-H., Johnson, J.R., and Johnson, R.W. 1991. A report on the performance of an implementation of Strassen's algorithm. Appl. Math. Lett., 4, 1: 99–102.
IBM. 1988. IBM Engineering and Scientific Subroutine Library, Guide and Reference. Program No. 5668–863, Rel. 3, 4th ed., IBM Corp., Kingston, N.Y.
Jacobson, E.M., Smitley, D.L., and Tsao, A. 1989. The tensor product as a tool for the systematic design of fast vector and parallel FFT implementations. APP-89–003, Supercomputing Research Center, Bowie, Md.
Johnson, J.R. 1988. Some issues in designing algebraic algorithms for the CRAY X-MP. Master's thesis, Tech. Rept. 88–02, Center for Mathematical Computation, Univ. of Del., Newark, Del.
Johnson, J.R., Johnson, R.W., Rodriguez, D., and Tolimieri, R. 1990. A methodology for designing, modifying, and implementing Fourier transform algorithms on various architectures. Circuits Systems Signal Process., 9, 4: 449–500.
Johnson, R.W. 1989. Automatic implementation of tensor products. Unpub.
Miller, W. 1975. Computational complexity and numerical stability. SIAM J. Comput., 4, 2 (June), 97–107.
Pease, M.C. 1968. An adaptation of the fast Fourier transform for parallel processing. JACM, 15, 2 (Apr.), 252–264.
Schreiber, R. 1988. Block algorithms for parallel machines. In Numerical Algorithms for Modern Parallel Computer Architectures (M. Schultz, ed.), Springer-Verlag, New York, pp. 197–208.
Strassen, V. 1969. Gaussian elimination is not optimal. Numer. Math., 13: 354–356.
Tolimieri, R., An, M., and Lu, C. 1989. Algorithms for the Discrete Fourier Transform and Convolution. Springer-Verlag, New York.
Winograd, S. 1980. Arithmetic Complexity of Computations. CBMS-NSF Regional Conf. Series in Applied Math., SIAM, Philadelphia.
Author information
Authors and Affiliations
Additional information
Supported in part by Defense Advanced Research Projects Agency DARPA Order No. 6674, monitored by AFOSR under contract No. F49620-89-C-0020.
Rights and permissions
About this article
Cite this article
Johnson, R.W., Huang, C.H. & Johnson, J.R. Multilinear algebra and parallel programming. J Supercomput 5, 189–217 (1991). https://doi.org/10.1007/BF00127843
Issue Date:
DOI: https://doi.org/10.1007/BF00127843