Skip to main content
Log in

On high-speed computing with a programmable linear array

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

It has been observed by many researchers that systolic arrays are very suitable for certain high-speed computations. Using a formal methodology, we present a design for a single simple programmable linear systolic array capable of solving large numbers of problems drawn from a variety of applications. The methodology is applicable to problems solvable by sequential algorithms that can be specified as nested for-loops of arbitrary depth. The algorithms of this form that can be computed on the array presented in this paper include 25 algorithms dealing with signal and image processing, algebraic computations, matrix arithmetic, pattern matching, database operations, sorting, and transitive closure. Assuming bounded I/O, for 18 of those algorithms the time and storage complexities are optimal, and therefore no improvement can be expected by using dedicated special-purpose linear systolic arrays designed for individual algorithms. We also describe another design which, using a sufficient large local memory and allowing data to be preloaded and unloaded, has an optimal processor/time product.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Annaratone, M.A., Arnould, E., Gross, T., Kung, H.T., Lam, M., Menzilcioglu, I., and Webb, J.A. 1987. The WARP computer: Architecture, implementation, and performance.IEEE Trans. Comp., C-36, 12 (Dec.), 1523–1538.

    Google Scholar 

  • Banerjee, U., Chen, S.C., Kuck, D.J., and Towle, R.A. 1979. Time and parallel processor bounds for FORTRAN like loops.IEEE Trans. Comp., C-28, 9 (Sept.), 660–670.

    Google Scholar 

  • Chen, M.C. 1988. The generation of a class of multipliers: Synthesizing highly parallel algorithms in VLSI.IEEE Trans. Comp., C-37, 3 (Mar.), 329–338.

    Google Scholar 

  • Foster, M.J., and Kung H.T. 1980. The design of special-purpose VLSI chips.IEEE Comp., 13, 1 (Jan.), 26–40.

    Google Scholar 

  • Guibas, L.J., Kung, H.T., and Thompson, CD. 1979. Direct VLSI implementation of combinatorial algorithms. InConf. Proc.—CALTECH Conf. on VLSI (Jan.), pp. 509–525.

  • Heuft, R., and Little, W. 1982. Improved time and parallel processor bounds for Fortran-like loops.IEEE Trans. Comp., C-31, 1 (Jan.), 78–81.

    Google Scholar 

  • Hwang, K., and Cheng, Y.H. 1982. Partitioned matrix algorithms for VLSI arithmetic systems.IEEE Trans. Comp., C-31, 12 (Dec.), 1215–1224.

    Google Scholar 

  • Kuhn, R.H. 1980. Transforming algorithms for single-stage and VLSI architectures. InConf. Proc.—The Workshop on Interconnection Networks for Parallel and Distributed Processing, IEEE Computer Soc. Press, pp. 11–19.

  • Kung, H.T. 1981. Use of VLSI in algebraic computation: Some suggestions. InConf. Proc.—ACM Symp, on Symbolic and Algebraic Computation, pp. 218–222.

  • Kung, H.T. 1982. Why systolic architectures?IEEE Comp., 15, 1 (Jan.), 37–46.

    Google Scholar 

  • Kung, H.T. 1984. Systolic algorithms for the CMU WARP processor. InConf. Proc.—The Seventh Internat. Conf. on Pattern Recognition (July), pp. 570–577.

  • Kung, H.T., and Lam, M.S. 1984. Wafer-scale integration and two-level pipelined implementations of systolic arrays,J. Parallel and Distributed Computing, 1: 32–63.

    Google Scholar 

  • Kung, H.T., and Lehman, L. 1980. Systolic (VLSI) arrays for relational database operations. InConf. Proc.- ACM SIGMOD, pp. 105–116.

  • Kung, H.T., and Leiserson, C.E. 1980. Algorithms for VLSI processor arrays. InIntroduction to VLSI Systems, Chap. 8.3 (C. Mead and L. Conway, eds.) Addison-Wesley, Reading, Mass.

    Google Scholar 

  • Kung, S.Y. 1984. On supercomputing with systolic/wavefront array processors.Proc. IEEE, 72, 7 (July), 867–884.

    Google Scholar 

  • Kung, S.Y. 1988.VLSI Array Processors. Prentice Hall, Englewood Cliffs, N.J.

    Google Scholar 

  • Lamport, L. 1974. The parallel execution of do loops.CACM, 17, 2 (Feb.), 83–93.

    Google Scholar 

  • Lee, P.-Z. 1989. Mapping algorithms on regular parallel architectures, Ph.D. diss., New York Univ., New York.

    Google Scholar 

  • Lee, P.-Z., and Kedem, Z.M. 1988. Synthesizing linear-array algorithms from nested for loop algorithms.The Special Issue on Parallel and Distributed Algorithms, IEEE Trans. Comp., C-37, 12 (Dec.), 1578–1598. (Preliminary version also available as NYU Comp. Sci. TR-355, Mar. 1988.)

    Google Scholar 

  • Lee, P.-Z., and Kedem, Z.M. 1990. Mapping nested loop algorithms into multi-dimensional systolic arrays.IEEE Trans. on Parallel and Distributed Systems, 1, 1 (Jan.), 64–76.

    Google Scholar 

  • Lee, P.-Z., Wu, J., Yang, A., Yip, K., Chu, C.W., and Liang, L.W. 1989. SYSDES: A systolic array automation design system. Presented inThe Fourth SIAM Conf. on Parallel Processing for Scientific Computing (Dec.) 11-13.

  • Li, G., and Wah, B. 1985. The design of optimal systolic arrays.IEEE Trans. Comp., C-34, 1 (Jan.), 66–77.

    Google Scholar 

  • Moldovan, D.I. 1983. On the design of algorithms for VLSI systolic arrays.Proc. IEEE, 71, 1 (Jan.), 113–120.

    Google Scholar 

  • Moldovan, D.I., and Fortes, J.A. 1986. Partitioning and mapping algorithms into fixed size systolic arrays.IEEE Trans. Comp., C-35, 1 (Jan.), 1–12.

    Google Scholar 

  • Omtzigt, E.T.L. 1988. SYSTARS: A CAD tool for the synthesis and analysis of VLSI systolic/wavefront arrays. InConf. Proc.—Internat. Conf. on Systolic Arrays, (San Diego, Calif., May), pp. 383–391.

  • Quinton, P. 1984. Automatic synthesis of systolic arrays from uniform recurrent equations. InConf. Proc.—11th Annual Symp. Comput. Architecture, (Ann Arbor, Mich., June 5–7), IEEE Computer Soc. Press, pp. 208–214.

  • Ramakrishnan, I.V., and Varman, P. 1984. Modular matrix multiplication on a linear array.IEEE Trans. Comp., C-33, 11 (Nov.), 952–958.

    Google Scholar 

  • Ramakrishnan, I.V., Fussell, D., and Silberschatz, A. 1986. Mapping homogeneous graphs on linear arrays.IEEE Trans. Comp., C-35, 3 (Mar.), 198–209.

    Google Scholar 

  • Yang, C.B., and Lee, R.C.T. 1984. Systolic algorithms for the LCS problem. InConf. Proc.—Internat. Comput. Symp., (Taipei, Taiwan, R.O.C.), pp. 895–901.

Download references

Author information

Authors and Affiliations

Authors

Additional information

An earlier version of this paper was presented at Supercomputing '88.

This work was partially supported by ONR under the contract N00014-85-K-0046 and by NSF under Grant Number CCR-8906949.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, P., Kedem, Z.M. On high-speed computing with a programmable linear array. J Supercomput 4, 223–249 (1990). https://doi.org/10.1007/BF00127833

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00127833

Key words

Navigation