Skip to main content
Log in

Limited width parallel prefix circuits

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In this paper, we present lower and upper bounds on the size of limited width, bounded and unbounded fan-out parallel prefix circuits. The lower bounds on the sizes of such circuits are a function of the depth, width, and number of inputs. The size requirement of an N input bounded fan-out parallel prefix circuit having limited width W and extra depth k (the difference between allowed and minimum possible depth) is shown to be Ω(N log2 W/2k + N) for k ≤ log2 W. This implies that insisting on minimum depth causes the circuit size to be nonlinear, while as little as log2log2 W of extra depth can possibly reduce the size to linear. Also, we show that there is a clear difference between the two cases of bounded and unbounded fan-out by proving the size of a limited width, unbounded fan-out parallel prefix circuit lies between a lower bound of Ω((2 + 21−k/3)N) and an upper bound of O((2 + 21−k)N).

Uniform, systolic constructions of limited width parallel prefix circuits are provided here and shown to be asymptotically optimal. By associating the width of the circuit with the number of processors and the fan-out capabilities of the circuit with the interconnection structure of a multiprocessor, time- and processor-efficient algorithms may be developed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aczel, J. and Daroczy, Z. 1975. On Measures of Information and Their Characterizations. Academic Press, New York.

    Google Scholar 

  • Bilardi, G., and Preparata, F.P. 1989. Size-time complexity of boolean networks for prefix computations. JACM, 36, 2 (Apr.), 362–382.

    Google Scholar 

  • Brent, R.P., and Kung, H.T. 1982. A regular layout for parallel adders. IEEE Trans. Comps., C-31, 3 (Mar.), 260–264.

    Google Scholar 

  • Carlson, D.A., and Sugla, B. 1984. Time and processor efficient algorithms for linear recurrence equations and related problems. In Proc., 1984 Internat. Conf. on Parallel Processing (Aug.), pp. 310–314.

  • Carlson, D.A., and Sugla, B. 1989. Adapting shuffle-exchange like parallel processing organizations to work as systolic arrays. Parallel Computing, 11, 1 (July), 93–106.

    Google Scholar 

  • Chen, S.C., and Kuck, D.J. 1975. Time and parallel processor bounds for linear recurrence systems. IEEE Trans. Comps., C-24, 7 (July), 707–717.

    Google Scholar 

  • Despain, A., Sequin, C., Thompson, C., Wold, E., and Lioupis, D. 1982. VLSI implementation of digital Fourier transforms. Rept. no. UCB/CSD82/111, Comp. Sci. Div., Univ. of Calif., Berkeley (Nov.).

    Google Scholar 

  • Fich, F.E. 1983. New bounds for parallel prefix circuits. In Proc., 15th Symp. on the Theory of Computing (Apr.), pp. 100–109.

  • Gajski, D. 1981. An algorithm for solving linear recurrence systems on parallel and pipelined machines. IEEE Trans. Comps., C-30, 3 (Mar.), 190–206.

    Google Scholar 

  • Greenberg, A.G., Ladner, R.E., Paterson, M.S., and Galil, Z. 1982. Efficient parallel algorithms for linear recurrence computation. Information Processing Letters, 15, 1 (Aug.), 31–35.

    Google Scholar 

  • Kogge, P.M., and Stone, H.S. 1973. A parallel algorithm for the efficient solution of a general class of recurrence equations. IEEE Trans. Comps., C-22, 8 (Aug.), 786–793.

    Google Scholar 

  • Kruskal, C.P., Rudolph, L., and Snir, M. 1985. The power of parallel prefix. IEEE Trans. Comps., C-34, 10 (Oct.), 965–968.

    Google Scholar 

  • Ladner, R.E., and Fischer, M.J. 1980. Parallel prefix computation. JACM, 27, 4 (Oct.), 831–838.

    Google Scholar 

  • Lakshmivarahan, S., Yang, C., and Dhall, S.K. 1987. On a new class of optimal parallel prefix circuits. In Proc., 1987 Internat. Conf. on Parallel Processing (Aug.), pp. 58–65.

  • Munro, J.I., and Paterson, M.S. 1973. Optimal algorithms for parallel polynomial evaluation. J. Computer and and System Sciences, 7, 2: 189–198.

    Google Scholar 

  • Ofman, Y. 1963. On the algorithmic complexity of discrete functions. Cybernetics and Control Theory, Soviet Physics Doklady, 7 (Jan.), 589–591.

    Google Scholar 

  • Reif, J. 1984. Probabilistic parallel prefix computation. In Proc., 1984 Internat. Conf. on Parallel Processing (Aug.), pp. 291–298.

  • Snir, M. 1986. Depth-size tradeoffs for parallel prefix computation. J. Algorithms, 7, 2 (June), 185–201.

    Google Scholar 

  • Sugla, B., and Carlson, D.A. 1990. Extreme area-time tradeoffs in VLSI. IEEE Trans. Comps., 39, 2 (Feb.), 251–257.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carlson, D.A., Sugla, B. Limited width parallel prefix circuits. J Supercomput 4, 107–129 (1990). https://doi.org/10.1007/BF00127876

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00127876

Keywords