ABSTRACT
This paper discusses the development of a high speed pipelined arithmetic system suitable for recursive numeric computations. The core of the arithmetic system is an online pipeline network. The details of the architectural design of this arithmetic system are first presented. Then the organization of such a system to support a broad range of recursive computations, which have not been amenable to pipelining by other techniques, will be described. The LU factorization of a tridiagonal matrix is used as an example to provide timing comparisons between the online pipeline network, the CRAY-1, and the systolic array as presented by Kung and Leiserson, 1978.
- 1.Atkins, D. E., "Introduction to the Role of Redundancy in Computer Arithmetic," Computer, Vol. 8, No, 6, pp. 84-76, June 1975.Google ScholarDigital Library
- 2.Avizienis, A., "Signed-digit Number Representations for Fast Parallel Arithmetic," IRE Trans. on Electronic Computers, p. 389, 1961.Google ScholarCross Ref
- 3.Calahan, D. A., "A Block-Oriented Sparse Equation Solver for the CRAY-1," Proc. 1979 Inter. Conf. on Parallel Processing, pp. 116-123, August 1979.Google Scholar
- 4.Chen, S. C., D. J. Kuck, and A. H. Sameh, "Practical Parallel Band Triangular System Solvers," ACM Transactions on Mathematical Software (Sept. 1978), pp. 270-277. Google ScholarDigital Library
- 5.Chow, C. Y. and J. E. Robertson, "Logical Design of a Redundant Binary Adder," Proc. of Fourth Symp. on Computer Arithmetic, Santa Monica, CA, Oct. 1978.Google Scholar
- 6.DeLugish, B. G., "A Class of Algorithms for Automatic Evaluation of Certain Elementary Functions in a Binary Computer," Ph.D. Thesis, Report 399, Department of Computer Science, University of Illinois, Urbana, June 1970.Google Scholar
- 7.Ercegovac, M. D., "A General Method for Evaluation of Functions in a Digital Computer," Ph.D. Thesis, Report No. 750, Department of Computer Science, University of Illinois, Urbana, Aug. 1975. Google ScholarDigital Library
- 8.Ercegovac, M. D., "An On-Line Square Rooting Algorithm," Proceedings of Fourth Symposium on Computer Arithmetic, Santa Monica, CA, Oct. 1978.Google Scholar
- 9.Fong, R., T. L. Jordan, Some Linear Algebraic Algorithms and Their Performance on the CRAY-1, Report LA-6774, Los Alamos Scientific Laboratory, June, 1977.Google Scholar
- 10.Gajski, D. D., "Solving Banded Triangular Systems on Pipelined Machines," Proc. 1979 Inter Conf. on Parallel Processing, pp. 308-319, August 1979.Google Scholar
- 11.Heller, D., "On the Efficient Computation of Recurrence Relations," NASA Langley Research. Center, Institute for Computer Applications in Science and Engineering (ICASE), Hampton, VA (June 1974).Google Scholar
- 12.Heller, D., "A Survey of Parallel Algorithms in Numerical Linear Algebra," SIAM Review, Vol. 20, No. 4, pp. 740-777, Oct. 1978.Google ScholarDigital Library
- 13.Horowitz, E., "VLSI Architectures for Matrix Computations," Proc. 1979 Inter. Conf. on Parallel Processing, pp. 124-127, August 1979.Google Scholar
- 14.Hwang, K., Computer Arithmetic: Principles, Architecture and Design, John Wiley, 1979. Google ScholarDigital Library
- 15.Hyafil, L., and H. T. Kung, "The Complexity of Parallel Evaluation of Linear Recurrences," Journal of the ACM (July 1977), pp. 513-521. Google ScholarDigital Library
- 16.Irwin, M. J., "An Arithmetic Unit for On-Line Computation," Ph.D. Thesis, Report No. UIUCDCS-R-77-873, Department of Computer Science, University of Illinois, Urbana, May 1977. Google ScholarDigital Library
- 17.Irwin, M. J., "A Pipelined Processing Unit for On-Line Division," Proc. of the Fifth Annual Symp. on Computer Architecture, Palo Alto, CA, April 1978. Google ScholarDigital Library
- 18.Irwin, M. J., "Reconfigurable Pipeline Systems", Proceedings of 1978 Annual Conference of the ACM, pp. 86-92, Washington, D. C., Dec. 1978. Google ScholarDigital Library
- 19.Kogge, P. M., "Maximal Rate Pipelined Solutions to Recurrence Problems," Proc. First Ann. Symp. on Comp. Architecture, pp. 71-76, Gainesville, FL, 1973. Google ScholarDigital Library
- 20.Kuck, D. J., The Structure of Computers and Computations, Vol. I, John Wiley & Sons, Inc., (1978). Google ScholarDigital Library
- 21.Kung, H. T. and C. E. Leiserson, "Systolic Arrays for VLSI," Computer Science Research Reviews, Carnegie-Mellon Univ., 1977-78.Google Scholar
- 22.Owens, R. M. and M. J. Irwin, "On-Line Algorithms for the Design of Pipeline Architectures," Proc. Sixth Annual Symp. of Computer Architecture, Philadelphia, PA, pp. 12-19, April 1979. Google ScholarDigital Library
- 23.Ramamoorthy, C. V. and H. F. Li, "Pipeline Architecture," Computing Surveys, Vol. 9, No, 1, pp. 61-102, March 1977. Google ScholarDigital Library
- 24.Russell, R. M., "The CRAY-1 Computer System," Communications of the ACM, Vol. 21, No. 1, pp. 63-72, January 1978. Google ScholarDigital Library
- 25.Sameh, A. H., and R. P. Brent, "Solving Triangular Systems on a Parallel Computer," SIAM Journal of Numerical Analysis (1977), pp. 1101-1113.Google ScholarDigital Library
- 26.Specker, W. H., "A Class of Algorithms for Ln X, Exp X, Sin X, Cos X, Tan−1X, Cot−1X," IEEE Transactions on Electronic Computers, Vol. EC-14, No. 1, pp. 85-86, Feb. 1965.Google ScholarCross Ref
- 27.Trivedi, K. S. and M. D. Ercegovac, "On-line Algorithms for Division and Multiplication," Proceedings of the Third IEEE Symposium on Computer Arithmetic, Dallas, Texas, Nov. 1975.Google Scholar
- 28.Trivedi, K. S. and M. D. Ercegovac, "On-line Algorithms for Division and Multiplication," IEEE Transactions on Computers, Vol. C-26, No. 7, pp. 681-687, July 1977.Google Scholar
- 29.Trivedi, K. S. and J. G. Rusnak, "Higher Radix On-Line Division," Proc, of Fourth Symp. on Computer Arithmetic, Santa Monica, CA, Oct. 1978,Google Scholar
Index Terms
- Online pipeline systems for recursive numeric computations
Recommendations
Distributed memory code generation for mixed Irregular/Regular computations
PPoPP 2015: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingMany applications feature a mix of irregular and regular computational structures. For example, codes using adaptive mesh refinement (AMR) typically use a collection of regular blocks, where the number of blocks and the relationship between blocks is ...
Modulo scheduling of loops in control-intensive non-numeric programs
MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on MicroarchitectureMuch of the previous work on modulo scheduling has targeted numeric programs, in which, often, the majority of the loops are well-behaved loop-counter-based loops without early exits. In control-intensive non-numeric programs, the loops frequently have ...
Split-Path Enhanced Pipeline Scheduling
Software pipelining increases the loop execution throughput by overlapping the execution of successive iterations in a pipelined fashion. For loops with control flows, however, software pipelining is not straightforward because we need to consider the ...
Comments