Abstract
This paper discusses strategies for implementing DSP systems using residue replication. The theory, recently introduced by two of the authors, uses formal polynomial ring mappings, from binary representations, to direct product ring implementation of integer processing arrays. The mapping produces completely independent computational arrays each computing over the same ring. This paper describes an architecture and processing array to implement, and take advantage of, the special computational ring structures that result from the mapping. A brief review of the theory and mapping techniques, is followed by the discussion of the architecture and VLSI design of an efficient inner product processing array using Fermat Primes.
Similar content being viewed by others
References
M.R. Schroeder,Number Theory in Science and Communication, Springer-Verlag, Berlin, 1986.
S.R. Barraclough, M. Sotheran, K. Burgin, A.P. Wise, A. Vadher, W.P. Robbins, and R.M. Forsythe, “The design and implementation of the IMS A110 image and signal processor,”IEEE Custom Integrated Circuits Conf., pp. 24.5.1–24.5.4, 1989.
J.D. Mellott, J.C. Smith, and F.J. Taylor, “The gauss machine: A galois-enhanced quadratic residue number system systolic array,”Proceedings of the 11th IEEE Symposium on Computer Arithmetic, Windsor, Canada, 1993, pp. 156–162.
G.A. Jullien, M. Taheri, S. Bandyopadhyay, and W.C. Miller, “A low-overhead scheme for testing a bit level finite ring systolic array,”Journal of VLSI Signal Processing, Vol. 2.3, pp. 131–138, 1990.
M.A. Soderstrand, W.K. Jenkins, G.A. Jullien, and F.J. Taylor,Residue Number System Arithmetic: Modern Applications in Digital Signal Processing, IEEE Press, New York, NY, 1986.
G.A. Jullien, “Number theoretic techniques in digital signal processing,”Advances in Electronics and Electron Physics, Academic Press, 1991, Vol. 80, pp. 69–163.
N.M. Wigley and G.A. Jullien, “On modulus replication for residue arithmetic computations of complex inner products,”IEEE Trans. Comp., Vol. 39, pp. 1065–1076, Aug. 1990.
N.M. Wigley and G.A. Jullien, “Large dynamic range computations over small finite rings,”IEEE Trans. Comp., Vol. 43, No.1, pp. 76–86, 1994.
W.A.J. Chren, “Area and latency improvements for DDS using the residue number system,”Proceedings of the 37th Mid-West Symp. on Circuits and Systems, Lafayette, LA Paper 22.5, 1994.
Z. Wang, G.A. Jullien, and W.C. Miller, “Algorithms for length 15 and 30 discrete cosine transforms,”1991 Asilomar Conference on Circuits Systems and Computers, Pacific Grove, CA, 1991, pp. 111–115.
C.E. Leiserson, Area Efficient VLSI Computation, Ph.D. Dissertation, Dept. Computer Science, Carnegie-Mellon University, Oct. 1981.
G.A. Jullien, “Implementation of multiplication, modulo a prime number, with applications to number theoretic transforms,”IEEE Trans. on Computers, Vol. C-29, No. 10, pp. 899–905, 1980.
G. Zelniker and F.J. Taylor, “A reduced-complexity finite field ALU,”IEEE Trans. on CAS, Vol. 38, No. 12, pp. 1571–1573, 1991.
M.A. Bayoumi, G.A. Jullien, and W.C. Miller, “A VLSI implementation of residue adders,”IEEE Trans. on CAS, Vol. CAS-34, No. 3, 1987.
W. Luo, G.A. Jullien, N.M. Wigley, W.C. Miller, and Z. Wang, “An array processor for inner product computations using a fermat number ALU,”Proceedings of the 1995 Conference on Application Specific Array Processors, Strasbourg, 1995.
L.M. Leibowitz, “A simplified binary arithmetic for the fermat number transform,”IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-24, No. 5, Oct. 1976.
S.Y. Kung,VLSI Array Processors, Prentice-Hall, New Jersey, 1988.
Z. Wang, G.A. Jullien, and W.C. Miller, “A new design technique for column compression multipliers,”IEEE Trans. on Computers, Vol. 44, No. 8, pp. 962–970, 1995.
Z. Wang, G.A. Jullien, W.C. Miller, J. Wang, and S.S. Bizzan, “Fast adders using enhanced multiple-output domino logic,” Journal of Solid-State Circuits, (in print).
M. Afghahi and C. Svensson, “A unified single-phase clocking scheme for VLSI systems,”IEEE J. Solid-State Circuits, Vol. 25, pp. 225–233, Feb. 1990.
J. Yuan, I. Karlsson, and C. Svensson, “A true singlephase-clock dynamic CMOS circuit technique,”IEEE J. of Solid-State Circuits, Vol. 22, No. 5, pp. 899–901, Oct. 1987.
P. Larsson and C. Svensson, “Impact of clock slope on true single phase clocked (TSPC) circuits,”IEEE J. of Solid-State Circuits, Vol. 39, No. 6, pp. 723–726, June 1994.
P. Larsson, Robustness of Digital CMOS Techniques with Special Emphasis on the True Signal Phase Clocking Strategy, Ph.D. Dissertation, Linkoping University, Linkoping Studies in Science and Technology, Thesis No. 390, Linkoping, pp. 29–35, Aug. 1993.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Jullien, G.A., Luo, W. & Wigley, N.M. High throughput VLSI DSP using replicated finite rings. J VLSI Sign Process Syst Sign Image Video Technol 14, 207–220 (1996). https://doi.org/10.1007/BF00925500
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF00925500