Abstract
The two-dimensional discrete cosine transform (2D-DCT) is at the core of image encoding and compression applications. We present a new architecture for the 2D-DCT which is based on row-column decomposition. An efficient architecture to compute the one-dimensional fast direct (1D-DCT) and inverse cosine (1D-IDCT) transforms, which is based in reordering the butterflies after their computation, is also discussed. The architectures designed exploit locality, allowing pipelining between stages and saving memory (in-place). The result is an efficient architecture for high speed computation of the (1D, 2D)-DCT that significantly reduces the area required for VLSI implementation.
Similar content being viewed by others
References
"JPEG digital compression and coding of continuous-tone still images," Draft ISO 10918, 1991.
"Coding of moving pictures and associated audio," Committee Draft of Standard ISO 11172: ISO/MEPG 90/176, Dec. 1990.
"Special issue on advances in image and video compression," Proceedings of the IEEE, Vol. 83, No. 2, pp. 133–340, 1995.
N. Ahmed, T. Natarajan, and K.R. Rao, "Discrete cosine transform," IEEE Trans. Comput., Vol. C-23, No. 1, pp. 90–94, Jan. 1974.
U. Totzek, F. Matthiesen, S.Wohlleben, and T.G. Noll, "CMOS VLSI Implementation of the 2D-DCT with linear processor arrays," Proc. ICASS'90, pp. 937–940, 1990.
Concordel, J. Guichard, and E. Cassimatis, "A single chip video rate 16 £ 16 discrete cosine transform," Proc. ICASSP'86, pp. 805–808, 1986.
M. Sánchez, J.D. Bruguera, and E.L. Zapata, "Bit-serial architecture for the two dimensional DCT," Proc. ICSPAT'95, Vol. II, Boston, pp. 662–666, Oct. 1995.
A.M. Gottlieb, "VLSI implementation of 16 × 16 discrete cosine transform," IEEE Tr. on Circuits and Systems, Vol. 36, No. 4, pp. 610–617, April 1989.
Shin-Ichi Uramoto et al., "A 100–MHz 2D discrete cosine transform core processor," IEEE J. of Solid-State Circuits, Vol. 21, No. 4, pp. 492–499, April 1992.
Willson, "A100MHz2–D 8 × 8 DCT/IDCT processor forHDTV applications," IEEE Tr. on Circuits and Systems for Video Technology, Vol. 5, No. 2, pp. 158–165, April 1995.
P. Chaisemartin, S. Kritter, and A. Artieri, "A 20 MHz, CCITT requirements compatible, discrete cosine transform," ESSCIRC'90, Grenoble, pp. 197–200, Sept. 1990.
Y.H. Hu and Z. Wu, "An efficient CORDIC array structure for the implementation of discrete cosine transform," IEEE Tr. on Signal Processing, Vol. 43, No. 1, pp. 331–336, Jan. 1995.
J.-H. Hsiao, L.-G. Chen, T.-D. Chiueh, and Ch.-T. Chen, "High throughput CORDIC-based systoloic array design for the discrete cosine transform," IEEE Tr. on Circuits and Systems for Video Technology, Vol. 5, No. 3, pp. 218–225, June 1995.
V. Srinivasan and K.J. Ray Liu, "VLSI design of high-speed time-recursive 2–D DCT/IDCT processor for video applications," IEEE Tr. on Circuits and Systems for Video Technology, Vol. 6, No. 1, pp. 87–96, Feb. 1996.
F. Argüello and E.L. Zapata, "Fast cosine transform based on the successive doubling method," Electronics Letters, Vol. 26, No. 19, pp. 1616–1618, Sept. 1990.
D. Fraser, "Array permutation by index-digit permutation," Journal of ACM, Vol. 23, No. 2, pp. 298–309, April 1976.
P.M. Flanders, "A unified approach to a class of data movements on an array processor," IEEE Trans. Comp., Vol. C-31, pp. 809–819, 1982.
M. Sánchez, J. López, O. Plata, and E.L. Zapata, "An efficient architecture for the in-place fast cosine transform," IEEE ASAP'97 Proceedings, Zurich, pp. 499–508, 1997.
F. Argüello, J.D. Bruguera, R. Doallo, and E.L. Zapata, "Parallel architecture for fast transforms with trigonometric kernel," IEEE Tr. on Parallel and Distributed Systems, Vol. 5, No. 10, pp. 1091–1099, March 1994.
S.F. Gorman and J.M. Wills, "Partial column FFT pipelines," IEEE Tr. on Circuits and Systems-II: Analog and Digital Signal Processing, Vol. 42, No. 6, pp. 414–423, 1995.
J. Lpez and E.L. Zapata, "Unified architecture for divide and conquer based tridiagonal systems solvers," IEEE Tr. on Computers, Vol. 43, No. 12, pp. 1413–1421, Dec. 1994.
M.C. Pease, "An adaptation of the fast fourier transform for parallel processing," Journal of ACM, Vol. 15, pp. 252–264, 1968.
M. Sánchez and E.L. Zapata, "Arquitectura para el cálculo de la DCT/IDCT bidimensional," Proc. DCIS'94, Gran Canaria, pp. 412–416, Nov. 1994.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Sánchez, M., López, J., Plata, O. et al. An Efficient Architecture for the In-Place Fast Cosine Transform. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 21, 91–102 (1999). https://doi.org/10.1023/A:1008044104579
Published:
Issue Date:
DOI: https://doi.org/10.1023/A:1008044104579