Abstract
The Discrete Cosine Transform (DCT) is one of the most widely used techniques for image compression. Several algorithms are proposed to implement the DCT-2D. The scaled SDCT algorithm is an optimization of the DCT-1D, which consists in gathering all the multiplications at the end. In this paper, in addition to the hardware implementation on an FPGA, an extended optimization has been performed by merging the multiplications in the quantization block without having an impact on the image quality. A simplified quantization has been performed also to keep higher the performances of the all chain. Tests using MATLAB environment have shown that our proposed approach produces images with nearly the same quality of the ones obtained using the JPEG standard. FPGA-based implementations of this proposed approach is presented and compared to other state of the art techniques. The target is an an Altera Cyclone II FPGA using the Quartus synthesis tool. Results show that our approach outperforms the other ones in terms of processing-speed, used resources and power consumption. A comparison has been done between this architecture and a distributed arithmetic based architecture.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-012-1043-y/MediaObjects/11042_2012_1043_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-012-1043-y/MediaObjects/11042_2012_1043_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-012-1043-y/MediaObjects/11042_2012_1043_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-012-1043-y/MediaObjects/11042_2012_1043_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-012-1043-y/MediaObjects/11042_2012_1043_Fig5_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-012-1043-y/MediaObjects/11042_2012_1043_Fig6_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-012-1043-y/MediaObjects/11042_2012_1043_Fig7_HTML.gif)
Similar content being viewed by others
References
Agostini L, Bampi S (2001) Pipelined Fast 2-D DCT Architecture for JPEG image compression. Proceedings of the 14th Annual Symposium on Integrated Circuits and Systems Design, Pirenopolis, Brazil. IEEE Computer Society, pp 226–231
Andraka R (1998) A survey of CORDIC algorithms for FPGA based computers. Proceedings of the ACM/SIGDA sixth international symposium on Field programmable gate arrays, pp 191–200
Arai Y, Agui T, Nakajima M (1988) A fast DCT-SQ scheme for images. Trans IEICE E71:1095–1097
Belkouch S, El Aakif M, Ouahman AA, Hassani MM (2010) Improved implementation of a modified discrete cosine transform on low-cost FPGA. 5th International Symposium on I/V Communications and Mobile Network (ISVC), pp 1–4
Bougzeel S and Omair Ahmed M and Swamy MNS (2009) A fast 8 × 8 for image compressing. Proc. of the International Conference on Microelectronics, pp 74–77
Chen J, Liu KJR (1998) A complete pipelined parallel CORDIC architecture for motion estimation. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing 45:653–660
Chen WH, Smith C, Fralick S (1977) A fast computational algorithm for the discrete cosine transform. IEEE Trans Commun 25:1004–1009
Chiu CT, Liu KJR (1992) Real-time parallel and fully pipelined twodimensional DCT lattice structures with application to HDTV systems. IEEE Trans Circ Syst Video Tech 2:25–37
El Aakif M, Belkouch S, Chabini N, Hassani, MM (2011) Low power and fast DCT architecture using multiplier-less method. Faible Tens Faible Consomm (FTFC) pp 63–66
Feig E, Winograd S (1992) On the multiplicative complexity of discrete cosine transform. IEEE Trans Inform Theor 38:1387–1391
Haggag MN, El-Sharkawy M and Fahmy G (2010) Efficient fast multiplication-free integer transformation for the 2-D DCT H.265 Standard. IEEE 17th International Conference on Image Processing (ICIP), pp 3769–3772
Huang J, Parris M, Lee J, De Mara RF (2009) Scalable FPGA-based architecture for DCT computation using dynamic partial reconfiguration. ACM Trans Embed Comput Syst 9(1):1–18
Ismail Y, McNeely J, Shaaban M, Al Najjar M, Bayoumi MA (2010) A fast discrete transform architecture for frequency domain motion estimation. 17th IEEE International Conference on Image Processing (ICIP), pp 1249–1252
Kassem A, Hamad M, Haidamous E (2009) Image compression on FPGA using DCT. Intern Conf Advance Comput Tools Engineer Appl ACTEA pp 320–323
Kusuma ED, Widodo TS (2010) FPGA implementation of pipelined 2D-DCT and quantization architecture for JPEG image compression. Intern Symp Inform Tech (ITSim) pp 1–6
Le Gall D (1991) MPEG: a video compression standard for multimedia applications. Commun ACM 34:46–58
Linzer E and Feig E (1991) New scaled DCT algorithms for fused multiply/add architectures. in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process, pp 2201–2204
Loeffler C, Lightenberg A and Moschytz GS (1989) Practical Fast 1-D DCT algorithms with 11-multiplications. Proc. of ICASSP, Glagow, UK vol. 2, pp 988–991
Peled A, Liu B (1974) A new hardware realization of digital filters. IEEE Trans Acoust Speech Signal Process ASSP-22:456–462
Rao KR and Yip P (1990) Discrete cosine transform: algorithms, advantages, applications. Academic Press
Tumeo A, Monchiero M, Palermo G, Ferrandi F, Sciuto D (2007) A Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs. ISVLSI ’07. IEEE Comp Soc Ann Symp VLSI pp 331–336
Wahid KA, Khan AW, Vassil SD, Graham AJ (2007) On the error-free realization of a Scaled DCT Algorithm and Its VLSI Implementation. IEEE Trans Circ Syst Exp Briefs 54(8):700–704
Wallace GK (1991) The JPEG still picture compression standard. Commun ACM 34:30–44
Wang Z (1984) Fast algorithms for the discrete w transform and for the discrete fourier transform. IEEE Trans Acoust Speech Signal Process 32:803–816
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
White SA (1989) Applications of distributed arithmetic to digital sequence processing: a tutorial review. IEEE ASSP Mag 6:5–19
Wu Z, Sha J, Wang Z, Li L, Gao M (2009) An improved scaled DCT architecture. IEEE Trans Consum Electron 55–2:685–689
Xiuhua J, Caiming Z, Xuefen Z (2011) An efficient joint implementation of three stages for fast computation of color space conversation in image coding/decoding. Multimed Tool Appl pp 1–15
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hatim, A., Belkouch, S., El Aakif, M. et al. Design optimization of the quantization and a pipelined 2D-DCT for real-time applications. Multimed Tools Appl 67, 667–685 (2013). https://doi.org/10.1007/s11042-012-1043-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-1043-y