Abstract
The Discrete Cosine Transform (DCT) is one of the most widely used techniques of transforms in digital signal processing. It is the main algorithm in image and video coding systems. In this paper, we propose an algorithm which generates enhanced Cordic based Loeffler DCT architectures for angle’s precision degrees ranging from 10−1 to 10−7. High level PSNR, area and power estimators have been proposed to make a trade-off between consumption and image quality. An optimal architecture has been retained for its low complexity, low power and high PSNR. The complexity of this architecture is the lowest among the conventional DCT architectures even the BinDCT which is a reference in terms of reduced complexity. The selected architecture has also the closest PSNR to the reference Loeffler-DCT architecture without a substancial loss of power.
















Similar content being viewed by others
References
Loeffler, C., Lightenberg, A., & Moschytz, G.S. (1989). Practical fast 1-D DCT algorithms with 11-multiplications (Vol. 2, pp. 988–991). Glasgow UK: Proceedings ICASSP.
Jeong, H., Kim, J., & Cho, W. K. (2004). Low-power multiplierless DCT architecture using image correlation. IEEE Transactions of Consumer Electronics, 50(1), 262–267.
Arathanasis, H. C. (1993). On computing the 2-D Discrete Cosine Transform Using Rotations, Microprocessors and Microprogramming 38.
Volder, J. E. (1959). The CORDIC trigonometric computing technique. IRE Electronics Packaging Computer, EC-8, 330–334.
Walther, J. (1971). A unified algorithm for elementary functions. Proceedings Spring Joint Computer Conference, 38, 379–385.
Mariatos, E. P., Metafas, D. E., Hallas, J. A., & Goutis, C. E. (1994). A fast DCT processor, based on special purpose CORDIC rotators. Proceedings of IEEE International Symposium Circuits System, 4, 271–274.
Sun, C.-C., Ruan, S.-J., Heyne, B., & Goetze, J. (2007). Low-power and high quality Cordic-based Loeffler DCT for signal processing. IET Circuits Devices System, 1(6), 453–461.
Sun, C. -C., Donner, P., & Götze, J. (2012). VLSI implementation of a configurable IP Core for quantized discrete cosine and integer transforms. International Journal of Circuit Theory and Applications, 40(11), 1107–1126.
Dang, P. P., Chau, P. M., Nguyen, T. Q., & Tran, T. D. (2005). BinDCT and its efficient VLSI architectures for real-time embedded applications. Journal Image Science and Technology, 49(2), 124–137.
Fritts, J. E., Steiling, F. W., Tucek, J. A., & Wolf, W (2009). MediaBench II video: Expediting the next generation of video systems research. Microprocessors and Microsystems, 33(4), 301–318.
Deng, L., Sobti, K., Chakrabarti, C., & Zhang, Y. (2011). Accurate Area, Time and Power Models for FPGA-Based Implementations, Journal Sign Process System.
Lee, M. -W., Yoon, J. -H., & Park, J. (2014). Reconfigurable CORDIC-Based Low-Power DCT Architecture Based on Data Priority. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 22(5), 1060–1068.
Shafique, M., Bauer, L., & Henkel, J. (2008). Optimizing the H.264/AVC Video Encoder Application Structure for Reconfigurable and Application- Specific Platforms, Journal of Signal Processing Systems (JSPS).
Shafique, M., Bauer, L., & Henkel, J. (2010). Optimizing the H.264/AVC Video Encoder Application Structure for Reconfigurable and Application-Specific Platforms, Journal Sign Process System.
Fang, J. T., Tsai, Y. C., Lee, J. X., & Yu, P. S. (2016). Computation Reduction in Transform Unit of High Efficiency Video Coding based on Zero coefficients , International Symposium on Computer, Consumer and Control.
Liu, Y. Y., Chen, H. X., Zhao, Y., & Sun, H. Y. (2015). Discrete cosine transform optimization in image compression based on genetic algorithm, 8th International Congress on Image and Signal Processing (CISP).
Meeuws, R., Ostadzadeh, S. A., Galuzzi, C., Sima, V. M., Nane, R., & Bertels, K. (2013). ’Quipu: A statistical model for predicting hardware resources ’, ACM Trans. Reconfigurable Technol. Syst. 6(1) .
International Organization for Standardization. ITU-T Recommendation T.81. In ISO/IEC IS 10918-1, http://www.jpeg.org/jpeg/ JPEG homepage (2016).
Tao, Z., Liu, S., & He, J. (2007). A New Algorithm on Short Window MDCT for Dolby AC3, Proceedings of ISPACS.
Zhang, J., Chow, P., & Liu, H (2015). FPGA Implementation of Low-Power and High-PSNR DCT/IDCT Architecture based on Adaptive Recoding CORDIC, International Conference on Field Programmable Technology (FPT).
Gall, D. L. (1991). MPEG: a video compression standard for multimedia applications. Communications of the ACM-Special Issue on Digital Multimedia Systems, 34(4), 46–58.
Xilinx Power Estimator User Guide, UG440 (v2014.1) Xilinx Homepage (2016).
Mehri, H., & Alizadeh, B. (2015). Analytical performance model for FPGA-based reconfigurable computing. Microprocessors and Microsystems, 39, 796–806.
Vivado Design Suite User Guide, Model-Based DSP Design Using System Generator, UG897 (v2015.3) 30, (2015).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix:
Annex 1
This annex gives the details of the PSNR measured considering a jpeg compression chain applyed on Lena, Baboon, Peppers and Goldhill.
2.1 Annex 2
This annex gives the detailed implementation of each Cordic block for different angle’s precision degree.
2.2 Annex 3
This annex gives the detailed values of the power as a function of frequency.
Rights and permissions
About this article
Cite this article
Mami, S., Saad, I.B., Lahbib, Y. et al. Enhanced Configurable DCT Cordic Loeffler Architectures for Optimal Power-PSNR Trade-Off. J Sign Process Syst 90, 371–393 (2018). https://doi.org/10.1007/s11265-017-1245-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-017-1245-7