Performance Improvement of Vector-Radix Decimation-in-Frequency 3D-DCT/IDCT Using Variable Word Length

Arunachalam, V.; Joseph Raj, Alex Noel; Deepika, S.

doi:10.1007/s00034-020-01557-w

Performance Improvement of Vector-Radix Decimation-in-Frequency 3D-DCT/IDCT Using Variable Word Length

Published: 08 October 2020

Volume 40, pages 1818–1831, (2021)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

291 Accesses
2 Citations
Explore all metrics

Abstract

High-speed video coding and compression are extensively used in many IoT applications with optimum data usage and resolution using three-dimensional discrete cosine transforms (3D-DCT). We propose an efficient hardware implementation for high-speed vector-radix decimation-in-frequency (VR-DIF) 3D-DCT with an optimum area and power consumption. In the previous implementation, the data path arithmetic units used a fixed word length (either 16 or 18 or 21 bits), whereas the proposed architecture uses the range of word length from 11 bits (1-bit sign, 1-bit integer and 9-bit fraction) to 20 bits (1-bit sign, 10-bit integer and 9-bit fraction) to achieve lower silicon area and power consumption. The architecture is optimally pipelined to achieve high processing speed (above 3 Giga samples/s). To test the proposed architecture, an \(8\times 8\times 8\) video cube with a pixel depth of 8 bits is considered. The arithmetic functional units such as signed adder/subtractor and cosine coefficient multipliers required for implementing \(8\times 8\times 8\) 3D-DCT/IDCT processor is designed with the proposed variable word length. The core of VR-DIF 3D-DCT/IDCT with the variable word length is implemented using TSMC 90 nm technology library. The proposed architecture consumes 26.5% and 23.2% lesser area and power, respectively, than the existing fixed word length 3D-DCT-II implementation tested with a maximum frequency of 653 MHz.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Performance analysis of multi-folded pipelined successive cancellation decoder architecture for polar code

Article 13 April 2024

Dinesh Kumar D & Shantha Selvakumari R

An Efficient Design of 8 * 8 Wallace Tree Multiplier Using 2 and 3-Bit Adders

Performance evaluation of all intra Kvazaar and x265 HEVC encoders on embedded system Nvidia Jetson platform

Article 02 April 2024

R. James, Mohammed Abo-Zahhad, … Mohammed S. Sayed

References

G.P. Abousleman, M.W. Marcellin, B.R. Hunt, Compression of hyperspectral imagery using the 3-D DCT and hybrid DPCM/DCT. IEEE Trans. Geosci. Remote Sens. 33, 26–34 (1995)
Article Google Scholar
N. Ahmed, T. Natarajan, K.R. Rao, Discrete cosine transform. IEEE Trans. Comput. 23, 90–93 (1974)
Article MathSciNet Google Scholar
S. Al-Azawi, O. Nibouche, S. Boussakta, G. Lightbody, New fast and area-efficient pipeline 3-D DCT architectures. Digit. Signal Process. 84, 15–25 (2019)
Article MathSciNet Google Scholar
S. Boussakta, H.O. Alshibami, IEEE fast algorithm for the 3-D DCT-II. IEEE Trans. Signal Process. 52, 992–1001 (2004)
Article MathSciNet Google Scholar
S.C. Chan, K.L. Ho, Direct methods for computing discrete sinusoidal transforms, in Proc. Inst. Elect. Eng. Radar Signal Process., vol. 137, pp. 433–442 (1990)
R.J. Clarke, Relation between the Karhunen Loeve and cosine transforms. IEEE Proc. F Commun. Radar Signal Process. 128, 359–360 (1981)
Article MathSciNet Google Scholar
F. Fang, T. Chen, R. Rutenbar, Lightweight floating point arithmetic: case study of inverse discrete cosine transform. EURASIP J. Signal Process. 9, 879–892 (2002)
MATH Google Scholar
E. Feig, E. Linzer, Scaled DCTs on input sizes that are composite. IEEE Trans. Signal Process. 43, 43–50 (1995)
Article Google Scholar
E. Feig, S. Winograd, Fast algorithms for the discrete cosine transform. IEEE Trans. Signal Process. 40, 2174–2193 (1992)
Article Google Scholar
R.C. Gonzalez, P. Wintz, Digital image processing. Inc. Applied Mathematics and Computation, Reading, Mass., Addison-Wesley Publishing Co., vol. 13, p. 451 (1977)
G. Hegde, S. Tripathi, P.R. Vaya, VLSI implementation of the video encoder using an efficient 3-D DCT algorithm. Int. J. Electron. Lett. 4, 38–49 (2016)
Article Google Scholar
H. Hou, A fast recursive algorithm for computing the discrete cosine transform. IEEE Trans. Acoust. Speech Signal Process. 35, 1532–1539 (1985)
Google Scholar
IEEE Standard Specifications for the Implementations of \(8\times 8\) Inverse Discrete Cosine Transform, in IEEE Std 1180-1990, pp. 1–12 (1991). https://doi.org/10.1109/IEEESTD.1991.101047
M. Jamunarani, C. Vasanthanayaki, Shape adaptive DCT compression for high quality surveillance using wireless sensor networks. Clust. Comput. 22(2), 3737–3747 (2019)
Article Google Scholar
B.G. Lee, A new algorithm to compute discrete cosine transform. IEEE Trans. Acoust. Speech Signal Process. 32, 1243–1245 (1984)
Article Google Scholar
J. Liang, T.D. Tran, Fast multiplierless approximations of the DCT with the lifting scheme. IEEE Trans. Signal Process. 49, 3032–3044 (2001)
Article Google Scholar
M. Nazir, Z. Jan, M. Sajjad, Facial expression recognition using histogram of oriented gradients based transformed features. Clust. Comput. 21(1), 539–548 (2018)
Article Google Scholar
J.S. Park, T. Ogunfunmi, A new VLSI architecture for 3D-DCT video compression system, in Proc. IEEE SiPS, Taipei City, Taiwan (2013), pp. 135–140
J.S. Park, T. Ogunfunmi, A 3D-DCT video encoder using advanced coding techniques for low power mobile device. J. Vis. Commun. Image Represent. 48, 122–135 (2017)
Article Google Scholar
R. Rădescu, An efficient solution for video compression using an original modified algorithm applied to improve the 3D Discrete Cosine Transform, in Int. Symp. ISFEE, University Politehnica of Bucharest, Romania, pp. 1–5 (2018)
K.R. Rao, P. Yip, Discrete Cosine Transform: Algorithms, Advantages, Applications (Academic Press, New York, 2014)
MATH Google Scholar
S. Saponara, Real-time and low-power processing of 3D direct/inverse discrete cosine transform for low-complexity video codec. J. Real Time Image Process. 7, 43–53 (2012)
Article Google Scholar
S. Saponara, L. Fanucci, P. Terreni, Low-power VLSI architectures for 3D discrete cosine transform (DCT). Midwest Symp. Circuits Syst. 3, 1567–1570 (2003)
Google Scholar
J. Song, Z. Xiong, X. Liu, Y. Liu, PVH-3DDCT: an algorithm for layered video coding and transmission, in Proc. Fourth Int. Conf./Exh. High Performance Comput. Asia-Pacific Region, vol. 2, pp. 700–703 (2000)
S.-C. Tai, Y. Gi, C.-W. Lin, An adaptive 3-D discrete cosine transform coder for medical image compression. IEEE Trans. Inform. Technol. Biomed. 4, 259–263 (2000)
Article Google Scholar
Video Samples. http://eeweb.poly.edu/~yao/EL6123_s16/SampleVideoData.html
J. Xiuhua, Z. Caiming, Z. Xuefen, An efficient joint implementation of three stages for fast computation of color space conversation in image coding/decoding. Multimed. Tools Appl. 63, 1–15 (2011)
Google Scholar
B. Yeo, B. Liu, Volume rendering of DCT-based compressed 3D scalar data. IEEE Trans. Vis. Comput. Graph. 1, 29–43 (1995)
Article Google Scholar
L. Yuanyuan, C. Hexin, Z. Yan, Y. Chuxi, Device-saving pipeline architectures of multi-dimensional DCT similar butterfly algorithm, in Conf. on Integrated Circuits and Microsystems (ICICM) (2016), pp. 339–344
L. Yuanyuan, C. Hexin, Z. Yan, Y. Chuxi, Three dimensional DCT similar butterfly algorithm and its pipeline architectures, in IEEE Information Technology, Networking, Electronic and Automation Control Conf. (2016), pp. 506–510
Y. Zeng, G. Bi, A.R. Leyman, New polynomial transform algorithm for multidimensional DCT. IEEE Trans. Signal Process. 48, 2814–2821 (2000)
Article MathSciNet Google Scholar
Y. Zeng, G. Bi, A.C. Kot, New algorithm for multidimensional type-III DCT. IEEE Trans. Circuits Syst. 47, 1523–1529 (2000)
Article Google Scholar
Y. Zeng, G. Bi, Z. Lin, Combined polynomial transform and radix-q algorithm for multi-dimensional DCT-III. Multidimensional Syst. Signal Process. 13, 79–99 (2002)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research was financially supported by The Research Start-Up Fund Subsidized Project of Shantou University, China, Grant No. NTF17016. The authors would like to thank CH Vijendra Kumar, M.Tech., VLSI design student for assisting in ASIC synthesis trails and Vellore Institute of Technology, Vellore, for providing laboratory facilities.

Author information

Authors and Affiliations

Department of Micro and Nano Electronics, School of Electronics Engineering, Vellore Institute of Technology, Vellore, India
V. Arunachalam & S. Deepika
Key Laboratory of Digital Signal and Image Processing of Guangdong Province, Department of Electronic Engineering, College of Engineering, Shantou University, Shantou, China
Alex Noel Joseph Raj

Authors

V. Arunachalam
View author publications
You can also search for this author in PubMed Google Scholar
Alex Noel Joseph Raj
View author publications
You can also search for this author in PubMed Google Scholar
S. Deepika
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Noel Joseph Raj.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The optimum word lengths for various stages of the architecture is analysed using the MATLAB model of VR-DIF 3D-DCT and VR-DIF 3D-IDCT and listed in the following Tables 7 and 8.

Table 7 Stage wise word length representation for VR 3D-DCT

Full size table

Table 8 Stage wise word length representation for VR 3D-IDCT

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arunachalam, V., Joseph Raj, A.N. & Deepika, S. Performance Improvement of Vector-Radix Decimation-in-Frequency 3D-DCT/IDCT Using Variable Word Length. Circuits Syst Signal Process 40, 1818–1831 (2021). https://doi.org/10.1007/s00034-020-01557-w

Download citation

Received: 08 February 2020
Revised: 16 September 2020
Accepted: 21 September 2020
Published: 08 October 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s00034-020-01557-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance Improvement of Vector-Radix Decimation-in-Frequency 3D-DCT/IDCT Using Variable Word Length

Abstract

Access this article

Similar content being viewed by others

Performance analysis of multi-folded pipelined successive cancellation decoder architecture for polar code

An Efficient Design of 8 * 8 Wallace Tree Multiplier Using 2 and 3-Bit Adders

Performance evaluation of all intra Kvazaar and x265 HEVC encoders on embedded system Nvidia Jetson platform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Performance Improvement of Vector-Radix Decimation-in-Frequency 3D-DCT/IDCT Using Variable Word Length

Abstract

Access this article

Similar content being viewed by others

Performance analysis of multi-folded pipelined successive cancellation decoder architecture for polar code

An Efficient Design of 8 * 8 Wallace Tree Multiplier Using 2 and 3-Bit Adders

Performance evaluation of all intra Kvazaar and x265 HEVC encoders on embedded system Nvidia Jetson platform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation