A hierarchical multiplier-free architecture for HEVC transform

Fan, Chunxiao; Li, Fu; Shi, Guangming; Niu, Yi; Qi, Fei; Xie, Xuemei; Jiao, Dandan

doi:10.1007/s11042-015-3114-3

A hierarchical multiplier-free architecture for HEVC transform

Published: 26 November 2015

Volume 76, pages 997–1015, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chunxiao Fan¹,
Fu Li¹,
Guangming Shi¹,
Yi Niu¹,
Fei Qi¹,
Xuemei Xie¹ &
…
Dandan Jiao¹

325 Accesses
4 Citations
Explore all metrics

Abstract

In spite of high decorrelation performance, the large block size of transform coding in High Efficiency Video Coding (HEVC) brings about undesirable complexity in hardware design. The heaviest burden in HEVC transform implementation is the large quantity of multiplications. In this paper, we propose a novel hierarchical multiplier-free architecture for HEVC transform, which can achieve a multiplier-free partial butterfly combined with matrix multiplications (PBMM) architecture based on vector decomposition (VD-PBMM). In the proposed architecture, the complicate matrix multiplication in PBMM is achieved by several simple stages to simplify its VLSI realization. Each stage only involves additions and multiplications with power of two which can be achieved by shifters and adders. In addition, the new architecture can balance the distribution of adders to improve the system frequency. The proposed architecture has been evaluated with TSMC 0.13um CMOS technology. The relative system can run at 400 MHz with 92 K logic gates, which is about half of the PBMM method when the latency is 8. The proposed architecture can achieve the transform without any performance loss compared with the standard, and it is suitable for the hardware implementation in VLSI design.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Fast Algorithm-Based Cost-Effective and Hardware-Efficient Unified Architecture Design of 4 × 4, 8 × 8, 16 × 16, and 32 × 32 Inverse Core Transforms for HEVC

Article 04 March 2015

Chia-Wei Chang, Hao-Fan Hsu, … Robert Chen-Hao Chang

Hardware Friendly Oriented Design for Alternative Transform in HEVC

An Area Efficient and Reusable HEVC 1D-DCT Hardware Accelerator

References

Ahmed N, Natarajan T, Rao KR (1974) Discrete cosine transform. IEEE Trans Comput C-23(1):90–93
Article MathSciNet MATH Google Scholar
Alshina E, Alshin A, Il-Koo Kim, et al. (2011) CE10: full-factorized core transform proposal by Samsung/FastVDO, JCTVC-F251, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Alshina E, Alshin A, Lee W et al. (2011) CE10: full factorization core transforms for HEVC, JCTVC-G737, 7th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Avizienis A (1961) Signed-digit number representations for fast parallel arithmetic. IRE Trans Electron Comput EC-10:389–400
Article MathSciNet Google Scholar
Bossen F, Bross B, Suhring K et al (2012) HEVC complexity and implementation analysis. IEEE Trans Circ Syst Video Technol 22(12):1685–1696
Article Google Scholar
Bross B, Fraunhofer HHI, Han W-J et al. (2012) High efficiency video coding (HEVC) text specification draft 9, JCTVC-K1003, 11th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Shanghai, China
Budagavi M, Fuldseth A, Bjontegaard G et al (2013) Core transform design for the high efficiency video coding (HEVC) standard. IEEE J Select Topic Signal Process 7(6):1029–1041
Article Google Scholar
Budagavi M, Fuldseth A, Bjontegaard G et al (2013) Core transform design for the high efficiency video coding (HEVC) standard. IEEE J Select Topics Signal Process 7(6):1029–1041
Article Google Scholar
Chen YH, Chang TY (2012) A high performance video transform engine by using space-time scheduling strategy. IEEE Trans Very Large Scale Integr Syst 20(4):655–664
Article Google Scholar
Dai W, Krihnan M, Topiwala P et al. (2011) CE10: FastVDO-Samsung Core Transform Proposal, JCTVC-F363, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Dai W, Krihnan M, Topiwala J et al. (2011) CE10: Lossless Core Transforms for HEVC, JCTVC-G266, 7th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Dhandapani V, Ramachandran S (2014) Area and power efficient DCT architecture for image compression. Eurasip J Adv Signal Process 2014(1):1–9
Article Google Scholar
Fan C, Li F, Shi G et al. (2012) “A low complexity multiplierless transform coding for HEVC.” Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing Springer-Verlag :578–586
Fuldseth A, Bjøntegaard G, Sadafale M et al. (2011) Transform design for HEVC with 16 bit intermediate data representation, JCTVC-E243, 5th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Fuldseth A, Bjøntegaard G, Sadafale M et al. (2011) CE10: core transform design for HEVC, JCTVC-F446, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Haggag MN, El-Sharkawy M, Fahmy G (2010) “Efficient fast multiplication-free integer transformation for the 2-D DCT H. 265 standard.” Image Processing (ICIP), 2010 17th IEEE International Conference on. pp. 3769–3772
Joshi R, Reznik Y, Sole J et al. (2011) Efficient 16 and 32-point transforms, JCTVC-D256, 4th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Daegu, Korea
Joshi R, Reznik Y, Sole J (2011) CE10: scaled orthogonal integer transforms supporting recursive factorization structure, JCTVC-F352, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Joshi R, Rojals JS, Karczewicz M (2011) CE10: scaled integer transforms supporting recursive factorization structure, JCTVC-G579, 7th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Khayam SA (2003) The discrete cosine transform (DCT): theory and application. Texts Comput Sci 41(1):135–147
Google Scholar
Loeffler C, Ligtenberg A, Moschytz GS (1989) Practical fast 1-D DCT algorithms with 11 multiplications [J]. Procintl Confon Acoust Speech Signal Process 2:988–991
Article Google Scholar
Malvar HS, Hallapuro A, Karczewicz M et al (2003) Low-complexity transform and quantization in H. 264/AVC. IEEE Trans Circ Syst Video Technol 13(7):598–603
Article Google Scholar
McCann K, Bross B, Han W-J et al. (2014) HM15: high efficiency video coding (HEVC) test model 15 (HM 15) encoder description, JCTVC-Q1002, 17th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Valencia, Spain
Meher PK, Sang YP, Mohanty BK et al (2014) Efficient Integer DCT Architectures for HEVC. IEEE Trans Circ Syst Video Technol 24(1):168–178
Article Google Scholar
Nguyen T, Helle P, Winken M et al (2013) Transform coding techniques in HEVC. IEEE J Select Topics Signal Process 7(6):978–989
Article Google Scholar
Sadafale M and Budagavi M (2011) Matrix multiplication specification for HEVC transforms, JCTVC-D036, 4th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Daegu, Korea
Sullivan GJ, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circ Syst Video Technol 22(12):1649–1668
Article Google Scholar
Topiwala P, Budagavi M, Fuldseth A et al. (2011) CE10: summary report on Core transform design, JCTVC-G040, 7th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Wenjun Z, Onoye T, Song T (2013) “High-performance multiplierless transform architecture for HEVC.” Circuits and Systems (ISCAS), 2013 I.E. International Symposium on. IEEE, pp. 1668–1671
Wiegand T, Sullivan GJ, Bjøntegaard G et al (2003) Overview of the H. 264/AVC video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576
Article Google Scholar
Yeo C, Tan YH, Li Z (2011) On fast implementation of 4-point ODST-3 in HM3, JCTVC-F153, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Yeo C, Tan YH, Li Z et al (2012) Mode-dependent transforms for coding directional intra prediction residuals. IEEE Trans Circ Syst Video Technol 22(4):545–554
Article Google Scholar
Zhou MH, Sze V (2010) TE 12: evaluation of transform unit (TU) size, JCTVC-C056, 3rd Joint Collaborative Team on Video Coding (JCTVC) Meeting, Guangzhou, China

Download references

Acknowledgments

This work was supported in part by the NSFC (No. 61227004,61100155,61401333 and 61301288), the Fundamental Research Funds of the Central Universities of China (No. K5051302096, K5051399020, K5051202050, and JB140207), and the Research Fund for the Doctoral Program of Higher Education (No. 20130203130001).

Author information

Authors and Affiliations

Key Laboratory of Intelligent Perception and Image Understanding (Chinese Ministry of Education), School of Electronic Engineering, Xidian University, Xi’an, China
Chunxiao Fan, Fu Li, Guangming Shi, Yi Niu, Fei Qi, Xuemei Xie & Dandan Jiao

Authors

Chunxiao Fan
View author publications
You can also search for this author in PubMed Google Scholar
Fu Li
View author publications
You can also search for this author in PubMed Google Scholar
Guangming Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yi Niu
View author publications
You can also search for this author in PubMed Google Scholar
Fei Qi
View author publications
You can also search for this author in PubMed Google Scholar
Xuemei Xie
View author publications
You can also search for this author in PubMed Google Scholar
Dandan Jiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fu Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, C., Li, F., Shi, G. et al. A hierarchical multiplier-free architecture for HEVC transform. Multimed Tools Appl 76, 997–1015 (2017). https://doi.org/10.1007/s11042-015-3114-3

Download citation

Received: 20 March 2015
Revised: 18 November 2015
Accepted: 23 November 2015
Published: 26 November 2015
Issue Date: January 2017
DOI: https://doi.org/10.1007/s11042-015-3114-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical multiplier-free architecture for HEVC transform

Abstract

Access this article

Similar content being viewed by others

A Fast Algorithm-Based Cost-Effective and Hardware-Efficient Unified Architecture Design of 4 × 4, 8 × 8, 16 × 16, and 32 × 32 Inverse Core Transforms for HEVC

Hardware Friendly Oriented Design for Alternative Transform in HEVC

An Area Efficient and Reusable HEVC 1D-DCT Hardware Accelerator

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hierarchical multiplier-free architecture for HEVC transform

Abstract

Access this article

Similar content being viewed by others

A Fast Algorithm-Based Cost-Effective and Hardware-Efficient Unified Architecture Design of 4 × 4, 8 × 8, 16 × 16, and 32 × 32 Inverse Core Transforms for HEVC

Hardware Friendly Oriented Design for Alternative Transform in HEVC

An Area Efficient and Reusable HEVC 1D-DCT Hardware Accelerator

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation