Abstract
In spite of high decorrelation performance, the large block size of transform coding in High Efficiency Video Coding (HEVC) brings about undesirable complexity in hardware design. The heaviest burden in HEVC transform implementation is the large quantity of multiplications. In this paper, we propose a novel hierarchical multiplier-free architecture for HEVC transform, which can achieve a multiplier-free partial butterfly combined with matrix multiplications (PBMM) architecture based on vector decomposition (VD-PBMM). In the proposed architecture, the complicate matrix multiplication in PBMM is achieved by several simple stages to simplify its VLSI realization. Each stage only involves additions and multiplications with power of two which can be achieved by shifters and adders. In addition, the new architecture can balance the distribution of adders to improve the system frequency. The proposed architecture has been evaluated with TSMC 0.13um CMOS technology. The relative system can run at 400 MHz with 92 K logic gates, which is about half of the PBMM method when the latency is 8. The proposed architecture can achieve the transform without any performance loss compared with the standard, and it is suitable for the hardware implementation in VLSI design.
Similar content being viewed by others
References
Ahmed N, Natarajan T, Rao KR (1974) Discrete cosine transform. IEEE Trans Comput C-23(1):90–93
Alshina E, Alshin A, Il-Koo Kim, et al. (2011) CE10: full-factorized core transform proposal by Samsung/FastVDO, JCTVC-F251, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Alshina E, Alshin A, Lee W et al. (2011) CE10: full factorization core transforms for HEVC, JCTVC-G737, 7th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Avizienis A (1961) Signed-digit number representations for fast parallel arithmetic. IRE Trans Electron Comput EC-10:389–400
Bossen F, Bross B, Suhring K et al (2012) HEVC complexity and implementation analysis. IEEE Trans Circ Syst Video Technol 22(12):1685–1696
Bross B, Fraunhofer HHI, Han W-J et al. (2012) High efficiency video coding (HEVC) text specification draft 9, JCTVC-K1003, 11th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Shanghai, China
Budagavi M, Fuldseth A, Bjontegaard G et al (2013) Core transform design for the high efficiency video coding (HEVC) standard. IEEE J Select Topic Signal Process 7(6):1029–1041
Budagavi M, Fuldseth A, Bjontegaard G et al (2013) Core transform design for the high efficiency video coding (HEVC) standard. IEEE J Select Topics Signal Process 7(6):1029–1041
Chen YH, Chang TY (2012) A high performance video transform engine by using space-time scheduling strategy. IEEE Trans Very Large Scale Integr Syst 20(4):655–664
Dai W, Krihnan M, Topiwala P et al. (2011) CE10: FastVDO-Samsung Core Transform Proposal, JCTVC-F363, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Dai W, Krihnan M, Topiwala J et al. (2011) CE10: Lossless Core Transforms for HEVC, JCTVC-G266, 7th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Dhandapani V, Ramachandran S (2014) Area and power efficient DCT architecture for image compression. Eurasip J Adv Signal Process 2014(1):1–9
Fan C, Li F, Shi G et al. (2012) “A low complexity multiplierless transform coding for HEVC.” Proceedings of the 13th Pacific-Rim conference on Advances in Multimedia Information Processing Springer-Verlag :578–586
Fuldseth A, Bjøntegaard G, Sadafale M et al. (2011) Transform design for HEVC with 16 bit intermediate data representation, JCTVC-E243, 5th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Fuldseth A, Bjøntegaard G, Sadafale M et al. (2011) CE10: core transform design for HEVC, JCTVC-F446, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Haggag MN, El-Sharkawy M, Fahmy G (2010) “Efficient fast multiplication-free integer transformation for the 2-D DCT H. 265 standard.” Image Processing (ICIP), 2010 17th IEEE International Conference on. pp. 3769–3772
Joshi R, Reznik Y, Sole J et al. (2011) Efficient 16 and 32-point transforms, JCTVC-D256, 4th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Daegu, Korea
Joshi R, Reznik Y, Sole J (2011) CE10: scaled orthogonal integer transforms supporting recursive factorization structure, JCTVC-F352, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Joshi R, Rojals JS, Karczewicz M (2011) CE10: scaled integer transforms supporting recursive factorization structure, JCTVC-G579, 7th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Khayam SA (2003) The discrete cosine transform (DCT): theory and application. Texts Comput Sci 41(1):135–147
Loeffler C, Ligtenberg A, Moschytz GS (1989) Practical fast 1-D DCT algorithms with 11 multiplications [J]. Procintl Confon Acoust Speech Signal Process 2:988–991
Malvar HS, Hallapuro A, Karczewicz M et al (2003) Low-complexity transform and quantization in H. 264/AVC. IEEE Trans Circ Syst Video Technol 13(7):598–603
McCann K, Bross B, Han W-J et al. (2014) HM15: high efficiency video coding (HEVC) test model 15 (HM 15) encoder description, JCTVC-Q1002, 17th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Valencia, Spain
Meher PK, Sang YP, Mohanty BK et al (2014) Efficient Integer DCT Architectures for HEVC. IEEE Trans Circ Syst Video Technol 24(1):168–178
Nguyen T, Helle P, Winken M et al (2013) Transform coding techniques in HEVC. IEEE J Select Topics Signal Process 7(6):978–989
Sadafale M and Budagavi M (2011) Matrix multiplication specification for HEVC transforms, JCTVC-D036, 4th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Daegu, Korea
Sullivan GJ, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circ Syst Video Technol 22(12):1649–1668
Topiwala P, Budagavi M, Fuldseth A et al. (2011) CE10: summary report on Core transform design, JCTVC-G040, 7th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Geneva, Switzerland
Wenjun Z, Onoye T, Song T (2013) “High-performance multiplierless transform architecture for HEVC.” Circuits and Systems (ISCAS), 2013 I.E. International Symposium on. IEEE, pp. 1668–1671
Wiegand T, Sullivan GJ, Bjøntegaard G et al (2003) Overview of the H. 264/AVC video coding standard. IEEE Trans Circ Syst Video Technol 13(7):560–576
Yeo C, Tan YH, Li Z (2011) On fast implementation of 4-point ODST-3 in HM3, JCTVC-F153, 6th Joint Collaborative Team on Video Coding (JCTVC) Meeting, Torino, Italy
Yeo C, Tan YH, Li Z et al (2012) Mode-dependent transforms for coding directional intra prediction residuals. IEEE Trans Circ Syst Video Technol 22(4):545–554
Zhou MH, Sze V (2010) TE 12: evaluation of transform unit (TU) size, JCTVC-C056, 3rd Joint Collaborative Team on Video Coding (JCTVC) Meeting, Guangzhou, China
Acknowledgments
This work was supported in part by the NSFC (No. 61227004,61100155,61401333 and 61301288), the Fundamental Research Funds of the Central Universities of China (No. K5051302096, K5051399020, K5051202050, and JB140207), and the Research Fund for the Doctoral Program of Higher Education (No. 20130203130001).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fan, C., Li, F., Shi, G. et al. A hierarchical multiplier-free architecture for HEVC transform. Multimed Tools Appl 76, 997–1015 (2017). https://doi.org/10.1007/s11042-015-3114-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-3114-3