Abstract
The computational time of HEVC encoder is increased mainly because of the hierarchical quad-tree-based structure, recursive coding units, and the exhaustive prediction search up to 35 modes. These advances improve the coding efficiency, but result in a very high computational complexity. Furthermore, selecting the optimal modes among all prediction modes is necessary for subsequent rate-distortion optimization process. Therefore, we propose a convolution neural network-based algorithm which learns the region-wise image features and performs a classification job. These classification results are later used in the encoder downstream systems for finding the optimal coding units in each of the tree blocks, and subsequently reduce the number of prediction modes. The experimental results show that our proposed learning-based algorithm reduces the encoder time saving up to 66.89% with a minimal Bjøntegaard delta bit rate (BD-BR) loss of 1.31% over the state-of-the-art machine learning approaches. Furthermore, our method also reduces the mode selection by 45.83% with respect to the HEVC baseline.
Similar content being viewed by others
References
M. Abadi, Tensorflow: large-scale machine learning on heterogeneous distributed systems (2016), https://arxiv.org/abs/1603.044672016
S. Bharat, J. Michael, M. Tim, T. Oncel, S. Ming, A multi-stream bi-directional recurrent neural network for fine-grained action detection, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA (2016), pp. 1961–1970
G. Bjontegaard, Calculation of average PSNR differences between R-D curves, in Document VCEG-M33, ITU-T VCEG 13th Meeting (2001)
L. Zhao, L. Zhang, S. Ma, D. Zhao, Fast mode decision algorithm for intra prediction in HEVC, in Proc. Visual Communications and Image Processing (VCIP), Tainan, Taiwan (2011)
C. Chen, M. Liu, O. Tuzel, J. Xiao, R-CNN for small object detection, in IEEE, Asian Conference on Computer Vision, Taipei, Taiwan (2016), pp. 214–230
S. Cho, M. Kim, Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding. IEEE Trans. Circuits Syst. Video Technol. 23(9), 1555–1564 (2013)
M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, A. Vedaldi, Describing textures in the wild, in IEEE, Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA (2014)
M. Cimpoi, S. Maji, A. Vedaldi, Deep filter banks for texture recognition and segmentation, in IEEE, Conference on Computer Vision and Pattern Recognition (CVPR), Boston, Massachusetts, USA (2015)
G. Correa, P.A. Assuncao, L.V. Agostini, L.A. da Silva Cruz, Fast HEVC encoding decisions using data mining. IEEE Trans. Circuits Syst. Video Technol. (TCSVT) 25(4), 660–673 (2015)
M. Everingham, L.V. Gool, C.K.I. Williams, J. Winn, A. Zisserman, The PASCAL visual object classes challenge 2007 (VOC2007) results (2007)
Q. Hu, Z. Shi, X. Zhang, Z. Gao, Fast HEVC intra mode decision based on logistic regression classification, in IEEE, International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Nara, Japan (2016)
M.J. Huiskes, R. Péteri, Dyntex: a comprehensive database of dynamic textures. Pattern Recognit. Lett. 31(12), 1627–1632 (2010)
H. Jegou, F. Perronnin, M. Douze, J. Sanchez, P. Perez, C. Schmid, Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 32(9), 1704–1716 (2012)
M. Khan, M. Shafique, J. Henkel, An adaptive complexity reduction scheme with fast prediction unit decision for HEVC intra encoding, in IEEE, International Conference on Image Processing (ICIP), Melbourne, Australia (2013)
S. Kuanar, Deep learning based fast mode decision in HEVC intra prediction using region wise feature classification (2019), https://rc.library.uta.edu/uta-ir/handle/10106/27767
S. Kuanar, C. Conly, K. Rao, Deep learning based HEVC in-loop filtering for decoder quality enhancement, in IEEE, Picture Coding Symposium (PCS) (2018), pp. 164–168
S. Kuanar, K. Rao, C. Conly, Fast mode decision in HEVC intra prediction using region wise CNN feature classification, in IEEE, International Conference on Multimedia and Expo (ICME), San Diego, USA (2018)
S. Kuanar, K.R. Rao, D. Mahapatra, M. Bilas, Night time haze and glow removal using deep dilated convolutional network (2019), https://arxiv.org/abs/1902.00855
J. Lainema, F. Bossen, W.J. Han, J. Min, K. Ugur, Intra coding of the HEVC standard. IEEE TCSVT 22(12), 1792–1801 (2012)
T. Li, M. Xu, X. Deng, A deep convolutional neural network approach for complexity reduction on intra-mode HEVC, in IEEE International Conference on Multimedia and Expo (ICME), China (2017)
Z. Liu, X. Yu, S. Chen, D. Wang, CNN oriented fast HEVC intra CU mode decision, in IEEE, International Symposium on Circuits and Systems (ISCAS), Montreal, QC, Canada (2016)
D.G. Lowe (1999) Object recognition from local scale-invariant features, in IEEE, International Conference on Computer Vision (ICCV), Kerkyra, Greece
B. Min, R.C.C. Cheung, A fast CU size decision algorithm for the HEVC intra encoder. IEEE Trans. Circuits Syst. Video Technol. 25(5), 892–896 (2015)
Online (2017) JCT-VC, HEVC HM Software, https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.9/. Accessed 14 Jan 2017
F. Perronnin, J. Sánchez, T. Mensink, Improving the fisher kernel for large-scale image classification, in ECCV (2010)
Y. Piao, J. Min, J. Chen, Encoder improvement of unified intra prediction, in JCTVC-C207 (2010), pp. 7–15
S. Ren, K. He, Sun RJ, Faster R-CNN, towards real-time object detection with region proposal networks, in Neural Information Processing Systems (NIPS), Montréal CANADA (2015)
L. Shen, Z. Liu, X. Zhang, W. Zhao, Z. Zhang, An effective CU size decision method for HEVC encoders. IEEE Trans. Multimed. 15(2), 465–470 (2013)
N. Srivastava, E. Mansimov, R. Salakhutdinov, Unsupervised learning of video representations using LSTMs, in International Conference on Machine Learning (ICLM), vol. 1502 (2016), pp. 843– 852
G.J. Sullivan, J.R. Ohm, W.J. Han, T. Wiegand, Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012)
Y. Wang, X. Fan, L. Zhao, S. Ma, D. Zhao, W. Gao, A fast intra coding algorithm for HEVC, in IEEE International Conference on Image Processing (ICIP), La Defense, Paris, France (2014), pp. 4117–4121
J. Xiong, H. Li, Q. Wu, F. Meng, A fast HEVC inter CU selection method based on pyramid motion divergence. IEEE Trans. Multimed. 16(2), 559–564 (2014)
M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, in IEEE, European Conference on Computer Vision (ECCV), Zurich, Switzerland (2014), pp. 818–833
Y. Zhang, S. Kwong, X. Wang, H. Yuan, Z.L. Xu, Machine learning- based coding unit depth decisions for flexible complexity allocation in high efficiency video coding. IEEE Trans. Image Process. (TIP) 24(7), 2225–2238 (2015)
L. Zhao, L. Zhang, S. Ma, D. Zhao, Fast mode decision algorithm for intra prediction in HEVC, in IEEE, Visual Communications and Image Processing (2011), pp. 1–4
Acknowledgements
The author would like to thank all the reviewers for their time and valuable comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kuanar, S., Rao, K.R., Bilas, M. et al. Adaptive CU Mode Selection in HEVC Intra Prediction: A Deep Learning Approach. Circuits Syst Signal Process 38, 5081–5102 (2019). https://doi.org/10.1007/s00034-019-01110-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-019-01110-4