Abstract
The fifth-generation (5G) mobile networks pave a highway path for ultra-high-definition video communications and the newest versatile video coding standard, VVC/H.266, supporting 8K video coding, is best suited to offer media streaming applications over 5G networks, such as remote desktop, online streaming, cloud gaming, and other interactive media services. For a platform to provide better media-consuming experiences, it had to control quality and response in real-time. The VVC/H.266 adopts a quadtree plus multitype tree (QTMT) coding structure that requires exhaustive search operations, such that its time complexity is 18 times of the previous HEVC/H.265. To make practical application feasible, we proposed to quickly determine whether one coding unit (CU) resides on static regions, based on which the VVC coding controller can decide to inherit the co-located QTMT coding mode of a previously coded frame or not to reduce encoding time complexity. A subjective similarity measure, MS-SSIM, is used to determine CU static. In addition, a learned optical flow motion estimation (OFME) model is developed to measure motion activity to screen out false-positive results, such that BDBR can be kept small. By quickly locating static CUs and precisely screening out false-positive ones, the VVC encoding time complexity can be largely reduced while maintaining good quality. Experiments showed that the proposed method can save 42.34% of encoding time with 1.49% of BDBR increment, as compared with the default VTM 11.0 intra-coding. The percentage of static region blocks is found to be 61.32% on average from test video sequences.
Similar content being viewed by others
Data Availability Statement
All data generated or analyzed during this study are included in this published article.
References
Bjontegaard G (2001) Calculation of average PSNR differences between RD-curves. ITU-t VCEG-m33
Chen J, Ye Y, Kim SH (2020) Algorithm description for versatile video coding and test model, vol 8(VTM-8)
Dosovitskiy A et al (2015) Flownet: learning optical flow with convolutional networks. In: IEEE international conference on computer vision (ICCV), pp 2758–2766
Fu T et al (2019) Fast CU partitioning algorithm for H.266/VVC intra-frame coding. In: IEEE international conference on multimedia and exp (ICME), pp 55–60
Huang Y-H, Chen J-J, Tsai Y-H (2021) Speed up h.266/qtmt intra-coding based on predictions of resnet and random forest classifier. In: IEEE international conference on consumer electronics (ICCE), pp 1–6
Krizhevsky A et al (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., vol 25, pp 1097–1105
Lecun Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li T et al (2020) DeepQTMT: a deep learning approach for fast QTMT-based CU partition of intra-mode vvc
Lin W, He X, Dai W, See J, Shinde T, Xiong H, Duan L (2020) Key-point sequence lossless compression for intelligent video analysis. IEEE MultiMedia 27(3):12–22
Lin W, Panusopone K, Baylon DM, Sun M-T (2010) A computation control motion estimation method for complexity-scalable video coding. IEEE Trans Circuits Syst Video Technol 20(11):1533–1543
Na Tang, Cao Jian, Liang Fan, Wang Jun, Liu Hongmei, Wang Xiaoyang, Xiaorong D (2019) Fast CTU partition decision algorithm for VVC intra and inter coding. In: 2019 IEEE Asia pacific conference on circuits and systems (APCCAS), pp 361–364
Pakdaman F et al (2020) Complexity analysis of next-generation VVC encoding and decoding. In: IEEE international conference on image processing (ICIP), pp 3134–3138
Pakdaman F et al (2020) Complexity analysis of next-generation VVC encoding and decoding. CoRR, arXiv:2005.10801
Pan Z, Zhang P, Peng B, Ling N, Lei J (2021) A CNN-based fast inter coding method for VVC. IEEE Signal Process Lett 28:1260–1264
Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In: IEEE conf computer vision and pattern recognition (CVPR), pp 2720–2729
Sankaraiah S, Lam H S, Eswaran C, Abdullah J (2011) GOP level parallelism on H. 264 video encoder for multicore architecture. In: Int conf on circuits, system and simulation (IPCSIT), vol 7, pp 127–132
Tang G et al (2019) Adaptive CU split decision with pooling-variable CNN for VVC intra encoding. In: IEEE visual communications and image processing (VCIP), pp 1–4
Wang Z, Simoncelli EP, Bovik C (2003) Multiscale structural similarity for image quality assessment. In: The thrity-seventh Asilomar conference on signals, systems computers, vol 2, pp 1398–1402
Wang Z et al (2017) Effective quadtree plus binary tree block partition decision for future video coding. In: Data Compression Conference (DCC), pp 23–32
Xu M et al (2018) Reducing complexity of hevc: A deep learning approach. IEEE Trans Image Process 27(10):5044–5059
Yang H et al (2020) Low-complexity ctu partition structure decision and fast intra mode decision for versatile video coding. IEEE Trans Circuits Syst Video Technol 30(6):1668–1682
Acknowledgements
This work is partially supported by the Taiwan Ministry of Science and Technology with a grant No. MOST 109-2221-E-011-117.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Consent for Publication
We declare that this work is original and not considered for publication in any other publication media.
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, JJ., Su, JA. Fast H.266/VVC intra-coding by mode inheritance. Multimed Tools Appl 82, 36041–36065 (2023). https://doi.org/10.1007/s11042-023-14849-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-14849-5