Skip to main content
Log in

Fast coding unit size decision based on deep reinforcement learning for versatile video coding

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Video coding has long been looking for a more available approach than the greedy method. The quad-tree with nested multi-type tree (QTMT) structure including quad-tree (QT) and multi-type tree (MTT) results in highly coding complexity in Versatile Video Coding (VVC). In addition, the rapid progress in deep learning (DL) is attracting increasing attention in the video coding community. Therefore, this paper proposes a fast Coding Unit (CU) splitting decision method based on Deep reinforcement learning (DRL) for VVC to decrease the coding complexity. Specifically, the 32 × 32 CU for splitting is considered as a Markov decision process (MDP), the CU splitting situations at a certain node as state, the splitting modes decision as actions, the reduction or increase in rate-distortion (RD) cost as the immediate rewards or punishments, and the encoder as an agent to make coding decisions successively. The simulation results demonstrate that the coding time reduction (CTR) of the proposed approach can lead to a reduction of about 54.38% while maintaining coding performance, which can realize a trade-off between the complexity reduction and coding efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig.4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Amestoy T, Mercat A, Hamidouche W, Menard D, Bergeron C (2020) Tunable VVC frame partitioning based on lightweight machine learning. IEEE Trans Image Process 29(1):1313–1328

    Article  MathSciNet  Google Scholar 

  2. Amna M, Imen W, Sayadi FE and Atri M (2020) “Fast intra-coding unit partition decision in H.266/FVC based on deep learning,” J Real-Time Image Process, vo1. 17, no. 11, pp. 1971–1981

  3. Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Magazine 34(6):26–38

    Article  Google Scholar 

  4. Bjontegaard G (2001) Calculation of average PSNR differences between RD curves, document ITU-T SG16 Q6, VCEG-M33. TX, USA, Austin

  5. Bouaafia S, Khemiri R, Sayadi FE, Mohamed A (2019) Fast CU partition-based machine learning approach for reducing HEVC complexity. J Real-Time Image Proc 17:185–196

    Article  Google Scholar 

  6. Boyce J, Suehring K, Li X and Seregin V (2018) “JVET common test conditions and software reference configurations,” Document JVET-J1010 of JVET, San Diego, US

  7. Bross B, Andersson K, Bläser M, Drugeon V, Kim S, Lainema J, Li J, Liu S, Ohm J, Sullivan GJ, Yu R (2020) General video coding technology in responses to the joint call for proposals on video compression with capability beyond HEVC. IEEE Trans. Circuits Syst. Video Technol. 30(5):1226–1240

    Article  Google Scholar 

  8. Chen J, Sun H, Katto J, Zeng X and Fan Y (2019) “Fast QTMT partition decision algorithm in VVC intra coding based on variance and gradient,” in Proc. 2019 IEEE visual communications and image processing (VCIP), Sydney, Australia, pp 1–4

  9. Chen J, Chiu Y, Lee C and Tsai Y (2019) “Utilize neighboring LCU depth information to speedup FVC/H.266 intra coding,” in Proc. 2019 International Conference on System Science and Engineering (ICSSE), Dong Hoi, Vietnam, pp. 308–312

  10. Chen J, Karczewicz M, Huang Y, Choi K, Ohm J, Sullivan GJ (2020) The joint exploration model (JEM) for video compression with capability beyond HEVC. IEEE Trans. Circuits Syst Video Technol 99:1–1

    Article  Google Scholar 

  11. Chen K, Zeng X and Fan Y (2018) “CNN Oriented Fast CU Partition Decision and PU Mode Decision for HEVC Intra Encoding,” in Proc. 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Qingdao, pp. 1–3

  12. Chung C, Peng W. and Hu J.(2017) “HEVC/H.265 coding unit split decision using deep reinforcement learning,” in Proc. 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Xiamen, pp. 570–575

  13. Correa G, Assuncao P, Agostini L, da Silva Cruz L (2014) Classification-based early termination for coding tree structure decision in HEVC. Proc. IEEE Int. Conf. Electronics, Circuits, and Systems:239–242

  14. De-Luxán-Hernández S, Schwarz H, Marpe D and Wiegand T (2018) “Fast line-based intra prediction for video coding,” in Proc. 2018 IEEE International Symposium on Multimedia (ISM), Taichung, pp. 135–138, .

  15. Fan J, Wang Z, Xie Y, and Yang Z (2020)“A theoretical analysis of deep Q-learning,” in Learning for Dynamics and Control, PMLR, pp.486–489

  16. Fan Y, Chen J, Sun H, Katto J and Jing M (2020) “A Fast QTMT Partition Decision Strategy for VVC Intra Prediction,” in IEEE Access, vol. 8, pp. 107900–107911

  17. Fu T, Zhang H, Mu F and Chen H (2019) “Fast CU partitioning algorithm for H.266/VVC intra-frame coding,” in Proc. 2019 IEEE International Conference on Multimedia and Expo (ICME), Shanghai, China, pp. 55–60

  18. Jin Z, An P, Shen L and Yang C (2017) “CNN oriented fast QTBT partition algorithm for JVET intra coding,” in Proc. 2017 IEEE visual communications and image processing (VCIP), St Petersburg, FL, pp 1–4

  19. Katayama T, Kuroda K, Wen S, Tian S and Takashi S (2018) “Low-complexity intra coding algorithm based on convolutional neural network for HEVC,” in Proc. 2018 International Conference on Information and Computer Technologies (ICICT), DeKalb, IL, pp. 115–118

  20. Kim K, Ro WW (2019) Fast CU Depth Decision for HEVC Using Neural Networks. IEEE Trans Circuits Syst Video Technol 29(5):1462–1473

    Article  Google Scholar 

  21. Lei M, Luo F, Zhang X, Wang S and Ma S (2019) “Look-Ahead prediction based coding unit size pruning for VVC intra coding,” in Proc. 2019 IEEE International Conference on Image Processing (ICIP), Taipei, pp. 4120–4124,

  22. Li N, Zhang Y, Zhu L, Luo W, Kwong S (Feb. 2019) Reinforcement learning based coding unit early termination algorithm for high efficiency video coding. J Vis Commun Image Represent 60:276–286

    Article  Google Scholar 

  23. T. Lin, H. Jiang, J. Huang and P. Chang (2018) “Fast binary tree partition decision in H.266/FVC intra Coding,” in Proc. 2018 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), Taichung, pp. 1–2

  24. Park S, Kang J (2019) Context-based ternary tree decision method in versatile video coding for fast intra coding. IEEE Access 7:172597–172605

    Article  Google Scholar 

  25. Ren W, Su J, Sun C and Shi Z (2019) “An IBP-CNN Based Fast Block Partition for Intra Prediction,” in Proc. 2019 Picture Coding Symposium (PCS), Ningbo, China, pp. 1–5

  26. Segall A, Baroncini V, Boyce J, Chen J and Suzuki T (2017) Joint call for proposals on video compression with capability beyond HEVC. document JVET-H1002, Joint Video Experts Team (JVET) of ITU-T and ISO/IEC, Macao, CN.

  27. Shen X, Yu L (2013) CU splitting early termination based on weighted SVM. EURASIP J Image Video Process 4

  28. Sutton R, Barto A (2018) Reinforcement learning: An introduction. MIT Press

    MATH  Google Scholar 

  29. Tang G, Jing M, Zeng X and Fan Y (2019) “Adaptive CU split decision with pooling-variable CNN for VVC intra encoding,” in Proc. 2019 IEEE visual communications and image processing (VCIP), Sydney, Australia, pp 1–4

  30. Tang N, Cao J, Liang F, Wang J, Liu H, Wang X and Du X (2019) “Fast CTU Partition Decision Algorithm for VVC Intra and Inter Coding,” in Proc. 2019 IEEE Asia Pacific conference on circuits and systems (APCCAS), Bangkok, Thailand, pp 361–364

  31. Wang Z, Wang S, Zhang J, Wang S and Ma S (2017) “Effective quadtree plus binary tree block partition decision for future video coding,” in Proc. 2017 Data compression conference (DCC), pp 23–32, Snowbird, UT.

  32. Xu M, Li T, Wang Z, Deng X, Yang R, Guan Z (2018) Reducing complexity of HEVC: a deep learning approach. IEEE Trans Image Process 27(10):5044–5059

    Article  MathSciNet  Google Scholar 

  33. Yang H, Shen L, Dong X, Ding Q, An P, Jiang G (2020) Low complexity CTU partition structure decision and fast intra mode decision for versatile video coding. IEEE Trans. Circuits Syst. Video Technol. 30(6):1668–1682

    Article  Google Scholar 

  34. Yu X, Liu Z, Liu J, Gao Y, and Wang D (2015) “VLSI friendly fast CU/PU mode decision for HEVC intra encoding:leveraging convolution neural networks,” in Proc. 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, pp. 1285–1289

  35. Zhang Q, Wang Y, Huang L, Jiang B (2020) Fast CU Partition and Intra Mode Decision Method for H.266/VVC. IEEE Access 8:117539–117550

    Article  Google Scholar 

  36. Zhang Q, Zhao Y, Jiang B, Huang L, Wei T (2020) Fast CU Partition Decision Method Based on Texture Characteristics for H.266/VVC. IEEE Access 8:203516–203524

    Article  Google Scholar 

  37. Zhang Q, Wang Y, Huang L, Jiang B and Wang X (2020) Fast CU partition decision for H.266/VVC based on the improved DAG-SVM classifier model. Multimed Syst

  38. Zhao J, Wang Y, Zhang Q (2020) Adaptive CU Split decision based on deep learning and multifeature fusion for H.266/VVC. Sci Program 2020:1–11

    Google Scholar 

  39. Zhu L, Zhang Y, Kwong S, Wang X, Zhao T (2018) Fuzzy SVM-based coding unit decision in HEVC. IEEE Trans Broadcast 64(3):681–694

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China No.61771432, and 61302118, the Basic Research Projects of Education Department of Henan No. 21zx003, and 20A880004, and the Key Research and Development Program of Henan No. 202102210179.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiuwen Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, J., Wang, Y., Li, M. et al. Fast coding unit size decision based on deep reinforcement learning for versatile video coding. Multimed Tools Appl 81, 16371–16387 (2022). https://doi.org/10.1007/s11042-022-12558-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12558-z

Keywords

Navigation