Skip to main content

Advertisement

Log in

In-loop perceptual model-based rate-distortion optimization for HEVC real-time encoder

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

In this paper, a novel High Efficiency Video Coding (HEVC)-compliant perceptual rate-distortion optimization (RDO) scheme is proposed based on motion attention and visual distortion sensitivity models, which both fully utilize in-loop coding information of HEVC. In detail, the motion attention model is designed by using the motion vectors (MVs) estimated during the inter-prediction process. The MV field is refined based on maximum a posteriori (MAP) estimation to remove MV outliers and improve the model’s efficiency. In addition, the visual distortion sensitivity is modeled by using the spatiotemporal energy of AC coefficients, which are obtained from HEVC transform process. Then, these two models are incorporated together into the RDO process. As a result, the Lagrange multiplier and quantization parameter are adjusted adaptively in an analytical way. Since the two models are calculated within the HEVC coding loop, the complexity increase is limited. The experimental results indicate that the proposed perceptual RDO scheme can achieve significantly better rate-VQM performance than the conventional RDO scheme. Specifically, the BD-rate can reach a maximum 24.45% and an average 13.68% reduction in terms of the Bjontegaard Delta metric compared to HEVC practical encoder x265.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. The \(\times 265\) website. [Online]. Available: https://bitbucket.org/multicoreware/x265/wiki/Home.

  2. “JCT-VC Subversion Respository for the HEVC test Model Version HM16.0,” [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.0.

References

  1. Sullivan, G., Ohm, J., Han, W.J., Wiegand, T.: High efficiency video coding (HEVC) text specification draft 10. JCTVC-L1003. Geneva, CH (2013)

  2. Wiegand, T., Sullivan, G., Bjontegaard, G., Luthra, A.: Overview of the H.264/ AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 13(7), 560–576 (2003)

    Article  Google Scholar 

  3. Ohm, J.R., Sullivan, G.J., Schwarz, H., Tan, T.K., Wiegand, T.: Comparison of the coding efficiency of video coding standards including high efficiency video coding (HEVC). IEEE Trans. Circuits Syst. Video Technol. 22(12), 1669–1684 (2012)

    Article  Google Scholar 

  4. Sullivan, G., Ohm, J., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012)

    Article  Google Scholar 

  5. Rehman, A., Wang, Z.: SSIM-inspired perceptual video coding for HEVC. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 497–502 (2012)

  6. Yang, C.L., Wang, H.X., Po, L.M.: Improved inter prediction based on structural similarity in H.264. In: Proceedings of IEEE International Conference on Signal Processing Communication, vol. 2, pp. 340–343 (2007)

  7. Yang, C.L., Leung, R.K., Po, L.M., Mai, Z.Y.: An SSIM-optimal H.264/AVC inter frame encoder. In: Proceedings of IEEE International Conference on Intelligence Computation and Intelligence Systems, vol. 4, pp. 291–295 (2009)

  8. Huang, Y.H., Ou, T.S., Su, P.Y., Chen, H.H.: Perceptual rate-distortion optimization using structural similarity index as quality metric. IEEE Trans. Circuits Syst. Video Technol. 20(11), 1614–1624 (2010)

    Article  Google Scholar 

  9. Chen, H.H., Huang, Y.H., Su, P.Y., Ou, T.S.: Improving video coding quality by perceptual rate-distortion optimization. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 1287–1292 (2010)

  10. Wang, S., Rehman, A., Wang, Z., Ma, S., Gao, W.: SSIM-motivated rate-distortion optimization for video coding. IEEE Trans. Circuits Syst. Video Technol. 22(4), 516–529 (2012)

    Article  Google Scholar 

  11. Wang, S., Rehman, A., Wang, Z., Ma, S., Gao, W.: Perceptual video coding based on SSIM-inspired divisive normalization. IEEE Trans. Image Process. 22(4), 1418–1429 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  12. Hu, S., Jin, L., Wang, H., Zhang, Y., Kwong, S., Kuo, C.C.J.: Objective video quality assessment based on perceptually weighted mean squared error. IEEE Trans. Circuits Syst. Video Technol. 99, 1–1 (2016)

    Google Scholar 

  13. Xu, L., Lin, W., Ma, L., Zhang, Y., Fang, Y., Ngan, K.N., Li, S., Yan, Y.: Free-energy principle inspired video quality metric and its use in video coding. IEEE Trans. Multimed. 18(4), 590–602 (2016)

    Article  Google Scholar 

  14. Ahn, Y.J., Sim, D.: Fast mode decision and early termination based on perceptual visual quality for HEVC encoders. J. Real Time Image Process (2017)

  15. Zeng, H., Yang, A., Ngan, K.N., Wang, M.: Perceptual sensitivity-based rate control method for high efficiency video coding. Multimed. Tools Appl. 75(17), 10383–10396 (2016)

    Article  Google Scholar 

  16. Tang, C.W., Chen, C.H., Yu, Y.H., Tsai, C.J.: Visual sensitivity guided bit allocation for video coding. IEEE Trans. Multimed. 8(1), 11–18 (2006)

    Article  Google Scholar 

  17. Tang, C.W.: Spatiotemporal visual considerations for video coding. IEEE Trans. Multimed. 9(2), 231–238 (2007)

    Article  MathSciNet  Google Scholar 

  18. Sun, C., Wang, H.J., Li, H.: Macroblock-level rate-distortion optimization with perceptual adjustment for video coding. In: Proceedings of IEEE Data Compressing Conference, pp. 546–546 (2008)

  19. Wang, Z., Lu, L., Bovik, A.C.: Foveation scalable video coding with automatic fixation selection. IEEE Trans. Image Process. 12(2), 243–254 (2003)

    Article  Google Scholar 

  20. Wei, H., Zhou, X., Zhou, W., Yan, C., Duan, Z., Shan, N.: Visual saliency based perceptual video coding in HEVC. In: Proceedings of International Symposium Circuits Systems, pp. 2547–2550 (2016)

  21. Zhang, F., Bull, D.R.: HEVC enhancement using content-based local QP selection. In: Proceedings of IEEE ICIP, pp. 4215–4219 (2016)

  22. Yang, A., Zeng, H., Chen, J., Zhu, J., Cai, C.: Perceptual feature guided rate distortion optimization for high efficiency video coding. Multidimension. Syst. Signal Process. 28(4), 1249–1266 (2017)

    Article  MathSciNet  Google Scholar 

  23. Zhao, W., Fu, J., Lu, Y., Li, S., Zhao, D.: Region-of-interest based coding scheme for synthesized video. In: Proceedings of IEEE International Conference on Visual Communication and Image Processing, pp. 1–4 (2015)

  24. Li, S., Xu, M., Wang, Z., Sun, X.: Optimal bit allocation for CTU level rate control in HEVC. IEEE Trans. Circuits Syst. Video Technol. 27(11), 2409–2424 (2017)

    Article  Google Scholar 

  25. Wang, M., Ngan, K.N., Li, H.: Low-delay rate control for consistent quality using distortion-based Lagrange multiplier. IEEE Trans. Image Process. 25(7), 2943–2955 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  26. Perez-Daniel, K., Sanchez, V.: Luma-aware multi-model rate-control for HDR content in HEVC. In: Proceedings of IEEE ICIP (2017)

  27. Meddeb, M., Cagnazzo, M., Pesquet-Popescu, B.: Region-of-interest-based rate control scheme for high-efficiency video coding. APSIPA Transactions on Signal and Information Processing 3 (2014)

  28. Xu, M., Deng, X., Li, S., Wang, Z.: Region-of-interest based conversational HEVC coding with hierarchical perception model of face. IEEE J. Sel. Topics Signal Process. 8(3), 475–489 (2014)

    Article  Google Scholar 

  29. Yang, X., Lin, W., Lu, Z., Ong, E., Yao, S.: Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile. IEEE Trans. Circuits Syst. Video Technol. 15(6), 742–752 (2005)

    Article  Google Scholar 

  30. Yang, X., Lin, W., Lu, Z., Ong, E., Yao, S.: Just noticeable distortion model and its applications in video coding. Signal Process. Image Commun. 20(7), 662–680 (2005)

    Article  Google Scholar 

  31. Wei, Z., Ngan, K.N.: Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain. IEEE Trans. Circuits Syst. Video Technol. 19(3), 337–346 (2009)

    Article  Google Scholar 

  32. Chen, Z., Guillemot, C.: Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model. IEEE Trans. Circuits Syst. Video Technol. 20(6), 806–819 (2010)

    Article  Google Scholar 

  33. Luo, Z., Song, L., Zheng, S., Ling, N.: H.264/advanced video control perceptual optimization coding based on JND-directed coefficient suppression. IEEE Trans. Circuits Syst. Video Technol. 23(6), 935–948 (2013)

    Article  Google Scholar 

  34. Jung, C., Chen, Y.: Perceptual rate distortion optimisation for video coding using free-energy principle. Electron. Lett. 51(21), 1656–1658 (2015)

    Article  Google Scholar 

  35. Kim, J., Bae, S.H., Kim, M.: An HEVC-compliant perceptual video coding scheme based on JND models for variable block-sized transform kernels. IEEE Trans. Circuits Syst. Video Technol. 25(11), 1786–1800 (2015)

    Article  Google Scholar 

  36. Bae, S.H., Kim, J., Kim, M.: HEVC-based perceptually adaptive video coding using a DCT-based local distortion detection probability model. IEEE Trans. Image Process. 25(7), 3343–3357 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  37. Wang, G., Zhang, Y., Li, B., Fan, R., Zhou, M.: A fast and HEVC-compatible perceptual video coding scheme using a transform-domain multi-channel JND model. Multimed. Tools Appl. (2017)

  38. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  39. Wang, X., Su, L., Huang, Q., Liu, C., Duan, L.Y.: Motion based perceptual distortion and rate optimization for video coding. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 1061–1066 (2012)

  40. Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  41. Wu, H.R., Reibman, A.R., Lin, W., Pereira, F., Hemami, S.S.: Perceptual visual signal compression and transmission. Proc. IEEE 101(9), 2025–2043 (2013)

    Article  Google Scholar 

  42. Pei, S.C., Lai, C.L.: Very low bit-rate coding algorithm for stereo video with spatiotemporal HVS model and binary correlation disparity estimator. IEEE J. Sel. Areas Commun. 16(1), 98–107 (1998)

    Article  Google Scholar 

  43. Chen, J., Zheng, J., Mei, S., He, Y.: Macroblock-level adaptive frequency weighting. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 304–307 (2007)

  44. Watson, A.B.: DCT quantization matrices visually optimized for individual images. In: Proceedings of SPIE, vol. 1913 (1993)

  45. Marzuki, I., Ma, J., Ahn, Y.J., Sim, D.: A context-adaptive fast intra coding algorithm of high-efficiency video coding (HEVC). J. Real Time Image Process. (2016)

  46. Lee, J.H., Goswami, K., Kim, B.G., Jeong, S., Choi, J.S.: Fast encoding algorithm for high-efficiency video coding (HEVC) system based on spatio-temporal correlation. J. Real Time Image Proc. 12(2), 407–418 (2016)

    Article  Google Scholar 

  47. Hu, Q., Zhang, X., Shi, Z., Gao, Z.: Neyman-Pearson-based early mode decision for HEVC encoding. IEEE Trans. Multimed. 18(3), 379–391 (2016)

    Article  Google Scholar 

  48. Lin, T.L., Chou, C.C., Liu, Z., Tung, K.H.: HEVC early termination methods for optimal cu decision utilizing encoding residual information. J. Real Time Image Process. (2016)

  49. Sullivan, G.J., Wiegand, T.: Rate-distortion optimization for video compression. IEEE Signal Process. Mag. 15(6), 74–90 (1998)

    Article  Google Scholar 

  50. Itti, L., Baldi, P.: A principled approach to detecting surprising events in video. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 631–637 (2005)

  51. Seshadrinathan, K., Bovik, A.C.: Motion tuned spatio-temporal quality assessment of natural videos. IEEE Trans. Image Process. 19(2), 335–350 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  52. Watson, A.B., Ahumada, J., J, A.: Model of human visual-motion sensing. J. Opt. Soc. Am. A 2(2), 322–342 (1985)

    Article  Google Scholar 

  53. Bae, S.H., Kim, M.: DCT-QM: A DCT-based quality degradation metric for image quality optimization problems. IEEE Trans. Image Process. 25(10), 4916–4930 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  54. Ning, Z., Zhang, Z., Liu, Z.: Visual attention based video object segmentation in MPEG compressed domain. In: Proceedings of IEEE CCWMSN07, pp. 564–567 (2007)

  55. Westen, S.J.P., Lagendijk, R.L., Biemond, J.: Perceptual optimization of image coding algorithms. Proc. IEEE ICIP 2, 69–72 (1995)

    Google Scholar 

  56. Konrad, J., Dubois, E.: Bayesian estimation of motion vector fields. IEEE Trans. Pattern Anal. Mach. Intell. 14(9), 910–927 (1992)

    Article  Google Scholar 

  57. Aly, H.A.: Data hiding in motion vectors of compressed video based on their associated prediction error. IEEE Trans. Inf. Forensics Secur. 6(1), 14–18 (2011)

    Article  Google Scholar 

  58. Wang, Y.K., Hannuksela, M.M., Varsa, V., Hourunranta, A., Gabbouj, M.: The error concealment feature in the H.26L test model. In: Proceedings of IEEE ICIP, vol. 2, pp. II-729–II-732 (2002)

  59. Shen, B., Sethi, I.K., Vasudev, B.: Adaptive motion-vector resampling for compressed video downscaling. IEEE Trans. Circuits Syst. Video Technol. 9(6), 929–936 (1999)

    Article  Google Scholar 

  60. Stiller, C., Konrad, J.: Estimating motion in image sequences. IEEE Signal Processing Mag. 16(4), 70–91 (1999)

    Article  Google Scholar 

  61. Chiang, T., Zhang, Y.Q.: A new rate control scheme using quadratic rate distortion model. IEEE Trans. Circuits Syst. Video Technol. 7(1), 246–250 (1997)

    Article  Google Scholar 

  62. Yang, E.H., Yu, X., Meng, J., Sun, C.: Transparent composite model for DCT coefficients: design and analysis. IEEE Trans. Image Process. 23(3), 1303–1316 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  63. Cho, S.H., Mathews, V.J.: Tracking analysis of the sign algorithm in nonstationary environments. IEEE Trans. Acoust. Speech Signal Process. 38(12), 2046–2057 (1990)

    Article  Google Scholar 

  64. Chen, Y.M., Bajic, I.V.: A joint approach to global motion estimation and motion segmentation from a coarsely sampled motion vector field. IEEE Trans. Circuits Syst. Video Technol. 21(9), 1316–1328 (2011)

    Article  Google Scholar 

  65. Wan, P., Feng, Y., Cheung, G., Bajic, I.V., Au, O.C.: 3-D motion estimation for visual saliency modeling. IEEE Signal Process. Lett. 20(10), 972–975 (2013)

    Article  Google Scholar 

  66. Su, Y., Sun, M.T., Hsu, V.: Global motion estimation from coarsely sampled motion vector field and the applications. IEEE Trans. Circuits Syst. Video Technol. 15(2), 232–242 (2005)

    Article  Google Scholar 

  67. Smolic, A., Hoeynck, M., Ohm, J.R.: Low-complexity global motion estimation from p-frame motion vectors for mpeg-7 applications. Proc. IEEE ICIP 2, 271–274 (2000)

    Google Scholar 

  68. Simoncelli, E.P., Olshausen, B.: Natural image statistics and neural representation. Ann. Rev. Neurosci. 24, 1193–1216 (2001)

    Article  Google Scholar 

  69. Friston, K.: The free-energy principle: a unified brain theory? Nature Rev. Neurosci. 11(2), 127–138 (2010)

    Article  Google Scholar 

  70. Li, B., Li, H., Li, L., Zhang, J.: \(\lambda \) domain rate control algorithm for high efficiency video coding. IEEE Trans. Image Process. 23(9), 3841–3854 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  71. Hu, Q., Zhang, X., Gao, Z., Sun, J.: Analysis and optimization of x265 encoder. In: Proceedings of IEEE International Conference on Visual Communication and Image Processing, pp. 502–505 (2014)

  72. Bossen, F.: Common HM test conditions and software reference configurations. In: Proceedings of 11th Meeting, JCTVC-K1100 (2012)

  73. Pinson, M.H., Wolf, S.: A new standardized method for objectively measuring video quality. IEEE Trans. Broadcast. 50(3), 312–322 (2004)

    Article  Google Scholar 

  74. Naccari, M., Pereira, F.: Advanced H.264/AVC-based perceptual video coding: architecture, tools, and assessment. IEEE Trans. Circuits Syst. Video Technol. 21(6), 766–782 (2011)

    Article  Google Scholar 

  75. Bjontegaard, G.: Calculation of average PSNR differences between RD curves. ITU-T SC16/Q6, VCEG-M33. Austin, USA (2001)

  76. ITU-R: Methodology for the subjective assessment of quality of television pictures. ITU-R Rec. BT.500-11 (2002)

  77. Sheikh, H.R., Sabir, M.F., Bovik, A.C.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3440–3451 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Hu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, Q., Zhou, J., Zhang, X. et al. In-loop perceptual model-based rate-distortion optimization for HEVC real-time encoder. J Real-Time Image Proc 17, 293–311 (2020). https://doi.org/10.1007/s11554-018-0772-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-018-0772-1

Keywords

Navigation