In-loop perceptual model-based rate-distortion optimization for HEVC real-time encoder

Hu, Qiang; Zhou, Jun; Zhang, Xiaoyun; Gao, Zhiyong; Sun, Ming-Ting

doi:10.1007/s11554-018-0772-1

In-loop perceptual model-based rate-distortion optimization for HEVC real-time encoder

Original Research Paper
Published: 05 April 2018

Volume 17, pages 293–311, (2020)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Qiang Hu¹,
Jun Zhou¹,
Xiaoyun Zhang¹,
Zhiyong Gao¹ &
…
Ming-Ting Sun²

588 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, a novel High Efficiency Video Coding (HEVC)-compliant perceptual rate-distortion optimization (RDO) scheme is proposed based on motion attention and visual distortion sensitivity models, which both fully utilize in-loop coding information of HEVC. In detail, the motion attention model is designed by using the motion vectors (MVs) estimated during the inter-prediction process. The MV field is refined based on maximum a posteriori (MAP) estimation to remove MV outliers and improve the model’s efficiency. In addition, the visual distortion sensitivity is modeled by using the spatiotemporal energy of AC coefficients, which are obtained from HEVC transform process. Then, these two models are incorporated together into the RDO process. As a result, the Lagrange multiplier and quantization parameter are adjusted adaptively in an analytical way. Since the two models are calculated within the HEVC coding loop, the complexity increase is limited. The experimental results indicate that the proposed perceptual RDO scheme can achieve significantly better rate-VQM performance than the conventional RDO scheme. Specifically, the BD-rate can reach a maximum 24.45% and an average 13.68% reduction in terms of the Bjontegaard Delta metric compared to HEVC practical encoder x265.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving compression efficiency of HEVC using perceptual coding

Article 18 November 2020

HEVC optimization based on human perception for real-time environments

Article 18 December 2018

Perceptual importance analysis-based rate control method for HEVC

Article 19 February 2022

Notes

The $\times 265$ website. [Online]. Available: https://bitbucket.org/multicoreware/x265/wiki/Home.
“JCT-VC Subversion Respository for the HEVC test Model Version HM16.0,” [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.0.

References

Sullivan, G., Ohm, J., Han, W.J., Wiegand, T.: High efficiency video coding (HEVC) text specification draft 10. JCTVC-L1003. Geneva, CH (2013)
Wiegand, T., Sullivan, G., Bjontegaard, G., Luthra, A.: Overview of the H.264/ AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 13(7), 560–576 (2003)
Article Google Scholar
Ohm, J.R., Sullivan, G.J., Schwarz, H., Tan, T.K., Wiegand, T.: Comparison of the coding efficiency of video coding standards including high efficiency video coding (HEVC). IEEE Trans. Circuits Syst. Video Technol. 22(12), 1669–1684 (2012)
Article Google Scholar
Sullivan, G., Ohm, J., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012)
Article Google Scholar
Rehman, A., Wang, Z.: SSIM-inspired perceptual video coding for HEVC. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 497–502 (2012)
Yang, C.L., Wang, H.X., Po, L.M.: Improved inter prediction based on structural similarity in H.264. In: Proceedings of IEEE International Conference on Signal Processing Communication, vol. 2, pp. 340–343 (2007)
Yang, C.L., Leung, R.K., Po, L.M., Mai, Z.Y.: An SSIM-optimal H.264/AVC inter frame encoder. In: Proceedings of IEEE International Conference on Intelligence Computation and Intelligence Systems, vol. 4, pp. 291–295 (2009)
Huang, Y.H., Ou, T.S., Su, P.Y., Chen, H.H.: Perceptual rate-distortion optimization using structural similarity index as quality metric. IEEE Trans. Circuits Syst. Video Technol. 20(11), 1614–1624 (2010)
Article Google Scholar
Chen, H.H., Huang, Y.H., Su, P.Y., Ou, T.S.: Improving video coding quality by perceptual rate-distortion optimization. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 1287–1292 (2010)
Wang, S., Rehman, A., Wang, Z., Ma, S., Gao, W.: SSIM-motivated rate-distortion optimization for video coding. IEEE Trans. Circuits Syst. Video Technol. 22(4), 516–529 (2012)
Article Google Scholar
Wang, S., Rehman, A., Wang, Z., Ma, S., Gao, W.: Perceptual video coding based on SSIM-inspired divisive normalization. IEEE Trans. Image Process. 22(4), 1418–1429 (2013)
Article MathSciNet MATH Google Scholar
Hu, S., Jin, L., Wang, H., Zhang, Y., Kwong, S., Kuo, C.C.J.: Objective video quality assessment based on perceptually weighted mean squared error. IEEE Trans. Circuits Syst. Video Technol. 99, 1–1 (2016)
Google Scholar
Xu, L., Lin, W., Ma, L., Zhang, Y., Fang, Y., Ngan, K.N., Li, S., Yan, Y.: Free-energy principle inspired video quality metric and its use in video coding. IEEE Trans. Multimed. 18(4), 590–602 (2016)
Article Google Scholar
Ahn, Y.J., Sim, D.: Fast mode decision and early termination based on perceptual visual quality for HEVC encoders. J. Real Time Image Process (2017)
Zeng, H., Yang, A., Ngan, K.N., Wang, M.: Perceptual sensitivity-based rate control method for high efficiency video coding. Multimed. Tools Appl. 75(17), 10383–10396 (2016)
Article Google Scholar
Tang, C.W., Chen, C.H., Yu, Y.H., Tsai, C.J.: Visual sensitivity guided bit allocation for video coding. IEEE Trans. Multimed. 8(1), 11–18 (2006)
Article Google Scholar
Tang, C.W.: Spatiotemporal visual considerations for video coding. IEEE Trans. Multimed. 9(2), 231–238 (2007)
Article MathSciNet Google Scholar
Sun, C., Wang, H.J., Li, H.: Macroblock-level rate-distortion optimization with perceptual adjustment for video coding. In: Proceedings of IEEE Data Compressing Conference, pp. 546–546 (2008)
Wang, Z., Lu, L., Bovik, A.C.: Foveation scalable video coding with automatic fixation selection. IEEE Trans. Image Process. 12(2), 243–254 (2003)
Article Google Scholar
Wei, H., Zhou, X., Zhou, W., Yan, C., Duan, Z., Shan, N.: Visual saliency based perceptual video coding in HEVC. In: Proceedings of International Symposium Circuits Systems, pp. 2547–2550 (2016)
Zhang, F., Bull, D.R.: HEVC enhancement using content-based local QP selection. In: Proceedings of IEEE ICIP, pp. 4215–4219 (2016)
Yang, A., Zeng, H., Chen, J., Zhu, J., Cai, C.: Perceptual feature guided rate distortion optimization for high efficiency video coding. Multidimension. Syst. Signal Process. 28(4), 1249–1266 (2017)
Article MathSciNet Google Scholar
Zhao, W., Fu, J., Lu, Y., Li, S., Zhao, D.: Region-of-interest based coding scheme for synthesized video. In: Proceedings of IEEE International Conference on Visual Communication and Image Processing, pp. 1–4 (2015)
Li, S., Xu, M., Wang, Z., Sun, X.: Optimal bit allocation for CTU level rate control in HEVC. IEEE Trans. Circuits Syst. Video Technol. 27(11), 2409–2424 (2017)
Article Google Scholar
Wang, M., Ngan, K.N., Li, H.: Low-delay rate control for consistent quality using distortion-based Lagrange multiplier. IEEE Trans. Image Process. 25(7), 2943–2955 (2016)
Article MathSciNet MATH Google Scholar
Perez-Daniel, K., Sanchez, V.: Luma-aware multi-model rate-control for HDR content in HEVC. In: Proceedings of IEEE ICIP (2017)
Meddeb, M., Cagnazzo, M., Pesquet-Popescu, B.: Region-of-interest-based rate control scheme for high-efficiency video coding. APSIPA Transactions on Signal and Information Processing 3 (2014)
Xu, M., Deng, X., Li, S., Wang, Z.: Region-of-interest based conversational HEVC coding with hierarchical perception model of face. IEEE J. Sel. Topics Signal Process. 8(3), 475–489 (2014)
Article Google Scholar
Yang, X., Lin, W., Lu, Z., Ong, E., Yao, S.: Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile. IEEE Trans. Circuits Syst. Video Technol. 15(6), 742–752 (2005)
Article Google Scholar
Yang, X., Lin, W., Lu, Z., Ong, E., Yao, S.: Just noticeable distortion model and its applications in video coding. Signal Process. Image Commun. 20(7), 662–680 (2005)
Article Google Scholar
Wei, Z., Ngan, K.N.: Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain. IEEE Trans. Circuits Syst. Video Technol. 19(3), 337–346 (2009)
Article Google Scholar
Chen, Z., Guillemot, C.: Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model. IEEE Trans. Circuits Syst. Video Technol. 20(6), 806–819 (2010)
Article Google Scholar
Luo, Z., Song, L., Zheng, S., Ling, N.: H.264/advanced video control perceptual optimization coding based on JND-directed coefficient suppression. IEEE Trans. Circuits Syst. Video Technol. 23(6), 935–948 (2013)
Article Google Scholar
Jung, C., Chen, Y.: Perceptual rate distortion optimisation for video coding using free-energy principle. Electron. Lett. 51(21), 1656–1658 (2015)
Article Google Scholar
Kim, J., Bae, S.H., Kim, M.: An HEVC-compliant perceptual video coding scheme based on JND models for variable block-sized transform kernels. IEEE Trans. Circuits Syst. Video Technol. 25(11), 1786–1800 (2015)
Article Google Scholar
Bae, S.H., Kim, J., Kim, M.: HEVC-based perceptually adaptive video coding using a DCT-based local distortion detection probability model. IEEE Trans. Image Process. 25(7), 3343–3357 (2016)
Article MathSciNet MATH Google Scholar
Wang, G., Zhang, Y., Li, B., Fan, R., Zhou, M.: A fast and HEVC-compatible perceptual video coding scheme using a transform-domain multi-channel JND model. Multimed. Tools Appl. (2017)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Wang, X., Su, L., Huang, Q., Liu, C., Duan, L.Y.: Motion based perceptual distortion and rate optimization for video coding. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 1061–1066 (2012)
Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)
Article MathSciNet MATH Google Scholar
Wu, H.R., Reibman, A.R., Lin, W., Pereira, F., Hemami, S.S.: Perceptual visual signal compression and transmission. Proc. IEEE 101(9), 2025–2043 (2013)
Article Google Scholar
Pei, S.C., Lai, C.L.: Very low bit-rate coding algorithm for stereo video with spatiotemporal HVS model and binary correlation disparity estimator. IEEE J. Sel. Areas Commun. 16(1), 98–107 (1998)
Article Google Scholar
Chen, J., Zheng, J., Mei, S., He, Y.: Macroblock-level adaptive frequency weighting. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 304–307 (2007)
Watson, A.B.: DCT quantization matrices visually optimized for individual images. In: Proceedings of SPIE, vol. 1913 (1993)
Marzuki, I., Ma, J., Ahn, Y.J., Sim, D.: A context-adaptive fast intra coding algorithm of high-efficiency video coding (HEVC). J. Real Time Image Process. (2016)
Lee, J.H., Goswami, K., Kim, B.G., Jeong, S., Choi, J.S.: Fast encoding algorithm for high-efficiency video coding (HEVC) system based on spatio-temporal correlation. J. Real Time Image Proc. 12(2), 407–418 (2016)
Article Google Scholar
Hu, Q., Zhang, X., Shi, Z., Gao, Z.: Neyman-Pearson-based early mode decision for HEVC encoding. IEEE Trans. Multimed. 18(3), 379–391 (2016)
Article Google Scholar
Lin, T.L., Chou, C.C., Liu, Z., Tung, K.H.: HEVC early termination methods for optimal cu decision utilizing encoding residual information. J. Real Time Image Process. (2016)
Sullivan, G.J., Wiegand, T.: Rate-distortion optimization for video compression. IEEE Signal Process. Mag. 15(6), 74–90 (1998)
Article Google Scholar
Itti, L., Baldi, P.: A principled approach to detecting surprising events in video. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 631–637 (2005)
Seshadrinathan, K., Bovik, A.C.: Motion tuned spatio-temporal quality assessment of natural videos. IEEE Trans. Image Process. 19(2), 335–350 (2010)
Article MathSciNet MATH Google Scholar
Watson, A.B., Ahumada, J., J, A.: Model of human visual-motion sensing. J. Opt. Soc. Am. A 2(2), 322–342 (1985)
Article Google Scholar
Bae, S.H., Kim, M.: DCT-QM: A DCT-based quality degradation metric for image quality optimization problems. IEEE Trans. Image Process. 25(10), 4916–4930 (2016)
Article MathSciNet MATH Google Scholar
Ning, Z., Zhang, Z., Liu, Z.: Visual attention based video object segmentation in MPEG compressed domain. In: Proceedings of IEEE CCWMSN07, pp. 564–567 (2007)
Westen, S.J.P., Lagendijk, R.L., Biemond, J.: Perceptual optimization of image coding algorithms. Proc. IEEE ICIP 2, 69–72 (1995)
Google Scholar
Konrad, J., Dubois, E.: Bayesian estimation of motion vector fields. IEEE Trans. Pattern Anal. Mach. Intell. 14(9), 910–927 (1992)
Article Google Scholar
Aly, H.A.: Data hiding in motion vectors of compressed video based on their associated prediction error. IEEE Trans. Inf. Forensics Secur. 6(1), 14–18 (2011)
Article Google Scholar
Wang, Y.K., Hannuksela, M.M., Varsa, V., Hourunranta, A., Gabbouj, M.: The error concealment feature in the H.26L test model. In: Proceedings of IEEE ICIP, vol. 2, pp. II-729–II-732 (2002)
Shen, B., Sethi, I.K., Vasudev, B.: Adaptive motion-vector resampling for compressed video downscaling. IEEE Trans. Circuits Syst. Video Technol. 9(6), 929–936 (1999)
Article Google Scholar
Stiller, C., Konrad, J.: Estimating motion in image sequences. IEEE Signal Processing Mag. 16(4), 70–91 (1999)
Article Google Scholar
Chiang, T., Zhang, Y.Q.: A new rate control scheme using quadratic rate distortion model. IEEE Trans. Circuits Syst. Video Technol. 7(1), 246–250 (1997)
Article Google Scholar
Yang, E.H., Yu, X., Meng, J., Sun, C.: Transparent composite model for DCT coefficients: design and analysis. IEEE Trans. Image Process. 23(3), 1303–1316 (2014)
Article MathSciNet MATH Google Scholar
Cho, S.H., Mathews, V.J.: Tracking analysis of the sign algorithm in nonstationary environments. IEEE Trans. Acoust. Speech Signal Process. 38(12), 2046–2057 (1990)
Article Google Scholar
Chen, Y.M., Bajic, I.V.: A joint approach to global motion estimation and motion segmentation from a coarsely sampled motion vector field. IEEE Trans. Circuits Syst. Video Technol. 21(9), 1316–1328 (2011)
Article Google Scholar
Wan, P., Feng, Y., Cheung, G., Bajic, I.V., Au, O.C.: 3-D motion estimation for visual saliency modeling. IEEE Signal Process. Lett. 20(10), 972–975 (2013)
Article Google Scholar
Su, Y., Sun, M.T., Hsu, V.: Global motion estimation from coarsely sampled motion vector field and the applications. IEEE Trans. Circuits Syst. Video Technol. 15(2), 232–242 (2005)
Article Google Scholar
Smolic, A., Hoeynck, M., Ohm, J.R.: Low-complexity global motion estimation from p-frame motion vectors for mpeg-7 applications. Proc. IEEE ICIP 2, 271–274 (2000)
Google Scholar
Simoncelli, E.P., Olshausen, B.: Natural image statistics and neural representation. Ann. Rev. Neurosci. 24, 1193–1216 (2001)
Article Google Scholar
Friston, K.: The free-energy principle: a unified brain theory? Nature Rev. Neurosci. 11(2), 127–138 (2010)
Article Google Scholar
Li, B., Li, H., Li, L., Zhang, J.: $\lambda $ domain rate control algorithm for high efficiency video coding. IEEE Trans. Image Process. 23(9), 3841–3854 (2014)
Article MathSciNet MATH Google Scholar
Hu, Q., Zhang, X., Gao, Z., Sun, J.: Analysis and optimization of x265 encoder. In: Proceedings of IEEE International Conference on Visual Communication and Image Processing, pp. 502–505 (2014)
Bossen, F.: Common HM test conditions and software reference configurations. In: Proceedings of 11th Meeting, JCTVC-K1100 (2012)
Pinson, M.H., Wolf, S.: A new standardized method for objectively measuring video quality. IEEE Trans. Broadcast. 50(3), 312–322 (2004)
Article Google Scholar
Naccari, M., Pereira, F.: Advanced H.264/AVC-based perceptual video coding: architecture, tools, and assessment. IEEE Trans. Circuits Syst. Video Technol. 21(6), 766–782 (2011)
Article Google Scholar
Bjontegaard, G.: Calculation of average PSNR differences between RD curves. ITU-T SC16/Q6, VCEG-M33. Austin, USA (2001)
ITU-R: Methodology for the subjective assessment of quality of television pictures. ITU-R Rec. BT.500-11 (2002)
Sheikh, H.R., Sabir, M.F., Bovik, A.C.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3440–3451 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Image Communication and Network Engineering, Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China
Qiang Hu, Jun Zhou, Xiaoyun Zhang & Zhiyong Gao
Department of Electrical Engineering, University of Washington, Seattle, WA, 98105, USA
Ming-Ting Sun

Authors

Qiang Hu
View author publications
You can also search for this author inPubMed Google Scholar
Jun Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Xiaoyun Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Zhiyong Gao
View author publications
You can also search for this author inPubMed Google Scholar
Ming-Ting Sun
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Qiang Hu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, Q., Zhou, J., Zhang, X. et al. In-loop perceptual model-based rate-distortion optimization for HEVC real-time encoder. J Real-Time Image Proc 17, 293–311 (2020). https://doi.org/10.1007/s11554-018-0772-1

Download citation

Received: 28 November 2017
Accepted: 30 March 2018
Published: 05 April 2018
Issue Date: April 2020
DOI: https://doi.org/10.1007/s11554-018-0772-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

In-loop perceptual model-based rate-distortion optimization for HEVC real-time encoder

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improving compression efficiency of HEVC using perceptual coding

HEVC optimization based on human perception for real-time environments

Perceptual importance analysis-based rate control method for HEVC

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now