Abstract
In this paper, a novel High Efficiency Video Coding (HEVC)-compliant perceptual rate-distortion optimization (RDO) scheme is proposed based on motion attention and visual distortion sensitivity models, which both fully utilize in-loop coding information of HEVC. In detail, the motion attention model is designed by using the motion vectors (MVs) estimated during the inter-prediction process. The MV field is refined based on maximum a posteriori (MAP) estimation to remove MV outliers and improve the model’s efficiency. In addition, the visual distortion sensitivity is modeled by using the spatiotemporal energy of AC coefficients, which are obtained from HEVC transform process. Then, these two models are incorporated together into the RDO process. As a result, the Lagrange multiplier and quantization parameter are adjusted adaptively in an analytical way. Since the two models are calculated within the HEVC coding loop, the complexity increase is limited. The experimental results indicate that the proposed perceptual RDO scheme can achieve significantly better rate-VQM performance than the conventional RDO scheme. Specifically, the BD-rate can reach a maximum 24.45% and an average 13.68% reduction in terms of the Bjontegaard Delta metric compared to HEVC practical encoder x265.









Similar content being viewed by others
Notes
The \(\times 265\) website. [Online]. Available: https://bitbucket.org/multicoreware/x265/wiki/Home.
“JCT-VC Subversion Respository for the HEVC test Model Version HM16.0,” [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.0.
References
Sullivan, G., Ohm, J., Han, W.J., Wiegand, T.: High efficiency video coding (HEVC) text specification draft 10. JCTVC-L1003. Geneva, CH (2013)
Wiegand, T., Sullivan, G., Bjontegaard, G., Luthra, A.: Overview of the H.264/ AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 13(7), 560–576 (2003)
Ohm, J.R., Sullivan, G.J., Schwarz, H., Tan, T.K., Wiegand, T.: Comparison of the coding efficiency of video coding standards including high efficiency video coding (HEVC). IEEE Trans. Circuits Syst. Video Technol. 22(12), 1669–1684 (2012)
Sullivan, G., Ohm, J., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 22(12), 1649–1668 (2012)
Rehman, A., Wang, Z.: SSIM-inspired perceptual video coding for HEVC. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 497–502 (2012)
Yang, C.L., Wang, H.X., Po, L.M.: Improved inter prediction based on structural similarity in H.264. In: Proceedings of IEEE International Conference on Signal Processing Communication, vol. 2, pp. 340–343 (2007)
Yang, C.L., Leung, R.K., Po, L.M., Mai, Z.Y.: An SSIM-optimal H.264/AVC inter frame encoder. In: Proceedings of IEEE International Conference on Intelligence Computation and Intelligence Systems, vol. 4, pp. 291–295 (2009)
Huang, Y.H., Ou, T.S., Su, P.Y., Chen, H.H.: Perceptual rate-distortion optimization using structural similarity index as quality metric. IEEE Trans. Circuits Syst. Video Technol. 20(11), 1614–1624 (2010)
Chen, H.H., Huang, Y.H., Su, P.Y., Ou, T.S.: Improving video coding quality by perceptual rate-distortion optimization. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 1287–1292 (2010)
Wang, S., Rehman, A., Wang, Z., Ma, S., Gao, W.: SSIM-motivated rate-distortion optimization for video coding. IEEE Trans. Circuits Syst. Video Technol. 22(4), 516–529 (2012)
Wang, S., Rehman, A., Wang, Z., Ma, S., Gao, W.: Perceptual video coding based on SSIM-inspired divisive normalization. IEEE Trans. Image Process. 22(4), 1418–1429 (2013)
Hu, S., Jin, L., Wang, H., Zhang, Y., Kwong, S., Kuo, C.C.J.: Objective video quality assessment based on perceptually weighted mean squared error. IEEE Trans. Circuits Syst. Video Technol. 99, 1–1 (2016)
Xu, L., Lin, W., Ma, L., Zhang, Y., Fang, Y., Ngan, K.N., Li, S., Yan, Y.: Free-energy principle inspired video quality metric and its use in video coding. IEEE Trans. Multimed. 18(4), 590–602 (2016)
Ahn, Y.J., Sim, D.: Fast mode decision and early termination based on perceptual visual quality for HEVC encoders. J. Real Time Image Process (2017)
Zeng, H., Yang, A., Ngan, K.N., Wang, M.: Perceptual sensitivity-based rate control method for high efficiency video coding. Multimed. Tools Appl. 75(17), 10383–10396 (2016)
Tang, C.W., Chen, C.H., Yu, Y.H., Tsai, C.J.: Visual sensitivity guided bit allocation for video coding. IEEE Trans. Multimed. 8(1), 11–18 (2006)
Tang, C.W.: Spatiotemporal visual considerations for video coding. IEEE Trans. Multimed. 9(2), 231–238 (2007)
Sun, C., Wang, H.J., Li, H.: Macroblock-level rate-distortion optimization with perceptual adjustment for video coding. In: Proceedings of IEEE Data Compressing Conference, pp. 546–546 (2008)
Wang, Z., Lu, L., Bovik, A.C.: Foveation scalable video coding with automatic fixation selection. IEEE Trans. Image Process. 12(2), 243–254 (2003)
Wei, H., Zhou, X., Zhou, W., Yan, C., Duan, Z., Shan, N.: Visual saliency based perceptual video coding in HEVC. In: Proceedings of International Symposium Circuits Systems, pp. 2547–2550 (2016)
Zhang, F., Bull, D.R.: HEVC enhancement using content-based local QP selection. In: Proceedings of IEEE ICIP, pp. 4215–4219 (2016)
Yang, A., Zeng, H., Chen, J., Zhu, J., Cai, C.: Perceptual feature guided rate distortion optimization for high efficiency video coding. Multidimension. Syst. Signal Process. 28(4), 1249–1266 (2017)
Zhao, W., Fu, J., Lu, Y., Li, S., Zhao, D.: Region-of-interest based coding scheme for synthesized video. In: Proceedings of IEEE International Conference on Visual Communication and Image Processing, pp. 1–4 (2015)
Li, S., Xu, M., Wang, Z., Sun, X.: Optimal bit allocation for CTU level rate control in HEVC. IEEE Trans. Circuits Syst. Video Technol. 27(11), 2409–2424 (2017)
Wang, M., Ngan, K.N., Li, H.: Low-delay rate control for consistent quality using distortion-based Lagrange multiplier. IEEE Trans. Image Process. 25(7), 2943–2955 (2016)
Perez-Daniel, K., Sanchez, V.: Luma-aware multi-model rate-control for HDR content in HEVC. In: Proceedings of IEEE ICIP (2017)
Meddeb, M., Cagnazzo, M., Pesquet-Popescu, B.: Region-of-interest-based rate control scheme for high-efficiency video coding. APSIPA Transactions on Signal and Information Processing 3 (2014)
Xu, M., Deng, X., Li, S., Wang, Z.: Region-of-interest based conversational HEVC coding with hierarchical perception model of face. IEEE J. Sel. Topics Signal Process. 8(3), 475–489 (2014)
Yang, X., Lin, W., Lu, Z., Ong, E., Yao, S.: Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile. IEEE Trans. Circuits Syst. Video Technol. 15(6), 742–752 (2005)
Yang, X., Lin, W., Lu, Z., Ong, E., Yao, S.: Just noticeable distortion model and its applications in video coding. Signal Process. Image Commun. 20(7), 662–680 (2005)
Wei, Z., Ngan, K.N.: Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain. IEEE Trans. Circuits Syst. Video Technol. 19(3), 337–346 (2009)
Chen, Z., Guillemot, C.: Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model. IEEE Trans. Circuits Syst. Video Technol. 20(6), 806–819 (2010)
Luo, Z., Song, L., Zheng, S., Ling, N.: H.264/advanced video control perceptual optimization coding based on JND-directed coefficient suppression. IEEE Trans. Circuits Syst. Video Technol. 23(6), 935–948 (2013)
Jung, C., Chen, Y.: Perceptual rate distortion optimisation for video coding using free-energy principle. Electron. Lett. 51(21), 1656–1658 (2015)
Kim, J., Bae, S.H., Kim, M.: An HEVC-compliant perceptual video coding scheme based on JND models for variable block-sized transform kernels. IEEE Trans. Circuits Syst. Video Technol. 25(11), 1786–1800 (2015)
Bae, S.H., Kim, J., Kim, M.: HEVC-based perceptually adaptive video coding using a DCT-based local distortion detection probability model. IEEE Trans. Image Process. 25(7), 3343–3357 (2016)
Wang, G., Zhang, Y., Li, B., Fan, R., Zhou, M.: A fast and HEVC-compatible perceptual video coding scheme using a transform-domain multi-channel JND model. Multimed. Tools Appl. (2017)
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Wang, X., Su, L., Huang, Q., Liu, C., Duan, L.Y.: Motion based perceptual distortion and rate optimization for video coding. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 1061–1066 (2012)
Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)
Wu, H.R., Reibman, A.R., Lin, W., Pereira, F., Hemami, S.S.: Perceptual visual signal compression and transmission. Proc. IEEE 101(9), 2025–2043 (2013)
Pei, S.C., Lai, C.L.: Very low bit-rate coding algorithm for stereo video with spatiotemporal HVS model and binary correlation disparity estimator. IEEE J. Sel. Areas Commun. 16(1), 98–107 (1998)
Chen, J., Zheng, J., Mei, S., He, Y.: Macroblock-level adaptive frequency weighting. In: Proceedings of IEEE International Conference on Multimedia Expo, pp. 304–307 (2007)
Watson, A.B.: DCT quantization matrices visually optimized for individual images. In: Proceedings of SPIE, vol. 1913 (1993)
Marzuki, I., Ma, J., Ahn, Y.J., Sim, D.: A context-adaptive fast intra coding algorithm of high-efficiency video coding (HEVC). J. Real Time Image Process. (2016)
Lee, J.H., Goswami, K., Kim, B.G., Jeong, S., Choi, J.S.: Fast encoding algorithm for high-efficiency video coding (HEVC) system based on spatio-temporal correlation. J. Real Time Image Proc. 12(2), 407–418 (2016)
Hu, Q., Zhang, X., Shi, Z., Gao, Z.: Neyman-Pearson-based early mode decision for HEVC encoding. IEEE Trans. Multimed. 18(3), 379–391 (2016)
Lin, T.L., Chou, C.C., Liu, Z., Tung, K.H.: HEVC early termination methods for optimal cu decision utilizing encoding residual information. J. Real Time Image Process. (2016)
Sullivan, G.J., Wiegand, T.: Rate-distortion optimization for video compression. IEEE Signal Process. Mag. 15(6), 74–90 (1998)
Itti, L., Baldi, P.: A principled approach to detecting surprising events in video. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 631–637 (2005)
Seshadrinathan, K., Bovik, A.C.: Motion tuned spatio-temporal quality assessment of natural videos. IEEE Trans. Image Process. 19(2), 335–350 (2010)
Watson, A.B., Ahumada, J., J, A.: Model of human visual-motion sensing. J. Opt. Soc. Am. A 2(2), 322–342 (1985)
Bae, S.H., Kim, M.: DCT-QM: A DCT-based quality degradation metric for image quality optimization problems. IEEE Trans. Image Process. 25(10), 4916–4930 (2016)
Ning, Z., Zhang, Z., Liu, Z.: Visual attention based video object segmentation in MPEG compressed domain. In: Proceedings of IEEE CCWMSN07, pp. 564–567 (2007)
Westen, S.J.P., Lagendijk, R.L., Biemond, J.: Perceptual optimization of image coding algorithms. Proc. IEEE ICIP 2, 69–72 (1995)
Konrad, J., Dubois, E.: Bayesian estimation of motion vector fields. IEEE Trans. Pattern Anal. Mach. Intell. 14(9), 910–927 (1992)
Aly, H.A.: Data hiding in motion vectors of compressed video based on their associated prediction error. IEEE Trans. Inf. Forensics Secur. 6(1), 14–18 (2011)
Wang, Y.K., Hannuksela, M.M., Varsa, V., Hourunranta, A., Gabbouj, M.: The error concealment feature in the H.26L test model. In: Proceedings of IEEE ICIP, vol. 2, pp. II-729–II-732 (2002)
Shen, B., Sethi, I.K., Vasudev, B.: Adaptive motion-vector resampling for compressed video downscaling. IEEE Trans. Circuits Syst. Video Technol. 9(6), 929–936 (1999)
Stiller, C., Konrad, J.: Estimating motion in image sequences. IEEE Signal Processing Mag. 16(4), 70–91 (1999)
Chiang, T., Zhang, Y.Q.: A new rate control scheme using quadratic rate distortion model. IEEE Trans. Circuits Syst. Video Technol. 7(1), 246–250 (1997)
Yang, E.H., Yu, X., Meng, J., Sun, C.: Transparent composite model for DCT coefficients: design and analysis. IEEE Trans. Image Process. 23(3), 1303–1316 (2014)
Cho, S.H., Mathews, V.J.: Tracking analysis of the sign algorithm in nonstationary environments. IEEE Trans. Acoust. Speech Signal Process. 38(12), 2046–2057 (1990)
Chen, Y.M., Bajic, I.V.: A joint approach to global motion estimation and motion segmentation from a coarsely sampled motion vector field. IEEE Trans. Circuits Syst. Video Technol. 21(9), 1316–1328 (2011)
Wan, P., Feng, Y., Cheung, G., Bajic, I.V., Au, O.C.: 3-D motion estimation for visual saliency modeling. IEEE Signal Process. Lett. 20(10), 972–975 (2013)
Su, Y., Sun, M.T., Hsu, V.: Global motion estimation from coarsely sampled motion vector field and the applications. IEEE Trans. Circuits Syst. Video Technol. 15(2), 232–242 (2005)
Smolic, A., Hoeynck, M., Ohm, J.R.: Low-complexity global motion estimation from p-frame motion vectors for mpeg-7 applications. Proc. IEEE ICIP 2, 271–274 (2000)
Simoncelli, E.P., Olshausen, B.: Natural image statistics and neural representation. Ann. Rev. Neurosci. 24, 1193–1216 (2001)
Friston, K.: The free-energy principle: a unified brain theory? Nature Rev. Neurosci. 11(2), 127–138 (2010)
Li, B., Li, H., Li, L., Zhang, J.: \(\lambda \) domain rate control algorithm for high efficiency video coding. IEEE Trans. Image Process. 23(9), 3841–3854 (2014)
Hu, Q., Zhang, X., Gao, Z., Sun, J.: Analysis and optimization of x265 encoder. In: Proceedings of IEEE International Conference on Visual Communication and Image Processing, pp. 502–505 (2014)
Bossen, F.: Common HM test conditions and software reference configurations. In: Proceedings of 11th Meeting, JCTVC-K1100 (2012)
Pinson, M.H., Wolf, S.: A new standardized method for objectively measuring video quality. IEEE Trans. Broadcast. 50(3), 312–322 (2004)
Naccari, M., Pereira, F.: Advanced H.264/AVC-based perceptual video coding: architecture, tools, and assessment. IEEE Trans. Circuits Syst. Video Technol. 21(6), 766–782 (2011)
Bjontegaard, G.: Calculation of average PSNR differences between RD curves. ITU-T SC16/Q6, VCEG-M33. Austin, USA (2001)
ITU-R: Methodology for the subjective assessment of quality of television pictures. ITU-R Rec. BT.500-11 (2002)
Sheikh, H.R., Sabir, M.F., Bovik, A.C.: A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 15(11), 3440–3451 (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hu, Q., Zhou, J., Zhang, X. et al. In-loop perceptual model-based rate-distortion optimization for HEVC real-time encoder. J Real-Time Image Proc 17, 293–311 (2020). https://doi.org/10.1007/s11554-018-0772-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-018-0772-1