Skip to main content
Log in

Perceptual feature guided rate distortion optimization for high efficiency video coding

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

With the advances in understanding perceptual properties of the human visual system, perceptual video coding, which aims to incorporate human perceptual mechanisms into video coding for maximizing the perceptual coding efficiency, becomes an essential research topic. Since the newest video coding standard—high efficiency video coding (HEVC) does not fully consider the perceptual characteristic of the input video, a perceptual feature guided rate distortion optimization (RDO) method is presented to improve its perceptual coding performance in this paper. In the proposed method, for each coding tree unit, the spatial perceptual feature (i.e., gradient magnitude ratio) and the temporal perceptual feature (i.e., gradient magnitude similarity deviation ratio) are extracted by considering the spatial and temporal perceptual correlations. These perceptual features are then utilized to guide the RDO process by perceptually adjusting the corresponding Lagrangian multiplier. By incorporating the proposed method into the HEVC, extensive simulation results have demonstrated that the proposed approach can significantly improve the perceptual coding performance and obtain better visual quality of the reconstructed video, compared with the original RDO in HEVC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Bjontegaard, G. (2001). Calculation of average PSNR differences between RD-curves (VCEG-M33). In VCEG meeting (ITU-T SG16 Q. 6).

  • Bossen, F. (2012). Document JCTVC-J1100: Common test conditions and software reference configurations. In JCT-VC Meeting, Stockholm, Sweden, Tech. Rep.

  • Girod, B. (1993). What’s wrong with mean-squared error? In Andrew B. Watson Digital images and human vision (pp. 207–220). MIT Press, Cambridge.

  • Girod, B. (1993). What’s wrong with mean-squared error? In A. B. Watson (Ed.), Digital images and human vision (pp. 207–220). Cambridge: MIT Press.

  • Bjontegaard, G. (2001). Calculation of average PSNR differences between RD-curves (VCEG-M33). In VCEG meeting (ITU-T SG16 Q. 6).

  • Huang, Y. H., Ou, T. S., Su, P. Y., & Chen, H. H. (2010). Perceptual rate-distortion optimization using structural similarity index as quality metric. IEEE Transactions on Circuits and Systems for Video Technology, 20(11), 1614–1624.

    Article  Google Scholar 

  • Jung, C., & Chen, Y. (2015). Perceptual rate distortion optimisation for video coding using free-energy principle. Electronics Letters, 51(21), 1656–1658.

    Article  Google Scholar 

  • Kim, J., Bae, S. H., & Kim, M. (2015). An HEVC-compliant perceptual video coding scheme based on JND models for variable block-sized transform kernels. IEEE Transactions on Circuits and Systems for Video Technology, 25(11), 1786–1800.

    Article  Google Scholar 

  • Lee, J. S., & Ebrahimi, T. (2012). Perceptual video compression: A survey. IEEE Journal of Selected Topics in Signal Processing, 6(6), 684–697.

    Article  Google Scholar 

  • Li, S., Xu, M., Deng, X., & Wang, Z. (2015). Weight-based R-\(\lambda \) rate control for perceptual HEVC coding on conversational videos. Signal Processing: Image Communication, 38, 127–140.

  • Ma, L., Li, S., Zhang, F., & Ngan, K. N. (2011). Reduced-reference image quality assessment using reorganized DCT-based image representation. IEEE Transactions on Multimedia, 13(4), 824–829.

    Article  Google Scholar 

  • Ma, L., Ngan, K. N., Zhang, F., & Li, S. (2011). Adaptive block-size transform based just-noticeable difference model for images/videos. Signal Processing: Image Communication, 26(3), 162–174.

    Google Scholar 

  • Meddeb, M., Cagnazzo, M., & Pesquet-Popescu, B. (2014). Region-of-interest-based rate control scheme for high-efficiency video coding. APSIPA Transactions on Signal and Information Processing, 3, e16.

    Article  Google Scholar 

  • Ou, T. S., Huang, Y. H., & Chen, H. H. (2011). SSIM-based perceptual rate control for video coding. IEEE Transactions on Circuits and Systems for Video Technology, 21(5), 682–691.

    Article  Google Scholar 

  • Sullivan, G. J., Ohm, J. R., Han, W. J., & Wiegand, T. (2012). Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology, 22(12), 1649–1668.

    Article  Google Scholar 

  • Tang, C. W., Chen, C. H., Yu, Y. H., & Tsai, C. J. (2006). Visual sensitivity guided bit allocation for video coding. IEEE Transactions on Multimedia, 8(1), 11–18.

    Article  Google Scholar 

  • Ugur, K., Andersson, K., Fuldseth, A., Bjontegaard, G., Endresen, L. P., Lainema, J., et al. (2010). High performance, low complexity video coding and the emerging HEVC standard. IEEE Transactions on Circuits and Systems for Video Technology, 20(12), 1688–1697.

    Article  Google Scholar 

  • Wang, S., Ma, S., Zhao, D., & Gao, W. (2014). Lagrange multiplier based perceptual optimization for high efficiency video coding. In Asia-Pacific signal and information processing association, 2014 annual summit and conference (APSIPA) (pp. 1–4). IEEE.

  • Wang, S., Rehman, A., Wang, Z., Ma, S., & Gao, W. (2012). SSIM-motivated rate-distortion optimization for video coding. IEEE Transactions on Circuits and Systems for Video Technology, 22(4), 516–529.

    Article  Google Scholar 

  • Wang, S., Rehman, A., Wang, Z., Ma, S., & Gao, W. (2013). Perceptual video coding based on SSIM-inspired divisive normalization. IEEE Transactions on Image Processing, 22(4), 1418–1429.

    Article  MathSciNet  Google Scholar 

  • Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.

    Article  Google Scholar 

  • Wang, Z., Zeng, H., Chen, J., & Cai, C. (2014). Key techniques of high efficiency video coding standard and its extension. In 2014 IEEE 9th conference on industrial electronics and applications (ICIEA) (pp. 1169–1173). IEEE.

  • Xu, L., Ma, L., Ngan, K. N., Lin, W., & Weng, Y. (2013). Visual quality metric for perceptual video coding. In Visual communications and image processing (VCIP) (pp. 1–5).

  • Xu, M., Deng, X., Li, S., & Wang, Z. (2014). Region-of-interest based conversational HEVC coding with hierarchical perception model of face. IEEE Journal of Selected Topics in Signal Processing, 8(3), 475–489.

    Article  Google Scholar 

  • Xue, W., Zhang, L., Mou, X., & Bovik, A. (2014). Gradient magnitude similarity deviation: A highly efficiency perceptual image quality index. IEEE Transactions on Image Processing, 23(2), 684–695.

    Article  MathSciNet  Google Scholar 

  • Yeo, C., Tan, H. L., & Tan, Y. H. (2013). On rate distortion optimization using SSIM. IEEE Transactions on Circuits and Systems for Video Technology, 23(7), 1170–1181.

    Article  Google Scholar 

  • Zeng, H., Ngan, K. N., & Wang, M. (2013). Perceptual adaptive Lagrangian multiplier for high efficiency video coding. In Picture coding symposium (PCS) (pp. 69–72). IEEE.

  • Zeng, H., Yang, A., Ngan, K. N., & Wang, M. (2015). Perceptual sensitivity-based rate control method for high efficiency video coding. In Multimedia tools and applications (pp. 1–14).

  • Zhang, F., Ma, L., Li, S., & Ngan, K. N. (2011). Practical image quality metric applied to image coding. IEEE Transactions on Multimedia, 13(4), 615–624.

    Article  Google Scholar 

  • Zhao, H., Xie, W., Zhang, Y., Yu, L., & Men, A. (2013). An SSIM-motivated LCU-level rate control algorithm for HEVC. In Picture coding symposium (PCS) (pp. 85–88). IEEE.

Download references

Acknowledgments

This work was support in part by the National Natural Science Foundation of China under the Grants 61401167 and 61372107, in part by the Natural Science Foundation of Fujian Province under the Grant 2016J01308, in part by the Opening Project of State Key Laboratory of Digital Publishing Technology under the Grant FZDP2015-B-001, in part by the Zhejiang Open Foundation of the Most Important Subjects, in part by the High-Level Talent Project Foundation of Huaqiao University under the Grants 14BS201 and 14BS204, and in part by the Graduate Student Scientific Research Innovation Ability Cultivation Plan Projects of Huaqiao University under the Grant 1400201031.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huanqiang Zeng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, A., Zeng, H., Chen, J. et al. Perceptual feature guided rate distortion optimization for high efficiency video coding. Multidim Syst Sign Process 28, 1249–1266 (2017). https://doi.org/10.1007/s11045-016-0395-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-016-0395-2

Keywords

Navigation