Skip to main content
Log in

Perceptual rate-distortion optimization for H.264/AVC video coding from both signal and vision perspectives

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recent researches have shown that the Structural SIMilariy (SSIM)-based Rate-Distortion Optimization (RDO) can obtain more structural information than the traditional SSE-based RDO for video coding. Correspondingly, the perceptual video quality is improved at certain degree for the video stream encoded by the SSIM-based RDO. However, for the MB with significant luminance change (due to the flashlight or sunshine change, etc.) but little structure change compared to the co-located MB in the referred frame, the SSIM-based RDO may select improper encoding mode (SKIP) which leads to uncomfortable visual experience. In this paper, involving the surrounding pixels of the current coding MB into the measurement of spacial texture quality degradation, the RDO which jointly considers the SSIM-based distortion and SSE-based distortion is proposed to improve H.264/AVC perceptual coding performance. The Lagrange multiplier in the proposed RDO is firstly derived at the frame level. Then, to make the Lagrange multiplier more adaptive to the specific video content, the Lagrange multiplier for each individual MB is refined based on the visual information theory. Experimental results show that the proposed perceptual RDO can select the optimal encoding mode which preserves as much structural information as possible with as little SSE-based distortion as possible, and that the perceptual quality of the encoded video is improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bjontegaard G (2001) Calculation of average PSNR difference between RD curves. ITU-TQ.6/SG16 VCEG 13th Meeting

  2. Chandler D, Hemami S (2007) VSNR: A wavelet-based visual signal-to-noise ratio for natural images. IEEE Trans Image Process 16(9):2284–2298

    Article  MathSciNet  Google Scholar 

  3. Chen Z, Lin W, Ngan KN (2010) Perceptual video coding: challenges and approaches. IEEE Int Conf Multimed Expo:784–789

  4. Girod B (1993) What’s wrong with mean-squared error. In: Watson AB (ed) Digital Images and Human Vision. MIT Press, Cambridge, MA, pp 207–220

  5. H.264/MEPEG-4 AVC Reference Software, [online]. Available: http://iphome.hhi.de/suehring/tml/download/old_jm/jm16.1.zip

  6. Huang Y, Ou T, Su P, Chen H (2010) Perceptual rate-distortion optimization using structural similarity index as quality metric. IEEE Trans Circ Syst Video Technol 20(11):1614–1624

    Article  Google Scholar 

  7. Sheikh H, Bovik AC (2006) Image information and visual quality. IEEE Tans Image Process 15(2):430–444

    Article  Google Scholar 

  8. SSIM index map [online]. Available: https://ece.uwaterloo.ca/z70wang/research/ssim/ssim_index.m

  9. Wang Z, Bovik A (2009) Mean squared error, love it or leave it? - a new look at signal fidelity measures. IEEE Signal Proc Mag 26(1):98–117

    Article  Google Scholar 

  10. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Proc 13(4):600–612

    Article  Google Scholar 

  11. Wang S, Rehman A, Wang Z, Ma S, Gao W (2012) SSIM-motivated rate-distortion optimization for video coding. IEEE Trans Circ Syst Video Technol 22(4):516–529

    Article  MathSciNet  Google Scholar 

  12. Wang S, Rehman A, Wang Z, Ma S, Gao W (2013) Perceptual video coding based on SSIM-inspired divisive normalization. IEEE Trans Image Proc 22(4):1418–1429

    Article  MathSciNet  Google Scholar 

  13. Wang X, Su L, Huang Q, Liu C (2011) Visual perception based Lagrangian rate distortion optimization for video coding. IEEE Int Conf Image Proc:1653–1656

  14. Wiegand T, Girod B (2001) Lagrangian multiplier selection in hybrid video coder control. Proc Int Conf Image Proc 3:542–545

    Google Scholar 

  15. Yang C, Wang H, Po L (2007) A novel fast motion estimation algorithm based on SSIM for H.264 video coding. Adv Multimedia Inf Process - PCM 2007 4810:168–176

    Article  Google Scholar 

  16. Yeo C, Tan HL, Tan YH (2013) On rate distortion optimization using SSIM. IEEE Trans Circ Syst Video Technol 23(7):1170–1181

    Article  Google Scholar 

  17. Zhao P, Liu Y, Liu J, Yao R, Ci S, Tang H (2013) Low-complexity content-adaptive Lagrange multiplier decision for SSIM-based RD-optimized video coding. IEEE Int Symp Circ Syst:485–488

Download references

Acknowledgment

This work was supported in part by NSFC under Grant nos.61102077, 61401109 and 61472388, National Key Technology R &D Program 2012BAH01B03, Zhejiang Provincial Natural Science Foundation of China under Contract LY13F010012, Public welfare projects of Zhejiang Province under Contract 2014C31072.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanwei Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, P., Liu, Y., Liu, J. et al. Perceptual rate-distortion optimization for H.264/AVC video coding from both signal and vision perspectives. Multimed Tools Appl 75, 2781–2800 (2016). https://doi.org/10.1007/s11042-015-2533-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2533-5

Keywords

Navigation