Skip to main content
Log in

A Visual-Masking-Based Estimation Algorithm for Temporal Pumping Artifact Region Prediction

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

This paper investigates the temporal pumping artifact (TPA) induced by digital video coding using H.264/AVC and H.265/HEVC standards and proposes a visual-masking-based method to estimate regions with perceptible TPA, referred to as VM-TPA-PRE, for head-and-shoulder video sequences which are common in video messaging, video conferencing and video telephony applications. In digitally coded head-and-shoulder video sequences, the TPA manifests itself as a stumbling effect caused by severe quality fluctuations from frame to frame among adjacent pictures which are most likely to be perceived in regions that the human visual system (HVS) is sensitive to. Considering the object-based or region-of-interest-based video coding theory, accurately estimating regions of the TPA perceivable to the HVS is the key to effective assessment and processing of the TPA and to improve visual quality of videos impaired by the TPA. Experimental results clearly show that the estimation by the VM-TPA-PRE is accurate and in line with human perception.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. J.L. Bao, J. Guo, J.Y. Xu, A robust watermarking scheme for region of interest in H.264 scalable video coding, in Proceedings of the International Symposium on Instrumentation and Measurement, Sensor Network and Automation (IMSNA) (Toronto, Canada, 2013), pp. 536–538

  2. S. Borer, A model of jerkiness for temporal impairments in video transmission, in Proceedings of the International Workshop on Quality of Multimedia Experience (QoMEX) (Trondheim, Norway, 2010), pp. 218–223

  3. Z.Z. Chen, C. Guristine, Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model. IEEE Trans. Circuit Syst. Video Technol. 20(6), 806–819 (2010)

    Article  Google Scholar 

  4. M.C. Chi, M.J. Chen, C.T. Hsu, Region-of-interest video coding by fuzzy control for H.263+ standard, in Proceedings of the International Symposium on Circuits and Systems (Vancouver, Canada, 2004), pp. II-93–96

  5. M.C. Chi, M.J. Chen, C.H. Yeh, J.A. Jhu, Region-of-interest video coding based on rate and distortion variations for H.263+. Signal Process. Image Commun. 23(2), 127–142 (2008)

    Article  Google Scholar 

  6. K. Chono, Y. Senda, Y. Miyamoto, Detented quantization to suppress flicker artifacts in periodically inserted intra-coded pictures in H.264 video coding, in Proceedings of the IEEE International Conference on Image Processing (ICIP) (Atlanta, USA, 2006), pp. 1713–1716

  7. F.M. Ciaramello, S.S. Hemami, Can you see me now? An objective metric for predicting intelligibility of compressed American sign language video, inProceedings of SPIE, Human Vision and Electronic Imaging (2007), pp. 64920M-1–64920M-9

  8. CIPR sequences. http://www.cipr.rpi.edu/resource/sequences/ (2005). Accessed 6 Aug 2015

  9. A. Eden, No-reference estimation of the coding PSNR for H.264-coded sequences. IEEE Trans. Consum. Electron. 53(2), 667–674 (2007)

    Article  Google Scholar 

  10. Y.C. Gong, S. Wan, K.F. Yang, H.R. Wu, B. Li, Perception-based quantitative definition of temporal pumping artifact, in Proceedings of the International Conference on Digital Signal Processing (DSP) (Hong Kong, 2014), pp. 870–875

  11. Y.C. Gong, S. Wan, K.F. Yang, F.Z. Yang, L. Cui, An efficient algorithm to eliminate temporal pumping artifact in video coding with hierarchical prediction structure. J. Vis. Commun. Image Represent. 25(7), 1528–1542 (2014)

    Article  Google Scholar 

  12. R.C. Gonzalez, R.E. Woods, Digital Image Processing, 3rd edn. (Pearson Prentice Hall, Upper Saddle River, 2008)

    Google Scholar 

  13. HD test sequences from State Key Laboratory of ISN Xidian University. https://www.115.com/?lang=en (User name: ycgong@mail.nwpu.edu.cn, Password: NWPUvideo) and https://onedrive.live.com/ (User name: ycgongnpu@outlook.com, Password: NWPUvideo) (2015). Accessed 6 Aug 2015

  14. HM10.0 anchors bit streams. http://ftp.kw.bbc.co.uk/hevc/hm-10.0-anchors/bitstreams/ (2013). Accessed 6 Aug 2015

  15. ITU-T and ISO/IEC, High Efficiency Video Coding/Information Technology—High Efficiency Coding and Media Delivery in Heterogeneous Environments—Part 2: High Efficiency Video Coding, Rec. H265 and ISO/IEC 23008-2:2013 (2013)

  16. JVT, H.264/14496-10 AVC Reference Software. http://iphome.hhi.de/suehring/tml/download/old_jm (2015). Accessed 27 June 2015

  17. I.K. Kim, K. Mccann, K. Sugimoto, B. Bross, W.J. Han, G. Sullivan, High Efficiency Video Coding (HEVC) Test Model 14 (HM14) Encoder Description. Jonit Collaborative Team on Video Coding, JCTVC-P1002, San José, US (2014)

  18. J.Y. Kim, C.H. Yi, T.Y. Kim, ROI-centered compression by adaptive quantization for sports video. IEEE Trans. Consum. Electron. 56(2), 951–956 (2010)

    Article  Google Scholar 

  19. S. Kwon, J. Kim, D. Lee, K. Park, ROI analysis for remote photoplethysmography on facial video, in Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (Milan, Italy, 2015), pp. 25–29

  20. X. Li, P. Amon, A. Hutter, A. Kaup, Adaptive quantization parameter cascading for hierarchical video coding, in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS) (Paris, France, 2010), pp. 4197–4200

  21. X. Li, P. Amon, A. Hutter, A. Kaup, Model based analysis for quantization parameter cascading in hierarchical video coding, in Proceedings of the IEEE International Conference on Image Processing (ICIP) (Cairo, Egypt, 2009), pp. 3765–3768

  22. W.S. Lin, C.C.J. Kuo, Perceptual visual quality metrics: a survey. J. Vis. Commun. Image Represent. 22(4), 297–312 (2011)

    Article  Google Scholar 

  23. Y. Liu, Z.G. Li, Y.C. Soh, M.H. Loke, Conversational video communication of H.264/AVC with region-of-interest concern, in Proceedings of the International Conference on Image Processing (ICIP) (Atlanta, USA, 2006), pp. 3129–3132

  24. Y. Liu, Z.G. Li, Y.C. Soh, Region-of-interest based resource allocation for conversational video communication of H.264/AVC. IEEE Trans. Circuit Syst. Video Technol. 18(1), 134–139 (2008)

    Article  Google Scholar 

  25. C. Mantel, P. Ladret, T. Kunlin, A temporal mosquito noise corrector, in Proceedings of the International Workshop on Quality of Multimedia Experience (QoMEX) (San Diego, CA, 2009), pp. 244–249

  26. A. Ninassi, O. Le Meur, P. Le Callet, D. Barba, Considering temporal variations of spatial visual distortions in video quality assessment. IEEE J. Sel. Top. Signal Process. 3(2), 253–265 (2009)

    Article  Google Scholar 

  27. E.P. Ong, X.K. Yang, W.S. Lin, Z.K. Lu, S. Yao, X. Lin, S. Rahardja, B.C. Seng, Perceptual quality and objective quality measurements of compressed videos. J. Vis. Commun. Image Represent. 17(4), 717–737 (2006)

    Article  Google Scholar 

  28. F. Peng, X.W. Zhu, M. Long, An ROI privacy protection scheme for H.264 video based on FMO and chaos. IEEE Trans. Inf. Forensics Secur. 8(10), 1688–1699 (2013)

    Article  Google Scholar 

  29. H. Sabirin, M. Kim, Moving object detection and tracking using a spatio-temporal graph in H.264/AVC bitstreams for video surveillance. IEEE Trans. Multimed. 14(3), 657–668 (2012)

    Article  Google Scholar 

  30. R.P. Schumeyer, K.E. Barner, Color-based classifier for region identification in video, inProceedings of Visual Communications and Image Processing (San Jose, CA, 1998), pp. 189–200

  31. H. Schwarz, D. Marpe, T. Wiegand, Analysis of hierarchical B pictures and MCTF, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) (Toronto, Canada, 2006), pp. 1929–1932

  32. H. Schwarz, D. Marpe, T. Wiegand, Hierarchical B pictures. Joint Video Team, JVT-P014, Poznan, Poland (2005)

  33. H. Schwarz, D. Marpe, T. Wiegand, Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans. Circuit Syst. Video Technol. 17(9), 1103–1120 (2007)

    Article  Google Scholar 

  34. J. Serra, L. Vincent, An overview of morphological filtering. Circuits Syst. Signal Process. 11(1), 1–54 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  35. M.Y. Shen, C.C.J. Kuo, Review of postprocessing techniques for compression artifact removal. J. Vis. Commun. Image Represent. 9(1), 2–14 (1998)

    Article  Google Scholar 

  36. X.D. Sun, J. Foote, D. Kimber, B.S. Manjunath, Region of interest extraction and virtual camera control based on panoramic video capturing. IEEE Trans. Multimed. 7(5), 981–990 (2005)

    Article  Google Scholar 

  37. D.T. Vo, T.Q. Nguyen, S. Yea, A. Vetro, Adaptive fuzzy filtering for artifact reduction in compressed images and videos. IEEE Trans. Image Process. 18(6), 1166–1178 (2009)

    Article  MathSciNet  Google Scholar 

  38. VQEG, Hybrid perceptual/bitstream group TEST PLAN 1.1. http://www.its.bldrdoc.gov/vqeg (2007). Accessed 16 Dec 2012

  39. S. Wan, Y.C. Gong, F.Z. Yang, Perception of temporal fluctuations in video coding with the hierarchical prediction structure, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) (Melbourne, Australia, 2012), pp. 503–508

  40. Z. Wei, K.N. Ngan, Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain. IEEE Trans. Circuit Syst. Video Technol. 19(3), 337–346 (2009)

    Article  Google Scholar 

  41. H.R. Wu, A. Reibman, W. Lin, F. Pereira, S. Hemami, Perceptual visual signal compression and transmission. Special issue on perception-based media processing. Proc. IEEE 101(9), 2025–2043 (2013)

    Article  Google Scholar 

  42. B. Xiong, X.J. Fan, C. Zhu, X. Jing, Q. Peng, Face region based conversational video coding. IEEE Trans. Circuit Syst. Video Technol. 21(7), 917–931 (2011)

    Article  Google Scholar 

  43. Xiph.org video test media. http://media.xiph.org/video/derf/ (2010). Accessed 6 Aug 2015

  44. M. Xu, X. Deng, S.X. Li, Z.L. Wang, Region-of-interest based conversational HEVC coding with hierarchical perception model of face. IEEE J. Sel. Top. Signal Process. 8(3), 475–489 (2014)

    Article  Google Scholar 

  45. L. Yang, L.L. Yang, M.A. Robertson, Multiple-face tracking system for general region-of-interest video coding, in Proceedings of the International Conference on Image Processing (ICIP) (Vancouver, Canada, 2000), pp. 347–350

  46. X.K. Yang, W.S. Lin, Z.K. Lu, X. Lin, S. Rahardja, E. Ong, S. Yao, Rate control for videophone using local perceptual cues. IEEE Trans. Circuit Syst. Video Technol. 15(4), 496–507 (2005)

    Article  Google Scholar 

  47. X.K. Yang, W.S. Lin, Z.H. Lu, E.P. Ong, S.S. Yao, Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile. IEEE Trans. Circuit Syst. Video Technol. 15(6), 742–752 (2005)

    Article  Google Scholar 

  48. M. Yuen, H.R. Wu, Reconstruction artifacts in digital video compression, in Proceedings of Digital Video Compression: Algorithms and Technologies (1995), pp. 455–465

  49. M. Yuen, H.R. Wu, A survey of hybrid MC/DPCM/DCT video coding distortions. Signal Process. 70(3), 247–278 (1998)

    Article  MATH  Google Scholar 

  50. YUV video sequences. http://trace.eas.asu.edu/yuv/index.html (2010). Accessed 6 Aug 2015

  51. YUV420 test sequences. ftp://ftp.tnt.uni-hannover.de/testsequence (2013). Accessed 6 Aug 2015

  52. Y. Zhao, L. Yu, Z.Z. Chen, C. Zhu, Video quality assessment based on measuring perceptual noise from spatial and temporal perspectives. IEEE Trans. Circuit Syst. Video Technol. 21(12), 1890–1902 (2011)

    Article  Google Scholar 

Download references

Acknowledgments

Authors were in debt to anonymous reviewers for their thorough reviews, constructive comments and valuable suggestions which helped to improve the quality and presentation of the manuscript. Sincere thanks also go to our cooperative institutions: (a) State Key Laboratory of ISN Xidian University, and (b) Visual Communications Engineering Research Laboratory, School of Electrical and Computer Engineering, Royal Melbourne Institute of Technology University, for providing HD test video sequences in the experiments. This work was supported by the National Natural Science Foundation Research Program of China (No. 61371089).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanchao Gong.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gong, Y., Wan, S., Yang, K. et al. A Visual-Masking-Based Estimation Algorithm for Temporal Pumping Artifact Region Prediction. Circuits Syst Signal Process 36, 1264–1287 (2017). https://doi.org/10.1007/s00034-016-0357-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-016-0357-9

Keywords

Navigation