Abstract
This paper investigates the temporal pumping artifact (TPA) induced by digital video coding using H.264/AVC and H.265/HEVC standards and proposes a visual-masking-based method to estimate regions with perceptible TPA, referred to as VM-TPA-PRE, for head-and-shoulder video sequences which are common in video messaging, video conferencing and video telephony applications. In digitally coded head-and-shoulder video sequences, the TPA manifests itself as a stumbling effect caused by severe quality fluctuations from frame to frame among adjacent pictures which are most likely to be perceived in regions that the human visual system (HVS) is sensitive to. Considering the object-based or region-of-interest-based video coding theory, accurately estimating regions of the TPA perceivable to the HVS is the key to effective assessment and processing of the TPA and to improve visual quality of videos impaired by the TPA. Experimental results clearly show that the estimation by the VM-TPA-PRE is accurate and in line with human perception.
Similar content being viewed by others
References
J.L. Bao, J. Guo, J.Y. Xu, A robust watermarking scheme for region of interest in H.264 scalable video coding, in Proceedings of the International Symposium on Instrumentation and Measurement, Sensor Network and Automation (IMSNA) (Toronto, Canada, 2013), pp. 536–538
S. Borer, A model of jerkiness for temporal impairments in video transmission, in Proceedings of the International Workshop on Quality of Multimedia Experience (QoMEX) (Trondheim, Norway, 2010), pp. 218–223
Z.Z. Chen, C. Guristine, Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model. IEEE Trans. Circuit Syst. Video Technol. 20(6), 806–819 (2010)
M.C. Chi, M.J. Chen, C.T. Hsu, Region-of-interest video coding by fuzzy control for H.263+ standard, in Proceedings of the International Symposium on Circuits and Systems (Vancouver, Canada, 2004), pp. II-93–96
M.C. Chi, M.J. Chen, C.H. Yeh, J.A. Jhu, Region-of-interest video coding based on rate and distortion variations for H.263+. Signal Process. Image Commun. 23(2), 127–142 (2008)
K. Chono, Y. Senda, Y. Miyamoto, Detented quantization to suppress flicker artifacts in periodically inserted intra-coded pictures in H.264 video coding, in Proceedings of the IEEE International Conference on Image Processing (ICIP) (Atlanta, USA, 2006), pp. 1713–1716
F.M. Ciaramello, S.S. Hemami, Can you see me now? An objective metric for predicting intelligibility of compressed American sign language video, inProceedings of SPIE, Human Vision and Electronic Imaging (2007), pp. 64920M-1–64920M-9
CIPR sequences. http://www.cipr.rpi.edu/resource/sequences/ (2005). Accessed 6 Aug 2015
A. Eden, No-reference estimation of the coding PSNR for H.264-coded sequences. IEEE Trans. Consum. Electron. 53(2), 667–674 (2007)
Y.C. Gong, S. Wan, K.F. Yang, H.R. Wu, B. Li, Perception-based quantitative definition of temporal pumping artifact, in Proceedings of the International Conference on Digital Signal Processing (DSP) (Hong Kong, 2014), pp. 870–875
Y.C. Gong, S. Wan, K.F. Yang, F.Z. Yang, L. Cui, An efficient algorithm to eliminate temporal pumping artifact in video coding with hierarchical prediction structure. J. Vis. Commun. Image Represent. 25(7), 1528–1542 (2014)
R.C. Gonzalez, R.E. Woods, Digital Image Processing, 3rd edn. (Pearson Prentice Hall, Upper Saddle River, 2008)
HD test sequences from State Key Laboratory of ISN Xidian University. https://www.115.com/?lang=en (User name: ycgong@mail.nwpu.edu.cn, Password: NWPUvideo) and https://onedrive.live.com/ (User name: ycgongnpu@outlook.com, Password: NWPUvideo) (2015). Accessed 6 Aug 2015
HM10.0 anchors bit streams. http://ftp.kw.bbc.co.uk/hevc/hm-10.0-anchors/bitstreams/ (2013). Accessed 6 Aug 2015
ITU-T and ISO/IEC, High Efficiency Video Coding/Information Technology—High Efficiency Coding and Media Delivery in Heterogeneous Environments—Part 2: High Efficiency Video Coding, Rec. H265 and ISO/IEC 23008-2:2013 (2013)
JVT, H.264/14496-10 AVC Reference Software. http://iphome.hhi.de/suehring/tml/download/old_jm (2015). Accessed 27 June 2015
I.K. Kim, K. Mccann, K. Sugimoto, B. Bross, W.J. Han, G. Sullivan, High Efficiency Video Coding (HEVC) Test Model 14 (HM14) Encoder Description. Jonit Collaborative Team on Video Coding, JCTVC-P1002, San José, US (2014)
J.Y. Kim, C.H. Yi, T.Y. Kim, ROI-centered compression by adaptive quantization for sports video. IEEE Trans. Consum. Electron. 56(2), 951–956 (2010)
S. Kwon, J. Kim, D. Lee, K. Park, ROI analysis for remote photoplethysmography on facial video, in Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (Milan, Italy, 2015), pp. 25–29
X. Li, P. Amon, A. Hutter, A. Kaup, Adaptive quantization parameter cascading for hierarchical video coding, in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS) (Paris, France, 2010), pp. 4197–4200
X. Li, P. Amon, A. Hutter, A. Kaup, Model based analysis for quantization parameter cascading in hierarchical video coding, in Proceedings of the IEEE International Conference on Image Processing (ICIP) (Cairo, Egypt, 2009), pp. 3765–3768
W.S. Lin, C.C.J. Kuo, Perceptual visual quality metrics: a survey. J. Vis. Commun. Image Represent. 22(4), 297–312 (2011)
Y. Liu, Z.G. Li, Y.C. Soh, M.H. Loke, Conversational video communication of H.264/AVC with region-of-interest concern, in Proceedings of the International Conference on Image Processing (ICIP) (Atlanta, USA, 2006), pp. 3129–3132
Y. Liu, Z.G. Li, Y.C. Soh, Region-of-interest based resource allocation for conversational video communication of H.264/AVC. IEEE Trans. Circuit Syst. Video Technol. 18(1), 134–139 (2008)
C. Mantel, P. Ladret, T. Kunlin, A temporal mosquito noise corrector, in Proceedings of the International Workshop on Quality of Multimedia Experience (QoMEX) (San Diego, CA, 2009), pp. 244–249
A. Ninassi, O. Le Meur, P. Le Callet, D. Barba, Considering temporal variations of spatial visual distortions in video quality assessment. IEEE J. Sel. Top. Signal Process. 3(2), 253–265 (2009)
E.P. Ong, X.K. Yang, W.S. Lin, Z.K. Lu, S. Yao, X. Lin, S. Rahardja, B.C. Seng, Perceptual quality and objective quality measurements of compressed videos. J. Vis. Commun. Image Represent. 17(4), 717–737 (2006)
F. Peng, X.W. Zhu, M. Long, An ROI privacy protection scheme for H.264 video based on FMO and chaos. IEEE Trans. Inf. Forensics Secur. 8(10), 1688–1699 (2013)
H. Sabirin, M. Kim, Moving object detection and tracking using a spatio-temporal graph in H.264/AVC bitstreams for video surveillance. IEEE Trans. Multimed. 14(3), 657–668 (2012)
R.P. Schumeyer, K.E. Barner, Color-based classifier for region identification in video, inProceedings of Visual Communications and Image Processing (San Jose, CA, 1998), pp. 189–200
H. Schwarz, D. Marpe, T. Wiegand, Analysis of hierarchical B pictures and MCTF, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) (Toronto, Canada, 2006), pp. 1929–1932
H. Schwarz, D. Marpe, T. Wiegand, Hierarchical B pictures. Joint Video Team, JVT-P014, Poznan, Poland (2005)
H. Schwarz, D. Marpe, T. Wiegand, Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans. Circuit Syst. Video Technol. 17(9), 1103–1120 (2007)
J. Serra, L. Vincent, An overview of morphological filtering. Circuits Syst. Signal Process. 11(1), 1–54 (1992)
M.Y. Shen, C.C.J. Kuo, Review of postprocessing techniques for compression artifact removal. J. Vis. Commun. Image Represent. 9(1), 2–14 (1998)
X.D. Sun, J. Foote, D. Kimber, B.S. Manjunath, Region of interest extraction and virtual camera control based on panoramic video capturing. IEEE Trans. Multimed. 7(5), 981–990 (2005)
D.T. Vo, T.Q. Nguyen, S. Yea, A. Vetro, Adaptive fuzzy filtering for artifact reduction in compressed images and videos. IEEE Trans. Image Process. 18(6), 1166–1178 (2009)
VQEG, Hybrid perceptual/bitstream group TEST PLAN 1.1. http://www.its.bldrdoc.gov/vqeg (2007). Accessed 16 Dec 2012
S. Wan, Y.C. Gong, F.Z. Yang, Perception of temporal fluctuations in video coding with the hierarchical prediction structure, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) (Melbourne, Australia, 2012), pp. 503–508
Z. Wei, K.N. Ngan, Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain. IEEE Trans. Circuit Syst. Video Technol. 19(3), 337–346 (2009)
H.R. Wu, A. Reibman, W. Lin, F. Pereira, S. Hemami, Perceptual visual signal compression and transmission. Special issue on perception-based media processing. Proc. IEEE 101(9), 2025–2043 (2013)
B. Xiong, X.J. Fan, C. Zhu, X. Jing, Q. Peng, Face region based conversational video coding. IEEE Trans. Circuit Syst. Video Technol. 21(7), 917–931 (2011)
Xiph.org video test media. http://media.xiph.org/video/derf/ (2010). Accessed 6 Aug 2015
M. Xu, X. Deng, S.X. Li, Z.L. Wang, Region-of-interest based conversational HEVC coding with hierarchical perception model of face. IEEE J. Sel. Top. Signal Process. 8(3), 475–489 (2014)
L. Yang, L.L. Yang, M.A. Robertson, Multiple-face tracking system for general region-of-interest video coding, in Proceedings of the International Conference on Image Processing (ICIP) (Vancouver, Canada, 2000), pp. 347–350
X.K. Yang, W.S. Lin, Z.K. Lu, X. Lin, S. Rahardja, E. Ong, S. Yao, Rate control for videophone using local perceptual cues. IEEE Trans. Circuit Syst. Video Technol. 15(4), 496–507 (2005)
X.K. Yang, W.S. Lin, Z.H. Lu, E.P. Ong, S.S. Yao, Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile. IEEE Trans. Circuit Syst. Video Technol. 15(6), 742–752 (2005)
M. Yuen, H.R. Wu, Reconstruction artifacts in digital video compression, in Proceedings of Digital Video Compression: Algorithms and Technologies (1995), pp. 455–465
M. Yuen, H.R. Wu, A survey of hybrid MC/DPCM/DCT video coding distortions. Signal Process. 70(3), 247–278 (1998)
YUV video sequences. http://trace.eas.asu.edu/yuv/index.html (2010). Accessed 6 Aug 2015
YUV420 test sequences. ftp://ftp.tnt.uni-hannover.de/testsequence (2013). Accessed 6 Aug 2015
Y. Zhao, L. Yu, Z.Z. Chen, C. Zhu, Video quality assessment based on measuring perceptual noise from spatial and temporal perspectives. IEEE Trans. Circuit Syst. Video Technol. 21(12), 1890–1902 (2011)
Acknowledgments
Authors were in debt to anonymous reviewers for their thorough reviews, constructive comments and valuable suggestions which helped to improve the quality and presentation of the manuscript. Sincere thanks also go to our cooperative institutions: (a) State Key Laboratory of ISN Xidian University, and (b) Visual Communications Engineering Research Laboratory, School of Electrical and Computer Engineering, Royal Melbourne Institute of Technology University, for providing HD test video sequences in the experiments. This work was supported by the National Natural Science Foundation Research Program of China (No. 61371089).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gong, Y., Wan, S., Yang, K. et al. A Visual-Masking-Based Estimation Algorithm for Temporal Pumping Artifact Region Prediction. Circuits Syst Signal Process 36, 1264–1287 (2017). https://doi.org/10.1007/s00034-016-0357-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-016-0357-9