A Visual-Masking-Based Estimation Algorithm for Temporal Pumping Artifact Region Prediction

Gong, Yanchao; Wan, Shuai; Yang, Kaifang; Wu, Hong Ren; Li, Bo

doi:10.1007/s00034-016-0357-9

A Visual-Masking-Based Estimation Algorithm for Temporal Pumping Artifact Region Prediction

Published: 28 June 2016

Volume 36, pages 1264–1287, (2017)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Yanchao Gong¹,
Shuai Wan¹,
Kaifang Yang¹,
Hong Ren Wu² &
…
Bo Li¹

334 Accesses
3 Citations
Explore all metrics

Abstract

This paper investigates the temporal pumping artifact (TPA) induced by digital video coding using H.264/AVC and H.265/HEVC standards and proposes a visual-masking-based method to estimate regions with perceptible TPA, referred to as VM-TPA-PRE, for head-and-shoulder video sequences which are common in video messaging, video conferencing and video telephony applications. In digitally coded head-and-shoulder video sequences, the TPA manifests itself as a stumbling effect caused by severe quality fluctuations from frame to frame among adjacent pictures which are most likely to be perceived in regions that the human visual system (HVS) is sensitive to. Considering the object-based or region-of-interest-based video coding theory, accurately estimating regions of the TPA perceivable to the HVS is the key to effective assessment and processing of the TPA and to improve visual quality of videos impaired by the TPA. Experimental results clearly show that the estimation by the VM-TPA-PRE is accurate and in line with human perception.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PVC-STIM: Perceptual video coding based on spatio-temporal influence map

Article 07 February 2022

Spatiotemporal Masking for Objective Video Quality Assessment

No-reference real-time video transmission artifact detection for video signals

Article 24 September 2018

References

J.L. Bao, J. Guo, J.Y. Xu, A robust watermarking scheme for region of interest in H.264 scalable video coding, in Proceedings of the International Symposium on Instrumentation and Measurement, Sensor Network and Automation (IMSNA) (Toronto, Canada, 2013), pp. 536–538
S. Borer, A model of jerkiness for temporal impairments in video transmission, in Proceedings of the International Workshop on Quality of Multimedia Experience (QoMEX) (Trondheim, Norway, 2010), pp. 218–223
Z.Z. Chen, C. Guristine, Perceptually-friendly H.264/AVC video coding based on foveated just-noticeable-distortion model. IEEE Trans. Circuit Syst. Video Technol. 20(6), 806–819 (2010)
Article Google Scholar
M.C. Chi, M.J. Chen, C.T. Hsu, Region-of-interest video coding by fuzzy control for H.263+ standard, in Proceedings of the International Symposium on Circuits and Systems (Vancouver, Canada, 2004), pp. II-93–96
M.C. Chi, M.J. Chen, C.H. Yeh, J.A. Jhu, Region-of-interest video coding based on rate and distortion variations for H.263+. Signal Process. Image Commun. 23(2), 127–142 (2008)
Article Google Scholar
K. Chono, Y. Senda, Y. Miyamoto, Detented quantization to suppress flicker artifacts in periodically inserted intra-coded pictures in H.264 video coding, in Proceedings of the IEEE International Conference on Image Processing (ICIP) (Atlanta, USA, 2006), pp. 1713–1716
F.M. Ciaramello, S.S. Hemami, Can you see me now? An objective metric for predicting intelligibility of compressed American sign language video, inProceedings of SPIE, Human Vision and Electronic Imaging (2007), pp. 64920M-1–64920M-9
CIPR sequences. http://www.cipr.rpi.edu/resource/sequences/ (2005). Accessed 6 Aug 2015
A. Eden, No-reference estimation of the coding PSNR for H.264-coded sequences. IEEE Trans. Consum. Electron. 53(2), 667–674 (2007)
Article Google Scholar
Y.C. Gong, S. Wan, K.F. Yang, H.R. Wu, B. Li, Perception-based quantitative definition of temporal pumping artifact, in Proceedings of the International Conference on Digital Signal Processing (DSP) (Hong Kong, 2014), pp. 870–875
Y.C. Gong, S. Wan, K.F. Yang, F.Z. Yang, L. Cui, An efficient algorithm to eliminate temporal pumping artifact in video coding with hierarchical prediction structure. J. Vis. Commun. Image Represent. 25(7), 1528–1542 (2014)
Article Google Scholar
R.C. Gonzalez, R.E. Woods, Digital Image Processing, 3rd edn. (Pearson Prentice Hall, Upper Saddle River, 2008)
Google Scholar
HD test sequences from State Key Laboratory of ISN Xidian University. https://www.115.com/?lang=en (User name: ycgong@mail.nwpu.edu.cn, Password: NWPUvideo) and https://onedrive.live.com/ (User name: ycgongnpu@outlook.com, Password: NWPUvideo) (2015). Accessed 6 Aug 2015
HM10.0 anchors bit streams. http://ftp.kw.bbc.co.uk/hevc/hm-10.0-anchors/bitstreams/ (2013). Accessed 6 Aug 2015
ITU-T and ISO/IEC, High Efficiency Video Coding/Information Technology—High Efficiency Coding and Media Delivery in Heterogeneous Environments—Part 2: High Efficiency Video Coding, Rec. H265 and ISO/IEC 23008-2:2013 (2013)
JVT, H.264/14496-10 AVC Reference Software. http://iphome.hhi.de/suehring/tml/download/old_jm (2015). Accessed 27 June 2015
I.K. Kim, K. Mccann, K. Sugimoto, B. Bross, W.J. Han, G. Sullivan, High Efficiency Video Coding (HEVC) Test Model 14 (HM14) Encoder Description. Jonit Collaborative Team on Video Coding, JCTVC-P1002, San José, US (2014)
J.Y. Kim, C.H. Yi, T.Y. Kim, ROI-centered compression by adaptive quantization for sports video. IEEE Trans. Consum. Electron. 56(2), 951–956 (2010)
Article Google Scholar
S. Kwon, J. Kim, D. Lee, K. Park, ROI analysis for remote photoplethysmography on facial video, in Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (Milan, Italy, 2015), pp. 25–29
X. Li, P. Amon, A. Hutter, A. Kaup, Adaptive quantization parameter cascading for hierarchical video coding, in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS) (Paris, France, 2010), pp. 4197–4200
X. Li, P. Amon, A. Hutter, A. Kaup, Model based analysis for quantization parameter cascading in hierarchical video coding, in Proceedings of the IEEE International Conference on Image Processing (ICIP) (Cairo, Egypt, 2009), pp. 3765–3768
W.S. Lin, C.C.J. Kuo, Perceptual visual quality metrics: a survey. J. Vis. Commun. Image Represent. 22(4), 297–312 (2011)
Article Google Scholar
Y. Liu, Z.G. Li, Y.C. Soh, M.H. Loke, Conversational video communication of H.264/AVC with region-of-interest concern, in Proceedings of the International Conference on Image Processing (ICIP) (Atlanta, USA, 2006), pp. 3129–3132
Y. Liu, Z.G. Li, Y.C. Soh, Region-of-interest based resource allocation for conversational video communication of H.264/AVC. IEEE Trans. Circuit Syst. Video Technol. 18(1), 134–139 (2008)
Article Google Scholar
C. Mantel, P. Ladret, T. Kunlin, A temporal mosquito noise corrector, in Proceedings of the International Workshop on Quality of Multimedia Experience (QoMEX) (San Diego, CA, 2009), pp. 244–249
A. Ninassi, O. Le Meur, P. Le Callet, D. Barba, Considering temporal variations of spatial visual distortions in video quality assessment. IEEE J. Sel. Top. Signal Process. 3(2), 253–265 (2009)
Article Google Scholar
E.P. Ong, X.K. Yang, W.S. Lin, Z.K. Lu, S. Yao, X. Lin, S. Rahardja, B.C. Seng, Perceptual quality and objective quality measurements of compressed videos. J. Vis. Commun. Image Represent. 17(4), 717–737 (2006)
Article Google Scholar
F. Peng, X.W. Zhu, M. Long, An ROI privacy protection scheme for H.264 video based on FMO and chaos. IEEE Trans. Inf. Forensics Secur. 8(10), 1688–1699 (2013)
Article Google Scholar
H. Sabirin, M. Kim, Moving object detection and tracking using a spatio-temporal graph in H.264/AVC bitstreams for video surveillance. IEEE Trans. Multimed. 14(3), 657–668 (2012)
Article Google Scholar
R.P. Schumeyer, K.E. Barner, Color-based classifier for region identification in video, inProceedings of Visual Communications and Image Processing (San Jose, CA, 1998), pp. 189–200
H. Schwarz, D. Marpe, T. Wiegand, Analysis of hierarchical B pictures and MCTF, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) (Toronto, Canada, 2006), pp. 1929–1932
H. Schwarz, D. Marpe, T. Wiegand, Hierarchical B pictures. Joint Video Team, JVT-P014, Poznan, Poland (2005)
H. Schwarz, D. Marpe, T. Wiegand, Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans. Circuit Syst. Video Technol. 17(9), 1103–1120 (2007)
Article Google Scholar
J. Serra, L. Vincent, An overview of morphological filtering. Circuits Syst. Signal Process. 11(1), 1–54 (1992)
Article MathSciNet MATH Google Scholar
M.Y. Shen, C.C.J. Kuo, Review of postprocessing techniques for compression artifact removal. J. Vis. Commun. Image Represent. 9(1), 2–14 (1998)
Article Google Scholar
X.D. Sun, J. Foote, D. Kimber, B.S. Manjunath, Region of interest extraction and virtual camera control based on panoramic video capturing. IEEE Trans. Multimed. 7(5), 981–990 (2005)
Article Google Scholar
D.T. Vo, T.Q. Nguyen, S. Yea, A. Vetro, Adaptive fuzzy filtering for artifact reduction in compressed images and videos. IEEE Trans. Image Process. 18(6), 1166–1178 (2009)
Article MathSciNet Google Scholar
VQEG, Hybrid perceptual/bitstream group TEST PLAN 1.1. http://www.its.bldrdoc.gov/vqeg (2007). Accessed 16 Dec 2012
S. Wan, Y.C. Gong, F.Z. Yang, Perception of temporal fluctuations in video coding with the hierarchical prediction structure, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) (Melbourne, Australia, 2012), pp. 503–508
Z. Wei, K.N. Ngan, Spatio-temporal just noticeable distortion profile for grey scale image/video in DCT domain. IEEE Trans. Circuit Syst. Video Technol. 19(3), 337–346 (2009)
Article Google Scholar
H.R. Wu, A. Reibman, W. Lin, F. Pereira, S. Hemami, Perceptual visual signal compression and transmission. Special issue on perception-based media processing. Proc. IEEE 101(9), 2025–2043 (2013)
Article Google Scholar
B. Xiong, X.J. Fan, C. Zhu, X. Jing, Q. Peng, Face region based conversational video coding. IEEE Trans. Circuit Syst. Video Technol. 21(7), 917–931 (2011)
Article Google Scholar
Xiph.org video test media. http://media.xiph.org/video/derf/ (2010). Accessed 6 Aug 2015
M. Xu, X. Deng, S.X. Li, Z.L. Wang, Region-of-interest based conversational HEVC coding with hierarchical perception model of face. IEEE J. Sel. Top. Signal Process. 8(3), 475–489 (2014)
Article Google Scholar
L. Yang, L.L. Yang, M.A. Robertson, Multiple-face tracking system for general region-of-interest video coding, in Proceedings of the International Conference on Image Processing (ICIP) (Vancouver, Canada, 2000), pp. 347–350
X.K. Yang, W.S. Lin, Z.K. Lu, X. Lin, S. Rahardja, E. Ong, S. Yao, Rate control for videophone using local perceptual cues. IEEE Trans. Circuit Syst. Video Technol. 15(4), 496–507 (2005)
Article Google Scholar
X.K. Yang, W.S. Lin, Z.H. Lu, E.P. Ong, S.S. Yao, Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile. IEEE Trans. Circuit Syst. Video Technol. 15(6), 742–752 (2005)
Article Google Scholar
M. Yuen, H.R. Wu, Reconstruction artifacts in digital video compression, in Proceedings of Digital Video Compression: Algorithms and Technologies (1995), pp. 455–465
M. Yuen, H.R. Wu, A survey of hybrid MC/DPCM/DCT video coding distortions. Signal Process. 70(3), 247–278 (1998)
Article MATH Google Scholar
YUV video sequences. http://trace.eas.asu.edu/yuv/index.html (2010). Accessed 6 Aug 2015
YUV420 test sequences. ftp://ftp.tnt.uni-hannover.de/testsequence (2013). Accessed 6 Aug 2015
Y. Zhao, L. Yu, Z.Z. Chen, C. Zhu, Video quality assessment based on measuring perceptual noise from spatial and temporal perspectives. IEEE Trans. Circuit Syst. Video Technol. 21(12), 1890–1902 (2011)
Article Google Scholar

Download references

Acknowledgments

Authors were in debt to anonymous reviewers for their thorough reviews, constructive comments and valuable suggestions which helped to improve the quality and presentation of the manuscript. Sincere thanks also go to our cooperative institutions: (a) State Key Laboratory of ISN Xidian University, and (b) Visual Communications Engineering Research Laboratory, School of Electrical and Computer Engineering, Royal Melbourne Institute of Technology University, for providing HD test video sequences in the experiments. This work was supported by the National Natural Science Foundation Research Program of China (No. 61371089).

Author information

Authors and Affiliations

School of Electronics and Information, Northwestern Polytechnical University, Changan District, Xi’an, Shaanxi, China
Yanchao Gong, Shuai Wan, Kaifang Yang & Bo Li
School of Electrical and Computer Engineering, Royal Melbourne Institute of Technology, Melbourne, VIC, 3001, Australia
Hong Ren Wu

Authors

Yanchao Gong
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Wan
View author publications
You can also search for this author in PubMed Google Scholar
Kaifang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hong Ren Wu
View author publications
You can also search for this author in PubMed Google Scholar
Bo Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanchao Gong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gong, Y., Wan, S., Yang, K. et al. A Visual-Masking-Based Estimation Algorithm for Temporal Pumping Artifact Region Prediction. Circuits Syst Signal Process 36, 1264–1287 (2017). https://doi.org/10.1007/s00034-016-0357-9

Download citation

Received: 14 October 2015
Revised: 14 June 2016
Accepted: 16 June 2016
Published: 28 June 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s00034-016-0357-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Visual-Masking-Based Estimation Algorithm for Temporal Pumping Artifact Region Prediction

Abstract

Access this article

Similar content being viewed by others

PVC-STIM: Perceptual video coding based on spatio-temporal influence map

Spatiotemporal Masking for Objective Video Quality Assessment

No-reference real-time video transmission artifact detection for video signals

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Visual-Masking-Based Estimation Algorithm for Temporal Pumping Artifact Region Prediction

Abstract

Access this article

Similar content being viewed by others

PVC-STIM: Perceptual video coding based on spatio-temporal influence map

Spatiotemporal Masking for Objective Video Quality Assessment

No-reference real-time video transmission artifact detection for video signals

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation