Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots

Zuo, Xu-guang; Yu, Lu

doi:10.1631/FITEE.1601552

Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots

Published: 15 May 2018

Volume 19, pages 459–470, (2018)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

148 Accesses
4 Citations
Explore all metrics

Abstract

The latest video coding standard High Efficiency Video Coding (HEVC) can achieve much higher coding efficiency than previous video coding standards. Particularly, by exploiting the hierarchical B-picture prediction structure, temporal redundancy among neighbor frames is eliminated remarkably well. In practice, videos available to consumers usually contain many repeated shots, such as TV series, movies, and talk shows. According to our observations, when these videos are encoded by HEVC with the hierarchical B-picture structure, the temporal correlation in each shot is well exploited. However, the long-term correlation between repeated shots has not been used. We propose a long-term prediction (LTP) scheme to use the long-term temporal correlation between correlated shots in a video. The long-term reference (LTR) frames of a source video are chosen by clustering similar shots and extracting the representative frames, and a modified hierarchical B-picture coding structure based on an LTR frame is introduced to support long-term temporal prediction. An adaptive quantization method is further designed for LTR frames to improve the overall video coding efficiency. Experimental results show that up to 22.86% coding gain can be achieved using the new coding scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Keyframe Insights into Real-Time Video Tagging of Compressed UHD Content

A novel compact yet rich key frame creation method for compressed video summarization

Article 05 June 2017

A static video summarization method based on the sparse coding of features and representativeness of frames

Article Open access 22 June 2016

References

Alfonso D, Biffi B, Pezzoni L, 2006. Adaptive GOP size control in H.264/AVC encoding based on scene change detection. Proc 7th Nordic Signal Processing Symp, p.86–89. https://doi.org/10.1109/NORSIG.2006.275283
Google Scholar
Bjontegaard G, 2001. Calculation of average PSNR differences between RD curves. Document VCEG-M33. Austin, TX, USA.
Google Scholar
Bossen F, 2013. Common HM test conditions and software reference configurations. Document JCT-VC L1100. Geneva, Switzerland.
Google Scholar
Cendrowski M, 2013. The Hofstadter Insufficiency. The Big Bang Theory. DVD. Season 7. Episode 1. CBS.
Google Scholar
Dahl J, 2015. Chapter 33. House of Cards. DVD. Season 3, Episode 7. Netflix.
Google Scholar
Gao YB, Zhu C, Li S, 2016. Hierarchical temporal dependent rate-distortion optimization for low-delay coding. Proc IEEE Int Symp on Circuits and Systems, p.570–573. https://doi.org/10.1109/ISCAS.2016.7527304
Google Scholar
Hartigan JA, Wong MA, 1979. Algorithm AS 136: a K-means clustering algorithm. J R Stat Soc, 28(1):100–108. https://doi.org/10.2307/2346830
MATH Google Scholar
Hu N, Yang EH, 2015. Fast mode selection for HEVC intraframe coding with entropy coding refinement based on a transparent composite model. IEEE Trans Circ Syst Video Technol, 25(9):1521–1532. https://doi.org/10.1109/TCSVT.2015.2395772
Article Google Scholar
Lee J, Kim S, Lim K, et al., 2015. A fast CU size decision algorithm for HEVC. IEEE Trans Circ Syst Video Technol, 25(3):411–421. https://doi.org/10.1109/TCSVT.2014.2339612
Article Google Scholar
Lenka K, Jaroslav P, Michal M, 2018. Adaptive group of pictures structure based on the positions of video cuts. Proc World Academy of Science, Engineering and Technology, p.377–380.
Google Scholar
Li S, Zhu C, Gao YB, et al., 2016. Lagrangian multiplier adaptation for rate-distortion optimization with inter-frame dependency. IEEE Trans Circ Syst Video Technol, 26(1): 117–129. https://doi.org/10.1109/TCSVT.2015.2450131
Article Google Scholar
Liu D, Zhao DB, Ji XY, et al., 2010. Dual frame motion compensation with optimal long-term reference frame selection and bit allocation. IEEE Trans Circ Syst Video Technol, 20(3):325–339. https://doi.org/10.1109/TCSVT.2009.2031442
Article Google Scholar
McCarthy C, 2014. The Sign of Three. Sherlock. DVD. Season 3, Episode 2. BBC.
Google Scholar
Ngo CW, Pong TC, Zhang HJ, 2001. On clustering and retrieval of video shots. Proc 9th ACM Int Conf on Multimedia, p.51–60. https://doi.org/10.1145/500141.500151
Google Scholar
Nutter D, 2012. A Man Without Honor. Game of Thrones. DVD. Season 2, Episode 7. HBO.
Google Scholar
Pan ZQ, Kwong S, Sun MT, et al., 2014. Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC. IEEE Trans Broadcast, 60(2):405–412. https://doi.org/10.1109/TBC.2014.2321682
Article Google Scholar
Pan ZQ, Zhang Y, Lei JJ, et al., 2016a. Early DIRECT mode decision based on all-zero block and rate distortion cost for multiview video coding. IET Image Process, 10(1): 9–15. https://doi.org/10.1049/iet-ipr.2014.1018
Article Google Scholar
Pan ZQ, Zhang Y, Kwong S, 2016b. Fast mode decision based on texture–depth correlation and motion prediction for multiview depth video coding. J Real-Time Image Process, 11(1):27–36. https://doi.org/10.1007/s11554-013-0328-3
Article Google Scholar
Pan ZQ, Lei JJ, Zhang Y, et al., 2016c. Fast motion estimation based on content property for low-complexity H.265/HEVC encoder. IEEE Trans Broadcast, 62(3):675–684. https://doi.org/10.1109/TBC.2016.2580920
Article Google Scholar
Pan ZQ, Jin P, Lei JJ, et al., 2016d. Fast reference frame selection based on content similarity for low complexity HEVC encoder. J Vis Commun Image Represent, 40:516–524. https://doi.org/10.1016/j.jvcir.2016.07.018
Article Google Scholar
Paul M, Lin WS, Lau CT, et al., 2011. Explore and model better I-frames for video coding. IEEE Trans Circ Syst Video Technol, 21(9):1242–1254. https://doi.org/10.1109/TCSVT.2011.2138750
Article Google Scholar
Paul M, Lin WS, Lau CT, et al., 2014. A long-term reference frame for hierarchical B-picture-based video coding. IEEE Trans Circ Syst Video Technol, 24(10):1729–1742. https://doi.org/10.1109/TCSVT.2014.2302555
Article Google Scholar
Rosewarne C, Bross B, Naccari M, et al., 2016. High Efficiency Video Coding (HEVC) Test Model 16 (HM 16). Document JCTVC-X1002. Geneva, Switzerland.
Google Scholar
Scardino D, 2015. And the Show and Don’t Tell. 2 Broke Girls. DVD. Season 5, Episode 17. CBS.
Google Scholar
Schwarz H, Marpe D, Wiegand T, 2007. Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans Circ Syst Video Technol, 17(9):1103–1120. https://doi.org/10.1109/TCSVT.2007.905532
Article Google Scholar
Sullivan GJ, Ohm JR, Han WJ, et al., 2012. Overview of the High Efficiency Video Coding (HEVC) standard. IEEE Trans Circ Syst Video Technol, 22(12):1649–1668. https://doi.org/10.1109/TCSVT.2012.2221191
Article Google Scholar
Tang XL, Dai SK, Cai CH, 2010. An analysis of TZSearch algorithm in JMVC. Proc IEEE Int Conf on Green Circuits and Systems, p.516–520. https://doi.org/10.1109/ICGCS.2010.5543008
Google Scholar
Tirone R, 2015. The Price. Once Upon a Time. DVD. Season 5, Episode 2. ABC.
Google Scholar
Tiwari M, Cosman PC, 2008. Selection of long-term reference frames in dual-frame video coding using simulated annealing. IEEE Signal Process Lett, 15:249–252. https://doi.org/10.1109/LSP.2007.914928
Article Google Scholar
Vendrig J, Worring M, 2002. Systematic evaluation of logical story unit segmentation. IEEE Trans Multim, 4(4):492–499. https://doi.org/10.1109/TMM.2002.802021
Article Google Scholar
Wang XY, Weng ZK, 2000. Scene abrupt change detection. Proc IEEE Conf on Electrical and Computer Engineering, p.880–883. https://doi.org/10.1109/CCECE.2000.849592
Google Scholar
Wiegand T, Sullivan GJ, Bjontegaard G, et al., 2003. Overview of the H.264/AVC video coding standard. IEEE Trans Circ Syst Video Technol, 13(7):560–576. https://doi.org/10.1109/TCSVT.2003.815165
Article Google Scholar
Youm S, Kim W, 2003. Dynamic threshold method for scene change detection. Proc IEEE Int Conf on Multimedia and Expo, p.337–340. https://doi.org/10.1109/ICME.2003.1221622
Google Scholar
Zhang XG, Liang LH, Huang H, et al., 2010. An efficient coding scheme for surveillance videos captured by stationary cameras. Proc SPIE Visual Communications and Image Processing, p.1–10. https://doi.org/10.1117/12.863522
Google Scholar
Zhang XG, Tian YH, Huang TJ, et al., 2012. Low-complexity and high-efficiency background modeling for surveillance video coding. Proc IEEE Visual Communications and Image Processing, p.769–784. https://doi.org/10.1109/VCIP.2012.6410796
Google Scholar
Zhang XG, Huang TJ, Tian YH, et al., 2014. Backgroundmodeling-based adaptive prediction for surveillance video coding. IEEE Trans Image Process, 23(2): 769–784. https://doi.org/10.1109/TIP.2013.2294549
Article MathSciNet MATH Google Scholar
Zheng XL, 2012. Empresses in the Palace. DVD. Beijing Television Arts Centre, Beijing, China (in Chinese).
Google Scholar
Zuo XG, Yu L, 2015. A novel interpolation-free scheme for fractional pixel motion estimation. Proc Picture Coding Symp, p.80–84. https://doi.org/10.1109/PCS.2015.7170051
Google Scholar

Download references

Author information

Authors and Affiliations

Zhejiang Provincial Key Laboratory of Information Processing, Communication and Networking (IPCAN), Institute of Information and Communication Engineering, Zhejiang University, Hangzhou, 310027, China
Xu-guang Zuo & Lu Yu

Authors

Xu-guang Zuo
View author publications
You can also search for this author inPubMed Google Scholar
Lu Yu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Lu Yu.

Additional information

Project supported by the National Natural Science Foundation of China (No. 61371162)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zuo, Xg., Yu, L. Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots. Frontiers Inf Technol Electronic Eng 19, 459–470 (2018). https://doi.org/10.1631/FITEE.1601552

Download citation

Received: 13 September 2016
Revised: 02 December 2016
Accepted: 15 March 2018
Published: 15 May 2018
Issue Date: March 2018
DOI: https://doi.org/10.1631/FITEE.1601552

Key words

CLC number

TN919.8

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Keyframe Insights into Real-Time Video Tagging of Compressed UHD Content

A novel compact yet rich key frame creation method for compressed video summarization

A static video summarization method based on the sparse coding of features and representativeness of frames

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Subscribe and save

Buy Now