Abstract
The latest video coding standard High Efficiency Video Coding (HEVC) can achieve much higher coding efficiency than previous video coding standards. Particularly, by exploiting the hierarchical B-picture prediction structure, temporal redundancy among neighbor frames is eliminated remarkably well. In practice, videos available to consumers usually contain many repeated shots, such as TV series, movies, and talk shows. According to our observations, when these videos are encoded by HEVC with the hierarchical B-picture structure, the temporal correlation in each shot is well exploited. However, the long-term correlation between repeated shots has not been used. We propose a long-term prediction (LTP) scheme to use the long-term temporal correlation between correlated shots in a video. The long-term reference (LTR) frames of a source video are chosen by clustering similar shots and extracting the representative frames, and a modified hierarchical B-picture coding structure based on an LTR frame is introduced to support long-term temporal prediction. An adaptive quantization method is further designed for LTR frames to improve the overall video coding efficiency. Experimental results show that up to 22.86% coding gain can be achieved using the new coding scheme.
Similar content being viewed by others
References
Alfonso D, Biffi B, Pezzoni L, 2006. Adaptive GOP size control in H.264/AVC encoding based on scene change detection. Proc 7th Nordic Signal Processing Symp, p.86–89. https://doi.org/10.1109/NORSIG.2006.275283
Bjontegaard G, 2001. Calculation of average PSNR differences between RD curves. Document VCEG-M33. Austin, TX, USA.
Bossen F, 2013. Common HM test conditions and software reference configurations. Document JCT-VC L1100. Geneva, Switzerland.
Cendrowski M, 2013. The Hofstadter Insufficiency. The Big Bang Theory. DVD. Season 7. Episode 1. CBS.
Dahl J, 2015. Chapter 33. House of Cards. DVD. Season 3, Episode 7. Netflix.
Gao YB, Zhu C, Li S, 2016. Hierarchical temporal dependent rate-distortion optimization for low-delay coding. Proc IEEE Int Symp on Circuits and Systems, p.570–573. https://doi.org/10.1109/ISCAS.2016.7527304
Hartigan JA, Wong MA, 1979. Algorithm AS 136: a K-means clustering algorithm. J R Stat Soc, 28(1):100–108. https://doi.org/10.2307/2346830
Hu N, Yang EH, 2015. Fast mode selection for HEVC intraframe coding with entropy coding refinement based on a transparent composite model. IEEE Trans Circ Syst Video Technol, 25(9):1521–1532. https://doi.org/10.1109/TCSVT.2015.2395772
Lee J, Kim S, Lim K, et al., 2015. A fast CU size decision algorithm for HEVC. IEEE Trans Circ Syst Video Technol, 25(3):411–421. https://doi.org/10.1109/TCSVT.2014.2339612
Lenka K, Jaroslav P, Michal M, 2018. Adaptive group of pictures structure based on the positions of video cuts. Proc World Academy of Science, Engineering and Technology, p.377–380.
Li S, Zhu C, Gao YB, et al., 2016. Lagrangian multiplier adaptation for rate-distortion optimization with inter-frame dependency. IEEE Trans Circ Syst Video Technol, 26(1): 117–129. https://doi.org/10.1109/TCSVT.2015.2450131
Liu D, Zhao DB, Ji XY, et al., 2010. Dual frame motion compensation with optimal long-term reference frame selection and bit allocation. IEEE Trans Circ Syst Video Technol, 20(3):325–339. https://doi.org/10.1109/TCSVT.2009.2031442
McCarthy C, 2014. The Sign of Three. Sherlock. DVD. Season 3, Episode 2. BBC.
Ngo CW, Pong TC, Zhang HJ, 2001. On clustering and retrieval of video shots. Proc 9th ACM Int Conf on Multimedia, p.51–60. https://doi.org/10.1145/500141.500151
Nutter D, 2012. A Man Without Honor. Game of Thrones. DVD. Season 2, Episode 7. HBO.
Pan ZQ, Kwong S, Sun MT, et al., 2014. Early MERGE mode decision based on motion estimation and hierarchical depth correlation for HEVC. IEEE Trans Broadcast, 60(2):405–412. https://doi.org/10.1109/TBC.2014.2321682
Pan ZQ, Zhang Y, Lei JJ, et al., 2016a. Early DIRECT mode decision based on all-zero block and rate distortion cost for multiview video coding. IET Image Process, 10(1): 9–15. https://doi.org/10.1049/iet-ipr.2014.1018
Pan ZQ, Zhang Y, Kwong S, 2016b. Fast mode decision based on texture–depth correlation and motion prediction for multiview depth video coding. J Real-Time Image Process, 11(1):27–36. https://doi.org/10.1007/s11554-013-0328-3
Pan ZQ, Lei JJ, Zhang Y, et al., 2016c. Fast motion estimation based on content property for low-complexity H.265/HEVC encoder. IEEE Trans Broadcast, 62(3):675–684. https://doi.org/10.1109/TBC.2016.2580920
Pan ZQ, Jin P, Lei JJ, et al., 2016d. Fast reference frame selection based on content similarity for low complexity HEVC encoder. J Vis Commun Image Represent, 40:516–524. https://doi.org/10.1016/j.jvcir.2016.07.018
Paul M, Lin WS, Lau CT, et al., 2011. Explore and model better I-frames for video coding. IEEE Trans Circ Syst Video Technol, 21(9):1242–1254. https://doi.org/10.1109/TCSVT.2011.2138750
Paul M, Lin WS, Lau CT, et al., 2014. A long-term reference frame for hierarchical B-picture-based video coding. IEEE Trans Circ Syst Video Technol, 24(10):1729–1742. https://doi.org/10.1109/TCSVT.2014.2302555
Rosewarne C, Bross B, Naccari M, et al., 2016. High Efficiency Video Coding (HEVC) Test Model 16 (HM 16). Document JCTVC-X1002. Geneva, Switzerland.
Scardino D, 2015. And the Show and Don’t Tell. 2 Broke Girls. DVD. Season 5, Episode 17. CBS.
Schwarz H, Marpe D, Wiegand T, 2007. Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans Circ Syst Video Technol, 17(9):1103–1120. https://doi.org/10.1109/TCSVT.2007.905532
Sullivan GJ, Ohm JR, Han WJ, et al., 2012. Overview of the High Efficiency Video Coding (HEVC) standard. IEEE Trans Circ Syst Video Technol, 22(12):1649–1668. https://doi.org/10.1109/TCSVT.2012.2221191
Tang XL, Dai SK, Cai CH, 2010. An analysis of TZSearch algorithm in JMVC. Proc IEEE Int Conf on Green Circuits and Systems, p.516–520. https://doi.org/10.1109/ICGCS.2010.5543008
Tirone R, 2015. The Price. Once Upon a Time. DVD. Season 5, Episode 2. ABC.
Tiwari M, Cosman PC, 2008. Selection of long-term reference frames in dual-frame video coding using simulated annealing. IEEE Signal Process Lett, 15:249–252. https://doi.org/10.1109/LSP.2007.914928
Vendrig J, Worring M, 2002. Systematic evaluation of logical story unit segmentation. IEEE Trans Multim, 4(4):492–499. https://doi.org/10.1109/TMM.2002.802021
Wang XY, Weng ZK, 2000. Scene abrupt change detection. Proc IEEE Conf on Electrical and Computer Engineering, p.880–883. https://doi.org/10.1109/CCECE.2000.849592
Wiegand T, Sullivan GJ, Bjontegaard G, et al., 2003. Overview of the H.264/AVC video coding standard. IEEE Trans Circ Syst Video Technol, 13(7):560–576. https://doi.org/10.1109/TCSVT.2003.815165
Youm S, Kim W, 2003. Dynamic threshold method for scene change detection. Proc IEEE Int Conf on Multimedia and Expo, p.337–340. https://doi.org/10.1109/ICME.2003.1221622
Zhang XG, Liang LH, Huang H, et al., 2010. An efficient coding scheme for surveillance videos captured by stationary cameras. Proc SPIE Visual Communications and Image Processing, p.1–10. https://doi.org/10.1117/12.863522
Zhang XG, Tian YH, Huang TJ, et al., 2012. Low-complexity and high-efficiency background modeling for surveillance video coding. Proc IEEE Visual Communications and Image Processing, p.769–784. https://doi.org/10.1109/VCIP.2012.6410796
Zhang XG, Huang TJ, Tian YH, et al., 2014. Backgroundmodeling-based adaptive prediction for surveillance video coding. IEEE Trans Image Process, 23(2): 769–784. https://doi.org/10.1109/TIP.2013.2294549
Zheng XL, 2012. Empresses in the Palace. DVD. Beijing Television Arts Centre, Beijing, China (in Chinese).
Zuo XG, Yu L, 2015. A novel interpolation-free scheme for fractional pixel motion estimation. Proc Picture Coding Symp, p.80–84. https://doi.org/10.1109/PCS.2015.7170051
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Natural Science Foundation of China (No. 61371162)
Rights and permissions
About this article
Cite this article
Zuo, Xg., Yu, L. Long-term prediction for hierarchical-B-picture-based coding of video with repeated shots. Frontiers Inf Technol Electronic Eng 19, 459–470 (2018). https://doi.org/10.1631/FITEE.1601552
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1601552
Key words
- High Efficiency Video Coding (HEVC)
- Long-term temporal correlation
- Long-term prediction
- Hierarchical B-picture structure