Abstract
Video coding has served as a key enabling technology to the explosion in online video sharing and consumption. This includes live video streaming, online video sharing, video conferencing, video surveillance, remote medicine, online education, online gaming, video broadcasting, cloud video services, and many others. The recently released open source royalty-free video coding standard known as AV1, designed and developed by the Alliance of Open Media (AOM), achieves a 30%–40% data rate reduction from previous generational video coding standards, which includes VP9 and HEVC. This paper aims to outline paradigms that may provide further coding performance gains over AV1. Image restoration has demonstrated significant effectiveness in video coding performance enhancement in AV1. This paper describes techniques in the same vein effectively optimizing frame reconstruction through the use of the Deep Neural Networks (DNN) to further improve coding performance. Initial explorations of our proposed approach have demonstrated promising results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mukherjee, D., Bankoski, J., Grange, A., Han, J., Koleszar, J., Wilkins, P., Xu, Y., Bultje, R.S.: The latest open-source video codec VP9 - an overview and preliminary results. In: Picture Coding Symposium (PCS), December 2013
Sullivan, G.J., Ohm, J., Han, W., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circ. Syst. Video Technol 22(12), 1649–1668 (2012)
Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Trans. Circ. Syst. Video Technol. 13(7), 560–576 (2003)
Alliance for Open Media. http://aomedia.org
Mukherjee, D., Li, S., Chen, Y., Anis, S., Parker, S., Bankoski, J.: A switchable loop-restoration with side-information framework for the emerging AV1 video codec. In: Proceedings of the IEEE International Conference on Image Processing, 17–20 September 2017, Beijing, China (2017)
Fu, C., Chen, D., Liu, Z., Zhu, F., Delp, E.J.: Texture segmentation based video compression using convolutional neural networks. In: Proceedings of the IS&T Electronic Imaging on Visual Information Processing and Communication Conference, San Jose, California, United States, February 2018
Chen, Y., Murherjee, D., Han, J., Grange, A., Xu, Y., Liu, Z., Parker, S., Chen, C., Su, H., Joshi, U., Chiang, C.-H., Wang, Y., Wilkins, P., Bankoski, J., Trudeau, L., Egge, N., Valin, J.-M., Davies, T., Midtskogen, S, Norkin, A., de Rivaz, P.: An overview of core coding tools in the AV1 video codec. In: Picture Coding Symposium (PCS), 24–27 June 2018, San Francisco, California, United States (2018, submitted)
Chen, D., Fu, C., Zhu, F., Liu, Z.: AV1 video coding using texture analysis with convolutional neural networks. In: Picture Coding Symposium (PCS), 24–27 June 2018, San Francisco, California, United States (2018, submitted)
Finn, C., Goodfellow, I., Levine, S.: Unsupervised learning for physical interaction through video prediction. In: 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain (2016)
Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error. In: International Conference on Learning Representations (ICLR) (2016)
Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action-conditional video prediction using deep networks in atari games. In: Neural Information Processing Systems (NIPS) (2015)
Walker, J., Doersch, C., Gupta, A., Hebert, M.: An uncertain future: forecasting from static images using variational autoencoders. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 835–851. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_51
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Dong, C., Deng, Y., Loy, C.C., Tang, X.: Compression artifacts reduction by a deep convolutional network. In: 2015 IEEE International Conference on Computer Vision (ICCV 2015), 7–13 December 2015, Santiago, Chile, pp. 576–584 (2015)
Wang, Z., Liu, D., Chang, S., Ling, Q., Yang, Y., Huang, T.S.: Deep dual-domain based fast restoration of jpeg-compressed images. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 27–30 June 2016, Las Vegas, USA, pp. 2764–2772 (2016)
Guo, J., Chao, H.: Building dual-domain representations for compression artifacts reduction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 628–644. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_38
Park, W.-S., Kim, M.: CNN-based in-loop filtering for coding efficiency improvement. In: IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP 2016), 11–12 July 2016, Bordeaux, France, pp. 1–5 (2016)
Dai, Y., Liu, D., Wu, D.: A convolutional neural network approach for post-processing in HEVC intra coding. In: The 24th International Conference on MultiMedia Modeling (MMM 2017), 4–6 January, Reykjavik, Iceland, pp. 28–39 (2017)
Li, C., Song, L., Xie, R., Zhang, W.: CNN based post-processing to improve HEVC. In: 2017 IEEE International Conference on Image Processing (ICIP 2017), Beijing, China, 17–20 September 2017 (2017)
Kang, J., Kim, S., Lee, K.M.: Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec. In: 2017 IEEE International Conference on Image Processing (ICIP 2017), Beijing, China, 17–20 September 2017 (2017)
Greaves, A., Winter, H.: Multi-frame video super-resolution using convolutional neural networks (2018)
Mnih, V., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Ding, D., Liu, P., Chen, Y., Zhu, Z., Liu, Z., Bankoski, J. (2018). Deep Neural Network Based Frame Reconstruction for Optimized Video Coding. In: Aiello, M., Yang, Y., Zou, Y., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2018. AIMS 2018. Lecture Notes in Computer Science(), vol 10970. Springer, Cham. https://doi.org/10.1007/978-3-319-94361-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-94361-9_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94360-2
Online ISBN: 978-3-319-94361-9
eBook Packages: Computer ScienceComputer Science (R0)