Skip to main content
Log in

Flow-MotionNet: A neural network based video compression architecture

  • 1221: Deep Learning for Image/Video Compression and Visual Quality Assessment
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The growth of superfluous video content over the internet led to the emergence of highly proficient video compression techniques. These novel techniques make optimal use of the available varying bandwidths to deliver quality video content. The traditional techniques of video compression are mainly based on block designs and remove the redundancies using Discrete Cosine Transforms. Although these techniques perform well but these are not adaptive to the varying bandwidth. A number of learning based video compression schemes have been developed during previous years. Though some are performing efficiently but these are not adaptable for mobile usage because of their flexibility lack for varying reconstruction quality with varying bandwidth. In this paper, a lightweight learning-based video compression architecture has been proposed that attempts to allow variation in quality of the reconstructed video with the amount of data sent, without requiring separate low-resolution versions of the same video. The proposed model is a amalgamation of three tiny networks namely frame autoencoder, flow autoencoder and motion extension network. The performance analysis reveals a significant improvement in visual quality of the video frames but in tradeoff with frame reconstruction time. The results have also been compared to some state-of-the-art techniques including H.264 in terms of SSIM and PSNR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Agustsson E, Mentzer F, Tschannen M, Cavigelli L, Timofte R, Benini L, Gool LV (2017) Soft-to-hard vector quantization for end-to-end learning compressible representations. In: NIPS

  2. Baig MH, Koltun V, Torresani L (2017) Learning to inpaint for image compression. In: NIPS

  3. Ball_e J, Laparra V, Simoncelli EP (2017) End-to-end optimized image compression. In: ICLR

  4. Cavigelli L, Hager P, Benini L (2017) Cas-cnn: A deep convolutional neural network for image compression artifact suppression. In Neural Networks (IJCNN), 2017 International Joint Conference on, pages 752–759. IEEE

  5. Chen T, Liu H, Shen Q, Yue T, Cao X, Ma Z (2017) DeepCoder: a deep neural network based video compression. IEEE Visual Communications and Image Processing (VCIP). https://doi.org/10.1109/VCIP.2017.8305033

  6. Dong L, Li Y, Lin J, Li H, Wu F (2020) Deep Learning-Based Video Coding: A Review and a Case Study, ACM Computing SurveysFebruary

  7. Goswami K, Kim BG A design of fast high-efficiency video coding scheme based on markov chain monte carlo model and bayesian classifier. IEEE Trans Ind Electron 65(11):8861–8871

  8. Goswami K, Lee J, Kim B-g Fast algorithm for the high efficiency video coding (HEVC) encoder using texture analysis. Inf Sci 364:72–90

  9. He X, Hu Q, Han X, Zhang X, Lin W (2018) Enhancing hevc compressed videos with a partition-masked Convolutional neural network. arXiv preprint arXiv:1805.03894

  10. Huo S, Liu D, Wu F, Li H (2018) Convolutional neural network-based motion compensation refinement for video coding. In Circuits and Systems (ISCAS), 2018 IEEE International Symposium on, pages 1–4. IEEE

  11. Jia X, De Brabandere B, Tuytelaars T, Gool LV (2016) Dynamic filter networks. In: NIPS

  12. Jiang H, Sun D, Jampani V, Yang MH, Learned-Miller E, Kautz J (2018) Super slomo: high quality estimation of multiple intermediate frames for video interpolation. CVPR

  13. Johnston N, Vincent D, Minnen D, Covell M, Singh S, Chinen T, Hwang SJ, Shor J, Toderici G (2017) Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. arXiv preprint arXiv:1703.10114

  14. Kim S, Park JS, Christos G Bampis JL, Markey MK, Dimakis AG, Bovik AC (2018) Adversarial Video Compression Guided by Soft Edge Detection. arXiv:1811.10673v1 [eess.IV] 26 Nov

  15. Le Gall D (1991) MPEG: A video compression standard for multimedia applications. Commun ACM

  16. Lee J-H, Park CS, Kim B-G, Jun D-S (2013) Novel fast PU decision algorithm for the HEVC video standard, IEEE International Conference on Image Processing, 1982-1985

  17. Li Y, Li B, Liu D, Chen Z (2017) A convolutional neural network-based approach to rate control in hevc intra coding. In Visual Communications and Image Processing (VCIP), 2017 IEEE, pages 1–4. IEEE

  18. Li C, Song L, Xie R, Zhang W, C. M. I. center (n.d.) Cnn based post-processing to improve hevc

  19. Liu Z, Yeh R, Tang X, Liu Y, Agarwala A (2017) Video frame synthesis using deep voxel flow. In: ICCV

  20. Lu G, Ouyang W, Xu D, Zhang X, Cai C, Gao Z (2019) DVC:An End-to-end Deep Video Compression Framework. arXiv: 1812.00101v3 [eess.IV] 7Apr

  21. Mathieu M, Couprie C, LeCun Y (2015) Deep multi-scale video prediction beyond mean square error. arXiv preprint arXiv:1511.05440

  22. Mathieu M, Couprie C, LeCun Y (2016) Deep multi-scale video prediction beyond mean square error. In: ICLR

  23. Mentzer F, Agustsson E, Tschannen M, Timofte R, Van Gool L (2018) Conditional probability models for deep image compression. arXiv preprint arXiv:1801.04260

  24. Niklaus S, Mai L, Liu F (2017) Video frame interpolation via adaptive separable convolution. In: ICCV

  25. Oord AvD, Kalchbrenner N, Kavukcuoglu K (2016) Pixel recurrent neural networks. In: ICML

  26. Rippel O, Bourdev L (2017) Real-time adaptive image compression. In: ICML

  27. Song R, Liu D, Li H, Wu F (2017) Neural network-based arithmetic coding of intra prediction modes in hevc. In Visual Communications and Image Processing (VCIP), 2017 IEEE, pages 1–4. IEEE

  28. Song X, Yao J, Zhou L, Wang L, Wu X, Xie D, Pu S (2018) A practical convolutional neural network as loop filter for intra frame. arXiv preprint arXiv:1805.06121

  29. Theis L, Shi W, Cunningham A, Husz_ar F (2017) Lossy image compression with compressive autoencoders. In: ICLR

  30. Toderici G, Vincent D, Johnston N, Jin Hwang S, Minnen D, Shor J, Covell M (2017) Full resolution image compression with recurrent neural networks. In: CVPR

  31. Vondrick C, Pirsiavash H, Torralba A (2016) Generating videos with scene dynamics. In: NIPS

  32. Wang T, Chen M, Chao H (2017) A novel deep learning based method of improving coding efficiency from the decoder-end for hevc. In Data Compression Conference (DCC), 2017, pages 410–419. IEEE

  33. WebP (n.d.) https://developers.google.com/speed/webp/

  34. Xue T, Wu J, Bouman K, Freeman B (2016) Visual dynamics: probabilistic future frame synthesis via cross convolutional networks. In: NIPS

  35. Yan N, Liu D, Li H, Wu F (2017) A convolutional neural network approach for half-pel interpolation in video coding. In Circuits and Systems (ISCAS), 2017 IEEE International Symposium on, pages 1–4. IEEE

  36. Yang R, Xu M, Wang Z (2017) Decoder-side hevc quality enhancement with scalable convolutional neural network. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pages 817–822. IEEE

  37. Yang R, Xu M, Wang Z, Guan Z (2017) Enhancing quality for hevc compressed videos. arXiv preprint arXiv:1709.06734

  38. Yang R, Xu M, Wang Z, Li T (n.d.) Multi-frame quality enhancement for compressed video

  39. Zhao Z, Wang S, Wang S, Zhang X, Ma S, Yang J (2018) Cnn-based bi-directional motion compensation for high efficiency video coding. In Circuits and Systems (ISCAS), 2018 IEEE International Symposium on, pages 1–4. IEEE

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sangeeta Yadav.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yadav, S., Gulia, P. & Gill, N.S. Flow-MotionNet: A neural network based video compression architecture. Multimed Tools Appl 81, 42783–42804 (2022). https://doi.org/10.1007/s11042-022-13480-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13480-0

Keywords

Navigation