End-to-end video compression for surveillance and conference videos

Wang, Shenhao; Zhao, Yu; Gao, Han; Ye, Mao; Li, Shuai

doi:10.1007/s11042-022-13484-w

End-to-end video compression for surveillance and conference videos

1221: Deep Learning for Image/Video Compression and Visual Quality Assessment
Published: 05 August 2022

Volume 81, pages 42713–42730, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shenhao Wang¹,
Yu Zhao ORCID: orcid.org/0000-0002-0606-4676²,
Han Gao²,
Mao Ye² &
…
Shuai Li³

493 Accesses
1 Altmetric
Explore all metrics

Abstract

The storage and transmission tasks of surveillance and conference videos are an important branch of video compression. Since surveillance and conference videos have strong inter-frame correlation, considerable continuity at the image level and motion level between the consecutive frames exists. However, traditional video codec networks cannot fully use the characteristics of surveillance and conference videos during compression. Therefore, based on the DVC video codec framework, we propose a “MV residual + MV optimization” coding strategy for surveillance and conference videos to further reduce the compression rate and improve the quality of compressed video frames. During the testing stage, the online update strategy is promoted, which adapts the network’s parameters to different surveillance and conference videos. Our contribution is to propose an optical flow residual coding method for videos with strong inter-frame correlation, implement optical flow optimization at decoding end and online update strategy at the encoding end. Experiments show that our method can outperform DVC framework, especially on CUHK Square surveillance video with 1.2dB improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video Synopsis: A Systematic Review

No-reference Video Quality Assessment Based on Spatio-temporal Perception Feature Fusion

Article 25 June 2022

Efficient video quality assessment for on-demand video transcoding using intensity variation analysis

Article 13 January 2018

References

Agustsson E, Mentzer F, Tschannen M, Cavigelli L, Timofte R, Benini L, Gool LV (2017) Soft-to-hard vector quantization for end-to-end learning compressible representations. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R. (eds) Advances in neural information processing systems, vol 30. Curran Associates, Inc
Alam MM, Nguyen TD, Hagan MT, Chandler DM (2015) A perceptual quantization strategy for HEVC based on a convolutional neural network trained on natural images. In: Applications of digital image processing XXXVIII, vol 9599, p 959918. International Society for Optics and Photonics
Alexandre D, Hang HM (2020) Learned video codec with enriched reconstruction for clic p-frame coding. arXiv:2012.07462
Ballé J, Laparra V, Simoncelli EP (2017) End-to-end optimized image compression. In: International conference on learning representations
Ballé J, Minnen D, Singh S, Hwang SJ, Johnston N (2018) Variational image compression with a scale hyperprior. In: International conference on learning representations
Bellard F BPG image format (http://bellard.org/bpg/), Accessed 30 Jan 2017
Cui W, Zhang T, Zhang S, Jiang F, Zuo W, Zhao D (2018) Convolutional neural networks based intra prediction for HEVC, pp 436–436
Djelouah A, Campos J, Schaub-Meyer S, Schroers C (2019) Neural inter-frame compression for video coding. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6421–6429
Hu Z, Lu G, Xu D (2021) FVC: a new framework towards deep video compression in feature space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1502–1511
Huo S, Liu D, Wu F, Li H (2018) Convolutional neural network-based motion compensation refinement for video coding. In: 2018 IEEE International symposium on circuits and systems (ISCAS), pp 1–4
Index CVN (2016) Forecast and methodology, 2015–2020. White paper, 1–41
Johnston N, Vincent D, Minnen D, Covell M, Singh S, Chinen T, Hwang SJ, Shor J, Toderici G (2018) Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4385–4393
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neur Inform Process Syst 25:1097–1105
Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Li J, Li B, Xu J, Xiong R, Gao W (2018) Fully connected network-based intra prediction for image coding. IEEE Trans Image Process 27(7):3236–3247
Article MathSciNet MATH Google Scholar
Lin J, Liu D, Li H, Wu F (2020) M-LVC: multiple frames prediction for learned video compression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3546–3554
Lu G, Ouyang W, Xu D, Zhang X, Cai C, Gao Z (2019) DVC: an end-to-end deep video compression framework. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11006–11015
Lu G, Cai C, Zhang X, Chen L, Ouyang W, Xu D, Gao Z (2020) Content adaptive and error propagation aware deep video compression. In: European conference on computer vision, pp 456–472. Springer
Marpe D, Schwarz H, Wiegand T (2003) Context-based adaptive binary arithmetic coding in the h. 264/avc video compression standard. IEEE Trans Circ Syst Video Technol 13(7):620–636
Article Google Scholar
Minnen D, Ballé J, Toderici G (2018) Joint autoregressive and hierarchical priors for learned image compression. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems, vol 31. Curran Associates, Inc
Pellegrini S, Ess A, Schindler K, Van Gool L (2009) You’ll never walk alone: modeling social behavior for multi-target tracking. In: 2009 IEEE 12th International conference on computer vision, pp 261– 268
Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4161–4170
Reda FA, Liu G, Shih KJ, Kirby R, Barker J, Tarjan D, Tao A, Catanzaro B (2018) Sdc-net: video prediction using spatially-displaced convolution. In: Proceedings of the European conference on computer vision (ECCV), pp 718–733
Sengar SS, Mukhopadhyay S (2020) Motion segmentation-based surveillance video compression using adaptive particle swarm optimization. Neural Comput Applic 32(15):11443–11457
Article Google Scholar
Skodras A, Christopoulos C, Ebrahimi T (2001) The jpeg 2000 still image compression standard. IEEE Signal Process Mag 18(5):36–58
Article MATH Google Scholar
Song R, Liu D, Li H, Wu F (2017) Neural network-based arithmetic coding of intra prediction modes in HEVC. In: 2017 IEEE Visual communications and image processing (VCIP), pp 1–4
Song X, Chen Y, Feng ZH, Hu G, Yu DJ, Wu XJ (2020) SP-GAN: self-growing and pruning generative adversarial networks. IEEE Trans Neural Netw Learn Syst 32(6):2458–2469
Article Google Scholar
Sullivan GJ, Ohm JR, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circ Syst Video Technol 22(12):1649–1668
Article Google Scholar
Theis L, Shi W, Cunningham A, Huszár F (2017) Lossy image compression with compressive autoencoders. In: International conference on learning representations
Toderici G, O’Malley SM, Hwang SJ, Vincent D, Minnen D, Baluja S, Covell M, Sukthankar R (2016) Variable rate image compression with recurrent neural networks. In: International conference on learning representations
Toderici G, Vincent D, Johnston N, Jin Hwang S, Minnen D, Shor J, Covell M (2017) Full resolution image compression with recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5306–5314
Wallace GK (1992) The jpeg still picture compression standard. IEEE Trans Consum Electron 38(1):xviii–xxxiv
Article Google Scholar
Wang M, Li W, Wang X (2012) Transferring a generic pedestrian detector towards specific scenes. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 3274–3281
Wu CY, Singhal N, Krahenbuhl P (2018) Video compression through image interpolation. In: Proceedings of the European conference on computer vision (ECCV), pp 416–431
Wu Y, He T, Chen Z (2020) Memorize, then recall: a generative framework for low bit-rate surveillance video compression. In: 2020 IEEE International symposium on circuits and systems (ISCAS), pp 1–5
Wu L, Huang K, Shen H, Gao L (2021) Foreground-background parallel compression with residual encoding for surveillance video. IEEE Trans Circuits Syst Video Technol 31(7):2711–2724
Article Google Scholar
Xue T, Chen B, Wu J, Wei D, Freeman WT (2019) Video enhancement with task-oriented flow. Int J Comput Vis 127(8):1106–1125
Article Google Scholar
Yan N, Liu D, Li H, Li B, Li L, Wu F (2018) Convolutional neural network-based fractional-pixel motion compensation. IEEE Trans Circuits Syst Video Technol 29(3):840–853
Article Google Scholar
Yang R, Xu M, Wang Z, Li T (2018) Multi-frame quality enhancement for compressed video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6664–6673
Zhao L, Wang S, Wang S, Ye Y, Ma S, Gao W (2021) Enhanced surveillance video compression with dual reference frames generation. IEEE Trans Circuits Syst Video Technol, 1–1

Download references

Acknowledgements

This work was supported in part by National Key R&D Program of China (2018YFE0203900), National Natural Science Foundation of China (61773093), Sichuan Science and Technology Program (2020YFG0476) and Important Science and Technology Innovation Projects in Chengdu (2018-YF08-00039-GX).

Author information

Authors and Affiliations

School of Physics, University of Electronic Science and Technology of China, Chengdu, 611731, People’s Republic of China
Shenhao Wang
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, People’s Republic of China
Yu Zhao, Han Gao & Mao Ye
School of Information Communication, Shandong University, Jinan, 250000, People’s Republic of China
Shuai Li

Authors

Shenhao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Han Gao
View author publications
You can also search for this author in PubMed Google Scholar
Mao Ye
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Zhao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, S., Zhao, Y., Gao, H. et al. End-to-end video compression for surveillance and conference videos. Multimed Tools Appl 81, 42713–42730 (2022). https://doi.org/10.1007/s11042-022-13484-w

Download citation

Received: 29 June 2021
Revised: 17 September 2021
Accepted: 13 July 2022
Published: 05 August 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11042-022-13484-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-end video compression for surveillance and conference videos

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Video Synopsis: A Systematic Review

No-reference Video Quality Assessment Based on Spatio-temporal Perception Feature Fusion

Efficient video quality assessment for on-demand video transcoding using intensity variation analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

End-to-end video compression for surveillance and conference videos

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Video Synopsis: A Systematic Review

No-reference Video Quality Assessment Based on Spatio-temporal Perception Feature Fusion

Efficient video quality assessment for on-demand video transcoding using intensity variation analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation