A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution

Zhang, Hengsheng; Zou, Xueyi; Guo, Jiaming; Yan, Youliang; Xie, Rong; Song, Li

doi:10.1007/978-3-031-19790-1_14

Hengsheng Zhang ORCID: orcid.org/0000-0001-6738-3462¹²,
Xueyi Zou¹³,
Jiaming Guo¹³,
Youliang Yan¹³,
Rong Xie¹² &
…
Li Song ORCID: orcid.org/0000-0002-7124-5182^12,14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13677))

Included in the following conference series:

European Conference on Computer Vision

3491 Accesses

Abstract

Online processing of compressed videos to increase their resolutions attracts increasing and broad attention. Video Super-Resolution (VSR) using recurrent neural network architecture is a promising solution due to its efficient modeling of long-range temporal dependencies. However, state-of-the-art recurrent VSR models still require significant computation to obtain a good performance, mainly because of the complicated motion estimation for frame/feature alignment and the redundant processing of consecutive video frames. In this paper, considering the characteristics of compressed videos, we propose a Codec Information Assisted Framework (CIAF) to boost and accelerate recurrent VSR models for compressed videos. Firstly, the framework reuses the coded video information of Motion Vectors to model the temporal relationships between adjacent frames. Experiments demonstrate that the models with Motion Vector based alignment can significantly boost the performance with negligible additional computation, even comparable to those using more complex optical flow based alignment. Secondly, by further making use of the coded video information of Residuals, the framework can be informed to skip the computation on redundant pixels. Experiments demonstrate that the proposed framework can save up to 70% of the computation without performance drop on the REDS4 test videos encoded by H.264 when CRF is 23.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Spatial-Temporal Recurrent Residual Networks for Video Super-Resolution

Video Super-Resolution with Recurrent Structure-Detail Network

Sliding Window Recurrent Network for Efficient Video Super-Resolution

References

Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: CVPR, pp. 2848–2857. IEEE Computer Society (2017)
Google Scholar
Chan, K.C.K., Wang, X., Yu, K., Dong, C., Loy, C.C.: BasicVSR: the search for essential components in video super-resolution and beyond. In: CVPR, pp. 4947–4956. Computer Vision Foundation/IEEE (2021)
Google Scholar
Chan, K.C.K., Zhou, S., Xu, X., Loy, C.C.: BasicVSR++: improving video super-resolution with enhanced propagation and alignment. CoRR abs/2104.13371 (2021)
Google Scholar
Charbonnier, P., Blanc-Féraud, L., Aubert, G., Barlaud, M.: Two deterministic half-quadratic regularization algorithms for computed imaging. In: ICIP (2), pp. 168–172. IEEE Computer Society (1994)
Google Scholar
Chen, P., Yang, W., Wang, M., Sun, L., Hu, K., Wang, S.: Compressed domain deep video super-resolution. IEEE Trans. Image Process. 30, 7156–7169 (2021)
Article Google Scholar
Dai, J., et al.: Deformable convolutional networks. In: ICCV, pp. 764–773. IEEE Computer Society (2017)
Google Scholar
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13
Chapter Google Scholar
Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation. In: ICCV Workshops, pp. 3476–3485. IEEE (2019)
Google Scholar
Habibian, A., Abati, D., Cohen, T.S., Bejnordi, B.E.: Skip-convolutions for efficient video processing. In: CVPR, pp. 2695–2704. Computer Vision Foundation/IEEE (2021)
Google Scholar
Haris, M., Shakhnarovich, G., Ukita, N.: Recurrent back-projection network for video super-resolution. In: CVPR, pp. 3897–3906. Computer Vision Foundation/IEEE (2019)
Google Scholar
Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 645–660. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_38
Chapter Google Scholar
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: ICLR (Poster). OpenReview.net (2017)
Google Scholar
Jo, Y., Oh, S.W., Kang, J., Kim, S.J.: Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: CVPR, pp. 3224–3232. Computer Vision Foundation/IEEE Computer Society (2018)
Google Scholar
Kong, X., Zhao, H., Qiao, Y., Dong, C.: ClassSR: a general framework to accelerate super-resolution networks by data characteristic. In: CVPR, pp. 12016–12025. Computer Vision Foundation/IEEE (2021)
Google Scholar
Lai, W., Huang, J., Ahuja, N., Yang, M.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: CVPR, pp. 5835–5843. IEEE Computer Society (2017)
Google Scholar
Li, W., Tao, X., Guo, T., Qi, L., Lu, J., Jia, J.: MuCAN: multi-correspondence aggregation network for video super-resolution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 335–351. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_20
Chapter Google Scholar
Li, Y., Jin, P., Yang, F., Liu, C., Yang, M., Milanfar, P.: COMISR: compression-informed video super-resolution. CoRR abs/2105.01237 (2021)
Google Scholar
Liu, C., Sun, D.: A Bayesian approach to adaptive video super resolution. In: CVPR, pp. 209–216. IEEE Computer Society (2011)
Google Scholar
Liu, M., Zhang, Z., Hou, L., Zuo, W., Zhang, L.: Deep adaptive inference networks for single image super-resolution. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12538, pp. 131–148. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66823-5_8
Chapter Google Scholar
Nah, S., et al.: NTIRE 2019 challenge on video deblurring and super-resolution: dataset and study. In: CVPR Workshops, pp. 1996–2005. Computer Vision Foundation/IEEE (2019)
Google Scholar
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: CVPR, pp. 2720–2729. IEEE Computer Society (2017)
Google Scholar
Rec, BI: H.264, advanced video coding for generic audiovisual services (2005)
Google Scholar
Sajjadi, M.S.M., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: CVPR, pp. 6626–6634. Computer Vision Foundation/IEEE Computer Society (2018)
Google Scholar
Sun, D., Yang, X., Liu, M., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: CVPR, pp. 8934–8943. Computer Vision Foundation/IEEE Computer Society (2018)
Google Scholar
Tian, Y., Zhang, Y., Fu, Y., Xu, C.: TDAN: temporally-deformable alignment network for video super-resolution. In: CVPR, pp. 3357–3366. Computer Vision Foundation/IEEE (2020)
Google Scholar
Wang, L., et al.: Learning sparse masks for efficient image super-resolution. CoRR abs/2006.09603 (2020)
Google Scholar
Wang, X., Chan, K.C.K., Yu, K., Dong, C., Loy, C.C.: EDVR: video restoration with enhanced deformable convolutional networks. In: CVPR Workshops, pp. 1954–1963. Computer Vision Foundation/IEEE (2019)
Google Scholar
Wang, X., Yu, K., Dong, C., Loy, C.C.: Recovering realistic texture in image super-resolution by deep spatial feature transform. In: CVPR, pp. 606–615. Computer Vision Foundation/IEEE Computer Society (2018)
Google Scholar
Wu, Z., et al.: BlockDrop: dynamic inference paths in residual networks. In: CVPR, pp. 8817–8826. Computer Vision Foundation/IEEE Computer Society (2018)
Google Scholar
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vis. 127(8), 1106–1125 (2019)
Article Google Scholar
Yi, P., et al.: Omniscient video super-resolution. CoRR abs/2103.15683 (2021)
Google Scholar
Yi, P., Wang, Z., Jiang, K., Jiang, J., Ma, J.: Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: ICCV, pp. 3106–3115. IEEE (2019)
Google Scholar
Zhang, Z., Sze, V.: FAST: a framework to accelerate super-resolution processing on compressed videos. In: CVPR Workshops, pp. 1015–1024. IEEE Computer Society (2017)
Google Scholar
Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets V2: more deformable, better results. In: CVPR, pp. 9308–9316. Computer Vision Foundation/IEEE (2019)
Google Scholar

Download references

Acknowledgement

The authors Rong Xie and Li Song were supported by National Key R &D Project of China under Grant 2019YFB1802701, the 111 Project (B07022 and Sheitc No.150633) and the Shanghai Key Laboratory of Digital Media Processing and Transmissions.

Author information

Authors and Affiliations

Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, China
Hengsheng Zhang, Rong Xie & Li Song
Huawei Noah’s Ark Lab, Shenzhen, China
Xueyi Zou, Jiaming Guo & Youliang Yan
MoE Key Lab of Artifical Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China
Li Song

Authors

Hengsheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xueyi Zou
View author publications
You can also search for this author in PubMed Google Scholar
Jiaming Guo
View author publications
You can also search for this author in PubMed Google Scholar
Youliang Yan
View author publications
You can also search for this author in PubMed Google Scholar
Rong Xie
View author publications
You can also search for this author in PubMed Google Scholar
Li Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Song .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 9329 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Zou, X., Guo, J., Yan, Y., Xie, R., Song, L. (2022). A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13677. Springer, Cham. https://doi.org/10.1007/978-3-031-19790-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-19790-1_14
Published: 24 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19789-5
Online ISBN: 978-3-031-19790-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution