Abstract
Online processing of compressed videos to increase their resolutions attracts increasing and broad attention. Video Super-Resolution (VSR) using recurrent neural network architecture is a promising solution due to its efficient modeling of long-range temporal dependencies. However, state-of-the-art recurrent VSR models still require significant computation to obtain a good performance, mainly because of the complicated motion estimation for frame/feature alignment and the redundant processing of consecutive video frames. In this paper, considering the characteristics of compressed videos, we propose a Codec Information Assisted Framework (CIAF) to boost and accelerate recurrent VSR models for compressed videos. Firstly, the framework reuses the coded video information of Motion Vectors to model the temporal relationships between adjacent frames. Experiments demonstrate that the models with Motion Vector based alignment can significantly boost the performance with negligible additional computation, even comparable to those using more complex optical flow based alignment. Secondly, by further making use of the coded video information of Residuals, the framework can be informed to skip the computation on redundant pixels. Experiments demonstrate that the proposed framework can save up to 70% of the computation without performance drop on the REDS4 test videos encoded by H.264 when CRF is 23.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: CVPR, pp. 2848–2857. IEEE Computer Society (2017)
Chan, K.C.K., Wang, X., Yu, K., Dong, C., Loy, C.C.: BasicVSR: the search for essential components in video super-resolution and beyond. In: CVPR, pp. 4947–4956. Computer Vision Foundation/IEEE (2021)
Chan, K.C.K., Zhou, S., Xu, X., Loy, C.C.: BasicVSR++: improving video super-resolution with enhanced propagation and alignment. CoRR abs/2104.13371 (2021)
Charbonnier, P., Blanc-Féraud, L., Aubert, G., Barlaud, M.: Two deterministic half-quadratic regularization algorithms for computed imaging. In: ICIP (2), pp. 168–172. IEEE Computer Society (1994)
Chen, P., Yang, W., Wang, M., Sun, L., Hu, K., Wang, S.: Compressed domain deep video super-resolution. IEEE Trans. Image Process. 30, 7156–7169 (2021)
Dai, J., et al.: Deformable convolutional networks. In: ICCV, pp. 764–773. IEEE Computer Society (2017)
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13
Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation. In: ICCV Workshops, pp. 3476–3485. IEEE (2019)
Habibian, A., Abati, D., Cohen, T.S., Bejnordi, B.E.: Skip-convolutions for efficient video processing. In: CVPR, pp. 2695–2704. Computer Vision Foundation/IEEE (2021)
Haris, M., Shakhnarovich, G., Ukita, N.: Recurrent back-projection network for video super-resolution. In: CVPR, pp. 3897–3906. Computer Vision Foundation/IEEE (2019)
Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 645–660. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_38
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: ICLR (Poster). OpenReview.net (2017)
Jo, Y., Oh, S.W., Kang, J., Kim, S.J.: Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: CVPR, pp. 3224–3232. Computer Vision Foundation/IEEE Computer Society (2018)
Kong, X., Zhao, H., Qiao, Y., Dong, C.: ClassSR: a general framework to accelerate super-resolution networks by data characteristic. In: CVPR, pp. 12016–12025. Computer Vision Foundation/IEEE (2021)
Lai, W., Huang, J., Ahuja, N., Yang, M.: Deep Laplacian pyramid networks for fast and accurate super-resolution. In: CVPR, pp. 5835–5843. IEEE Computer Society (2017)
Li, W., Tao, X., Guo, T., Qi, L., Lu, J., Jia, J.: MuCAN: multi-correspondence aggregation network for video super-resolution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 335–351. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_20
Li, Y., Jin, P., Yang, F., Liu, C., Yang, M., Milanfar, P.: COMISR: compression-informed video super-resolution. CoRR abs/2105.01237 (2021)
Liu, C., Sun, D.: A Bayesian approach to adaptive video super resolution. In: CVPR, pp. 209–216. IEEE Computer Society (2011)
Liu, M., Zhang, Z., Hou, L., Zuo, W., Zhang, L.: Deep adaptive inference networks for single image super-resolution. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12538, pp. 131–148. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66823-5_8
Nah, S., et al.: NTIRE 2019 challenge on video deblurring and super-resolution: dataset and study. In: CVPR Workshops, pp. 1996–2005. Computer Vision Foundation/IEEE (2019)
Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: CVPR, pp. 2720–2729. IEEE Computer Society (2017)
Rec, BI: H.264, advanced video coding for generic audiovisual services (2005)
Sajjadi, M.S.M., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: CVPR, pp. 6626–6634. Computer Vision Foundation/IEEE Computer Society (2018)
Sun, D., Yang, X., Liu, M., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: CVPR, pp. 8934–8943. Computer Vision Foundation/IEEE Computer Society (2018)
Tian, Y., Zhang, Y., Fu, Y., Xu, C.: TDAN: temporally-deformable alignment network for video super-resolution. In: CVPR, pp. 3357–3366. Computer Vision Foundation/IEEE (2020)
Wang, L., et al.: Learning sparse masks for efficient image super-resolution. CoRR abs/2006.09603 (2020)
Wang, X., Chan, K.C.K., Yu, K., Dong, C., Loy, C.C.: EDVR: video restoration with enhanced deformable convolutional networks. In: CVPR Workshops, pp. 1954–1963. Computer Vision Foundation/IEEE (2019)
Wang, X., Yu, K., Dong, C., Loy, C.C.: Recovering realistic texture in image super-resolution by deep spatial feature transform. In: CVPR, pp. 606–615. Computer Vision Foundation/IEEE Computer Society (2018)
Wu, Z., et al.: BlockDrop: dynamic inference paths in residual networks. In: CVPR, pp. 8817–8826. Computer Vision Foundation/IEEE Computer Society (2018)
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vis. 127(8), 1106–1125 (2019)
Yi, P., et al.: Omniscient video super-resolution. CoRR abs/2103.15683 (2021)
Yi, P., Wang, Z., Jiang, K., Jiang, J., Ma, J.: Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: ICCV, pp. 3106–3115. IEEE (2019)
Zhang, Z., Sze, V.: FAST: a framework to accelerate super-resolution processing on compressed videos. In: CVPR Workshops, pp. 1015–1024. IEEE Computer Society (2017)
Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable convnets V2: more deformable, better results. In: CVPR, pp. 9308–9316. Computer Vision Foundation/IEEE (2019)
Acknowledgement
The authors Rong Xie and Li Song were supported by National Key R &D Project of China under Grant 2019YFB1802701, the 111 Project (B07022 and Sheitc No.150633) and the Shanghai Key Laboratory of Digital Media Processing and Transmissions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, H., Zou, X., Guo, J., Yan, Y., Xie, R., Song, L. (2022). A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13677. Springer, Cham. https://doi.org/10.1007/978-3-031-19790-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-19790-1_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19789-5
Online ISBN: 978-3-031-19790-1
eBook Packages: Computer ScienceComputer Science (R0)