Skip to main content

MetaVSR: A Novel Approach to Video Super-Resolution for Arbitrary Magnification

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14554))

Included in the following conference series:

  • 366 Accesses

Abstract

Video super-resolution is a pivotal task that involves the recovery of high-resolution video frames from their low-resolution counterparts, possessing a multitude of applications in real-world scenarios. Within the domain of prevailing video super-resolution models, a majority of these models are tailored to specific magnification factors, thereby lacking a cohesive architecture capable of accommodating arbitrary magnifications. In response to this lacuna, this study introduces “MetaVSR”, a novel video super-resolution model devised to handle arbitrary magnifications. This model is structured around three distinct modules: inter-frame alignment, feature extraction, and upsampling. In the inter-frame alignment module, a bidirectional propagation technique is employed to attain the alignment of adjacent frames. The feature extraction module amalgamates superficial and profound video features to enhance the model’s representational prowess. The upsampling module serves to establish a mapping correlation between the desired target resolution and the input provided in lower resolution. An array of empirical findings attests to the efficacy of the proposed MetaVSR model in addressing this challenge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cao, W., Wu, Y., Chakraborty, C., Li, D., Zhao, L., Ghosh, S.K.: Sustainable and transferable traffic sign recognition for intelligent transportation systems. IEEE Trans. Intell. Transp. Syst. 24, 15784–15794 (2022)

    Article  Google Scholar 

  2. Cao, W., Zhou, C., Wu, Y., Ming, Z., Xu, Z., Zhang, J.: Research progress of zero-shot learning beyond computer vision. In: Qiu, M. (ed.) ICA3PP 2020. LNCS, vol. 12453, pp. 538–551. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60239-0_36

    Chapter  Google Scholar 

  3. Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: BasicVSR: the search for essential components in video super-resolution and beyond. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4947–4956 (2021)

    Google Scholar 

  4. Chan, K.C., Zhou, S., Xu, X., Loy, C.C.: BasicVSR++: improving video super-resolution with enhanced propagation and alignment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5972–5981 (2022)

    Google Scholar 

  5. Chu, M., Xie, Y., Leal-Taixé, L., Thuerey, N.: Temporally coherent GANs for video super-resolution (tecogan). arXiv preprint arXiv:1811.09393 (2018)

  6. Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)

    Article  Google Scholar 

  7. Gao, S., et al.: Implicit diffusion models for continuous super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10021–10030 (2023)

    Google Scholar 

  8. Haris, M., Shakhnarovich, G., Ukita, N.: Space-time-aware multi-resolution video enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2859–2868 (2020)

    Google Scholar 

  9. Hu, X., Mu, H., Zhang, X., Wang, Z., Tan, T., Sun, J.: Meta-SR: a magnification-arbitrary network for super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1575–1584 (2019)

    Google Scholar 

  10. Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 645–660. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_38

    Chapter  Google Scholar 

  11. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43

    Chapter  Google Scholar 

  12. Kappeler, A., Yoo, S., Dai, Q., Katsaggelos, A.K.: Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging 2(2), 109–122 (2016)

    Article  MathSciNet  Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  14. Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 624–632 (2017)

    Google Scholar 

  15. Li, J., Fang, F., Mei, K., Zhang, G.: Multi-scale residual network for image super-resolution. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 517–532 (2018)

    Google Scholar 

  16. Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144 (2017)

    Google Scholar 

  17. Liu, C., Sun, D.: On Bayesian adaptive video super resolution. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 346–360 (2013)

    Article  Google Scholar 

  18. Nah, S., et al.: Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0 (2019)

    Google Scholar 

  19. Niu, B., et al.: Single image super-resolution via a holistic attention network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 191–207. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_12

    Chapter  Google Scholar 

  20. Patwary, M.J., Cao, W., Wang, X.Z., Haque, M.A.: Fuzziness based semi-supervised multimodal learning for patient’s activity recognition using RGBDT videos. Appl. Soft Comput. 120, 108655 (2022)

    Article  Google Scholar 

  21. Ranjan, A., Black, M.J.: Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4161–4170 (2017)

    Google Scholar 

  22. Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4713–4726 (2022)

    Google Scholar 

  23. Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018)

    Google Scholar 

  24. Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016)

    Google Scholar 

  25. Shoeiby, M., Armin, A., Aliakbarian, S., Anwar, S., Petersson, L.: Mosaic super-resolution via sequential feature pyramid networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 84–85 (2020)

    Google Scholar 

  26. Soh, J.W., Cho, S., Cho, N.I.: Meta-transfer learning for zero-shot super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3516–3525 (2020)

    Google Scholar 

  27. Tian, Y., Zhang, Y., Fu, Y., Xu, C.: TDAN: temporally-deformable alignment network for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3360–3369 (2020)

    Google Scholar 

  28. Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: EDVR: video restoration with enhanced deformable convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0 (2019)

    Google Scholar 

  29. Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vision 127, 1106–1125 (2019)

    Article  Google Scholar 

  30. Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 286–301 (2018)

    Google Scholar 

  31. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481 (2018)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China (62106150, 61836005, 62372304) and the Open Research Fund of Anhui Province Key Laboratory of Machine Vision Inspection (KLMVI-2023-HIT-01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weipeng Cao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hong, Z. et al. (2024). MetaVSR: A Novel Approach to Video Super-Resolution for Arbitrary Magnification. In: Rudinac, S., et al. MultiMedia Modeling. MMM 2024. Lecture Notes in Computer Science, vol 14554. Springer, Cham. https://doi.org/10.1007/978-3-031-53305-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-53305-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-53304-4

  • Online ISBN: 978-3-031-53305-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics