Skip to main content

Efficient, Low-Cost, Real-Time Video Super-Resolution Network

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2021)

Abstract

Video Super-Resolution (VSR) task aims to reconstruct missing high-frequency information lost in degradation. Researchers have proposed many excellent models. However, these models require large memory and high computational cost. In this paper, we propose a novel VSR model called StudentVSR (StuVSR) which is a unidirectional recurrent network. To guarantee StuVSR can generate sufficient high-frequency information, we propose Inceptual Attention (IA) mechanism. Meanwhile, to compress the model size, we utilize the idea of knowledge distillation. We take an auto-encoder network as teacher and redesign the knowledge distillation mode. StuVSR employs extremely small parameters and accomplishes the VSR task in a rapid manner. StuVSR can generate 30-frame-per-second (FPS) 1080p-2k videos in real-time. We conduct comparison experiments to prove the superiority of StuVSR and StuVSR achieves the highest Peak Signal to Noise Ratio (PSNR) score among 16 state of the arts. We also explore the function of the inceptual attention and the knowledge distillation mode through ablation experiments. We will publish the codes at https://github.com/Dawn3474/StuVSR.

L. Wang—Supported by Institute of Information Engineering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4778–4787 (2017)

    Google Scholar 

  2. Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: BasicVSR: the search for essential components in video super-resolution and beyond. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  3. Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: Neural Information Processing Systems, NIPS 2017, pp. 742–751. Curran Associates Inc., Red Hook (2017)

    Google Scholar 

  4. Ding, L., Wang, Z., Fan, Y., Liu, X., Huang, T.: Robust video super-resolution with learned temporal dynamics. In: IEEE International Conference on Computer Vision (2017)

    Google Scholar 

  5. Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation. In: IEEE International Conference on Computer Vision Workshop, pp. 3476–3485. IEEE (2019)

    Google Scholar 

  6. Haris, M., Shakhnarovich, G., Ukita, N.: Recurrent back-projection network for video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3897–3906 (2019)

    Google Scholar 

  7. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. Comput. Sci. 14(7), 38–39 (2015)

    Google Scholar 

  8. Huang, Y., Wang, W., Wang, L.: Bidirectional recurrent convolutional networks for multi-frame super-resolution. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015)

    Google Scholar 

  9. Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 645–660. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_38

    Chapter  Google Scholar 

  10. Isobe, T., et al.: Video super-resolution with temporal group attention. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 8008–8017 (2020)

    Google Scholar 

  11. Isobe, T., Zhu, F., Jia, X., Wang, S.: Revisiting temporal modeling for video super-resolution. arXiv preprint arXiv:2008.05765 (2020)

  12. Jo, Y., Oh, S.W., Kang, J., Kim, S.J.: Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3224–3232 (2018)

    Google Scholar 

  13. Li, W., Tao, X., Guo, T., Qi, L., Lu, J., Jia, J.: MuCAN: multi-correspondence aggregation network for video super-resolution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 335–351. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_20

    Chapter  Google Scholar 

  14. Noh, J., Bae, W., Lee, W., Seo, J., Kim, G.: Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. In: IEEE International Conference on Computer Vision, pp. 9725–9734 (2019)

    Google Scholar 

  15. Peng, H., Du, H., Yu, H., LI, Q., Liao, J., Fu, J.: Cream of the crop: distilling prioritized paths for one-shot neural architecture search. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Neural Information Processing Systems, vol. 33, pp. 17955–17964. Curran Associates, Inc. (2020)

    Google Scholar 

  16. Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018)

    Google Scholar 

  17. Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016). https://doi.org/10.1109/CVPR.2016.207

  18. So, D., Le, Q., Liang, C.: The evolved transformer. In: International Conference on Machine Learning, pp. 5877–5886. PMLR (2019)

    Google Scholar 

  19. Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution. IEEE Computer Society (2017)

    Google Scholar 

  20. Tian, Y., Zhang, Y., Fu, Y., Xu, C.: TDAN: temporally-deformable alignment network for video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3360–3369 (2020)

    Google Scholar 

  21. Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: EDVR: video restoration with enhanced deformable convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, p. 0 (2019)

    Google Scholar 

  22. Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vision 127(8), 1106–1125 (2019)

    Article  Google Scholar 

  23. Yi, P., Wang, Z., Jiang, K., Jiang, J., Ma, J.: Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: IEEE International Conference on Computer Vision, pp. 3106–3115 (2019)

    Google Scholar 

  24. Zangeneh, E., Rahmati, M., Mohsenzadeh, Y.: Low resolution face recognition using a two-branch deep convolutional neural network architecture. Expert Syst. Appl. 139, 112854 (2020)

    Article  Google Scholar 

  25. Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)

    Google Scholar 

  26. Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_18

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, G., Wang, X., Zha, D., Wang, L., Zhao, L. (2021). Efficient, Low-Cost, Real-Time Video Super-Resolution Network. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13111. Springer, Cham. https://doi.org/10.1007/978-3-030-92273-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92273-3_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92272-6

  • Online ISBN: 978-3-030-92273-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics