Efficient, Low-Cost, Real-Time Video Super-Resolution Network

Liu, Guanqun; Wang, Xin; Zha, Daren; Wang, Lei; Zhao, Lin

doi:10.1007/978-3-030-92273-3_17

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13111))

Included in the following conference series:

International Conference on Neural Information Processing

2268 Accesses

Abstract

Video Super-Resolution (VSR) task aims to reconstruct missing high-frequency information lost in degradation. Researchers have proposed many excellent models. However, these models require large memory and high computational cost. In this paper, we propose a novel VSR model called StudentVSR (StuVSR) which is a unidirectional recurrent network. To guarantee StuVSR can generate sufficient high-frequency information, we propose Inceptual Attention (IA) mechanism. Meanwhile, to compress the model size, we utilize the idea of knowledge distillation. We take an auto-encoder network as teacher and redesign the knowledge distillation mode. StuVSR employs extremely small parameters and accomplishes the VSR task in a rapid manner. StuVSR can generate 30-frame-per-second (FPS) 1080p-2k videos in real-time. We conduct comparison experiments to prove the superiority of StuVSR and StuVSR achieves the highest Peak Signal to Noise Ratio (PSNR) score among 16 state of the arts. We also explore the function of the inceptual attention and the knowledge distillation mode through ablation experiments. We will publish the codes at https://github.com/Dawn3474/StuVSR.

L. Wang—Supported by Institute of Information Engineering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Video super-resolution based on deep learning: a comprehensive survey

Article 01 April 2022

Video Super-Resolution with Recurrent Structure-Detail Network

DSCVSR: A Lightweight Video Super-Resolution for Arbitrary Magnification

References

Caballero, J., et al.: Real-time video super-resolution with spatio-temporal networks and motion compensation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4778–4787 (2017)
Google Scholar
Chan, K.C., Wang, X., Yu, K., Dong, C., Loy, C.C.: BasicVSR: the search for essential components in video super-resolution and beyond. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)
Google Scholar
Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: Neural Information Processing Systems, NIPS 2017, pp. 742–751. Curran Associates Inc., Red Hook (2017)
Google Scholar
Ding, L., Wang, Z., Fan, Y., Liu, X., Huang, T.: Robust video super-resolution with learned temporal dynamics. In: IEEE International Conference on Computer Vision (2017)
Google Scholar
Fuoli, D., Gu, S., Timofte, R.: Efficient video super-resolution through recurrent latent space propagation. In: IEEE International Conference on Computer Vision Workshop, pp. 3476–3485. IEEE (2019)
Google Scholar
Haris, M., Shakhnarovich, G., Ukita, N.: Recurrent back-projection network for video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3897–3906 (2019)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. Comput. Sci. 14(7), 38–39 (2015)
Google Scholar
Huang, Y., Wang, W., Wang, L.: Bidirectional recurrent convolutional networks for multi-frame super-resolution. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015)
Google Scholar
Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., Tian, Q.: Video super-resolution with recurrent structure-detail network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 645–660. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_38
Chapter Google Scholar
Isobe, T., et al.: Video super-resolution with temporal group attention. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 8008–8017 (2020)
Google Scholar
Isobe, T., Zhu, F., Jia, X., Wang, S.: Revisiting temporal modeling for video super-resolution. arXiv preprint arXiv:2008.05765 (2020)
Jo, Y., Oh, S.W., Kang, J., Kim, S.J.: Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3224–3232 (2018)
Google Scholar
Li, W., Tao, X., Guo, T., Qi, L., Lu, J., Jia, J.: MuCAN: multi-correspondence aggregation network for video super-resolution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 335–351. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_20
Chapter Google Scholar
Noh, J., Bae, W., Lee, W., Seo, J., Kim, G.: Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. In: IEEE International Conference on Computer Vision, pp. 9725–9734 (2019)
Google Scholar
Peng, H., Du, H., Yu, H., LI, Q., Liao, J., Fu, J.: Cream of the crop: distilling prioritized paths for one-shot neural architecture search. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Neural Information Processing Systems, vol. 33, pp. 17955–17964. Curran Associates, Inc. (2020)
Google Scholar
Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018)
Google Scholar
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016). https://doi.org/10.1109/CVPR.2016.207
So, D., Le, Q., Liang, C.: The evolved transformer. In: International Conference on Machine Learning, pp. 5877–5886. PMLR (2019)
Google Scholar
Tao, X., Gao, H., Liao, R., Wang, J., Jia, J.: Detail-revealing deep video super-resolution. IEEE Computer Society (2017)
Google Scholar
Tian, Y., Zhang, Y., Fu, Y., Xu, C.: TDAN: temporally-deformable alignment network for video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3360–3369 (2020)
Google Scholar
Wang, X., Chan, K.C., Yu, K., Dong, C., Change Loy, C.: EDVR: video restoration with enhanced deformable convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, p. 0 (2019)
Google Scholar
Xue, T., Chen, B., Wu, J., Wei, D., Freeman, W.T.: Video enhancement with task-oriented flow. Int. J. Comput. Vision 127(8), 1106–1125 (2019)
Article Google Scholar
Yi, P., Wang, Z., Jiang, K., Jiang, J., Ma, J.: Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: IEEE International Conference on Computer Vision, pp. 3106–3115 (2019)
Google Scholar
Zangeneh, E., Rahmati, M., Mohsenzadeh, Y.: Low resolution face recognition using a two-branch deep convolutional neural network architecture. Expert Syst. Appl. 139, 112854 (2020)
Article Google Scholar
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)
Google Scholar
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 294–310. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_18
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Guanqun Liu, Xin Wang, Daren Zha, Lei Wang & Lin Zhao
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Guanqun Liu

Authors

Guanqun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Daren Zha
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lin Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Wang .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, G., Wang, X., Zha, D., Wang, L., Zhao, L. (2021). Efficient, Low-Cost, Real-Time Video Super-Resolution Network. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13111. Springer, Cham. https://doi.org/10.1007/978-3-030-92273-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-92273-3_17
Published: 05 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92272-6
Online ISBN: 978-3-030-92273-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics