Skip to main content
Log in

Deeply feature fused video super-resolution network using temporal grouping

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The video super-resolution (VSR) task refers to the use of corresponding low-resolution frames and multiple neighboring frames to generate high-resolution (HR) frames. An important step in VSR is to fuse the features of the reference frame with the features of the supporting frame. Existing VSR methods do not take full advantage of the information provided by distant neighboring frames and usually fuse the information in a one-stage manner. In this paper, we propose a deep fusion video super-resolution network based on temporal grouping. We divide the input sequence into groups according to different frame rates to provide more accurate supplementary information. Our method aggregates temporal-spatial information at different fusion stages. Firstly, we group the input sequence. Then the temporal-spatial information is extracted and fused hierarchically, and these groups are used to recover the information lost in the reference frame. Secondly, integrate information within each group to generate group-wise features, and then perform multi-stage fusion. The information of the reference frame is fully utilized, resulting in a better recovery of video details. Finally, the upsampling module is used to generate HR frames. We conduct a comprehensive comparative experiment on Vid4, SPMC-11 and Vimeo-90K-T datasets. The results show that the proposed method achieves good performance compared with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Ahn N, Kang B, Sohn K-A (2018) Fast, accurate, and lightweight super-resolution with cascading residual network. In: Proceedings of the European Conference on Computer Vision, pp 252–268

  2. Bertasius G, Torresani L, Shi J (2018) Object detection in video with spatiotemporal sampling networks. In: Proceedings of the European Conference on Computer Vision, pp 331–346

  3. Caballero J, Ledig C, Aitken A, Acosta A, Totz J, Wang Z, Shi W (2017) Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2848–2857

  4. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: In Proceedings of the European Conference on Computer Vision, pp 801–818

  5. Dai Jifeng, Qi Haozhi, Xiong Yuwen, Li Yi, Zhang Guodong, Hu Han, Wei Yichen (2017) Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 764–773

  6. Dong Chao, Loy Chen Change, He Kaiming, Tang Xiaoou (2014) Learning a deep convolutional network for image super-resolution. In: Proceedings of the European Conference on Computer Vision, pp 184–199

  7. Gast J, Roth S (2019) Deep video deblurring: the devil is in the details. In: Proceedings of the IEEE/CVF Conference on International Conference on Computer Vision Workshop, pp 3824–3833

  8. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp 315–323

  9. Haris M, Shakhnarovich G, Ukita N (2018) Deep back-projection networks for super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1664–1673

  10. Haris Muhammad, Shakhnarovich G, Ukita N (2019) Recurrent back-projection network for video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3892–3901

  11. Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE Conference on International Conference on Computer Vision, pp 1510–1519

  12. Huang Yan, Wang Wei, Wang Liang (2015) Bidirectional recurrent convolutional networks for multi-frame super-resolution. Adv Neural Inf Process Syst 28:235–243

    Google Scholar 

  13. Huang Yuanfei, Li Jie, Gao Xinbo, Yanting Hu, Wen Lu (2021) Interpretable detail-fidelity attention network for single image super-resolution. IEEE Trans Image Process 30:2325–2339

    Article  Google Scholar 

  14. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp 448–456

  15. Isobe T, Li S, Jia X, Yuan S, Slabaugh G, Xu C, Li Y-L, Wang S, Tian Q (2020) Video super-resolution with temporal group attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8005–8014

  16. Jo Y, Oh SW, Kang J, Kim SJ (2018) Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3224–3232

  17. Armin Kappeler, Seunghwan Yoo, Qiqin Dai, Katsaggelos Aggelos K (2016) Video super-resolution with convolutional neural networks. IEEE Trans Comput Imag 2(2):109–122

    Article  MathSciNet  Google Scholar 

  18. Kim J, Lee JK, Lee KM (2016) Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1637–1645

  19. Kim SY, Lim J, Na T, Kim M (2018) 3dsrnet: video super-resolution using 3d convolutional neural networks. CoRR, arXiv:abs/1812.09079

  20. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  21. Kupyn O, Budzan V, Mykhailych M, Mishkin D, Matas J (2018) Deblurgan: blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8183–8192

  22. Lai W-S, Huang J-B, Ahuja N, Yang M-H (2017) Deep laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 624–632

  23. Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 105–114

  24. Li Z, Yang J, Liu Z, Yang X, Jeon G, Wu W (2019) Feedback network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3867–3876

  25. Lim B, Son S, Kim H, Nah S, Mu LK (2017) Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 136–144

  26. Liu Ce, Sun Deqing (2013) On Bayesian adaptive video super resolution. IEEE Trans Pattern Anal Mach Intell 36(2):346–360

    Article  Google Scholar 

  27. Liu D, Wang Z, Fan Y, Liu X, Wang Z, Chang S, Huang T (2017) Robust video super-resolution with learned temporal dynamics. In: Proceedings of the IEEE Conference on International Conference on Computer Vision, pp 2526–2534

  28. Nah S, Tae HK, Kyoung ML (2017) Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3883–3891

  29. Qi Y, Junhua G, Li W, Tian Z, Zhang Y, Geng J (2020) Pulmonary nodule image super-resolution using multi-scale deep residual channel attention network with joint optimization. J Supercomput 76(2):1005–1019

    Article  Google Scholar 

  30. Ren D, Zhang K, Wang Q, Hu Q, Zuo W (2020) Neural blind deconvolution using deep priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3338–3347

  31. Ren Dongwei, Zuo Wangmeng, Zhang David, Zhang Lei, Yang Ming-Hsuan (2021) Simultaneous fidelity and regularization learning for image restoration. IEEE Trans Pattern Anal Mach Intell 43(1):284–299

    Article  Google Scholar 

  32. Sajjadi MSM, Vemulapalli R, Brown M (2018) Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6626–6634

  33. Shi W, Caballero J, Huszár F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1874–1883

  34. Song Huihui Xu, Wenjie Liu Dong, Bo Liu, Qingshan Liu, Metaxas Dimitris N (2021) Multi-stage feature fusion network for video super-resolution. IEEE Trans Image Process 30:2923–2934

    Article  Google Scholar 

  35. Tao X, Gao H, Liao R, Wang J, Jia J (2017) Detail-revealing deep video super-resolution. In: Proceedings of the IEEE Conference on International Conference on Computer Vision, pp 4482–4490

  36. Tian Y, Zhang Y, Fu Y, Xu C (2020) Tdan: temporally-deformable alignment network for video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3357–3366

  37. Kuo-Kun Tseng, Ran Zhang, Chien-Ming Chen, Mehedi Hassan Mohammad (2021) Dnetunet: a semi-supervised CNN of medical image segmentation for super-computing ai service. J Supercomput 77(4):3594–3615

    Article  Google Scholar 

  38. Wang L, Guo Y, Lin Z, Deng X, An W (2018) Learning for video super-resolution through hr optical flow estimation. In: Asian Conference on Computer Vision, pp 514–529

  39. Longguang Wang, Yulan Guo, Li Liu, Zaiping Lin, Xinpu Deng, Wei An (2020) Deep video super-resolution using hr optical flow estimation. IEEE Trans Image Process 29:4323–4336

    Article  Google Scholar 

  40. Wang X, Chan KCK, Yu K, Dong C, Loy CC (2019) Edvr: video restoration with enhanced deformable convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 1954–1963

  41. Wang X, Yu K, Dong C, Loy CC (2018) Recovering realistic texture in image super-resolution by deep spatial feature transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 606–615

  42. Yang Wen, Jie Chen, Bin Sheng, Zhihua Chen, Ping Li, Ping Tan, Tong-Yee Lee (2021) Structure-aware motion deblurring using multi-adversarial optimized cyclegan. IEEE Trans Image Process 30:6142–6155

    Article  Google Scholar 

  43. Tianfan Xue, Chen Baian Wu, Jiajun Wei Donglai, Freeman William T (2019) Video enhancement with task-oriented flow. Int J Comput Vis 127(8):1106–1125

    Article  Google Scholar 

  44. Yi P, Wang Z, Jiang K, Jiang J, Ma J (2019) Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: Proceedings of the IEEE Conference on International Conference on Computer Vision, pp 3106–3115

  45. Zhang K, Zuo W, Zhang L (2019) Deep plug-and-play super-resolution for arbitrary blur kernels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1671–1681

  46. Zhang Y, Li K, Li K, Wang L, Zhong B, Fu Y (2018) Image super-resolution using very deep residual channel attention networks. In: Proceedings of the European Conference on Computer Vision, pp 286–301

  47. Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y (2018) Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2472–2481

  48. Zhao Y, Xiong Y, Lin D (2018) Trajectory convolution for action recognition. In: Proceedings of the International Conference on Neural Information Processing Systems, pp 2208–2219

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under (Grant No. 61703196), the Key Science Foundation of Zhangzhou City under (Grant Nos. ZZ2019ZD11, ZZ2021J23) and the Fujian Province Nature Science Foundation under (Grant Nos. 2020J01813, 2020J01821).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingmin Yang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Yang, W. & Yang, J. Deeply feature fused video super-resolution network using temporal grouping. J Supercomput 78, 8999–9016 (2022). https://doi.org/10.1007/s11227-021-04299-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-021-04299-x

Keywords

Navigation