Skip to main content
Log in

Multi-hop Video Super Resolution with Long-Term Consistency (MVSRGAN)

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Utilizing deep learning, and especially Generative Adversarial Networks (GANs), for super-resolution images has yielded auspicious results. However, performing super resolutions with a big difference in scaling between input and output will add a certain degree of difficulty. In this paper we propose a super resolution with multiple steps, which means scaling the image gradually to stimulate maximum results. Video super resolution (VSR) needs different treatment from single image super resolution (SISR). It requires a temporal connection in between the frames, but this has not been fully explored by most of the existing studies. This temporal feature is significant to maintain the video consistency, in term of video quality and motion continuity. Using this loss functions, we can avoid the inconsistent failure in the image which accumulate continuously over time. Finally, our method has been shown to generate a super-resolution video that maintains both the video quality and its motion continuity. The quantitative result has higher Peak Signal to Noise Ratio (PSNR) scores for the Vimeo90K, Vid4, and Fireworks datasets with 37.70, 29.91, and 31.28 respectively compared to the state-of-the-art methods. The result shows that our models is better than other state-of-the-art methods using a different dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Akçay S, Atapour-Abarghouei A, Breckon TP (2019) Skip-GANomaly: skip connected and adversarially trained encoder-decoder anomaly detection. In: International joint conference on neural networks (IJCNN). IEEE

  2. Caballero J, Ledig C, Aitken A, Acosta A, Totz J, Wang Z, Shi W (2017) Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings—30th IEEE conference on computer vision and pattern recognition, CVPR 2017. IEEE, pp 2848–2857

  3. Cao Y, Wang C, Song C, Tang Y, Li H (2021) Real-time super-resolution system of 4K-video based on deep learning. In: International conference on application-specific systems, architectures and processors (ASAP). IEEE, pp 69–76

  4. Chadha A, Britto J, Roja MM (2020) iSeeBetter: spatio-temporal video super-resolution using recurrent generative back-projection networks. Comput Vis Media 6:307–317

    Article  Google Scholar 

  5. Chu M, Xie Y, Mayer J, Leal-Taixé L, Thuerey N (2020) Learning temporal coherence via self-supervision for GAN-based video generation. ACM Trans Graph. 39. https://doi.org/10.1145/3386569.3392457

  6. Dong C, Loy C C, He K, Tang X (2014) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38:295–307. https://doi.org/10.1109/TPAMI.2015.2439281

    Article  Google Scholar 

  7. Dong C, Loy CC, Tang X (2016) Accelerating the super-resolution convolutional neural network. In: Leibe B, Matas J, Sebe N, Welling M (eds) European conference on computer vision. Springer International Publishing, Cham, pp 391–407

  8. Farnebäck G (2003) Two-frame motion estimation based on polynomial expansion. In: Scandinavian conference on image analysis. Springer International Publishing, pp 363–370

  9. Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv neural Inf Process Syst

  10. Gu J, Lu H, Zuo W, Dong C (2019) Blind super-resolution with iterative kernel correction. In: Conference on computer vision and pattern recognition (CVPR). IEEE, pp 11604–1613

  11. Haris M, Shakhnarovich G, Ukita N (2019) Recurrent back-projection network for video super-resolution. In: Conference on computer vision and pattern recognition (CVPR), pp 13897–3906

  12. Huang Y, Wang W, Wang L (2015) Bidirectional recurrent convolutional networks for multi-frame super-resolution. In: Advances in neural information processing systems, pp 235–243

  13. Isola P, Zhu J -Y, Zhou T, Efros A A, Research B A (2017) Image-to-image translation with conditional adversarial networks. In: Conference on computer vision and pattern recognition (CVPR). IEEE, pp 5967–5976

  14. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision (ECCV). Springer International Publishing

  15. Kim J, Lee J K, Lee K M (2016) Accurate image super-resolution using very deep convolutional networks. In: Conference on computer vision and pattern recognition. IEEE, pp 1646–1654

  16. Koester E, Sahin C S (2019) A comparison of super-resolution and nearest neighbors interpolation applied to object detection on satellite data. arXiv:1907.05283

  17. Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Conference on computer vision and pattern recognition. IEEE, pp 4681–4690

  18. Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision (ECCV). Springer International Publishing, pp 702–716

  19. Li W, Tao X, Li Y, Guo T, Qi L, Lu J, Jia J (2021) MuCAN: multi-correspondence aggregation network for video super-resolution. In: European conference on computer vision (ECCV). Springer International Publishing, pp 335–351

  20. Lin Y, Wang Y, Li Y, Gao Y, Wang Z, Khan L (2021) Attention-based spatial guidance for image-to-image translation. In: Workshop on applications of computer vision. IEEE, pp 816–825

  21. Liu C, Sun D (2013) On bayesian adaptive video super resolution. IEEE Trans Pattern Anal Mach Intell 36(2):346–360. IEEE

    Article  Google Scholar 

  22. Liu D, Wang Z, Fan Y, Liu X, Wang Z, Chang S, Huang T (2017) Robust video super-resolution with learned temporal dynamics. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 2526–2534

  23. Mao X, Li Q, Xie H, Lau R Y K, Wang Z, Smolley S P (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 2794–2802

  24. Nazeri K, Thasarathan H, Ebrahimi M (2019) Edge-informed single image super-resolution. In: International conference on computer vision workshops (ICCV workshops). IEEE, pp 3275–3284

  25. Niu B, Wen W, Ren W, Zhang X, Yang L, Wang S, Zhang K, Cao X, Shen H (2020) Single image super-resolution via a holistic attention network. In: European conference on computer vision (ECCV). Springer International Publishing, pp 191–207

  26. Park S -J, Son H, Cho S, Hong K -S, Lee S (2018) SRFEat: single image super-resolution with feature discrimination. In: European conference on computer vision. Springer International Publishing, pp 455–471

  27. Sajjadi M S M, Vemulapalli R, Brown M (2018) Frame-recurrent video super-resolution. In: Conference on computer vision and pattern recognition. IEEE, pp 6626–6634

  28. Seoung Y J, Oh W, Kang J, Kim SJ (2018) Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Conference on computer vision and pattern recognition (CVPR), pp 3224–3232

  29. Shi W, Caballero J, Huszar F, Totz J, Aitken A P, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1874–1883

  30. Shocher A, Cohen N, Irani M (2018) “Zero-shot“ super-resolution using deep internal learning. In: Conference on computer vision and pattern recognition (CVPR), pp 1043–1052

  31. Sushko V, Gall J, Khoreva A, One-Shot G A N (2021) Learning to generate samples from single images and videos. In: Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 2596–2600

  32. Tao X, Gao H, Liao R, Wang J, Jia J (2017) Detail-revealing deep video super-resolution. In: International conference on computer vision (ICCV), pp 4482–4490

  33. Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Qiao Y, Loy C C (2018) ESRGAN enhanced super-resolution generative adversarial networks. In: European conference on computer vision. Springer International Publishing, pp 63–79

  34. Wang X, Chan KCK, Yu K, Dong C, Loy CC (2019) EDVR: video restoration with enhanced deformable convolutional networks. In: Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 1954–1963. https://doi.org/10.1109/CVPRW.2019.00247

  35. Wang L, Guo Y, Liu L, Lin Z, Deng X, An W (2020) Deep video super-resolution using HR optical flow estimation. IEEE Trans Image Process 29:4323–4336. https://doi.org/10.1109/TIP.2020.2967596

    Article  Google Scholar 

  36. Wang J, Teng G, An P (2021) Video super-resolution based on generative adversarial network and edge enhancement. Electron 10:1–19

    Google Scholar 

  37. Xue T, Chen B, Wu J, Wei D, Freeman W T (2019) Video enhancement with task-oriented flow. Int J Comput Vis (IJCV) 127:1106–1125. Springer

    Article  Google Scholar 

  38. Yi Z, Zhang H, Tan P, Gong M (2017) DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2868–2876

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wisnu Aditya.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Timothy K. Shih, Tipajin Thaipisutikul and Chih-Yang Lin contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aditya, W., Shih, T.K., Thaipisutikul, T. et al. Multi-hop Video Super Resolution with Long-Term Consistency (MVSRGAN). Multimed Tools Appl 83, 4115–4132 (2024). https://doi.org/10.1007/s11042-023-15351-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15351-8

Keywords

Navigation