Multi-hop Video Super Resolution with Long-Term Consistency (MVSRGAN)

Aditya, Wisnu; Shih, Timothy K; Thaipisutikul, Tipajin; Lin, Chih-Yang

doi:10.1007/s11042-023-15351-8

Multi-hop Video Super Resolution with Long-Term Consistency (MVSRGAN)

Published: 22 May 2023

Volume 83, pages 4115–4132, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Wisnu Aditya ORCID: orcid.org/0000-0002-0406-8083¹,
Timothy K Shih¹,
Tipajin Thaipisutikul² &
…
Chih-Yang Lin³

149 Accesses
Explore all metrics

Abstract

Utilizing deep learning, and especially Generative Adversarial Networks (GANs), for super-resolution images has yielded auspicious results. However, performing super resolutions with a big difference in scaling between input and output will add a certain degree of difficulty. In this paper we propose a super resolution with multiple steps, which means scaling the image gradually to stimulate maximum results. Video super resolution (VSR) needs different treatment from single image super resolution (SISR). It requires a temporal connection in between the frames, but this has not been fully explored by most of the existing studies. This temporal feature is significant to maintain the video consistency, in term of video quality and motion continuity. Using this loss functions, we can avoid the inconsistent failure in the image which accumulate continuously over time. Finally, our method has been shown to generate a super-resolution video that maintains both the video quality and its motion continuity. The quantitative result has higher Peak Signal to Noise Ratio (PSNR) scores for the Vimeo90K, Vid4, and Fireworks datasets with 37.70, 29.91, and 31.28 respectively compared to the state-of-the-art methods. The result shows that our models is better than other state-of-the-art methods using a different dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Plug-and-Play Video Super-Resolution

Learning for Video Super-Resolution Through HR Optical Flow Estimation

Video super-resolution based on deep learning: a comprehensive survey

Article 01 April 2022

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Akçay S, Atapour-Abarghouei A, Breckon TP (2019) Skip-GANomaly: skip connected and adversarially trained encoder-decoder anomaly detection. In: International joint conference on neural networks (IJCNN). IEEE
Caballero J, Ledig C, Aitken A, Acosta A, Totz J, Wang Z, Shi W (2017) Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings—30th IEEE conference on computer vision and pattern recognition, CVPR 2017. IEEE, pp 2848–2857
Cao Y, Wang C, Song C, Tang Y, Li H (2021) Real-time super-resolution system of 4K-video based on deep learning. In: International conference on application-specific systems, architectures and processors (ASAP). IEEE, pp 69–76
Chadha A, Britto J, Roja MM (2020) iSeeBetter: spatio-temporal video super-resolution using recurrent generative back-projection networks. Comput Vis Media 6:307–317
Article Google Scholar
Chu M, Xie Y, Mayer J, Leal-Taixé L, Thuerey N (2020) Learning temporal coherence via self-supervision for GAN-based video generation. ACM Trans Graph. 39. https://doi.org/10.1145/3386569.3392457
Dong C, Loy C C, He K, Tang X (2014) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38:295–307. https://doi.org/10.1109/TPAMI.2015.2439281
Article Google Scholar
Dong C, Loy CC, Tang X (2016) Accelerating the super-resolution convolutional neural network. In: Leibe B, Matas J, Sebe N, Welling M (eds) European conference on computer vision. Springer International Publishing, Cham, pp 391–407
Farnebäck G (2003) Two-frame motion estimation based on polynomial expansion. In: Scandinavian conference on image analysis. Springer International Publishing, pp 363–370
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv neural Inf Process Syst
Gu J, Lu H, Zuo W, Dong C (2019) Blind super-resolution with iterative kernel correction. In: Conference on computer vision and pattern recognition (CVPR). IEEE, pp 11604–1613
Haris M, Shakhnarovich G, Ukita N (2019) Recurrent back-projection network for video super-resolution. In: Conference on computer vision and pattern recognition (CVPR), pp 13897–3906
Huang Y, Wang W, Wang L (2015) Bidirectional recurrent convolutional networks for multi-frame super-resolution. In: Advances in neural information processing systems, pp 235–243
Isola P, Zhu J -Y, Zhou T, Efros A A, Research B A (2017) Image-to-image translation with conditional adversarial networks. In: Conference on computer vision and pattern recognition (CVPR). IEEE, pp 5967–5976
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision (ECCV). Springer International Publishing
Kim J, Lee J K, Lee K M (2016) Accurate image super-resolution using very deep convolutional networks. In: Conference on computer vision and pattern recognition. IEEE, pp 1646–1654
Koester E, Sahin C S (2019) A comparison of super-resolution and nearest neighbors interpolation applied to object detection on satellite data. arXiv:1907.05283
Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Conference on computer vision and pattern recognition. IEEE, pp 4681–4690
Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision (ECCV). Springer International Publishing, pp 702–716
Li W, Tao X, Li Y, Guo T, Qi L, Lu J, Jia J (2021) MuCAN: multi-correspondence aggregation network for video super-resolution. In: European conference on computer vision (ECCV). Springer International Publishing, pp 335–351
Lin Y, Wang Y, Li Y, Gao Y, Wang Z, Khan L (2021) Attention-based spatial guidance for image-to-image translation. In: Workshop on applications of computer vision. IEEE, pp 816–825
Liu C, Sun D (2013) On bayesian adaptive video super resolution. IEEE Trans Pattern Anal Mach Intell 36(2):346–360. IEEE
Article Google Scholar
Liu D, Wang Z, Fan Y, Liu X, Wang Z, Chang S, Huang T (2017) Robust video super-resolution with learned temporal dynamics. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 2526–2534
Mao X, Li Q, Xie H, Lau R Y K, Wang Z, Smolley S P (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 2794–2802
Nazeri K, Thasarathan H, Ebrahimi M (2019) Edge-informed single image super-resolution. In: International conference on computer vision workshops (ICCV workshops). IEEE, pp 3275–3284
Niu B, Wen W, Ren W, Zhang X, Yang L, Wang S, Zhang K, Cao X, Shen H (2020) Single image super-resolution via a holistic attention network. In: European conference on computer vision (ECCV). Springer International Publishing, pp 191–207
Park S -J, Son H, Cho S, Hong K -S, Lee S (2018) SRFEat: single image super-resolution with feature discrimination. In: European conference on computer vision. Springer International Publishing, pp 455–471
Sajjadi M S M, Vemulapalli R, Brown M (2018) Frame-recurrent video super-resolution. In: Conference on computer vision and pattern recognition. IEEE, pp 6626–6634
Seoung Y J, Oh W, Kang J, Kim SJ (2018) Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Conference on computer vision and pattern recognition (CVPR), pp 3224–3232
Shi W, Caballero J, Huszar F, Totz J, Aitken A P, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1874–1883
Shocher A, Cohen N, Irani M (2018) “Zero-shot“ super-resolution using deep internal learning. In: Conference on computer vision and pattern recognition (CVPR), pp 1043–1052
Sushko V, Gall J, Khoreva A, One-Shot G A N (2021) Learning to generate samples from single images and videos. In: Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 2596–2600
Tao X, Gao H, Liao R, Wang J, Jia J (2017) Detail-revealing deep video super-resolution. In: International conference on computer vision (ICCV), pp 4482–4490
Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Qiao Y, Loy C C (2018) ESRGAN enhanced super-resolution generative adversarial networks. In: European conference on computer vision. Springer International Publishing, pp 63–79
Wang X, Chan KCK, Yu K, Dong C, Loy CC (2019) EDVR: video restoration with enhanced deformable convolutional networks. In: Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 1954–1963. https://doi.org/10.1109/CVPRW.2019.00247
Wang L, Guo Y, Liu L, Lin Z, Deng X, An W (2020) Deep video super-resolution using HR optical flow estimation. IEEE Trans Image Process 29:4323–4336. https://doi.org/10.1109/TIP.2020.2967596
Article Google Scholar
Wang J, Teng G, An P (2021) Video super-resolution based on generative adversarial network and edge enhancement. Electron 10:1–19
Google Scholar
Xue T, Chen B, Wu J, Wei D, Freeman W T (2019) Video enhancement with task-oriented flow. Int J Comput Vis (IJCV) 127:1106–1125. Springer
Article Google Scholar
Yi Z, Zhang H, Tan P, Gong M (2017) DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2868–2876

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Central University, Zhongli, Taoyuan, 32001, Taiwan
Wisnu Aditya & Timothy K Shih
Faculty of Information and Communication Technology (ICT), Mahidol University, Salaya, Nakhon Pathom, 73170, Thailand
Tipajin Thaipisutikul
Department of Mechanical Engineering, National Central University, Zhongli, Taoyuan, 32001, Taiwan
Chih-Yang Lin

Authors

Wisnu Aditya
View author publications
You can also search for this author in PubMed Google Scholar
Timothy K Shih
View author publications
You can also search for this author in PubMed Google Scholar
Tipajin Thaipisutikul
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Yang Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wisnu Aditya.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Timothy K. Shih, Tipajin Thaipisutikul and Chih-Yang Lin contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Aditya, W., Shih, T.K., Thaipisutikul, T. et al. Multi-hop Video Super Resolution with Long-Term Consistency (MVSRGAN). Multimed Tools Appl 83, 4115–4132 (2024). https://doi.org/10.1007/s11042-023-15351-8

Download citation

Received: 08 November 2021
Revised: 10 October 2022
Accepted: 15 April 2023
Published: 22 May 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-15351-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-hop Video Super Resolution with Long-Term Consistency (MVSRGAN)

Abstract

Access this article

Similar content being viewed by others

Deep Plug-and-Play Video Super-Resolution

Learning for Video Super-Resolution Through HR Optical Flow Estimation

Video super-resolution based on deep learning: a comprehensive survey

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-hop Video Super Resolution with Long-Term Consistency (MVSRGAN)

Abstract

Access this article

Similar content being viewed by others

Deep Plug-and-Play Video Super-Resolution

Learning for Video Super-Resolution Through HR Optical Flow Estimation

Video super-resolution based on deep learning: a comprehensive survey

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation