Abstract
Utilizing deep learning, and especially Generative Adversarial Networks (GANs), for super-resolution images has yielded auspicious results. However, performing super resolutions with a big difference in scaling between input and output will add a certain degree of difficulty. In this paper we propose a super resolution with multiple steps, which means scaling the image gradually to stimulate maximum results. Video super resolution (VSR) needs different treatment from single image super resolution (SISR). It requires a temporal connection in between the frames, but this has not been fully explored by most of the existing studies. This temporal feature is significant to maintain the video consistency, in term of video quality and motion continuity. Using this loss functions, we can avoid the inconsistent failure in the image which accumulate continuously over time. Finally, our method has been shown to generate a super-resolution video that maintains both the video quality and its motion continuity. The quantitative result has higher Peak Signal to Noise Ratio (PSNR) scores for the Vimeo90K, Vid4, and Fireworks datasets with 37.70, 29.91, and 31.28 respectively compared to the state-of-the-art methods. The result shows that our models is better than other state-of-the-art methods using a different dataset.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Akçay S, Atapour-Abarghouei A, Breckon TP (2019) Skip-GANomaly: skip connected and adversarially trained encoder-decoder anomaly detection. In: International joint conference on neural networks (IJCNN). IEEE
Caballero J, Ledig C, Aitken A, Acosta A, Totz J, Wang Z, Shi W (2017) Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings—30th IEEE conference on computer vision and pattern recognition, CVPR 2017. IEEE, pp 2848–2857
Cao Y, Wang C, Song C, Tang Y, Li H (2021) Real-time super-resolution system of 4K-video based on deep learning. In: International conference on application-specific systems, architectures and processors (ASAP). IEEE, pp 69–76
Chadha A, Britto J, Roja MM (2020) iSeeBetter: spatio-temporal video super-resolution using recurrent generative back-projection networks. Comput Vis Media 6:307–317
Chu M, Xie Y, Mayer J, Leal-Taixé L, Thuerey N (2020) Learning temporal coherence via self-supervision for GAN-based video generation. ACM Trans Graph. 39. https://doi.org/10.1145/3386569.3392457
Dong C, Loy C C, He K, Tang X (2014) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38:295–307. https://doi.org/10.1109/TPAMI.2015.2439281
Dong C, Loy CC, Tang X (2016) Accelerating the super-resolution convolutional neural network. In: Leibe B, Matas J, Sebe N, Welling M (eds) European conference on computer vision. Springer International Publishing, Cham, pp 391–407
Farnebäck G (2003) Two-frame motion estimation based on polynomial expansion. In: Scandinavian conference on image analysis. Springer International Publishing, pp 363–370
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv neural Inf Process Syst
Gu J, Lu H, Zuo W, Dong C (2019) Blind super-resolution with iterative kernel correction. In: Conference on computer vision and pattern recognition (CVPR). IEEE, pp 11604–1613
Haris M, Shakhnarovich G, Ukita N (2019) Recurrent back-projection network for video super-resolution. In: Conference on computer vision and pattern recognition (CVPR), pp 13897–3906
Huang Y, Wang W, Wang L (2015) Bidirectional recurrent convolutional networks for multi-frame super-resolution. In: Advances in neural information processing systems, pp 235–243
Isola P, Zhu J -Y, Zhou T, Efros A A, Research B A (2017) Image-to-image translation with conditional adversarial networks. In: Conference on computer vision and pattern recognition (CVPR). IEEE, pp 5967–5976
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision (ECCV). Springer International Publishing
Kim J, Lee J K, Lee K M (2016) Accurate image super-resolution using very deep convolutional networks. In: Conference on computer vision and pattern recognition. IEEE, pp 1646–1654
Koester E, Sahin C S (2019) A comparison of super-resolution and nearest neighbors interpolation applied to object detection on satellite data. arXiv:1907.05283
Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Conference on computer vision and pattern recognition. IEEE, pp 4681–4690
Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision (ECCV). Springer International Publishing, pp 702–716
Li W, Tao X, Li Y, Guo T, Qi L, Lu J, Jia J (2021) MuCAN: multi-correspondence aggregation network for video super-resolution. In: European conference on computer vision (ECCV). Springer International Publishing, pp 335–351
Lin Y, Wang Y, Li Y, Gao Y, Wang Z, Khan L (2021) Attention-based spatial guidance for image-to-image translation. In: Workshop on applications of computer vision. IEEE, pp 816–825
Liu C, Sun D (2013) On bayesian adaptive video super resolution. IEEE Trans Pattern Anal Mach Intell 36(2):346–360. IEEE
Liu D, Wang Z, Fan Y, Liu X, Wang Z, Chang S, Huang T (2017) Robust video super-resolution with learned temporal dynamics. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 2526–2534
Mao X, Li Q, Xie H, Lau R Y K, Wang Z, Smolley S P (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 2794–2802
Nazeri K, Thasarathan H, Ebrahimi M (2019) Edge-informed single image super-resolution. In: International conference on computer vision workshops (ICCV workshops). IEEE, pp 3275–3284
Niu B, Wen W, Ren W, Zhang X, Yang L, Wang S, Zhang K, Cao X, Shen H (2020) Single image super-resolution via a holistic attention network. In: European conference on computer vision (ECCV). Springer International Publishing, pp 191–207
Park S -J, Son H, Cho S, Hong K -S, Lee S (2018) SRFEat: single image super-resolution with feature discrimination. In: European conference on computer vision. Springer International Publishing, pp 455–471
Sajjadi M S M, Vemulapalli R, Brown M (2018) Frame-recurrent video super-resolution. In: Conference on computer vision and pattern recognition. IEEE, pp 6626–6634
Seoung Y J, Oh W, Kang J, Kim SJ (2018) Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Conference on computer vision and pattern recognition (CVPR), pp 3224–3232
Shi W, Caballero J, Huszar F, Totz J, Aitken A P, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1874–1883
Shocher A, Cohen N, Irani M (2018) “Zero-shot“ super-resolution using deep internal learning. In: Conference on computer vision and pattern recognition (CVPR), pp 1043–1052
Sushko V, Gall J, Khoreva A, One-Shot G A N (2021) Learning to generate samples from single images and videos. In: Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 2596–2600
Tao X, Gao H, Liao R, Wang J, Jia J (2017) Detail-revealing deep video super-resolution. In: International conference on computer vision (ICCV), pp 4482–4490
Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, Qiao Y, Loy C C (2018) ESRGAN enhanced super-resolution generative adversarial networks. In: European conference on computer vision. Springer International Publishing, pp 63–79
Wang X, Chan KCK, Yu K, Dong C, Loy CC (2019) EDVR: video restoration with enhanced deformable convolutional networks. In: Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 1954–1963. https://doi.org/10.1109/CVPRW.2019.00247
Wang L, Guo Y, Liu L, Lin Z, Deng X, An W (2020) Deep video super-resolution using HR optical flow estimation. IEEE Trans Image Process 29:4323–4336. https://doi.org/10.1109/TIP.2020.2967596
Wang J, Teng G, An P (2021) Video super-resolution based on generative adversarial network and edge enhancement. Electron 10:1–19
Xue T, Chen B, Wu J, Wei D, Freeman W T (2019) Video enhancement with task-oriented flow. Int J Comput Vis (IJCV) 127:1106–1125. Springer
Yi Z, Zhang H, Tan P, Gong M (2017) DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2868–2876
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Timothy K. Shih, Tipajin Thaipisutikul and Chih-Yang Lin contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Aditya, W., Shih, T.K., Thaipisutikul, T. et al. Multi-hop Video Super Resolution with Long-Term Consistency (MVSRGAN). Multimed Tools Appl 83, 4115–4132 (2024). https://doi.org/10.1007/s11042-023-15351-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15351-8