Abstract
When considering the temporal relationships, most previous video super-resolution (VSR) methods follow the iterative or recurrent framework. The iterative framework adopts neighboring low-resolution (LR) frames from a sliding window, while the recurrent framework utilizes the output generated in the previous SR procedure. The hybrid framework combines them but still cannot fully leverage the temporal relationships. Meanwhile, the existing methods are limited in the receptive field of the optical flow or lack semantic constrains on motion information. In this work, we propose an omniscient framework to fully explore the temporal relationships in the video, which encompasses both LR frames and SR outputs from the past, present, and future. The omniscient framework is more generic because the iterative, recurrent, and hybrid frameworks can be regarded as its special cases. Besides, when addressing the motion information, most previous VSR methods adopt the explicit motion estimation and compensation, while many recent methods turn to implicit alignment. In implicit alignment methods, because basic non-local means suffers from heavy computational costs, we improve it by capturing the non-local correlations in a relatively local manner to reduce the complexity. Moreover, we integrate the explicit and implicit methods into an explicit-implicit alignment module to better utilize motion information. We have conducted extensive experiments on public datasets, which show that our method is superior over the state-of-the-art methods in objective metrics, subjective visual quality, and complexity. In particular, on datasets of Vid4 and UDM10, our method improves PSNR by 0.19 dB, 0.49 dB against the most advanced method BasicVSR++, respectively.
- [1] . 2021. MEMC-Net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 3 (2021), 933–948.Google ScholarCross Ref
- [2] . 2010. Maximum a posteriori video super-resolution using a new multichannel image prior. IEEE Transactions on Image Processing 19, 6 (2010), 1451–1464.Google ScholarDigital Library
- [3] . 2017. Real-time video super-resolution with spatio-temporal networks and motion compensation. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2848–2857.Google Scholar
- [4] . 2021. BasicVSR: The search for essential components in video super-resolution and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4947–4956.Google ScholarCross Ref
- [5] . 2022. BasicVSR++: Improving video super-resolution with enhanced propagation and alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5972–5981.Google ScholarCross Ref
- [6] . 2017. Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 764–773.Google ScholarCross Ref
- [7] . 2016. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 2 (2016), 295–307.Google ScholarDigital Library
- [8] . 2016. Accelerating the super-resolution convolutional neural network. In Proceedings of the European Conference on Computer Vision (ECCV). 391–407.Google ScholarCross Ref
- [9] . 2020. Light field super-resolution using a low-rank prior and deep convolutional neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 5 (2020), 1162–1175.Google Scholar
- [10] . 2019. Efficient video super-resolution through recurrent latent space propagation. In Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW). 3476–3485.Google ScholarCross Ref
- [11] . 2018. Deep back-projection networks for super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1664–1673.Google ScholarCross Ref
- [12] . 2019. Recurrent back-projection network for video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3892–3901.Google ScholarCross Ref
- [13] . 2023. CycMuNet+: Cycle-projected mutual learning for spatial-temporal video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 11 (2023), 13376–13392.Google ScholarDigital Library
- [14] . 2020. Video super-resolution with recurrent structure-detail network. In Proceedings of the European Conference on Computer Vision (ECCV). 645–660.Google ScholarDigital Library
- [15] . 2020. Video super-resolution with temporal group attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8008–8017.Google ScholarCross Ref
- [16] . 2020. Dual-path deep fusion network for face image hallucination. IEEE Transactions on Neural Networks and Learning Systems 33, 1 (2020), 378–391.Google ScholarCross Ref
- [17] . 2019. ATMFN: Adaptive-threshold-based multi-model fusion network for compressed face hallucination. IEEE Transactions on Multimedia 22, 10 (2019), 2734–2747.Google ScholarCross Ref
- [18] . 2019. Edge-enhanced GAN for remote sensing image superresolution. IEEE Transactions on Geoscience and Remote Sensing 57, 8 (
Aug 2019), 5799–5812.Google ScholarCross Ref - [19] . 2018. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3224–3232.Google ScholarCross Ref
- [20] . 2016. Video super-resolution with convolutional neural networks. IEEE Transactions on Computational Imaging 2, 2 (2016), 109–122.Google ScholarCross Ref
- [21] . 2016. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1646–1654.Google ScholarCross Ref
- [22] . 2014. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [23] . 2015. Efficient learning of image super-resolution and compression artifact removal with semi-local Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 9 (2015), 1792–1805.Google ScholarDigital Library
- [24] . 2017. Deep Laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5835–5843.Google ScholarCross Ref
- [25] . 2019. Fast and accurate image super-resolution with deep Laplacian pyramid networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 11 (
Nov 2019), 2599–2613.Google ScholarCross Ref - [26] . 2016. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 105–114.Google Scholar
- [27] . 2017. Video superresolution via motion compensation and deep residual learning. IEEE Transactions on Computational Imaging 3, 4 (2017), 749–762.Google ScholarCross Ref
- [28] . 2014. On Bayesian adaptive video super resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 2 (2014), 346–60.Google ScholarDigital Library
- [29] . 2017. Robust video super-resolution with learned temporal dynamics. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2526–2534.Google ScholarCross Ref
- [30] . 2018. Learning temporal dynamics for video super-resolution: A deep learning approach. IEEE Transactions on Image Processing 27, 7 (2018), 3432–3445.Google ScholarCross Ref
- [31] . 2019. NTIRE 2019 challenge on video deblurring and super-resolution: Dataset and study. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 1996–2005.Google ScholarCross Ref
- [32] . 2018. Frame-recurrent video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6626–6634.Google ScholarCross Ref
- [33] . 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1874–1883.Google ScholarCross Ref
- [34] . 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Proceedings of the International Conference on Neural Information Processing Systems (NIPS). 802–810.Google Scholar
- [35] . 2017. Detail-revealing deep video super-resolution. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 4482–4490.Google ScholarCross Ref
- [36] . 2020. TDAN: Temporally-deformable alignment network for video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3360–3369.Google ScholarCross Ref
- [37] . 2017. Image super-resolution using dense skip connections. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 4809–4817.Google ScholarCross Ref
- [38] . 2019. EDVR: Video restoration with enhanced deformable convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 1954–1963.Google ScholarCross Ref
- [39] . 2018. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7794–7803.Google ScholarCross Ref
- [40] . 2018. ESRGAN: Enhanced super-resolution generative adversarial networks. In Proceedings of the European Conference on Computer Vision Workshops (ECCVW). 63–79.Google Scholar
- [41] . 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.Google ScholarDigital Library
- [42] . 2021. Deep learning for image super-resolution: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 10 (2021), 3365–3387.Google ScholarCross Ref
- [43] . 2019. Multi-memory convolutional neural network for video super-resolution. IEEE Transactions on Image Processing 28, 5 (2019), 2530–2544.Google ScholarDigital Library
- [44] . 2023. Structured sparsity learning for efficient video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 22638–22647.Google ScholarCross Ref
- [45] . 2019. Video enhancement with task-oriented flow. International Journal of Computer Vision 127, 8 (2019), 1106–1125.Google ScholarDigital Library
- [46] . 2019. Frame and feature-context video super-resolution. In Proceedings of the AAAI Conference on Artificial Intelligence. 5597–5604.Google ScholarDigital Library
- [47] . 2022. A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 5 (2022), 2264–2280.Google Scholar
- [48] . 2021. Omniscient video super-resolution. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 4409–4418.Google ScholarCross Ref
- [49] . 2019. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 3106–3115.Google ScholarCross Ref
- [50] . 2020. Multi-temporal ultra dense memory network for video super-resolution. IEEE Transactions on Circuits and Systems for Video Technology 30, 8 (2020), 2503–2516.Google ScholarDigital Library
- [51] . 2020. Semantic face hallucination: Super-resolving very low-resolution face images with supplementary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 11 (2020), 2926–2943.Google Scholar
- [52] . 2018. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV). 294–310.Google ScholarDigital Library
- [53] . 2018. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2472–2481.Google ScholarCross Ref
- [54] . 2019. DCSR: Dilated convolutions for single image super-resolution. IEEE Transactions on Image Processing 28, 4 (
April 2019), 1625–1635.Google ScholarDigital Library - [55] . 2019. Deformable ConvNets V2: More deformable, better results. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9300–9308.Google ScholarCross Ref
Index Terms
- Omniscient Video Super-Resolution with Explicit-Implicit Alignment
Recommendations
Patch-based spatio-temporal super-resolution for video with non-rigid motion
This paper presents a novel approach for spatio-temporal video super-resolution. Whereas the task of synthesizing high-frequency information on the spatial domain can be accomplished without introducing arbitrary priors on the image model (beyond the ...
Video Super-Resolution using Multi-scale Pyramid 3D Convolutional Networks
MM '20: Proceedings of the 28th ACM International Conference on MultimediaVideo super-resolution (SR) aims at generating high-resolution (HR) frames from consecutive low-resolution (LR) frames. The challenge is how to make use of temporal coherence among neighbouring LR frames. Most previous works use motion estimation and ...
Video super-resolution network using detail component extraction and optical flow enhancement algorithm
AbstractThe video super-resolution (SR) task refers to the use of corresponding low-resolution (LR) frames and multiple neighboring frames to generate high-resolution (HR) frames. Existing deep learning-based approaches usually utilize LR optical flow for ...
Comments