Learning a spatial-temporal symmetry network for video super-resolution

Wang, Xiaohang; Liu, Mingliang; Wei, Pengying

doi:10.1007/s10489-022-03603-3

Learning a spatial-temporal symmetry network for video super-resolution

Published: 01 June 2022

Volume 53, pages 3530–3544, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

307 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

The video super-resolution (VSR) method is designed to estimate and restore high-resolution (HR) sequences from low-resolution (LR) input. For the past few years, many VSR methods with machine learning have been proposed that combine both the convolutional neural network (CNN) and motion compensation. Most mainstream approaches are based on optical flow or deformation convolution, and both need accurate estimates for motion compensation. However, most previous methods have not been able to fully utilize the spatial-temporal symmetrical information from input sequences. Moreover, much computation is consumed by aligning every neighbouring frame to the reference frame separately. Furthermore, many methods reconstruct HR results on only a single scale, which limits the reconstruction accuracy of the network and its performance in complex scenes. In this study, we propose a spatial-temporal symmetry network (STSN) to solve the above deficiencies. STSN includes four parts: prefusion, alignment, postfusion and reconstruction. First, a two-stage fusion strategy is applied to reduce the computation consumption of the network. Furthermore, ConvGRU is utilized in the prefusion module, the redundant features between neighbouring frames are eliminated, and several neighbouring frames are fused and condensed into two parts. To generate accurate offset maps, we present a spatial-temporal symmetry attention block (STSAB). This component exploits the symmetry of spatial-temporal combined spatial attention. In the reconstruction module, we propose an SR multiscale residual block (SR-MSRB) to enhance reconstruction performance. Abundant experimental results that test several datasets show that our method possesses better effects and efficiency in both quantitative and qualitative measurement indices compared with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Spatio-Temporal Network with Gated Fusion for Video Super-Resolution

Video super-resolution network using detail component extraction and optical flow enhancement algorithm

Article 12 January 2022

Learning for Video Super-Resolution Through HR Optical Flow Estimation

References

Lu E, Hu X (2021) Image super-resolution via channel attention and spatial attention. Appl Intell:1–9
Liu Z, Huang J, Zhu C, Peng X, Du X (2021) Residual attention network using multi-channel dense connections for image super-resolution. Appl Intell 51(1):85–99
Article Google Scholar
Zhang Y, Sun Y, Liu S (2021) Deformable and residual convolutional network for image super-resolution. Appl Intell:1–10
Xiong C, Shi X, Gao Z, Wang G (2021) Attention augmented multi-scale network for single image super-resolution. Appl Intell 51(2):935–951
Article Google Scholar
Chen W, Yao P, Gai S, Da F (2021) Multi-scale feature aggregation network for image super-resolution. Appl Intell:1–10
Keys R (1981) Cubic convolution interpolation for digital image processing. IEEE Trans Acoust Speech Signal Process 29(6):1153–1160
Article MathSciNet MATH Google Scholar
Dong C, Loy C C, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: European conference on computer vision. Springer, pp 184–199
Dong C, Loy C C, Tang X (2016) Accelerating the super-resolution convolutional neural network. In: European conference on computer vision. Springer, pp 391–407
Shi W, Caballero J, Huszár F, Totz J, Aitken A P, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1874–1883
Dong C, Loy C C, He K, Tang X (2015) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38(2):295–307
Article Google Scholar
Yu J, Fan Y, Yang J, Xu N, Wang Z, Wang X, Huang T (2018) Wide activation for efficient and accurate image super-resolution. arXiv:1808.08718
Caballero J, Ledig C, Aitken A, Acosta A, Totz J, Wang Z, Shi W (2017) Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4778–4787
Tao X, Gao H, Liao R, Wang J, Jia J (2017) Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4472–4480
Sajjadi MSM, Vemulapalli R, Brown M (2018) Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6626–6634
Wang L, Guo Y, Lin Z, Deng X, An W (2018) Learning for video super-resolution through hr optical flow estimation. In: Asian Conference on Computer Vision. Springer, pp 514–529
Xue T, Chen B, Wu J, Wei D, Freeman W T (2019) Video enhancement with task-oriented flow. Int J Comput Vis 127(8):1106–1125
Article Google Scholar
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764–773
Zhu X, Hu H, Lin S, Dai J (2019) Deformable convnets v2: More deformable, better results. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 9308–9316
Tian Y, Zhang Y, Fu Y, Xu C (2020) Tdan: Temporally-deformable alignment network for video super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3360–3369
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ballas N, Yao L, Pal C, Courville A (2015) Delving deeper into convolutional networks for learning video representations. arXiv:1511.06432
Kim J, Kwon Lee J, Mu Lee K (2016) Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1646–1654
Lai W-S, Huang J-B, Ahuja N, Yang M-H (2017) Deep laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 624–632
Huang Y, Wang W, Wang L (2015) Bidirectional recurrent convolutional networks for multi-frame super-resolution. In: Advances in Neural Information Processing Systems, pp 235–243
Kappeler A, Yoo S, Dai Q, Katsaggelos A K (2016) Video super-resolution with convolutional neural networks. IEEE Trans Comput Imaging 2(2):109–122
Article MathSciNet Google Scholar
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Luong M-T, Pham H, Manning C D (2015) Effective approaches to attention-based neural machine translation. arXiv:1508.04025
Yin W, Schütze H, Xiang B, Zhou B (2016) Abcnn: Attention-based convolutional neural network for modeling sentence pairs. Trans Assoc Comput Linguist 4:259–272
Article Google Scholar
Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 842–850
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Woo S, Park J, Lee J-Y, So Kweon I (2018) Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Shi X, Chen Z, Wang H, Yeung D-Y, Wong W-K, Woo W-c (2015) Convolutional lstm network: A machine learning approach for precipitation nowcasting. arXiv:1506.04214
Sajjadi MSM, Vemulapalli R, Brown M (2018) Frame-recurrent video super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6626–6634
Yi P, Wang Z, Jiang K, Shao Z, Ma J (2019) Multi-temporal ultra dense memory network for video super-resolution. IEEE Trans Circ Syst Video Technol 30(8):2503–2516
Article Google Scholar
Wang X, Chan KCK, Yu K, Dong C, Change Loy C (2019) Edvr: Video restoration with enhanced deformable convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 0–0
Zhang D, Shao J, Liang Z, Liu X, Shen H T (2020) Multi-branch networks for video super-resolution with dynamic reconstruction strategy. IEEE Transactions on Circuits and Systems for Video Technology
Gao S, Cheng M-M, Zhao K, Zhang X-Y, Yang M-H, Torr PHS (2019) Res2net: A new multi-scale backbone architecture. IEEE transactions on pattern analysis and machine intelligence
Ding X, Guo Y, Ding G, Han J (2019) Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 1911–1920
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122
Liu C, Sun D (2013) On bayesian adaptive video super resolution. IEEE Trans Pattern Anal Mach Intell 36(2):346–360
Article Google Scholar
Nah S, Baik S, Hong S, Moon G, Son S, Timofte R, Mu Lee K (2019) Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 0–0
Zhu X, Li Z, Lou J, Shen Q (2021) Video super-resolution based on a spatio-temporal matching network. Pattern Recogn 110:107619
Article Google Scholar
Li F, Bai H, Zhao Y (2020) Learning a deep dual attention network for video super-resolution. IEEE Trans Image Process 29:4474–4488
Article MATH Google Scholar
López-Tapia S, Lucas A, Molina R, Katsaggelos A K (2020) A single video super-resolution gan for multiple downsampling operators based on pseudo-inverse image formation models. Digital Signal Process 104:102801
Article Google Scholar
Cao Y, Wang C, Song C, Tang Y, Li H (2021) Real-time super-resolution system of 4k-video based on deep learning. In: 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP). IEEE, pp 69–76
Ying X, Wang L, Wang Y, Sheng W, An W, Guo Y (2020) Deformable 3d convolution for video super-resolution. IEEE Signal Process Lett 27:1500–1504
Article Google Scholar
Li D, Wang Z (2017) Video superresolution via motion compensation and deep residual learning. IEEE Trans Comput Imaging 3(4):749–762
Article MathSciNet Google Scholar
Lai W-S, Huang J-B, Ahuja N, Yang M-H (2018) Fast and accurate image super-resolution with deep laplacian pyramid networks. IEEE Trans Pattern Anal Mach Intell 41(11):2599– 2613
Article Google Scholar
Chu M, Xie Y, Mayer J, Leal-Taixé L, Thuerey N (2020) Learning temporal coherence via self-supervision for gan-based video generation. ACM Trans Graph (TOG) 39(4):75–1
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Automation, Heilongjiang University, Harbin, 150080, Heilongjiang, China
Xiaohang Wang, Mingliang Liu & Pengying Wei
Key Laboratory of Information Fusion Estimation and Detection, Heilongjiang University, Harbin, 150080, Heilongjiang, China
Xiaohang Wang, Mingliang Liu & Pengying Wei

Authors

Xiaohang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mingliang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Pengying Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mingliang Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Liu, M. & Wei, P. Learning a spatial-temporal symmetry network for video super-resolution. Appl Intell 53, 3530–3544 (2023). https://doi.org/10.1007/s10489-022-03603-3

Download citation

Accepted: 04 April 2022
Published: 01 June 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10489-022-03603-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning a spatial-temporal symmetry network for video super-resolution

Abstract

Access this article

Similar content being viewed by others

Efficient Spatio-Temporal Network with Gated Fusion for Video Super-Resolution

Video super-resolution network using detail component extraction and optical flow enhancement algorithm

Learning for Video Super-Resolution Through HR Optical Flow Estimation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning a spatial-temporal symmetry network for video super-resolution

Abstract

Access this article

Similar content being viewed by others

Efficient Spatio-Temporal Network with Gated Fusion for Video Super-Resolution

Video super-resolution network using detail component extraction and optical flow enhancement algorithm

Learning for Video Super-Resolution Through HR Optical Flow Estimation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation