skip to main content
10.1145/3474085.3475189acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Stereo Video Super-Resolution via Exploiting View-Temporal Correlations

Published: 17 October 2021 Publication History

Abstract

Stereo Video Super-Resolution (StereoVSR) aims to generate high-resolution video steams from two low-resolution videos under stereo settings. Existing video super-resolution and stereo image super-resolution techniques can be extended to tackle the StereoVSR task, yet they cannot make full use of the multi-view and temporal information to achieve satisfactory performance. In this paper, we propose a novel Stereo Video Super-Resolution Network (SVSRNet) to fulfill the StereoVSR task via exploiting view-temporal correlations. First, we devise a view-temporal attention module (VTAM) to integrate the information of cross-time-cross-view for constructing high-resolution stereo videos. Second, we propose a spatial-temporal fusion module (STFM), which aggregates the information across time in intra-view to emphasize important features for subsequent restoration. In addition, we design a view-temporal consistency loss function to enforce consistency constraint of superresolved stereo videos. Comprehensive experimental results demonstrate that our method generates superior results.

Supplementary Material

ZIP File (mfp0182aux.zip)
Supplemental Material contains PDF and Video with more results.
MP4 File (mm2021.mp4)
Presentation for "Stereo Video Super-Resolution via Exploiting View-Temporal Correlations"

References

[1]
Philip Lenz Andreas Geiger and Raquel Urtasun. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. In CVPR.
[2]
Jose Caballero, Christian Ledig, Andrew Aitken, Alejandro Acosta, Johannes Totz, Zehan Wang, and Wenzhe Shi. 2017. Real-time video super-resolution with spatio-temporal networks and motion compensation. In CVPR.
[3]
Kelvin CK Chan, Xintao Wang, Ke Yu, Chao Dong, and Chen Change Loy. 2021. Understanding deformable alignment in video super-resolution. In AAAI.
[4]
Chang Chen, Zhiwei Xiong, Xinmei Tian, Zheng-Jun Zha, and Feng Wu. 2019. Camera Lens Super-Resolution. In CVPR.
[5]
Xuelian Cheng, Yiran Zhong, Mehrtash Harandi, Yuchao Dai, Xiaojun Chang, Tom Drummond, Hongdong Li, and Zongyuan Ge. 2020. Hierarchical Neural Architecture Search for Deep Stereo Matching. In NeurIPS.
[6]
Dario Fuoli, Shuhang Gu, and Radu Timofte. 2019. Efficient video super-resolution through recurrent latent space propagation. In ICCVW.
[7]
Yucheng Hang, Qingmin Liao, Wenming Yang, Yupeng Chen, and Jie Zhou. 2020. Attention Cube Network for Image Restoration. In ACM MM.
[8]
Muhammad Haris, Gregory Shakhnarovich, and Norimichi Ukita. 2019. Recurrent back-projection network for video super-resolution. In CVPR.
[9]
Alain Hore and Djemel Ziou. 2010. Image quality metrics: PSNR vs. SSIM. In ICPR.
[10]
Zheng Hui, Xinbo Gao, Yunchu Yang, and Xiumei Wang. 2019. Lightweight image super-resolution with information multi-distillation network. In ACM MM.
[11]
Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and Thomas Brox. 2017. Flownet 2.0: Evolution of optical flow estimation with deep networks. In CVPR.
[12]
Daniel S. Jeon, Seung Hwan Baek, Inchang Choi, and Min H. Kim. 2018. Enhancing the Spatial Resolution of Stereo Images Using a Parallax Prior. In CVPR.
[13]
Younghyun Jo, Seoung Wug Oh, Jaeyeon Kang, and Seon Joo Kim. 2018. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In CVPR.
[14]
Armin Kappeler, Seunghwan Yoo, Qiqin Dai, and Aggelos K Katsaggelos. 2016. Video super-resolution with convolutional neural networks. IEEE Transactions on Computational Imaging, Vol. 2, 2 (2016), 109--122.
[15]
Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017. Deep laplacian pyramid networks for fast and accurate super-resolution. In CVPR.
[16]
Wei-Sheng Lai, Jia-Bin Huang, Oliver Wang, Eli Shechtman, Ersin Yumer, and Ming-Hsuan Yang. 2018. Learning blind video temporal consistency. In ECCV.
[17]
Bing Li, Chia-Wen Lin, Boxin Shi, Tiejun Huang, Wen Gao, and C-C Jay Kuo. 2018. Depth-aware stereo video retargeting. In CVPR.
[18]
Ding Liu, Zhaowen Wang, Yuchen Fan, Xianming Liu, Zhangyang Wang, Shiyu Chang, and Thomas Huang. 2017. Robust video super-resolution with learned temporal dynamics. In ICCV.
[19]
Pengpeng Liu, Irwin King, Michael R Lyu, and Jia Xu. 2020. Flow2stereo: Effective self-supervised learning of optical flow and stereo matching. In CVPR.
[20]
Jianping Luo, Shaofei Huang, and Yuan Yuan. 2020. Video Super-Resolution using Multi-scale Pyramid 3D Convolutional Networks. In ACM MM.
[21]
Moritz Menze and Andreas Geiger. 2015. Object scene flow for autonomous vehicles. In CVPR.
[22]
Philip Hausser Philipp Fischer Daniel Cremers Alexey Dosovitskiy Nikolaus Mayer, Eddy Ilg and Thomas Brox. 2016. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In CVPR.
[23]
Liyuan Pan, Yuchao Dai, Miaomiao Liu, and Fatih Porikli. 2017. Simultaneous stereo video deblurring and scene flow estimation. In CVPR.
[24]
Anita Sellent, Carsten Rother, and Stefan Roth. 2016. Stereo video deblurring. In ECCV.
[25]
Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In CVPR.
[26]
Wonil Song, Sungil Choi, Somi Jeong, and Kwanghoon Sohn. 2020. Stereoscopic Image Super-Resolution with Stereo Consistent Feature. In AAAI.
[27]
Deqing Sun, Xiaodong Yang, Ming-Yu Liu, and Jan Kautz. 2018. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In CVPR.
[28]
Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, and Jiaya Jia. 2017. Detail-revealing deep video super-resolution. In ICCV.
[29]
Zachary Teed and Jia Deng. 2020. Raft: Recurrent all-pairs field transforms for optical flow. In ECCV.
[30]
Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu. 2020. TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution. In CVPR.
[31]
Longguang Wang, Yingqian Wang, Zhengfa Liang, Zaiping Lin, Jungang Yang, Wei An, and Yulan Guo. 2019 a. Learning Parallax Attention for Stereo Image Super-Resolution. In CVPR.
[32]
X. Wang, K. C. K. Chan, K. Yu, C. Dong, and C. C. Loy. 2019. EDVR: Video Restoration With Enhanced Deformable Convolutional Networks. In CVPRW.
[33]
Yingqian Wang, Longguang Wang, Jungang Yang, Wei An, and Yulan Guo. 2019 b. Flickr1024: A large-scale dataset for stereo image super-resolution. In ICCVW.
[34]
Yingqian Wang, Xinyi Ying, Longguang Wang, Jungang Yang, Wei An, and Yulan Guo. 2021. Symmetric parallax attention for stereo image super-resolution. In CVPRW.
[35]
Zeyu Xiao, Xueyang Fu, Jie Huang, Zhen Cheng, and Zhiwei Xiong. 2021. Space-time distillation for video super-resolution. In CVPR.
[36]
Zeyu Xiao, Zhiwei Xiong, Xueyang Fu, Dong Liu, and Zheng-Jun Zha. 2020. Space-Time Video Super-Resolution Using Temporal Profiles. In ACM MM.
[37]
Wangduo Xie, Jian Zhang, Zhisheng Lu, Meng Cao, and Yong Zhao. 2020. Non-Local Nested Residual Attention Network for Stereo Image Super-Resolution. In ICASSP.
[38]
Longguang Wang Weidong Sheng Wei An Yulan Guo Xinyi Ying, Yingqian Wang. 2020. A Stereo Attention Module for Stereo Image Super-Resolution. IEEE Signal Processing Letters, Vol. 27 (2020), 496--500.
[39]
Zhiwei Xiong, Xiaoyan Sun, and Feng Wu. 2010. Robust web image/video super-resolution. IEEE Transactions on image processing, Vol. 19, 8 (2010), 2017--2028.
[40]
Zhiwei Xiong, Dong Xu, Xiaoyan Sun, and Feng Wu. 2013. Example-based super-resolution with soft information and decision. IEEE Transactions on multimedia, Vol. 15, 6 (2013), 1458--1465.
[41]
Qingyu Xu, Longguang Wang, Yingqian Wang, Weidong Sheng, and Xinpu Deng. 2021. Deep Bilateral Learning for Stereo Image Super-Resolution. IEEE Signal Processing Letters, Vol. 28 (2021), 613--617.
[42]
Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T Freeman. 2019. Video enhancement with task-oriented flow. International Journal of Computer Vision, Vol. 127, 8 (2019), 1106--1125.
[43]
Bo Yan, Chenxi Ma, Bahetiyaer Bare, Weimin Tan, and Steven C. H. Hoi. 2020. Disparity-Aware Domain Adaptation in Stereo Image Restoration. In CVPR.
[44]
Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, and Jiayi Ma. 2019. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In ICCV.
[45]
Xingyi Ying, Longguang Wang, Yingqian Wang, Weidong Sheng, Wei An, and Yulan Guo. 2020. Deformable 3D Convolution for Video Super-Resolution. IEEE Signal Processing Letters, Vol. 27 (2020), 1500--1504.
[46]
Haochen Zhang, Dong Liu, and Zhiwei Xiong. 2019. Two-Stream Action Recognition-Oriented Video Super-Resolution. In ICCV.
[47]
Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image super-resolution using very deep residual channel attention networks. In ECCV.

Cited By

View all
  • (2025)Conti-Fuse: A novel continuous decomposition-based fusion framework for infrared and visible imagesInformation Fusion10.1016/j.inffus.2024.102839117(102839)Online publication date: May-2025
  • (2024)Joint Video Denoising and Super-Resolution Network for IoT CamerasIEEE Internet of Things Journal10.1109/JIOT.2024.340262211:17(28526-28538)Online publication date: 1-Sep-2024
  • (2023)Cross-View Attention Interaction Fusion Algorithm for Stereo Super-ResolutionApplied Sciences10.3390/app1312726513:12(7265)Online publication date: 18-Jun-2023
  • Show More Cited By

Index Terms

  1. Stereo Video Super-Resolution via Exploiting View-Temporal Correlations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '21: Proceedings of the 29th ACM International Conference on Multimedia
    October 2021
    5796 pages
    ISBN:9781450386517
    DOI:10.1145/3474085
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. stereo video
    2. video super-resolution
    3. view-temporal correlations

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '21
    Sponsor:
    MM '21: ACM Multimedia Conference
    October 20 - 24, 2021
    Virtual Event, China

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)83
    • Downloads (Last 6 weeks)14
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Conti-Fuse: A novel continuous decomposition-based fusion framework for infrared and visible imagesInformation Fusion10.1016/j.inffus.2024.102839117(102839)Online publication date: May-2025
    • (2024)Joint Video Denoising and Super-Resolution Network for IoT CamerasIEEE Internet of Things Journal10.1109/JIOT.2024.340262211:17(28526-28538)Online publication date: 1-Sep-2024
    • (2023)Cross-View Attention Interaction Fusion Algorithm for Stereo Super-ResolutionApplied Sciences10.3390/app1312726513:12(7265)Online publication date: 18-Jun-2023
    • (2023)Mutual-Guided Dynamic Network for Image FusionProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612261(1779-1788)Online publication date: 26-Oct-2023
    • (2023)Cross-view Resolution and Frame Rate Joint Enhancement for Binocular VideoProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612213(8367-8375)Online publication date: 26-Oct-2023
    • (2023)Dynamic Grouped Interaction Network for Low-Light Stereo Image EnhancementProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611895(2468-2476)Online publication date: 26-Oct-2023
    • (2023)Accelerating Stereo Image Simulation for Automotive Applications Using Neural Stereo Super ResolutionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.328791224:11(12627-12636)Online publication date: 5-Jul-2023
    • (2023)Zero-Shot Dual-Lens Super-Resolution2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00881(9130-9139)Online publication date: Jun-2023
    • (2023)CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00572(5906-5916)Online publication date: Jun-2023
    • (2022)NTIRE 2022 Challenge on Stereo Image Super-Resolution: Methods and Results2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW56347.2022.00105(905-918)Online publication date: Jun-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media