ABSTRACT
Recently, super-resolution (SR) performance has been improved by the stereo images since the beneficial information could be provided by another view. Transformer has shown significant performance gains for computer vision tasks, while it needs huge computing resources and training time. To alleviate this problem, we introduce an efficient Transformer feature extraction block, which can efficiently capture long-range pixel interactions with lower resource consumption. There are many kinds of cross-view interaction modules for stereo image SR, and they all have limitations of SR performance in their own models. To address the aforementioned challenge, we first propose the strong-weak cross-view interaction mechanism, which consists of strong cross-view interaction module and weak cross-view interaction module. Benefiting from the proposed mechanism, the SR performance can be improved significantly with a negligible increment of computing cost. We integrate the efficient Transformer feature extraction module and the strong-weak cross-view interaction mechanism into a unified framework named strong-weak cross-view interaction network (SWCVIN), and extensive experiments on three benchmark datasets show the proposed model achieves state-of-the-art results.
- Xiaojie Chu, Liangyu Chen, and Wenqing Yu. 2022. NAFSSR: Stereo Image Super-Resolution Using NAFNet. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, New Orleans, LA, USA, June 19-20, 2022. 1238–1247.Google Scholar
- Feng Dai, Xianyu Chen, Yike Ma, Guoqing Jin, and Qiang Zhao. 2018. Wide Range Depth Estimation from Binocular Light Field Camera. In British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK, September 3-6, 2018. 107.Google Scholar
- Qinyan Dai, Juncheng Li, Qiaosi Yi, Faming Fang, and Guixu Zhang. 2021. Feedback Network for Mutually Boosted Stereo Image Super-Resolution and Disparity Estimation. In MM ’21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021. 1985–1993.Google Scholar
- Jiawang Dan, Zhaowei Qu, Xiaoru Wang, and Jiahang Gu. 2021. A Disparity Feature Alignment Module for Stereo Image Super-Resolution. IEEE Signal Process. Lett. 28 (2021), 1285–1289.Google ScholarCross Ref
- Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, June 16-21, 2012. 3354–3361.Google ScholarCross Ref
- Daniel S. Jeon, Seung-Hwan Baek, Inchang Choi, and Min H. Kim. 2018. Enhancing the Spatial Resolution of Stereo Images Using a Parallax Prior. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. 1721–1730.Google ScholarCross Ref
- Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate Image Super-Resolution Using Very Deep Convolutional Networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. 1646–1654.Google Scholar
- Changyu Li, Dongyang Zhang, Chunlin Jiang, Ning Xie, and Jie Shao. 2021. Learning Multi-dimensional Parallax Prior for Stereo Image Super-Resolution. In Neural Information Processing - 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8-12, 2021, Proceedings, Part VI. 718–727.Google Scholar
- Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. 9992–10002.Google ScholarCross Ref
- Moritz Menze and Andreas Geiger. 2015. Object scene flow for autonomous vehicles. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015. 3061–3070.Google ScholarCross Ref
- Daniel Scharstein, Heiko Hirschmüller, York Kitajima, Greg Krathwohl, Nera Nesic, Xi Wang, and Porter Westling. 2014. High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth. In Pattern Recognition - 36th German Conference, GCPR 2014, Münster, Germany, September 2-5, 2014, Proceedings. 31–42.Google Scholar
- Wonil Song, Sungil Choi, Somi Jeong, and Kwanghoon Sohn. 2020. Stereoscopic Image Super-Resolution with Stereo Consistent Feature. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, New York, NY, USA, February 7-12, 2020. 12031–12038.Google Scholar
- Longguang Wang, Yingqian Wang, Zhengfa Liang, Zaiping Lin, Jun-Gang Yang, Wei An, and Yulan Guo. 2019. Learning Parallax Attention for Stereo Image Super-Resolution. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. 12250–12259.Google Scholar
- Yingqian Wang, Longguang Wang, Jun-Gang Yang, Wei An, and Yulan Guo. 2019. Flickr1024: A Large-Scale Dataset for Stereo Image Super-Resolution. In 2019 IEEE/CVF International Conference on Computer Vision Workshops, ICCV Workshops 2019, Seoul, Korea (South), October 27-28, 2019. 3852–3857.Google ScholarCross Ref
- Yingqian Wang, Xinyi Ying, Longguang Wang, Jungang Yang, Wei An, and Yulan Guo. 2021. Symmetric Parallax Attention for Stereo Image Super-Resolution. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19-25, 2021. 766–775.Google Scholar
- Wangduo Xie, Jian Zhang, Zhisheng Lu, Meng Cao, and Yong Zhao. 2020. Non-Local Nested Residual Attention Network for Stereo Image Super-Resolution. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020. 2643–2647.Google Scholar
- Xinyi Ying, Yingqian Wang, Longguang Wang, Weidong Sheng, Wei An, and Yulan Guo. 2020. A Stereo Attention Module for Stereo Image Super-Resolution. IEEE Signal Process. Lett. 27 (2020), 496–500.Google ScholarCross Ref
- Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2022. Restormer: Efficient Transformer for High-Resolution Image Restoration. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. 5718–5729.Google Scholar
- Tianyi Zhang, Yun Gu, Xiaolin Huang, Jie Yang, and Guang-Zhong Yang. 2022. Disparity-constrained stereo endoscopic image super-resolution. International Journal of Computer Assisted Radiology and Surgery 17 (2022), 867–875.Google ScholarCross Ref
- Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image Super-Resolution Using Very Deep Residual Channel Attention Networks. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part VII. 294–310.Google Scholar
Index Terms
- Strong-Weak Cross-View Interaction Network for Stereo Image Super-Resolution
Recommendations
Geometry-Aware Reference Synthesis for Multi-View Image Super-Resolution
MM '22: Proceedings of the 30th ACM International Conference on MultimediaRecent multi-view multimedia applications struggle between high-resolution (HR) visual experience and storage or bandwidth constraints. Therefore, this paper proposes a Multi-View Image Super-Resolution (MVISR) task. It aims to increase the resolution ...
Space-Angle Super-Resolution for Multi-View Images
MM '21: Proceedings of the 29th ACM International Conference on MultimediaThe limited spatial and angular resolutions in multi-view multimedia applications restrict their visual experience in practical use. In this paper, we first argue the space-angle super-resolution (SASR) problem for irregular arranged multi-view images. ...
Cross-view Resolution and Frame Rate Joint Enhancement for Binocular Video
MM '23: Proceedings of the 31st ACM International Conference on MultimediaWith the popular of stereo video and free-viewpoint video, binocular and multi-view video enhancement has attracted increasing attention. Current binocular video enhancement methods mainly focus on stereo super-resolution. In this paper, we tend to ...
Comments