research-article

S-OmniMVS: Incorporating Sphere Geometry into Omnidirectional Stereo Matching

Authors:

Yuanzhouhan Cao,

Yao ZhaoAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 1495 - 1503

https://doi.org/10.1145/3581783.3612381

Published: 27 October 2023 Publication History

Abstract

Multi-fisheye stereo matching is a promising task that employs the traditional multi-view stereo (MVS) pipeline with spherical sweeping to acquire omnidirectional depth. However, the existing omnidirectional MVS technologies neglect fisheye and omnidirectional distortions, yielding inferior performance. In this paper, we revisit omnidirectional MVS by incorporating three sphere geometry priors: spherical projection, spherical continuity, and spherical position. To deal with fisheye distortion, we propose a new distortion-adaptive fusion module to convert fisheye inputs into distortion-free spherical tangent representations by constructing a spherical projection space. Then these multi-scale features are adaptively aggregated with additional learnable offsets to enhance content perception. To handle omnidirectional distortion, we present a new spherical cost aggregation module with a comprehensive consideration of the spherical continuity and position. Concretely, we first design a rotation continuity compensation mechanism to ensure omnidirectional depth consistency of left-right boundaries without introducing extra computation. On the other hand, we encode the geometry-aware spherical position and push them into the cost aggregation to relieve panoramic distortion and perceive the 3D structure. Furthermore, to avoid the excessive concentration of depth hypothesis caused by inverse depth linear sampling, we develop a segmented sampling strategy that combines linear and exponential spaces to create S-OmniMVS, along with three sphere priors. Extensive experiments demonstrate the proposed method outperforms the state-of-the-art (SoTA) solutions by a large margin on various datasets both quantitatively and qualitatively.

References

[1]

Jia-Ren Chang and Yong-Sheng Chen. 2018. Pyramid stereo matching network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5410--5418.

[2]

Zisong Chen, Chunyu Lin, Nie Lang, Kang Liao, and Yao Zhao. 2023. Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via Establishing Pseudo- Stereo Supervision. arXiv preprint arXiv:2302.09922 (2023).

[3]

Huan Fu, Mingming Gong, ChaohuiWang, Kayhan Batmanghelich, and Dacheng Tao. 2018. Deep ordinal regression network for monocular depth estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2002--2011.

[4]

Xiaoyang Guo, Kai Yang, Wukui Yang, Xiaogang Wang, and Hongsheng Li. 2019. Group-wise correlation stereo network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3273--3282.

[5]

Heiko Hirschmuller. 2007. Stereo processing by semiglobal matching and mutual information. IEEE Transactions on pattern analysis and machine intelligence 30, 2 (2007), 328--341.

Digital Library

[6]

Eddy Ilg, Tonmoy Saikia, Margret Keuper, and Thomas Brox. 2018. Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation. In Proceedings of the European conference on computer vision (ECCV). 614--630.

Digital Library

[7]

Hualie Jiang, Zhe Sheng, Siyu Zhu, Zilong Dong, and Rui Huang. 2021. Unifuse: Unidirectional fusion for 360 panorama depth estimation. IEEE Robotics and Automation Letters 6, 2 (2021), 1519--1526.

[8]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[9]

Jaewoo Lee, Daeul Park, Dongwook Lee, and Daehyun Ji. 2022. Semi-Supervised 360° Depth Estimation from Multiple Fisheye Cameras with Pixel-Level Selective Loss. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2290--2294.

[10]

Jingliang Li, Zhengda Lu, Yiqun Wang, Ying Wang, and Jun Xiao. 2022. DSMVSNet: Unsupervised Multi-view Stereo via Depth Synthesis. In Proceedings of the 30th ACM International Conference on Multimedia. 5593--5601.

Digital Library

[11]

Yuyan Li, Yuliang Guo, Zhixin Yan, Xinyu Huang, Ye Duan, and Liu Ren. 2022. Omnifusion: 360 monocular depth estimation via geometry-aware fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2801--2810.

[12]

Anja Matatko, Jürgen Bollmann, and Andreas Müller. 2011. Depth Perception in Virtual Reality. Springer Berlin Heidelberg, Berlin, Heidelberg, 115--129. https: //doi.org/10.1007/978-3-642-12670-3_7

[13]

Andreas Meuleman, Hyeonjoong Jang, Daniel S Jeon, and Min H Kim. 2021. Realtime sphere sweeping stereo from multiview fisheye images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11423--11432.

[14]

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99--106.

Digital Library

[15]

Giovanni Pintore, Marco Agus, Eva Almansa, Jens Schneider, and Enrico Gobbetti. 2021. Slicenet: deep dense depth estimation from a single indoor panorama using a slice-based representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11536--11545.

[16]

Mohamed Sayed, John Gibson, Jamie Watson, Victor Prisacariu, Michael Firman, and Clément Godard. 2022. SimpleRecon: 3D reconstruction without 3D convolutions. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXXIII. Springer, 1--19.

[17]

Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, and Yao Zhao. 2022. PanoFormer: Panorama Transformer for Indoor 360? Depth Estimation. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part I. Springer, 195--211.

[18]

Shang Sun, Dan Xu, Hao Wu, Haocong Ying, and Yurui Mou. 2022. Multi-view stereo for large-scale scene reconstruction with MRF-based depth inference. Computers & Graphics 106 (2022), 248--258.

Digital Library

[19]

Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, and Yi-Hsuan Tsai. 2020. Bifuse: Monocular 360 depth estimation via bi-projection fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 462--471.

[20]

Xiaofeng Wang, Zheng Zhu, Guan Huang, Fangbo Qin, Yun Ye, Yijia He, Xu Chi, and XingangWang. 2022. MVSTER: epipolar transformer for efficient multi-view stereo. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXXI. Springer, 573--591.

[21]

Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Yongming Rao, Guan Huang, Jiwen Lu, and Jie Zhou. 2022. SurroundDepth: entangling surrounding views for self-supervised multi-camera depth estimation. arXiv preprint arXiv:2204.03636 (2022).

[22]

Changhee Won, Jongbin Ryu, and Jongwoo Lim. 2019. Omnimvs: End-to-end learning for omnidirectional stereo matching. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8987--8996.

[23]

Changhee Won, Jongbin Ryu, and Jongwoo Lim. 2019. Sweepnet: Wide-baseline omnidirectional depth estimation. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 6073--6079.

Digital Library

[24]

Changhee Won, Jongbin Ryu, and Jongwoo Lim. 2020. End-to-end learning for omnidirectional stereo matching with uncertainty prior. IEEE transactions on pattern analysis and machine intelligence 43, 11 (2020), 3850--3862.

[25]

Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, and Long Quan. 2018. Mvsnet: Depth inference for unstructured multi-view stereo. In Proceedings of the European conference on computer vision (ECCV). 767--783.

Digital Library

[26]

Jure Zbontar, Yann LeCun, et al. 2016. Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17, 1 (2016), 2287--2318.

Digital Library

[27]

Nikolaos Zioulis, Antonis Karakottas, Dimitrios Zarpalas, and Petros Daras. 2018. Omnidepth: Dense depth estimation for indoors spherical panoramas. In Proceedings of the European Conference on Computer Vision (ECCV). 448--465.

Digital Library

Cited By

Jiang HXu RTan MJiang W(2024)RomniStereo: Recurrent Omnidirectional Stereo MatchingIEEE Robotics and Automation Letters10.1109/LRA.2024.33573159:3(2511-2518)Online publication date: Mar-2024
https://doi.org/10.1109/LRA.2024.3357315
Li JWang ZHuang WLiu C(2024)Recurrent Omnidirectional Stereo Matching Method Based on Mixture of Laplace Loss2024 IEEE International Symposium on Product Compliance Engineering - Asia (ISPCE-ASIA)10.1109/ISPCE-ASIA64773.2024.10756276(1-6)Online publication date: 25-Oct-2024
https://doi.org/10.1109/ISPCE-ASIA64773.2024.10756276

Index Terms

S-OmniMVS: Incorporating Sphere Geometry into Omnidirectional Stereo Matching
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Dual-fisheye omnidirectional stereo
2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
We propose a novel omnidirectional stereo camera setup that is formed by two ultra-wide field-of-view (FOV) fisheye cameras. The proposed configuration is formed by two 245-degree FOV fisheye cameras, facing opposite directions, that are rigidly mounted ...
Direct Recovery of Three-Dimensional Scene Geometry From Binocular Stereo Disparity

An analysis of disparity is presented. It makes explicit the geometric relations between a stereo disparity field and a differentially project scene. These results show how it is possible to recover three-dimensional surface geometry through first-order ...
Mosaic Image Generation on a Flattened Gaussian Sphere
WACV '96: Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV '96)

Mosaicing of images is the task of fusing a collection of images with smaller fields of view to obtain an image with a larger field of view. In order to accomplish this task, images should be geometrically and radiometrically corrected. We propose a new ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the National Natural Science Foundation of China

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
279
Total Downloads

Downloads (Last 12 months)125
Downloads (Last 6 weeks)17

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jiang HXu RTan MJiang W(2024)RomniStereo: Recurrent Omnidirectional Stereo MatchingIEEE Robotics and Automation Letters10.1109/LRA.2024.33573159:3(2511-2518)Online publication date: Mar-2024
https://doi.org/10.1109/LRA.2024.3357315
Li JWang ZHuang WLiu C(2024)Recurrent Omnidirectional Stereo Matching Method Based on Mixture of Laplace Loss2024 IEEE International Symposium on Product Compliance Engineering - Asia (ISPCE-ASIA)10.1109/ISPCE-ASIA64773.2024.10756276(1-6)Online publication date: 25-Oct-2024
https://doi.org/10.1109/ISPCE-ASIA64773.2024.10756276

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten