skip to main content
10.1145/3581783.3612381acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

S-OmniMVS: Incorporating Sphere Geometry into Omnidirectional Stereo Matching

Published: 27 October 2023 Publication History

Abstract

Multi-fisheye stereo matching is a promising task that employs the traditional multi-view stereo (MVS) pipeline with spherical sweeping to acquire omnidirectional depth. However, the existing omnidirectional MVS technologies neglect fisheye and omnidirectional distortions, yielding inferior performance. In this paper, we revisit omnidirectional MVS by incorporating three sphere geometry priors: spherical projection, spherical continuity, and spherical position. To deal with fisheye distortion, we propose a new distortion-adaptive fusion module to convert fisheye inputs into distortion-free spherical tangent representations by constructing a spherical projection space. Then these multi-scale features are adaptively aggregated with additional learnable offsets to enhance content perception. To handle omnidirectional distortion, we present a new spherical cost aggregation module with a comprehensive consideration of the spherical continuity and position. Concretely, we first design a rotation continuity compensation mechanism to ensure omnidirectional depth consistency of left-right boundaries without introducing extra computation. On the other hand, we encode the geometry-aware spherical position and push them into the cost aggregation to relieve panoramic distortion and perceive the 3D structure. Furthermore, to avoid the excessive concentration of depth hypothesis caused by inverse depth linear sampling, we develop a segmented sampling strategy that combines linear and exponential spaces to create S-OmniMVS, along with three sphere priors. Extensive experiments demonstrate the proposed method outperforms the state-of-the-art (SoTA) solutions by a large margin on various datasets both quantitatively and qualitatively.

References

[1]
Jia-Ren Chang and Yong-Sheng Chen. 2018. Pyramid stereo matching network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5410--5418.
[2]
Zisong Chen, Chunyu Lin, Nie Lang, Kang Liao, and Yao Zhao. 2023. Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via Establishing Pseudo- Stereo Supervision. arXiv preprint arXiv:2302.09922 (2023).
[3]
Huan Fu, Mingming Gong, ChaohuiWang, Kayhan Batmanghelich, and Dacheng Tao. 2018. Deep ordinal regression network for monocular depth estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2002--2011.
[4]
Xiaoyang Guo, Kai Yang, Wukui Yang, Xiaogang Wang, and Hongsheng Li. 2019. Group-wise correlation stereo network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3273--3282.
[5]
Heiko Hirschmuller. 2007. Stereo processing by semiglobal matching and mutual information. IEEE Transactions on pattern analysis and machine intelligence 30, 2 (2007), 328--341.
[6]
Eddy Ilg, Tonmoy Saikia, Margret Keuper, and Thomas Brox. 2018. Occlusions, motion and depth boundaries with a generic network for disparity, optical flow or scene flow estimation. In Proceedings of the European conference on computer vision (ECCV). 614--630.
[7]
Hualie Jiang, Zhe Sheng, Siyu Zhu, Zilong Dong, and Rui Huang. 2021. Unifuse: Unidirectional fusion for 360 panorama depth estimation. IEEE Robotics and Automation Letters 6, 2 (2021), 1519--1526.
[8]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[9]
Jaewoo Lee, Daeul Park, Dongwook Lee, and Daehyun Ji. 2022. Semi-Supervised 360° Depth Estimation from Multiple Fisheye Cameras with Pixel-Level Selective Loss. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2290--2294.
[10]
Jingliang Li, Zhengda Lu, Yiqun Wang, Ying Wang, and Jun Xiao. 2022. DSMVSNet: Unsupervised Multi-view Stereo via Depth Synthesis. In Proceedings of the 30th ACM International Conference on Multimedia. 5593--5601.
[11]
Yuyan Li, Yuliang Guo, Zhixin Yan, Xinyu Huang, Ye Duan, and Liu Ren. 2022. Omnifusion: 360 monocular depth estimation via geometry-aware fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2801--2810.
[12]
Anja Matatko, Jürgen Bollmann, and Andreas Müller. 2011. Depth Perception in Virtual Reality. Springer Berlin Heidelberg, Berlin, Heidelberg, 115--129. https: //doi.org/10.1007/978-3-642-12670-3_7
[13]
Andreas Meuleman, Hyeonjoong Jang, Daniel S Jeon, and Min H Kim. 2021. Realtime sphere sweeping stereo from multiview fisheye images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11423--11432.
[14]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99--106.
[15]
Giovanni Pintore, Marco Agus, Eva Almansa, Jens Schneider, and Enrico Gobbetti. 2021. Slicenet: deep dense depth estimation from a single indoor panorama using a slice-based representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11536--11545.
[16]
Mohamed Sayed, John Gibson, Jamie Watson, Victor Prisacariu, Michael Firman, and Clément Godard. 2022. SimpleRecon: 3D reconstruction without 3D convolutions. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXXIII. Springer, 1--19.
[17]
Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, and Yao Zhao. 2022. PanoFormer: Panorama Transformer for Indoor 360? Depth Estimation. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part I. Springer, 195--211.
[18]
Shang Sun, Dan Xu, Hao Wu, Haocong Ying, and Yurui Mou. 2022. Multi-view stereo for large-scale scene reconstruction with MRF-based depth inference. Computers & Graphics 106 (2022), 248--258.
[19]
Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, and Yi-Hsuan Tsai. 2020. Bifuse: Monocular 360 depth estimation via bi-projection fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 462--471.
[20]
Xiaofeng Wang, Zheng Zhu, Guan Huang, Fangbo Qin, Yun Ye, Yijia He, Xu Chi, and XingangWang. 2022. MVSTER: epipolar transformer for efficient multi-view stereo. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXXI. Springer, 573--591.
[21]
Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Yongming Rao, Guan Huang, Jiwen Lu, and Jie Zhou. 2022. SurroundDepth: entangling surrounding views for self-supervised multi-camera depth estimation. arXiv preprint arXiv:2204.03636 (2022).
[22]
Changhee Won, Jongbin Ryu, and Jongwoo Lim. 2019. Omnimvs: End-to-end learning for omnidirectional stereo matching. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8987--8996.
[23]
Changhee Won, Jongbin Ryu, and Jongwoo Lim. 2019. Sweepnet: Wide-baseline omnidirectional depth estimation. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 6073--6079.
[24]
Changhee Won, Jongbin Ryu, and Jongwoo Lim. 2020. End-to-end learning for omnidirectional stereo matching with uncertainty prior. IEEE transactions on pattern analysis and machine intelligence 43, 11 (2020), 3850--3862.
[25]
Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, and Long Quan. 2018. Mvsnet: Depth inference for unstructured multi-view stereo. In Proceedings of the European conference on computer vision (ECCV). 767--783.
[26]
Jure Zbontar, Yann LeCun, et al. 2016. Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17, 1 (2016), 2287--2318.
[27]
Nikolaos Zioulis, Antonis Karakottas, Dimitrios Zarpalas, and Petros Daras. 2018. Omnidepth: Dense depth estimation for indoors spherical panoramas. In Proceedings of the European Conference on Computer Vision (ECCV). 448--465.

Cited By

View all
  • (2024)RomniStereo: Recurrent Omnidirectional Stereo MatchingIEEE Robotics and Automation Letters10.1109/LRA.2024.33573159:3(2511-2518)Online publication date: Mar-2024
  • (2024)Recurrent Omnidirectional Stereo Matching Method Based on Mixture of Laplace Loss2024 IEEE International Symposium on Product Compliance Engineering - Asia (ISPCE-ASIA)10.1109/ISPCE-ASIA64773.2024.10756276(1-6)Online publication date: 25-Oct-2024

Index Terms

  1. S-OmniMVS: Incorporating Sphere Geometry into Omnidirectional Stereo Matching

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fisheye distortion
    2. omnidirectional 3d estimation
    3. positional encoding
    4. spherical projection

    Qualifiers

    • Research-article

    Funding Sources

    • the National Natural Science Foundation of China

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)125
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)RomniStereo: Recurrent Omnidirectional Stereo MatchingIEEE Robotics and Automation Letters10.1109/LRA.2024.33573159:3(2511-2518)Online publication date: Mar-2024
    • (2024)Recurrent Omnidirectional Stereo Matching Method Based on Mixture of Laplace Loss2024 IEEE International Symposium on Product Compliance Engineering - Asia (ISPCE-ASIA)10.1109/ISPCE-ASIA64773.2024.10756276(1-6)Online publication date: 25-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media