Rethinking probability volume for multi-view stereo: A probability analysis method

Yu, Zonghua; Wang, Huaijun; Li, Junhuai; Jin, Haiyan; Cao, Ting; Cheng, Kuanhong

doi:10.1007/s10489-025-06284-w

Rethinking probability volume for multi-view stereo: A probability analysis method

Published: 03 February 2025

Volume 55, article number 396, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Zonghua Yu¹,
Huaijun Wang ORCID: orcid.org/0000-0002-2933-6566^1,2,
Junhuai Li^1,2,
Haiyan Jin^1,2,
Ting Cao^1,2 &
…
Kuanhong Cheng^1,2

80 Accesses
Explore all metrics

Abstract

The existing learning-based multi-view stereo (MVS) models primarily focus on predicting depth maps through a cascaded structure to achieve more robust reconstruction results. However, they often emphasize improving the quality of stereo matching while overlooking the importance of depth hypotheses. In this paper, we propose a novel MVS model from the perspective of probability volume analysis. First, the guiding effect of the probability volume is considered for depth refinement. Ideally, the probability distribution along the depth dimension of the probability volume follows an unimodal pattern. We design an unimodal curve to fit this pattern. Then, a reasonable depth refinement range is adaptively selected for each pixel position based on a predefined probability threshold. Additionally, considering that matching noise may cause the probability volume to appear as a blurred unimodal peak, we design the probability volume split-merge module (PVS-PVM). This module performs a peak search based on conditional constraints, splitting the probability volume into main and sub probability volumes, then computes the two sets of depth hypotheses from them. Finally, the new main and sub probability volumes are computed based on these depth hypotheses and merged to predict the depth. This approach allows for a more comprehensive consideration of the regions with higher probability, improving the robustness of depth hypotheses. Experimental results demonstrate that our method effectively utilizes probability volume information to guide depth map refinement and yields enhanced reconstruction results on the DTU and Tanks & Temples datasets. Our code will be released at https://github.com/zongh5a/ProbMVSNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 2

Uncertainty awareness with adaptive propagation for multi-view stereo

Article 19 August 2023

PSP-MVSNet: Deep Patch-Based Similarity Perceptual for Multi-view Stereo Depth Inference

Learning Inverse Depth Regression for Pixelwise Visibility-Aware Multi-View Stereo Networks

Article 19 June 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets used in this study can be downloaded from https://github.com/YoYo000/MVSNet, and the results of the proposed model tested on the tanks and temples dataset are submitted to https://www.tanksandtemples.org/leaderboard/.

References

Ayman B, Malik M, Lotfi B (2023) DAM-SLAM: depth attention module in a semantic visual SLAM based on objects interaction for dynamic environments. Appl Intell 53(21):25802–25815
Article Google Scholar
Rodriguez-Lozano FJ, Gámez-Granados JC, Martínez H, Palomares JM, Olivares J (2023) 3d reconstruction system and multiobject local tracking algorithm designed for billiards. Appl Intell 53(19):21543–21575
Article Google Scholar
Zhang Z, Yu Y, Da F (2023) Vgpcnet: viewport group point clouds network for 3d shape recognition. Appl Intell 53(16):19060–19073
Chen J, Yu Z, Ma L, Zhang K (2023) Uncertainty awareness with adaptive propagation for multi-view stereo. Appl Intell 53:26230–26239
Article MATH Google Scholar
Cai Y, Li L, Wang D, Liu X (2023) MFNet: Multi-level fusion aware feature pyramid based multi-view stereo network for 3d reconstruction. Appl Intell 53(4):4289–4301
Article MATH Google Scholar
Zhao R, Gu Z, Han X, He L, Sun F, Jiao S (2023) Multi-view stereo network with point attention. Appl Intell 53(22):26622–26636
Article MATH Google Scholar
Giang K.T, Song S, Jo S (2022) Curvature-guided dynamic scale networks for multi-view stereo. In: International conference on learning representations (ICLR)
Yan J, Wei Z, Yi H, Ding M, Zhang R, Chen Y, Wang G, Tai Y-W (2020) Dense hybrid recurrent multi-view stereo net with dynamic consistency checking. In: European conference on computer vision (ECCV), pp 674–689
Zhang Z, Peng R, Hu Y, Wang R (2023) GeoMVSNet: Learning multi-view stereo with geometry perception. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 21508–21518
Cheng S, Xu Z, Zhu S, Li Z, Li LE, Ramamoorthi R, Su H (2020) Deep stereo using adaptive thin volume representation with uncertainty awareness. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2524–2534
Yang J, Mao W, Alvarez JM, Liu M (2020) Cost volume pyramid based depth inference for multi-view stereo. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 4877–4886
Wang L, Gong Y, Ma X, Wang Q, Zhou K, Chen L (2022) IS-MVSNet: Importance sampling-based mvsnet. In: European conference on computer vision (ECCV), pp 668–683. Springer
Han M, Yin H, Chong A, Du Q: Enhanced feature pyramid for multi-view stereo with adaptive correlation cost volume. Appl Intell, pp 1–17 (2024)
Zhang J, Li S, Luo Z, Fang T, Yao Y (2023) Ijcv. Int J Comput Vis 131:199–214
Article Google Scholar
Chen W, Xu H, Zhou Z, Liu Y, Sun B, Kang W, Xie X (2023) CostFormer: Cost transformer for cost aggregation in multi-view stereo, pp 599–608
Xu Q, Su W, Qi Y, Tao W, Pollefeys M (2022) Learning inverse depth regression for pixelwise visibility-aware multi-view stereo networks. Int J Comput Vis 130(8):2040–2059
Article MATH Google Scholar
Wang F, Galliani S, Vogel C, Pollefeys M (2022) IterMVS: Iterative probability estimation for efficient multi-view stereo. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8606–8615
Wang F, Galliani S, Vogel C, Speciale P, Pollefeys M (2021) Patchmatchnet: Learned multi-view patchmatch stereo. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 14194–14203
Peng R, Wang R, Wang Z, Lai Y, Wang R (2022) Rethinking depth estimation for multi-view stereo: A unified representation. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8645–8654
Su W, Tao W (2023) Efficient edge-preserving multi-view stereo network for depth estimation. In: AAAI conference on artificial intelligence vol 37, pp 2348–2356
Li Y, Zhao Z, Fan J, Li W (2022) ADR-MVSNet: A cascade network for 3d point cloud reconstruction with pixel occlusion. Pattern Recognit 125:108516
Article MATH Google Scholar
Zhang S, Xu W, Wei Z, Zhang L, Wang Y, Liu J (2023) ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive depth range and depth interval. Pattern Recognit 144:109885
Article MATH Google Scholar
Galliani S, Lasinger K, Schindler K (2015) Massively parallel multiview stereopsis by surface normal diffusion. In: IEEE international conference on computer vision (ICCV), pp 873–881
Schönberger JL, Zheng E, Frahm J-M, Pollefeys M (2016) Pixelwise view selection for unstructured multi-view stereo. In: European conference on computer vision (ECCV), pp 501–518. Springer
Yao Y, Luo Z, Li S, Fang T, Quan L (2018) MVSNet: Depth inference for unstructured multi-view stereo. In: European conference on computer vision (ECCV), pp 767–783
Yao Y, Luo Z, Li S, Shen T, Fang T, Quan L (2019) Recurrent mvsnet for high-resolution multi-view stereo depth inference. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5525–5534
Gu X, Fan Z, Zhu S, Dai Z, Tan F, Tan P (2020) Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2495–2504
Yi H, Wei Z, Ding M, Zhang R, Chen Y, Wang G, Tai Y-W (2020) Pyramid multi-view stereo net with self-adaptive view aggregation. In: European conference on computer vision (ECCV), pp 766–782. Springer
Ma X, Gong Y, Wang Q, Huang J, Chen L, Yu F (2021) EPP-MVSNet: Epipolar-assembling based depth prediction for multi-view stereo. In: IEEE/CVF international conference on computer vision (ICCV), pp 5732–5740
Song S, Truong KG, Kim D, Jo S (2023) Prior depth-based multi-view stereo network for online 3D model reconstruction. Pattern Recognit 136:109198
Article Google Scholar
Yan Q, Wang Q, Zhao K, Li B, Chu X, Deng F (2023) Rethinking disparity: a depth range free multi-view stereo based on disparity. AAAI conference on artificial intelligence 37:3091–3099
Article MATH Google Scholar
Xu G, Wang X, Ding X, Yang X (2023) Iterative geometry encoding volume for stereo matching. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 21919–21928
Ding Y, Yuan W, Zhu Q, Zhang H, Liu X, Wang Y, Liu X (2022) TransMVSNet: Global context-aware multi-view stereo network with transformers. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8585–8594
Xi J, Shi Y, Wang Y, Guo Y, Xu K (2022) RayMVSNet: Learning ray-based 1D implicit fields for accurate multi-view stereo. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8595–8605
Wang X, Luo H, Wang Z, Zheng J, Bai X (2024) Robust training for multi-view stereo networks with noisy labels. Displays 81:102604
Article MATH Google Scholar
Yang R, Miao W, Zhang Z, Liu Z, Li M, Lin B (2024) SA-MVSNet: Self-attention-based multi-view stereo network for 3d reconstruction of images with weak texture. Eng Appl Artif Intell 131:107800
Article Google Scholar
Wang Z, Luo H, Wang X, Zheng J, Ning X, Bai X (2024) A contrastive learning based unsupervised multi-view stereo with multi-stage self-training strategy. Displays 83:102672
Article MATH Google Scholar
Wang L, Sun L, Duan F (2024) CT-MVSNet: Curvature-guided multi-view stereo with transformers. Multimedia Tools and Applications, pp 1–22
Chen Z, Zhao Y, He J, Lu Y, Cui Z, Li W, Zhang Y (2024) Feature distribution normalization network for multi-view stereo. The Visual Computer, pp 1–13
Lu P, Cai Y, Yang J, Wang D, Wu T (2024) UANet: Uncertainty-aware cost volume aggregation-based multi-view stereo for 3D reconstruction. The Visual Computer, pp 1–14
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2117–2125
Chen J, Yu Z, Ma L, Zhang K (2023) Multi-distribution fitting for multi-view stereo. Mach Vis Appl 34(5):93
Article MATH Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-Net: Convolutional networks for biomedical image segmentation. In: Medical image computing and computer-assisted intervention (MICCAI), pp 234–241. Springer
Kendall A, Martirosyan H, Dasgupta S, Henry P, Kennedy R, Bachrach A, Bry A (2017) End-to-end learning of geometry and context for deep stereo regression. In: IEEE international conference on computer vision (ICCV), pp 66–75
Aanæs H, Jensen RR, Vogiatzis G, Tola E, Dahl AB (2016) Large-scale data for multiple-view stereopsis. Int J Comput Vis 120:153–168
Article MathSciNet Google Scholar
Yao Y, Luo Z, Li S, Zhang J, Ren Y, Zhou L, Fang T, Quan L (2020) Blendedmvs: A large-scale dataset for generalized multi-view stereo networks. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1790–1799
Knapitsch A, Park J, Zhou Q-Y, Koltun V (2017) Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (ToG) 36(4):1–13
Article MATH Google Scholar

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China (No. 62105258, No. 62272383).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, 710048, China
Zonghua Yu, Huaijun Wang, Junhuai Li, Haiyan Jin, Ting Cao & Kuanhong Cheng
Shaanxi Key Laboratory for Network Computing and Security Technology, Xi’an, 710048, China
Huaijun Wang, Junhuai Li, Haiyan Jin, Ting Cao & Kuanhong Cheng

Authors

Zonghua Yu
View author publications
You can also search for this author inPubMed Google Scholar
Huaijun Wang
View author publications
You can also search for this author inPubMed Google Scholar
Junhuai Li
View author publications
You can also search for this author inPubMed Google Scholar
Haiyan Jin
View author publications
You can also search for this author inPubMed Google Scholar
Ting Cao
View author publications
You can also search for this author inPubMed Google Scholar
Kuanhong Cheng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Huaijun Wang.

Ethics declarations

Conflict of Interest

We declare that we do not have any commercial or associative interest that represents a Conflict of interest in connection with the work submitted.

Ethics Approval and Consent to Participate

Ethics approval was not required for this research.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yu, Z., Wang, H., Li, J. et al. Rethinking probability volume for multi-view stereo: A probability analysis method. Appl Intell 55, 396 (2025). https://doi.org/10.1007/s10489-025-06284-w

Download citation

Accepted: 09 January 2025
Published: 03 February 2025
DOI: https://doi.org/10.1007/s10489-025-06284-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rethinking probability volume for multi-view stereo: A probability analysis method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Uncertainty awareness with adaptive propagation for multi-view stereo

PSP-MVSNet: Deep Patch-Based Similarity Perceptual for Multi-view Stereo Depth Inference

Learning Inverse Depth Regression for Pixelwise Visibility-Aware Multi-View Stereo Networks

Explore related subjects

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethics Approval and Consent to Participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now