Abstract
Image-based reconstruction is devoted to recovering the 3D point cloud models of target objects from scene images photographed at different viewpoints, and the existing methods often produce a large number of redundant background points, which causes inconvenience to 3D modeling or other related applications. To solve this issue, this work proposes an improved framework that combines image segmentation in the point cloud retrieving procedure, so as it only reconstructs the objects of interest in a scene. This framework provides two options for foreground object segmentation, and users can determine the appropriate method to obtain accurate segmentation for different scenes. Then, the feature matches are extracted from the segmented images, and the point cloud model is recovered via two phases of dense diffusion, feature diffusion and patch diffusion. In the diffusion stage, we introduce a new normalized metric that deals with both the illumination change and low texture case to enhance the robustness of the reconstruction. The experimental results show that proposed framework can effectively avoid reconstructing the irrelevant background data while outputting more even and detailed point cloud models.
Similar content being viewed by others
References
Achanta R, Estrada F, Wils P et al (2008) Salient region detection and segmentation’. ICVS, Santorini, Greece, pp 66–75
Achanta R, Hemami S, Estrada F et al (2009) Frequency-tuned salient region detection. CVPR, Miami, FL, USA, pp 1597–1604
Badrinarayanan V et al (2018) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intel 39(12):2481–2495
Bradley D, Boubekeur T, Heidrich W (2008) Accurate multi-view reconstruction using robust binocular stereo and surface meshing, IEEE conf. CVPR, Anchorage, AK, USA, pp 1–8
Cech J, Sara R (2007) Efficient sampling of disparity space for fast and accurate matching. In: IEEE conference on computer vision and pattern recognition, IEEE, pp 1–8
Chen LC, Papandreou G, Kokkinos I et al (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intel 40(4):834–848
Fan H, Kong D, Li J (2015) Reconstruction of high-resolution Depth Map using Sparse Linear Model. In: Int Conf intelligent Syst research & mechatronics engineering, pp 283–292
Furukawa Y, Hernández C (2015) Multi-view stereo: a tutorial. Foundations and Trends in Computer Graphics and Vision 9(1-2):1–148
Furukawa Y, Ponce J (2010) Accurate, Dense, and Robust Multi-View Stereopsis. IEEE Trans Pattern Anal Mach Intell 32(8):1362–1376
Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on CVPR, Columbus, OH, USA, pp 580–587
Goesele M, Snavely N, Curless B et al (2007) Multi-view stereo for community photo collections, ICCV, Rio de Janeiro, Brazil, pp 1–8
Guo Y, Sohel F, Bennamoun M, Wan J et al (2014) An accurate and robust range image registration algorithm for 3D object modeling. IEEE Trans Mult 16(5):1377–1390
Han X, Laga H, Bennamoun M Image-based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era, IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2019.2954885
He H, Mckinnon D, Warren M et al (2010) Graphcut-based Interactive Segmentation using Colour and Depth cues, ACRA
He K, Gkioxari G, Dollár P et al (2017) Mask r-CNN, ICCV, Venice, pp 2980–2988
Hou X, Xing S, Dongyang MA et al (2016) A method of 3D scene reconstruction based on sequence images. Science of Surveying & Mapping 41(2):126–129
Jiang L, Shi S, Qi X, Jia J (2018) GAL: Geometric adversarial loss for single-view 3D-Object reconstruction. In: Proceedings of the 15th European Conference on Computer vision(ECCV), Munich, Germany
Lasang P, Shen SM, Kumwilaisak W (2015) Combining high resolution color and depth images for dense 3D reconstruction, ICCE, Berlin, Germany, pp 331–334
Lhuillier M, Quan L (2002) Match propagation for image-based modeling and rendering. IEEE Trans Pattern Anal Mach Intel 24:1140–1146
Li Z, Wang K, Meng D et al (2016) Multi-view stereo via depth map fusion: a coordinate decent optimization method. Neurocomputing, pp 46–61
Li K, Pham T, Zhan H, Reid I (2018) Efficient dense point cloud object reconstruction using deformation vector fields. In: Proceedings of the 15th European conference on computer vision(ECCV), Munich, Germany, pp 497–513
Liu Y, Cao X, Dai Q et al (2009) Continuous depth estimation for multi-view stereo. In: IEEE conf CVPR, Miami, FL, USA, pp 2121–2128
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE conference on computer vision and pattern recognition, Boston, MA, USA, pp 3431–3440
Mandikal P, Murthy N, Agarwal M, Babu RV (2018) 3d-LMNet: Latent embedding matching for accurate and diverse 3D point cloud reconstruction from a single image. In: 29th British machine vision conference (BMVC), Newcastle, UK, pp 662–674
Park JJ, Florence P, Straub J et al (2019) DeepSDF: Learning continuous signed distance functions for shape representation. In: IEEE CVPR, Long Beach, CA, USA, pp 165–174
Ren S, He K, Girshick R et al (2017) Faster R-CNN:, Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans Pattern Anal Mach Intel 39(6):1137–1149
Rother C, Kolmogorov V, Blake A (2004) “grabcut”: interactive foreground extraction using iterated graph cuts. ACM SIGGRAPH, Aug ACM, pp 309–314
Snavely N Bundler: Structure from motion (SFM) for unordered image collections, http://www.cs.cornell.edu/snavely/bundler/, accessed 12 July 2018
Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM Trans Graph 25(3):835–846
Tatarchenko M, Dosovitskiy A, Brox T (2016) Multi-view 3D models from single images with a convolutional network. In: Proceedings of the 14th European conference on computer vision(ECCV), Amsterdam, Netherlands, pp 322–337
Wang J, Sun B, Lu Y MVPNEt: multi-view point re-gression networks for 3D object reconstruction from a single image. In: The Thirty-Third AAAI conference on artificial intelligence (AAAI), Honlulu, Hawaii, USA, 2019. arXiv:1811.09410
Wu C VisualSFM: A Visual Structure From Motion System, http://homes.cs.washington.edu/ccwu/vsfm/, accessed 20 Oct 2018
Yang Y, Liang Q, Niu L et al (2014) Belief propagation stereo matching algorithm using ground control points. In: Proc of SPIE - the int society for optical engineering, 9069, pp 90690W-90690W-7
Yang T, Tian H, Liu X et al (2016) Research on image segmentation algorithm based on edge detection and otsu. Comput Eng 42(11):255–260
Yang JQ, Zhang Q, Cao ZG (2017) Multi-attribute statistics histograms for accurate and robust pairwise registration of range images. Neurocomputing 251:54–67
Zhang Z, Shan Y (2000) A Progressive Scheme for Stereo Matching, Second European Workshop on 3D Structure from Multiple Images of Large-Scale Environments, pp 68–8
Zhu S, Xia X, Zhang Q et al (2007) An image segmentation algorithm in image processing based on threshold segmentation. In: Int IEEE Conf on signal-image technologies and internet-based system. IEEE, Shanghai, China, pp 673–678
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This paper was supported by the National Natural Science Foundation of China(Grant No.61802294), China Postdoctoral Science Foundation(Grant No.2018M633472).
Rights and permissions
About this article
Cite this article
Luo, N., Xu, Y., Wang, Q. et al. Retrieving point cloud models of target objects in a scene from photographed images. Multimed Tools Appl 80, 6311–6328 (2021). https://doi.org/10.1007/s11042-020-09879-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09879-2