Skip to main content

Advertisement

Retrieving point cloud models of target objects in a scene from photographed images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Image-based reconstruction is devoted to recovering the 3D point cloud models of target objects from scene images photographed at different viewpoints, and the existing methods often produce a large number of redundant background points, which causes inconvenience to 3D modeling or other related applications. To solve this issue, this work proposes an improved framework that combines image segmentation in the point cloud retrieving procedure, so as it only reconstructs the objects of interest in a scene. This framework provides two options for foreground object segmentation, and users can determine the appropriate method to obtain accurate segmentation for different scenes. Then, the feature matches are extracted from the segmented images, and the point cloud model is recovered via two phases of dense diffusion, feature diffusion and patch diffusion. In the diffusion stage, we introduce a new normalized metric that deals with both the illumination change and low texture case to enhance the robustness of the reconstruction. The experimental results show that proposed framework can effectively avoid reconstructing the irrelevant background data while outputting more even and detailed point cloud models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Achanta R, Estrada F, Wils P et al (2008) Salient region detection and segmentation’. ICVS, Santorini, Greece, pp 66–75

  2. Achanta R, Hemami S, Estrada F et al (2009) Frequency-tuned salient region detection. CVPR, Miami, FL, USA, pp 1597–1604

  3. Badrinarayanan V et al (2018) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intel 39(12):2481–2495

    Article  Google Scholar 

  4. Bradley D, Boubekeur T, Heidrich W (2008) Accurate multi-view reconstruction using robust binocular stereo and surface meshing, IEEE conf. CVPR, Anchorage, AK, USA, pp 1–8

  5. Cech J, Sara R (2007) Efficient sampling of disparity space for fast and accurate matching. In: IEEE conference on computer vision and pattern recognition, IEEE, pp 1–8

  6. Chen LC, Papandreou G, Kokkinos I et al (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intel 40(4):834–848

    Article  Google Scholar 

  7. Fan H, Kong D, Li J (2015) Reconstruction of high-resolution Depth Map using Sparse Linear Model. In: Int Conf intelligent Syst research & mechatronics engineering, pp 283–292

  8. Furukawa Y, Hernández C (2015) Multi-view stereo: a tutorial. Foundations and Trends in Computer Graphics and Vision 9(1-2):1–148

    Article  Google Scholar 

  9. Furukawa Y, Ponce J (2010) Accurate, Dense, and Robust Multi-View Stereopsis. IEEE Trans Pattern Anal Mach Intell 32(8):1362–1376

    Article  Google Scholar 

  10. Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on CVPR, Columbus, OH, USA, pp 580–587

  11. Goesele M, Snavely N, Curless B et al (2007) Multi-view stereo for community photo collections, ICCV, Rio de Janeiro, Brazil, pp 1–8

  12. Guo Y, Sohel F, Bennamoun M, Wan J et al (2014) An accurate and robust range image registration algorithm for 3D object modeling. IEEE Trans Mult 16(5):1377–1390

    Article  Google Scholar 

  13. Han X, Laga H, Bennamoun M Image-based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era, IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2019.2954885

  14. He H, Mckinnon D, Warren M et al (2010) Graphcut-based Interactive Segmentation using Colour and Depth cues, ACRA

  15. He K, Gkioxari G, Dollár P et al (2017) Mask r-CNN, ICCV, Venice, pp 2980–2988

  16. Hou X, Xing S, Dongyang MA et al (2016) A method of 3D scene reconstruction based on sequence images. Science of Surveying & Mapping 41(2):126–129

    Google Scholar 

  17. Jiang L, Shi S, Qi X, Jia J (2018) GAL: Geometric adversarial loss for single-view 3D-Object reconstruction. In: Proceedings of the 15th European Conference on Computer vision(ECCV), Munich, Germany

  18. Lasang P, Shen SM, Kumwilaisak W (2015) Combining high resolution color and depth images for dense 3D reconstruction, ICCE, Berlin, Germany, pp 331–334

  19. Lhuillier M, Quan L (2002) Match propagation for image-based modeling and rendering. IEEE Trans Pattern Anal Mach Intel 24:1140–1146

    Article  Google Scholar 

  20. Li Z, Wang K, Meng D et al (2016) Multi-view stereo via depth map fusion: a coordinate decent optimization method. Neurocomputing, pp 46–61

  21. Li K, Pham T, Zhan H, Reid I (2018) Efficient dense point cloud object reconstruction using deformation vector fields. In: Proceedings of the 15th European conference on computer vision(ECCV), Munich, Germany, pp 497–513

  22. Liu Y, Cao X, Dai Q et al (2009) Continuous depth estimation for multi-view stereo. In: IEEE conf CVPR, Miami, FL, USA, pp 2121–2128

  23. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE conference on computer vision and pattern recognition, Boston, MA, USA, pp 3431–3440

  24. Mandikal P, Murthy N, Agarwal M, Babu RV (2018) 3d-LMNet: Latent embedding matching for accurate and diverse 3D point cloud reconstruction from a single image. In: 29th British machine vision conference (BMVC), Newcastle, UK, pp 662–674

  25. Park JJ, Florence P, Straub J et al (2019) DeepSDF: Learning continuous signed distance functions for shape representation. In: IEEE CVPR, Long Beach, CA, USA, pp 165–174

  26. Ren S, He K, Girshick R et al (2017) Faster R-CNN:, Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans Pattern Anal Mach Intel 39(6):1137–1149

    Article  Google Scholar 

  27. Rother C, Kolmogorov V, Blake A (2004) “grabcut”: interactive foreground extraction using iterated graph cuts. ACM SIGGRAPH, Aug ACM, pp 309–314

  28. Snavely N Bundler: Structure from motion (SFM) for unordered image collections, http://www.cs.cornell.edu/snavely/bundler/, accessed 12 July 2018

  29. Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM Trans Graph 25(3):835–846

    Article  Google Scholar 

  30. Tatarchenko M, Dosovitskiy A, Brox T (2016) Multi-view 3D models from single images with a convolutional network. In: Proceedings of the 14th European conference on computer vision(ECCV), Amsterdam, Netherlands, pp 322–337

  31. Wang J, Sun B, Lu Y MVPNEt: multi-view point re-gression networks for 3D object reconstruction from a single image. In: The Thirty-Third AAAI conference on artificial intelligence (AAAI), Honlulu, Hawaii, USA, 2019. arXiv:1811.09410

  32. Wu C VisualSFM: A Visual Structure From Motion System, http://homes.cs.washington.edu/ccwu/vsfm/, accessed 20 Oct 2018

  33. Yang Y, Liang Q, Niu L et al (2014) Belief propagation stereo matching algorithm using ground control points. In: Proc of SPIE - the int society for optical engineering, 9069, pp 90690W-90690W-7

  34. Yang T, Tian H, Liu X et al (2016) Research on image segmentation algorithm based on edge detection and otsu. Comput Eng 42(11):255–260

    Google Scholar 

  35. Yang JQ, Zhang Q, Cao ZG (2017) Multi-attribute statistics histograms for accurate and robust pairwise registration of range images. Neurocomputing 251:54–67

    Article  Google Scholar 

  36. Zhang Z, Shan Y (2000) A Progressive Scheme for Stereo Matching, Second European Workshop on 3D Structure from Multiple Images of Large-Scale Environments, pp 68–8

  37. Zhu S, Xia X, Zhang Q et al (2007) An image segmentation algorithm in image processing based on threshold segmentation. In: Int IEEE Conf on signal-image technologies and internet-based system. IEEE, Shanghai, China, pp 673–678

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Wan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This paper was supported by the National Natural Science Foundation of China(Grant No.61802294), China Postdoctoral Science Foundation(Grant No.2018M633472).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, N., Xu, Y., Wang, Q. et al. Retrieving point cloud models of target objects in a scene from photographed images. Multimed Tools Appl 80, 6311–6328 (2021). https://doi.org/10.1007/s11042-020-09879-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09879-2

Keywords