Improving RGB-D-based 3D reconstruction by combining voxels and points

Liu, Xinqi; Li, Jituo; Lu, Guodong

doi:10.1007/s00371-022-02661-5

Improving RGB-D-based 3D reconstruction by combining voxels and points

Original article
Published: 06 October 2022

Volume 39, pages 5309–5325, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Xinqi Liu¹,
Jituo Li¹ &
Guodong Lu¹

436 Accesses
1 Altmetric
Explore all metrics

Abstract

We propose a flexible 3D reconstruction method based on the RGB-D data stream. Compared to previous methods using pure voxels or pure points as representations, our works propose a new representation combining voxels and points to improve the reconstruction accuracy. A key insight is that points can store additional depth data that are not sampled by regular voxels. Thus, by integrating points and voxels, the 3D reconstruction process can be accelerated due to higher data utilization. Furthermore, depth information stored in points is used to refine the noisy depth image through a depth image refinement method, consequently improving the reconstructed shape quality. Extensive comparative experiments are performed including different representations (pure voxels/points) and various methods (fusion-based/learning-based and online/offline) to illustrate the effectiveness of our work. Experimental results demonstrate that our method can achieve real-time performance, effectively avoid artifacts, and reach state-of-the-art accuracy levels. More importantly, we provide a novel idea to balance the conflict between memory overhead and reconstruction accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PCT: Point cloud transformer

Article Open access 10 April 2021

Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification

3D point cloud-based place recognition: a survey

Article Open access 07 March 2024

References

Zhou, Q., Miller, S., Koltun, V.: Elastic fragments for dense scene reconstruction. IEEE International Conference on Computer Vision (2013)
Zhou, Q., Koltun, V.: Color map optimization for 3d reconstruction with consumer depth cameras. ACM Trans. Gr. 33(4), 155 (2014)
Article Google Scholar
Maier, R., Kim, K., Cremers, D., Kautz, J., Niessner, M.: Intrinsic3d: High-quality 3d reconstruction by joint appearance and geometry optimization with spatially-varying lighting. IEEE International Conference on Computer Vision (2017)
Dai, A., Niessner, M., Zollhöfer, M., Izadi, S., Theobalt, C.: Bundlefusion: Real-time globally consistent 3d reconstruction using on-the-fly surface reintegration. ACM Trans. Gr. 36(3), 24 (2017)
Article Google Scholar
Yang, Y., Dong, W., Kaess, M.: Surfel-based dense RGB-D reconstruction with global and local consistency. International Conference on Robotics and Automation (2019)
Newcombe, RA., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, AJ., et al: Kinectfusion: Real-time dense surface mapping and tracking. 10th IEEE International Symposium on Mixed and Augmented Reality (2011)
Lan, Z., Yew, ZJ., Lee, GH.: Robust point cloud based reconstruction of large-scale outdoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
Niessner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3d reconstruction at scale using voxel hashing. ACM Trans. Gr. 32(6), 169 (2013)
Article Google Scholar
Whelan, T., Leutenegger, S., Salas-Moreno, R.F., Glocker, B., Davison, A.J.: Elasticfusion: Dense SLAM without A pose graph. Robot. Sci. Syst. 11, 1 (2015)
Google Scholar
Haefner, B., Peng, S., Verma, A., Quèau, Y., Cremers, D.: Photometric depth super-resolution. IEEE Trans. Pattern Anal. Mach. Intell. (2019)
Marquina, A., Osher, S.J.: Image super-resolution by tv-regularization and bregman iteration. J. Sci. Comput. 37(3), 367–382 (2008)
Article MathSciNet MATH Google Scholar
Song, X., Dai, Y., Qin, X.: Deep depth super-resolution: Learning depth super-resolution using deep convolutional neural network. Comput. Vis. (2016).
Wen, Y., Sheng, B., Li, P., Lin, W., Feng, D.D.: Deep color guided coarse to fine convolutional network cascade for depth image super-resolution. IEEE Trans. Image Process. 28(2), 994–1006 (2019)
Article MathSciNet MATH Google Scholar
Yu, L., Yeung, SK., Tai, Y., Lin, S.: Shading-based shape refinement of RGB-D images. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (2013)
Wu, H., Wang, Z., Zhou, K.: Simultaneous localization and appearance estimation with a consumer RGB-D camera. IEEE Trans. Vis. Comput. Gr. 22(8), 2012–2023 (2016)
Article Google Scholar
Mac Aodha, O., Campbell, NDF., Nair, A., Brostow, GJ.: Patch based synthesis for single depth image super-resolution. Comput. Vis. (2012).
Park, J., Kim, H., Tai, Y., Brown, MS., Kweon, I.: High quality depth image upsampling for 3d-tof cameras. In: IEEE International Conference on Computer Vision (2011).
Jiang, Z., Yue, H., Lai, Y., Yang, J., Hou, Y., Hou, C.: Deep edge map guided depth super resolution. Signal Process Image Commun. (2021).
Haefner, B., Peng, S., Verma, A., Quèau, Y., Cremers, D.: Photometric depth super-resolution. IEEE Trans. Pattern Anal. Mach. Intell. 42(10), 2453–2464 (2020)
Article Google Scholar
Ye, X., Sun, B., Wang, Z., Yang, J., Xu, R., Li, H., et al.: Pmbanet: Progressive multi-branch aggregation network for scene depth super-resolution. IEEE Trans. Image Process. 29, 7427–7442 (2020)
Article MATH Google Scholar
Song, X., Dai, Y., Zhou, D., Liu, L., Li, W., Li, H., et al.: Channel attention based iterative residual learning for depth image super-resolution. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020).
Liu, X., Zhai, D., Chen, R., Ji, X., Zhao, D., Gao, W.: Depth super-resolution via joint color-guided internal and external regularizations. IEEE Trans. Image Process 28(4), 1636–1645 (2019)
Article MathSciNet Google Scholar
Yang, H., Zhang, Z.: Depth image upsampling based on guided filter with low gradient minimization. Vis. Comput. 36(7), 1411–1422 (2020)
Article Google Scholar
Yang, S., Cao, N., Guo, B., Li, G.: Depth map super-resolution based on edge-guided joint trilateral upsampling. Vis. Comput. 38(3), 883–895 (2022)
Article Google Scholar
Altantawy, A., Saleh, I., Kishk, S.: Texture-guided depth upsampling using Bregman split: a clustering graph-based approach. Vis. Comput. 36(2), 333–359 (2020)
Article Google Scholar
Zhao, L., Bai, H., Liang, J., Zeng, B., Wang, A., Zhao, Y.: Simultaneous color-depth super-resolution with conditional generative adversarial networks. Pattern Recogn. 88, 356–369 (2019)
Article Google Scholar
Zollhöfer, M., Dai, A., Innmann, M., Wu, C., Stamminger, M., Theobalt, C., et al.: Shading-based refinement on volumetric signed distance functions. ACM Trans. Gr. 34(4), 96 (2015)
Article MATH Google Scholar
Fu, Y., Yan, Q., Liao, J., Chow, A., Xiao, C.: Real-time dense 3D reconstruction and camera tracking via embedded planes representation. Vis. Comput. 36(10), 2215–2226 (2020)
Article Google Scholar
Lu, F., Zhou, B., Zhang, Y., Zhao, Q.: Real-time 3D scene reconstruction with dynamically moving object using a single depth camera. Vis. Comput. 34(6–8), 753–763 (2018)
Article Google Scholar
Wang, K., Zhang, G., Yang, J., Bao, H.: Dynamic human body reconstruction and motion tracking with low-cost depth cameras. Vis. Comput. 37(3), 603–618 (2021)
Article Google Scholar
Huang, J., Dai, A., Guibas, L., Niessner, M.: 3dlite: Towards commodity 3d scanning for content creation. ACM Trans. Gr. 36(6), 1–14 (2017)
Article Google Scholar
Zhang, J., Zhu, C., Zheng, L., Xu, K.: ROSEFusion: random optimization for online dense reconstruction under fast camera motion. ACM Trans. Graph. 40(4), 56:1-56:17 (2021)
Article Google Scholar
Wong, Y.S., Li, C., Niessner, M., Mitra, N.J.: Rigidfusion: Rgb-d scene reconstruction with rigidly-moving objects. Comput. Gr. Forum 40(2), 511–522 (2021)
Article Google Scholar
Cao, Y.-P., Kobbelt, L., Hu, S.-M.: Real-time High-accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras. ACM Trans. Graph. 37(5), 171:1-1711:6 (2018)
Article Google Scholar
Li, K., Pham, T., Zhan, H., Reid, ID.: Effcient dense point cloud object reconstruction using deformation vector fields. Computer Vision (2018).
Arikan, M., Preiner, R., Scheiblauer, C., Jeschke, S., Wimmer, M.: Large scale point-cloud visualization through localized textured surface reconstruction. IEEE Trans. Vis. Comput. Gr. 20(9), 1280–1292 (2014)
Article Google Scholar
Schöps, T., Sattler, T., Pollefeys, M.: Surfelmeshing: Online surfelbased mesh reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 42(10), 2494–2507 (2020)
Article Google Scholar
Monica, R., Aleotti, J.: Surfel-based incremental reconstruction of the boundary between known and unknown space. IEEE Trans. Vis. Comput. Gr. 26(8), 2683–2695 (2020)
Article Google Scholar
Yang, Z., Chai, Y., Anguelov, D., Zhou, Y., Sun, P., Erhan, D., et al.: Surfelgan: Synthesizing realistic sensor data for autonomous driving. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020).
Newcombe, RA., Fox, D., Seitz, SM.: Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. IEEE Conference on Computer Vision and Pattern Recognition (2015).
Dou, M., Khamis, S., Degtyarev, Y., Davidson, P.L., Fanello, S.R., Kowdle, A., et al.: Fusion4d: Real-time performance capture of challenging scenes. ACM Trans. Gr. 35(4), 1–13 (2016)
Article Google Scholar
Fuhrmann, S., Goesele, M.: Fusion of depth images with multiple scales. ACM Trans. Gr. 30(6), 148 (2011)
Article Google Scholar
Chen, J., Bautembach, D., Izadi, S.: Scalable real-time volumetric surface reconstruction. ACM Trans. Gr. 32(4), 1–16 (2013)
Article MATH Google Scholar
Mostegel, C., Prettenthaler, R., Fraundorfer, F., Bischof, H.: Scalable surface reconstruction from point clouds with extreme scale and density diversity. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (2017).
Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., Kolb, A.: Real-time 3d reconstruction in dynamic scenes using point-based fusion. 2013 International Conference on 3D Vision (2013).
Sumner, R.W., Schmid, J., Pauly, M.: Embedded deformation for shape manipulation. ACM Trans. Gr. 26(3), 80 (2007)
Article Google Scholar
Gao, W., Tedrake, R.: Surfelwarp: Effcient non-volumetric single view dynamic reconstruction. Robotics: Science and Systems XIV (2018).
Park, J., Florence, P., Straub, J., Newcombe, R.-A., Lovegrove, S.: DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. CVPR: 165–174 (2019).
Chabra, R., Lenssen, J., Ilg, E., Schmidt, T., Straub, J., Lovegrove, S., Newcombe, R.-A.: Deep local shapes: learning local SDF priors for detailed 3D reconstruction. ECCV 29, 608–625 (2020)
Google Scholar
Peng, S., Niemeyer, M., Mescheder, L.-M., Pollefeys, M., Geiger, A.: Convolutional Occupancy Networks. ECCV 3, 523–540 (2020)
Google Scholar
Jiang, C., Sud, A., Makadia, A., Huang, J., Nießner, M., Funkhouser, T.-A.: Local Implicit Grid Representations for 3D Scenes. CVPR 6000–6009 (2020).
Huang, J., Huang, S.-S., Song, H., Hu, S.-M.: DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors. CVPR 8932–89411 (2021).
Weder, S., Schönberger, J.-L., Pollefeys, M., Oswald, M.-R.: NeuralFusion: Online Depth Fusion in Latent Space. CVPR, pp. 3162–3172 (2021).
Weder, S., Schönberger, J.-L., Pollefeys, M., Oswald, M.- R.: RoutedFusion: Learning Real-Time Depth Map Fusion. CVPR, pp. 4886–4896 (2020).
Saito, S., Huang, Z., Natsume, R., Morishima, S., Li, H., Kanazawa, A.: Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. In: 2019 IEEE/CVF International Conference on Computer Vision (2019).
Saito, S., Simon, T., Saragih, JM., Joo, H.: Pifuhd: Multi-level pixel aligned implicit function for high-resolution 3d human digitization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020).
He, T., Collomosse, J.P., Jin, H., Soatto, S.: Geo-pifu: Geometry and pixel aligned implicit functions for single-view human reconstruction. Advances in Neural Information Processing Systems (2020).
Deng, B., Lewis, JP., Jeruzalski, T., Pons-Moll, G., Hinton, GE., Norouzi, M., et al.: NASA neural articulated shape approximation. ECCV (2020).
Chibane, J., Alldieck, T., Pons-Moll, G.: Implicit functions in feature space for 3d shape reconstruction and completion. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020).
Zheng, Z., Yu, T., Liu, Y., Dai, Q.: Pamir: Parametric model-conditioned implicit representation for image-based human reconstruction. CoRR;abs/2007.03858, (2020).
Natsume, R., Saito, S., Huang, Z., Chen, W., Ma, C., Li, H., et al.: Siclope: Silhouette-based clothed people. In: IEEE Conference on Computer Vision and Pattern Recognition (2019).
Alldieck, T., Pons-Moll, G., Theobalt, C., Magnor, MA.: Tex2shape: Detailed full human body geometry from a single image. In: 2019 IEEE/CVF International Conference on Computer Vision (2019).
Xia, Z., Kim, J., Park, YS.: Real-time 3d reconstruction using a combination of point-based and volumetric fusion. IEEE/RSJ International Conference on Intelligent Robots and Systems (2018).
Liu, X., Li, J., Lu, G.: A new volumetric fusion strategy with adaptive weight field for RGB-D reconstruction. Sensors 20(15), 4330 (2020)
Article Google Scholar
Kazhdan, M.M., Hoppe, H.: Screened poisson surface reconstruction. ACM Trans. Gr. 32(3), 1–13 (2013)
Article MATH Google Scholar
Chang, A.-X., Funkhouser, T.-A., Guibas, L.-J., Hanrahan, P., Huang, Q.-X., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: ShapeNet: An Information-Rich 3D Model Repository. CoRR abs/1512.03012 (2015)
Lefloch, D., Weyrich, T., Kolb, A.: Anisotropic point-based fusion. International Conference on Information Fusion (2015)

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China (2018YFB1700700), the National Natural Science Foundation of China (61732015, 61972340), and the Research Funding of Zhejiang University Robotics Institute.

Author information

Authors and Affiliations

Institute of Design Engineering, School of Mechanical Engineering, Zhejiang University, Hangzhou, 310027, People’s Republic of China
Xinqi Liu, Jituo Li & Guodong Lu

Authors

Xinqi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jituo Li
View author publications
You can also search for this author in PubMed Google Scholar
Guodong Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jituo Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, X., Li, J. & Lu, G. Improving RGB-D-based 3D reconstruction by combining voxels and points. Vis Comput 39, 5309–5325 (2023). https://doi.org/10.1007/s00371-022-02661-5

Download citation

Accepted: 21 August 2022
Published: 06 October 2022
Issue Date: November 2023
DOI: https://doi.org/10.1007/s00371-022-02661-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving RGB-D-based 3D reconstruction by combining voxels and points

Abstract

Access this article

Similar content being viewed by others

PCT: Point cloud transformer

Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification

3D point cloud-based place recognition: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving RGB-D-based 3D reconstruction by combining voxels and points

Abstract

Access this article

Similar content being viewed by others

PCT: Point cloud transformer

Learning Bottleneck Transformer for Event Image-Voxel Feature Fusion Based Classification

3D point cloud-based place recognition: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation