Parametric Surface Representation with Bump Image for Dense 3D Modeling Using an RBG-D Camera

Thomas, Diego; Sugimoto, Akihiro

doi:10.1007/s11263-016-0969-3

Parametric Surface Representation with Bump Image for Dense 3D Modeling Using an RBG-D Camera

Published: 19 November 2016

Volume 123, pages 206–225, (2017)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Diego Thomas¹^nAff2 &
Akihiro Sugimoto¹

951 Accesses
3 Citations
3 Altmetric
Explore all metrics

Abstract

When constructing a dense 3D model of an indoor static scene from a sequence of RGB-D images, the choice of the 3D representation (e.g. 3D mesh, cloud of points or implicit function) is of crucial importance. In the last few years, the volumetric truncated signed distance function (TSDF) and its extensions have become popular in the community and largely used for the task of dense 3D modelling using RGB-D sensors. However, as this representation is voxel based, it offers few possibilities for manipulating and/or editing the constructed 3D model, which limits its applicability. In particular, the amount of data required to maintain the volumetric TSDF rapidly becomes huge which limits possibilities for portability. Moreover, simplifications (such as mesh extraction and surface simplification) significantly reduce the accuracy of the 3D model (especially in the color space), and editing the 3D model is difficult. We propose a novel compact, flexible and accurate 3D surface representation based on parametric surface patches augmented by geometric and color texture images. Simple parametric shapes such as planes are roughly fitted to the input depth images, and the deviations of the 3D measurements to the fitted parametric surfaces are fused into a geometric texture image (called the Bump image). A confidence and color texture image are also built. Our 3D scene representation is accurate yet memory efficient. Moreover, updating or editing the 3D model becomes trivial since it is reduced to manipulating 2D images. Our experimental results demonstrate the advantages of our proposed 3D representation through a concrete indoor scene reconstruction application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Atlas: End-to-End 3D Scene Reconstruction from Posed Images

Hybrid3D: learning 3D hybrid features with point clouds and multi-view images for point cloud registration

Article 29 June 2023

Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Notes

The notation $\llbracket a,b \rrbracket $ denotes the integer interval between a and b.
Note that in our implementation the plane detection is run every 20 frames.
Radial distortion as exhibited in Zhou et al. (2013) is ignored in this work. How to properly handle this noise is left for future work (Zhou et al. 2013) is a pointer about how to handle such a noise).
For information, Infinitam was reported to run above 20 fps.
The GPU memory usage at run-time depends only on the complexity of the scene (i.e., number and size of planar patches in the current visual frustrum).
Memory usage at run-time was less than 150 MB on the GPU and less than 35 MB on the CPU.
At run-time, the GPU memory usage never exceeded 300 MB, while the number of visible planar patches never exceeded 28 planes.
The memory usage at run-time with data Library never exceeded 447 MB in the GPU and 2180 MB in the CPU. The memory usage at run-time with data Library-2 never exceeded 311 MB in the GPU and 881 MB in the CPU.
(1) The tangent vector is made orthogonal to the normal vector and normalised and (2) the bitangent vector is made orthogonal to both the normal and tangent vectors and then normalised.

References

Anasosalu, P.K., Thomas, D., & Sugimoto, A. (2013). Compact and accurate 3-D face modeling using an RGB-D camera: Let s open the door to 3-D video conference. In: Proc. of CDC4CV.
Besl, P. J., & McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on PAMI, 14(2), 239–256.
Article Google Scholar
Blanz, V., & Vetter, T. (2003). Face recognition based on fitting a 3D morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), 1063–1074.
Article Google Scholar
Canny, J. (1986). A Computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 679–698.
Article Google Scholar
Chen, J., Bautembach, D., & Izadi, S. (2013). Scalable real-time volumetric surface reconstruction. ACM Transactions on Graphics, 32(4), 113:1–113:8.
Davison, A., Reid, I., Molton, N., & Stasse, O. (2007). Monoslam: Real-time single camera slam. IEEE Transactions on PAMI, 29, 1052–1067.
Article Google Scholar
Henry, P., Fox, D., Bhowmik, A., & Mongia, R. (2013). Patch volumes: Segmentation-based consistent mapping with RGB-D cameras. In: Proceedings of 3DV’13.
Henry, P., Krainin, M., Herbst, E., Ren, X., & Fox, D. (2012). RGB-D mapping: Using Kinect-style depth cameras for dense 3D modelling of indoor environments. International Journal of Robotics Research, 31(5), 647–663.
Article Google Scholar
Hernandez, M., Choi, J., & Medioni, G. (2012). Laser scan quality 3-D face modeling using a low-cost depth camera. In: Proceedings of the 20th European signal processing conference (EUSIPCO), pp. 1995–1999.
Jaeggli, T., Konenckx, T., & Gool, L. (2003). Online 3D acquisition and model integration. In: Proceedings of Procam’03.
Kahler, O., Prisacariu, V. A., Ren, C. Y., Sun, X., Torr, P. H. S., & Murray, D. W. (2015). Very high frame rate volumetric integration of depth images on mobile device. In: IEEE Transactions on Visualization and Computer Graphics (proceedings international symposium on mixed and augmented reality).
Kazhdan, M., Bolitho, M., & Hoppe, H. (2006). Poisson surface reconstruction. In: Proceedings of Eurographics symposium on geometry.
Lengyel, E. (2001). Computing tangent space basis vectors for an arbitrary mesh. In: Terathon Software 3D Graphics Library.
Lowe, D. G. (1999). Object recognition from local scale-invariant features. In: Proceedings of ICCV, pp. 1150–1157.
Neibner, M., Zollhofer, M., Izadi, S., & Stamminger, M. (2013). Real-time 3D reconstruction at scale using voxel hashing. ACM Transactions on Graphics, 32(6), 169:1–169:11.
Google Scholar
Newcombe, R., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A., Kohli, P., Shotton, J., Hodges, S., & Fitzgibbon, A. (2011). Kinectfusion: Real-time dense surface mapping and tracking. In: Proceedings of ISMAR’11, pp. 127–136.
Nguyen, C., Izadi, S., & Lovell, D. (2012). Modeling kinect sensor noise for improved 3D reconstruction and tracking. In: Proceedings of 3DIM/PVT’12, pp. 524–530.
Pfister, H., Zwicker, M., Baar, J., & Gross, M. (2000). Surfels: Surface elements as rendering primitives. In: ACM Transactions on Graphics (Proceedings of SIGGRAPH’00).
Roth, H., & Vona, M. (2012). Moving volume kinectfusion. In: Proceedings of BMVC.
Segal, A., Haehnel, D., & Thrun, S. (2009). Generalized-ICP. In: Robotics: Science and systems.
Steinbrucker, F., Kerl, C., Sturm, J., & Cremers, D. (2013). Large-scale multi-resolution surface reconstruction from RGB-D sequences. In: Proceedings of international conference on computer vision (ICCV 13).
Thomas, D., & Sugimoto, A. (2013). A flexible scene representation for 3D reconstruction using an RGB-D camera. In: Proceedings of ICCV.
Thomas, D., & Sugimoto, A. (2014). A two-stage strategy for real-time dense 3D reconstruction of large-scale scenes. In: Proceedings of ECCV workshops’14 (CDC4CV).
Weise, T., Wismer, T., Leibe, B., & Gool, L. (2009). In-hand scanning with online loop closure. Proceedings of ICCV Workshops’09, pp. 1630–1637.
Whelan, T., McDonald, J., Kaess, M., Fallon, M., Johansson, H., & Leonard, J. (2012). Kintinuous: Spatially extended kinectfusion. Proceedings of RSS Workshop on RGB-D: Advanced reasoning with depth camera.
Zeng, M., Zhao, F., Zheng, J., & Liu, X. (2013). Octree-based fusion for realtime 3D reconstruction. Transaction of Graphical Models, 75(3), 126–136.
Article Google Scholar
Zhou, Q.-Y., & Koltun, V. (2013). Dense scene reconstruction with points of interest. ACM Transaction on Graphics, 32(4), 112:1–112:8.
MATH Google Scholar
Zhou, Q.-Y., Miller, S., & Koltun, V. (2013). Elastic fragments for dense scene reconstruction. In: Proceedings of ICCV.

Download references

Acknowledgements

This work is in part supported by Grant-in-Aid for Scientific Research of the Ministry of Education, Culture, Sports, Science and Technology of Japan.

Author information

Diego Thomas
Present address: Kyushu University, Fukuoka, Japan

Authors and Affiliations

National Institute of Informatics, Tokyo, Japan
Diego Thomas & Akihiro Sugimoto

Authors

Diego Thomas
View author publications
Search author on:PubMed Google Scholar
Akihiro Sugimoto
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Diego Thomas.

Additional information

Communicated by S.-C. Zhu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thomas, D., Sugimoto, A. Parametric Surface Representation with Bump Image for Dense 3D Modeling Using an RBG-D Camera. Int J Comput Vis 123, 206–225 (2017). https://doi.org/10.1007/s11263-016-0969-3

Download citation

Received: 23 September 2015
Accepted: 01 November 2016
Published: 19 November 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11263-016-0969-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parametric Surface Representation with Bump Image for Dense 3D Modeling Using an RBG-D Camera

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Atlas: End-to-End 3D Scene Reconstruction from Posed Images

Hybrid3D: learning 3D hybrid features with point clouds and multi-view images for point cloud registration

Shape Reconstruction Using Volume Sweeping and Learned Photoconsistency

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now