SDF-2-SDF Registration for Real-Time 3D Reconstruction from RGB-D Data

Slavcheva, Miroslava; Kehl, Wadim; Navab, Nassir; Ilic, Slobodan

doi:10.1007/s11263-017-1057-z

SDF-2-SDF Registration for Real-Time 3D Reconstruction from RGB-D Data

Published: 18 December 2017

Volume 126, pages 615–636, (2018)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Miroslava Slavcheva ORCID: orcid.org/0000-0001-5591-4887^1,2,
Wadim Kehl^1,3,
Nassir Navab¹ &
…
Slobodan Ilic^1,2

2451 Accesses
14 Citations
3 Altmetric
Explore all metrics

Abstract

We tackle the task of dense 3D reconstruction from RGB-D data. Contrary to the majority of existing methods, we focus not only on trajectory estimation accuracy, but also on reconstruction precision. The key technique is SDF-2-SDF registration, which is a correspondence-free, symmetric, dense energy minimization method, performed via the direct voxel-wise difference between a pair of signed distance fields. It has a wider convergence basin than traditional point cloud registration and cloud-to-volume alignment techniques. Furthermore, its formulation allows for straightforward incorporation of photometric and additional geometric constraints. We employ SDF-2-SDF registration in two applications. First, we perform small-to-medium scale object reconstruction entirely on the CPU. To this end, the camera is tracked frame-to-frame in real time. Then, the initial pose estimates are refined globally in a lightweight optimization framework, which does not involve a pose graph. We combine these procedures into our second, fully real-time application for larger-scale object reconstruction and SLAM. It is implemented as a hybrid system, whereby tracking is done on the GPU, while refinement runs concurrently over batches on the CPU. To bound memory and runtime footprints, registration is done over a fixed number of limited-extent volumes, anchored at geometry-rich locations. Extensive qualitative and quantitative evaluation of both trajectory accuracy and model fidelity on several public RGB-D datasets, acquired with various quality sensors, demonstrates higher precision than related techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning-based 3D reconstruction: a survey

Article 28 January 2023

Taha Samavati & Mohsen Soryani

3D point cloud-based place recognition: a survey

Article Open access 07 March 2024

Kan Luo, Hongshan Yu, … Ajmal Mian

Recent advances in implicit representation-based 3D shape generation

Article Open access 25 March 2024

Jia-Mu Sun, Tong Wu & Lin Gao

References

Adalsteinsson, D., & Sethian, J. A. (1995). A fast level set method for propagating interfaces. Journal of Computational Physics, 118(2), 269–277.
Article MathSciNet MATH Google Scholar
Alexandre, L. A. (2012). 3D descriptors for object and category recognition: A comparative evaluation. In Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Besl, P. J., & McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 14(2), 239–256.
Article Google Scholar
Blender Project: Free and open 3D creation software. https://www.blender.org/. Last Accessed March 30, 2017.
Bo, L., Ren, X., & Fox, D. (2011). Depth Kernel descriptors for object recognition. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Bylow, E., Olsson, C., & Kahl, F. (2014). Robust camera tracking by combining color and depth measurements. In International Conference on Pattern Recognition (ICPR).
Bylow, E., Sturm, J., Kerl, C., Kahl, F., & Cremers, D. (2013). Real-time camera tracking and 3D reconstruction using signed distance functions. In Robotics: Science and Systems Conference (RSS).
Canelhas, D. (2017). sdf_tracker - ROS Wiki. http://wiki.ros.org/sdf_tracker. Last Accessed March 30, 2017.
Canelhas, D. R., Stoyanov, T., & Lilienthal, A. J. (2013). SDF Tracker: A parallel algorithm for on-line pose estimation and scene reconstruction from depth images. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Chen, Y., & Medioni, G. (1991). Object modeling by registration of multiple range images. In IEEE International Conference on Robotics and Automation (ICRA).
Chen, J., Bautembach, D., & Izadi, S. (2013). Scalable real-time volumetric surface reconstruction. ACM Transactions on Graphics, 32(4), 113.
MATH Google Scholar
Choi, S., Zhou, Q. Y., & Koltun, V. (2015) Robust reconstruction of indoor scenes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Choi, S., Zhou, Q., Miller, S., & Koltun, V. (2016). A large dataset of object scans. arXiv:1602.02481.
Clarenz, U., Rumpf, M., & Telea, A. (2004). Robust feature detection and local classification for surfaces based on moment analysis. IEEE Transactions on Visualization and Computer Graphics, 10(5), 516–524.
Article Google Scholar
CloudCompare: 3D point cloud and mesh processing software. http://www.danielgm.net/cc/. Last Accessed March 30, 2017.
Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’96, pp. 303–312.
Dimashova, M., Lysenkov, I., Rabaud, V., & Eruhimov, V. (2013). Tabletop object scanning with an RGB-D sensor. In Third Workshop on Semantic Perception, Mapping and Exploration (SPME) at the 2013 IEEE International Conference on Robotics and Automation (ICRA).
Drost, B., Ulrich, M., Navab, N., & Ilic, S. (2010) Model globally, match locally: Efficient and robust 3D object recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., & Burgard, W. (2012). An evaluation of the RGB-D SLAM system. In IEEE International Conference on Robotics and Automation (ICRA).
Fioraio, N., Taylor, J., Fitzgibbon, A., Di Stefano, L., & Izadi, S. (2015). Large-scale and drift-free surface reconstruction using online subvolume registration. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Article MathSciNet Google Scholar
Gelfand, N., Mitra, N. J., Guibas, L. J., & Pottmann, H. (2005). Robust global registration. In Third Eurographics Symposium on Geometry Processing (SGP).
Henry, P., Fox, D., Bhowmik, A., & Mongia, R. (2013). Patch volumes: Segmentation-based consistent mapping with RGB-D cameras. In International Conference on 3D Vision (3DV).
Henry, P., Krainin, M., Herbst, E., Ren, X., & Fox, D. (2010). RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. In International Symposium on Experimental Robotics.
Holzer, S., Shotton, J., & Kohli, P. (2012). Learning to efficiently detect repeatable interest points in depth data. In European Conference on Computer Vision (ECCV).
Houston, B., Nielsen, M. B., Batty, C., Nilsson, O., & Museth, K. (2006). Hierarchical RLE level set: A compact and versatile deformable surface representation. ACM Transactions on Graphics (TOG), 25(1), 151–175.
Article Google Scholar
Ioannou, Y., Taati, B., Harrap, R., & Greenspan, M. A. (2012). Difference of normals as a multi-scale operator in unorganized point clouds. In Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission (3DIMPVT).
Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., & Fitzgibbon, A. (2011). KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In ACM Symposium on User Interface Software and Technology (UIST).
Johnson, A. E., & Hebert, M. (1999). Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 21(5), 433–449.
Article Google Scholar
Johnson, A., & Kang, S. B. (1999). Registration and integration of textured 3D Data. Image and Vision Computing, 17, 135–147.
Article Google Scholar
Kähler, O., Prisacariu, V. A., Ren, C. Y., Sun, X., Torr, P., & Murray, D. (2015). Very high frame rate volumetric integration of depth images on mobile devices. IEEE Transactions on Visualization and Computer Graphics (TVCG), 21(11), 1241–1250.
Article Google Scholar
Kehl, W., Holl, T., Tombari, F., Ilic, S., & Navab, N. (2016). An Octree-based approach towards efficient variational range data fusion. In British Machine Vision Conference (BMVC).
Kehl, W., Navab, N., & Ilic, S. (2014). Coloured signed distance fields for full 3D object reconstruction. In Proceedings of the British Machine Vision Conference (BMVC).
Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., & Kolb, A. (2013). Real-time 3D reconstruction in dynamic scenes using point-based fusion. In 2013 International Conference on 3D Vision (3DV).
Kerl, C. (2017). GitHub—tum-vision/dvo: Dense Visual Odometry. https://github.com/tum-vision/dvo. Last accessed March 30, 2017.
Kerl, C., Sturm, J., & Cremers, D. (2013). Robust odometry estimation for RGB-D cameras. In IEEE International Conference on Robotics and Automation (ICRA).
Khoshelham, K., & Elberink, S. O. (2012). Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors, 12(2), 1437–1454.
Article Google Scholar
KinectFusion implementation in the point cloud library (PCL). https://github.com/PointCloudLibrary/pcl/tree/master/gpu/kinfu. Last Accessed March 30, 2017.
Klein, G., & Murray, D. (2007). Parallel tracking and mapping for small AR workspaces. In Proceedings of the Sixth IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR).
Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., & Burgard, W. (2011). g2o: A general framework for graph optimization. In IEEE International Conference on Robotics and Automation (ICRA).
Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view RGB-D object dataset. In International Conference on Robotics and Automation (ICRA).
Lorensen, W. E., & Cline, H. E. (1987). Marching cubes: a high resolution 3D surface construction algorithm. In 14th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’87.
Losasso, F., Fedkiw, R., & Osher, S. (2006). Spatially adaptive techniques for level set methods and incompressible flow. Computers and Fluids, 35(10), 995–1010.
Article MathSciNet MATH Google Scholar
Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. S. (2003). An invitation to 3-D vision: From images to geometric models. Berlin: Springer.
MATH Google Scholar
Masuda, T. (2002). Registration and integration of multiple range images by matching signed distance fields for object shape modeling. Computer Vision and Image Understanding (CVIU), 87(1–3), 51–65.
Article MATH Google Scholar
Meilland, M., & Comport, A. I. (2013). On unifying key-frame and Voxel-based dense visual SLAM at large scales. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Narayan, K. S., Sha, J., Singh, A., & Abbeel, P. (2015). Range sensor and silhouette fusion for high-quality 3D scanning. In IEEE International Conference on Robotics and Automation (ICRA).
Neubeck, A., & Van Gool, L. (2006). Efficient non-maximum suppression. In 18th International Conference on Pattern Recognition (ICPR).
Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., & Fitzgibbon, A. (2011). KinectFusion: Real-time dense surface mapping and tracking. In 10th International Symposium on Mixed and Augmented Reality (ISMAR).
Newcombe, R. A., Lovegrove, S. J., & Davison, A. J. (2011). DTAM: Dense tracking and mapping in real-time. In IEEE International Conference on Computer Vision (ICCV).
Nielsen, M. B., & Museth, K. (2006). Dynamic tubular grid: An efficient data structure and algorithms for high resolution level sets. Journal of Scientific Computing, 26(3), 261–299.
Article MathSciNet MATH Google Scholar
Nießner, M., Zollhöfer, M., Izadi, S., & Stamminger, M. (2013). Real-time 3D reconstruction at scale using voxel hashing. ACM Transactions on Graphics (TOG), 32, 169.
Google Scholar
Osher, S., & Fedkiw, R. (2003). Level set methods and dynamic implicit surfaces. Applied mathematical science (Vol. 153). Springer.
Pirker, K., Rüther, M., Schweighofer, G., & Bischof, H. (2011). GPSlam: Marrying sparse geometric and dense probabilistic visual mapping. In Proceedings of the British Machine Vision Conference (BMVC).
Point Cloud Library. http://pointclouds.org/. Last Accessed March 30, 2017.
Ren, C. Y., & Reid, I. (2012). A unified energy minimization framework for model fitting in depth. In European Conference on Computer Vision 2nd Workshop on Consumer Depth Cameras (ECCVW).
Roth, H., & Vona, M. (2012). Moving volume KinectFusion. In British Machine Vision Conference (BMVC).
Rusinkiewicz, S., & Levoy, M. (2001). Efficient variants of the ICP algorithm. In 3rd International Conference on 3D Digital Imaging and Modeling (3DIM).
Rusu, R. B., Holzbach, A., Blodow, N., & Beetz, M. (2009). Fast geometric point labeling using conditional random fields. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Schütz, C., Jost, T., & Hugli, H. (1998). Multi-feature matching algorithm for free-form 3D surface registration. In International Conference on Pattern Recognition (ICPR).
Segal, A., Haehnel, D., & Thrun, S. (2009). Generalized-ICP. In Robotics: Science and Systems (RSS).
Singh, A., Sha, J., Narayan, K., Achim, T., & Abbeel, P. (2014) BigBIRD: A large-scale 3D database of object instances. In IEEE International Conference on Robotics and Automation (ICRA).
Slavcheva, M., & Ilic, S. (2016). SDF-TAR: Parallel tracking and refinement in RGB-D data using volumetric registration. In British Machine Vision Conference (BMVC).
Slavcheva, M., Kehl, W., Navab, N., & Ilic, S. (2016). SDF-2-SDF: Highly accurate 3D object reconstruction. In European Conference on Computer Vision (ECCV).
Steder, B., Rusu, R. B., Konolige, K., & Burgard, W. (2010). NARF: 3D range image features for object recognition. In Workshop on Defining and Solving Realistic Perception Problems in Personal Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Steinbrücker, F., Kerl, C., Sturm, J., & Cremers, D. (2013). Large-scale multi-resolution surface reconstruction from RGB-D sequences. In IEEE International Conference on Computer Vision (ICCV).
Steinbrücker, F., Sturm, J., & Cremers, D. (2014). Volumetric 3D mapping in real-time on a CPU. In IEEE International Conference on Robotics and Automation (ICRA).
Sturm, J., Engelhard, N., Endres, F., Burgard, W., & Cremers, D. (2017). A benchmark for the evaluation of RGB-D SLAM systems. In Proceedings of the International Conference on Intelligent Robot Systems (IROS).
Tomasi, C., & Manduchi, R. (1998). Bilateral filtering for gray and color images. In Sixth IEEE International Conference on Computer Vision (ICCV), pp. 839–846.
Tombari, F., Salti, S., & Di Stefano, L. (2013). Performance evaluation of 3D keypoint detectors. International Journal of Computer Vision (IJCV), 102(1), 198–220.
Article Google Scholar
Vijayanagar, K. R., Loghman, M., & Kim, J. (2014). Real-time refinement of Kinect depth maps using multi-resolution anisotropic diffusion. Mobile Networks and Applications, 19(3), 414–425.
Article Google Scholar
Wasenmüller, O., Ansari, M., & Stricker, D. (2016) DNA-SLAM: Dense noise aware SLAM for ToF RGB-D cameras. In Asian Conference on Computer Vision (ACCV) International Workshops.
Wasenmüller, O., Meyer, M., & Stricker, D. (2016). CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2. In IEEE Winter Conference on Applications of Computer Vision (WACV).
Whelan, T., Johannsson, H., Kaess, M., Leonard, J. J., & McDonald, J. B. (2013). Robust real-time visual odometry for dense RGB-D mapping. In IEEE International Conference on Robotics and Automation (ICRA).
Whelan, T., Kaess, M., Leonard, J. J., & McDonald, J. (2013). Deformation-based loop closure for large scale dense rgb-d slam. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
Whelan, T., Leutenegger, S., Salas-Moreno, R. F., Glocker, B., & Davison, A.J. (2015). ElasticFusion: Dense SLAM without a pose graph. In Robotics: Science and Systems (RSS).
Whelan, T., McDonald, J. B., Kaess, M., Fallon, M. F., Johannsson, H., & Leonard, J. J. (2012). Kintinuous: Spatially extended KinectFusion. In RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras.
Whelan, T., Salas-Moreno, R. F., Glocker, B., Davison, A. J., & Leutenegger, S. (2016). ElasticFusion: Real-time dense SLAM and light source estimation. International Journal of Robotics Research (IJRR), 35(14), 1697–1716.
Article Google Scholar
Whitaker, R. T. (1998). A level-set approach to 3D reconstruction from range data. International Journal of Computer Vision (IJCV), 29(3), 203–231.
Article Google Scholar
Zach, C., Pock, T., & Bischof, H. (2007). A globally optimal algorithm for robust TV-\(L^1\) range image integration. In Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV), pp. 1–8.
ZCorporation: ZPrinter 650. Hardware manual (2008).
Zeng, M., Zhao, F., Zheng, J., & Liu, X. (2013). Octree-based fusion for realtime 3D reconstruction. Graphical Models, 75(3), 126–136.
Article Google Scholar
Zhou, Q., Miller, S., & Koltun, V. (2013). Elastic fragments for dense scene reconstruction. In IEEE International Conference on Computer Vision (ICCV).
Zhou, Q., & Koltun, V. (2013). Dense scene reconstruction with points of interest. ACM Transactions on Graphics, 32(4), 112.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität München, München, Germany
Miroslava Slavcheva, Wadim Kehl, Nassir Navab & Slobodan Ilic
Siemens CT, München, Germany
Miroslava Slavcheva & Slobodan Ilic
Toyota Research Institute, Los Altos, CA, USA
Wadim Kehl

Authors

Miroslava Slavcheva
View author publications
You can also search for this author in PubMed Google Scholar
Wadim Kehl
View author publications
You can also search for this author in PubMed Google Scholar
Nassir Navab
View author publications
You can also search for this author in PubMed Google Scholar
Slobodan Ilic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miroslava Slavcheva.

Additional information

Communicated by Michael S. Brown.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (avi 23636 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Slavcheva, M., Kehl, W., Navab, N. et al. SDF-2-SDF Registration for Real-Time 3D Reconstruction from RGB-D Data. Int J Comput Vis 126, 615–636 (2018). https://doi.org/10.1007/s11263-017-1057-z

Download citation

Received: 24 June 2017
Accepted: 20 November 2017
Published: 18 December 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11263-017-1057-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SDF-2-SDF Registration for Real-Time 3D Reconstruction from RGB-D Data

Abstract

Access this article

Similar content being viewed by others

Deep learning-based 3D reconstruction: a survey

3D point cloud-based place recognition: a survey

Recent advances in implicit representation-based 3D shape generation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SDF-2-SDF Registration for Real-Time 3D Reconstruction from RGB-D Data

Abstract

Access this article

Similar content being viewed by others

Deep learning-based 3D reconstruction: a survey

3D point cloud-based place recognition: a survey

Recent advances in implicit representation-based 3D shape generation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation