Skip to main content
Log in

Real-time 3D scene reconstruction with dynamically moving object using a single depth camera

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Online 3D reconstruction of real-world scenes has been attracting increasing interests from both the academia and industry, especially with the consumer-level depth cameras becoming widely available. Recent most online reconstruction systems take live depth data from a moving Kinect camera and incrementally fuse them to a single high-quality 3D model in real time. Although most real-world scenes have static environment, the daily objects in a scene often move dynamically, which are non-trivial to reconstruct especially when the camera is also not still. To solve this problem, we propose a single depth camera-based real-time approach for simultaneous reconstruction of dynamic object and static environment, and provide solutions for its key issues. In particular, we first introduce a robust optimization scheme which takes advantage of raycasted maps to segment moving object and background from the live depth map. The corresponding depth data are then fused to the volumes, respectively. These volumes are raycasted to extract views of the implicit surface which can be used as a consistent reference frame for the next iteration of segmentation and tracking. Particularly, in order to handle fast motion of dynamic object and handheld camera in the fusion stage, we propose a sequential 6D pose prediction method which largely increases the registration robustness and avoids registration failures occurred in conventional methods. Experimental results show that our approach can reconstruct moving object as well as static environment with rich details, and outperform conventional methods in multiple aspects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Cao, C., Weng, Y., Lin, S., Zhou, K.: 3d shape regression for real-time facial animation. ACM Trans. Graph. (TOG) 32(4), 41 (2013)

    Article  MATH  Google Scholar 

  2. Chen, J., Bautembach, D., Izadi, S.: Scalable real-time volumetric surface reconstruction. ACM Trans. Graph. (TOG) 32(4), 113 (2013)

    MATH  Google Scholar 

  3. Chen, K., Lai, Y., Wu, Y.X., Martin, R.R., Hu, S.M.: Automatic semantic modeling of indoor scenes from low-quality rgb-d data using contextual information. ACM Trans. Gr. 33(6), 208:1–208:12 (2014)

  4. Chen, Y., Medioni, G.: Object modelling by registration of multiple range images. Image Vis. Comput. 10(3), 145–155 (1992)

    Article  Google Scholar 

  5. Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A., Izadi, S.: 3d scanning deformable objects with a single rgbd sensor. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 493–501. IEEE (2015)

  6. Guo, K., Xu, F., Yu, T., Liu, X., Dai, Q., Liu, Y.: Real-time geometry, albedo, and motion reconstruction using a single rgb-d camera. ACM Trans. Graph. (TOG) 36(3), 32 (2017)

    Article  Google Scholar 

  7. Hernández, C., Vogiatzis, G., Brostow, G.J., Stenger, B., Cipolla, R.: Non-rigid photometric stereo with colored lights. In: IEEE 11th International Conference on Computer Vision, 2007. ICCV 2007, pp. 1–8. IEEE (2007)

  8. Innmann, M., Zollhöfer, M., Nießner, M., Theobalt, C., Stamminger, M.: Volumedeform: real-time volumetric non-rigid reconstruction. In: European Conference on Computer Vision, pp. 362–379. Springer (2016)

  9. Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., et al.: Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th annual ACM Symposium on User Interface Software and Technology, pp. 559–568. ACM (2011)

  10. Jaimez, M., Kerl, C., Gonzalez-Jimenez, J., Cremers, D.: Fast odometry and scene flow from rgb-d cameras based on geometric clustering. In: Proc. International Conference on Robotics and Automation (ICRA) (2017)

  11. Kahler, O., Prisacariu, V., Valentin, J., Murray, D.: Hierarchical voxel block hashing for efficient integration of depth images. IEEE Robot. Autom. Lett. 1, 192–197 (2016)

    Article  Google Scholar 

  12. Kahler, O., Prisacariu, V.A., Ren, C.Y., Sun, X., Torr, P., Murray, D.: Very high frame rate volumetric integration of depth images on mobile devices. IEEE Trans. Vis. Comput. Graph. 21(11), 1241–1250 (2015)

    Article  Google Scholar 

  13. Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. In: ACM Transactions on Graphics (TOG), vol. 28, p. 175. ACM (2009)

  14. Liao, M., Zhang, Q., Wang, H., Yang, R., Gong, M.: Modeling deformable objects from a single depth camera. In: IEEE 12th International Conference on Computer Vision, pp. 167–174. IEEE (2009)

  15. McCormac, J., Handa, A., Davison, A., Leutenegger, S.: Semanticfusion: Dense 3d semantic mapping with convolutional neural networks. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 4628–4635. IEEE (2017)

  16. Newcombe, R.A., Fox, D., Seitz, S.M.: Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 343–352 (2015)

  17. Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohi, P., Shotton, J., Hodges, S., Fitzgibbon, A.: Kinectfusion: Real-time dense surface mapping and tracking. In: 10th IEEE international symposium on Mixed and augmented reality (ISMAR), pp. 127–136. IEEE (2011)

  18. Nießner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3d reconstruction at scale using voxel hashing. ACM Trans. Graph. (TOG) 32(6), 169 (2013)

    Article  Google Scholar 

  19. Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3d tracking of hand articulations using kinect. In: BmVC, vol. 1, p. 3 (2011)

  20. Roth, H., Vona, M.: Moving volume kinectfusion. In: BMVC, pp. 1–11 (2012)

  21. Shen, C.H., Fu, H., Chen, K., Hu, S.M.: Structure recovery by part assembly. ACM Trans. Graph. (TOG) 31(6), 180 (2012)

    Article  Google Scholar 

  22. Steinbrucker, F., Kerl, C., Cremers, D.: Large-scale multi-resolution surface reconstruction from rgb-d sequences. In: The IEEE International Conference on Computer Vision (ICCV) (2013)

  23. Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 103–110. IEEE (2012)

  24. Toscana, G., Rosa, S., Bona, B.: Fast graph-based object segmentation for rgb-d images. In: Proceedings of SAI Intelligent Systems Conference, pp. 42–58. Springer (2016)

  25. Weiss, A., Hirshberg, D., Black, M.J.: Home 3d body scans from noisy image and range data. In: IEEE International Conference on Computer Vision (ICCV), pp. 1951–1958. IEEE (2011)

  26. Whelan, T., Kaess, M., Fallon, M., et al.: Kintinuous: Spatially extended kinectFusion [J]. Robot Auton Syst 69(C), 3–14 (2012)

  27. Whelan, T., Salas-Moreno, R.F., Glocker, B., Davison, A.J., Leutenegger, S.: Elasticfusion: real-time dense slam and light source estimation. Int. J. Robot. Res. 35(14), 1697–1716 (2016)

    Article  Google Scholar 

  28. Xu, K., Huang, H., Shi, Y., Li, H., Long, P., Caichen, J., Sun, W., Chen, B.: Autoscanning for coupled scene reconstruction and proactive object analysis. ACM Trans. Graph. (TOG) 34(6), 177 (2015)

    Article  Google Scholar 

  29. Xu, K., Shi, Y., Zheng, L., Zhang, J., Liu, M., Huang, H., Su, H., Cohen-Or, D., Chen, B.: 3d attention-driven depth acquisition for object identification. ACM Trans. Graph. (TOG) 35(6), 238 (2016)

    Google Scholar 

  30. Yu, T., Guo, K., Xu, F., Dong, Y., Su, Z., Zhao, J., Li, J., Dai, Q., Liu, Y.: Bodyfusion: Real-time capture of human motion and surface geometry using a single depth camera. In: The IEEE International Conference on Computer Vision (ICCV). ACM (2017)

  31. Zhang, Y., Xu, W., Tong, Y., Zhou, K.: Online structure analysis for real-time indoor scene reconstruction. ACM Trans. Graph. (TOG) 34(5), 159 (2015)

    Article  Google Scholar 

  32. Zollhöfer, M., Nießner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., et al.: Real-time non-rigid reconstruction using an rgb-d camera. ACM Trans. Graph. (TOG) 33(4), 156 (2014)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This study was funded by National Natural Science Foundation of China (Grant Nos. 61502023 and U1736217).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feixiang Lu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, F., Zhou, B., Zhang, Y. et al. Real-time 3D scene reconstruction with dynamically moving object using a single depth camera. Vis Comput 34, 753–763 (2018). https://doi.org/10.1007/s00371-018-1540-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-018-1540-8

Keywords

Navigation