Skip to main content
Log in

Reconstructing three-dimensional models of objects using a Kinect sensor

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Advanced sensor technology has allowed us to acquire three-dimensional (3D) information from a scene using a low-cost RGB-D sensor such as Kinect. Although this sensor can recover the 3D structure of a scene, it cannot distinguish a target object from the background. In view of this, we incorporate an interactive 3D segmentation algorithm with a well-known Kinect scene reconstruction system, the KinectFusion, to effectively extract an object from the scene, and hence obtain a 3D point cloud of the object. With this system, a user can freely move the Kinect sensor to reconstruct the scene and then select the foreground/background seeds from the reconstructed point cloud. The system can take over the following tasks to complete the 3D reconstruction of the selected object. The advantage of this system is that users need not select the foreground/background seeds very carefully, which greatly reduces the operational complexity. Moreover, previous segmentation results are inherited to the next phase as new foreground/background seeds, which minimizes the required user intervention. With a simple seed selection, the point cloud of the selected object can be gradually recovered when a user moves the sensor to different viewpoints. Several experiments were conducted, and the results confirmed the effectiveness of the proposed system. The 3D structures of objects with complex shapes are well reconstructed by our system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Albitar, I., Graebling, P., Doignon, C.: Robust structured light coding for 3D reconstruction. In: 11th IEEE International Conference on Computer Vision, pp. 1–6 (2007)

  2. Anwer, A., Ali, S.S.A., Mériaudeau, F.: Underwater online 3D mapping and scene reconstruction using low cost Kinect RGB-D sensor. In: 6th International Conference on Intelligent and Advanced Systems (ICIAS) (2016)

  3. Bay, H., Ess, A., Tuytelaars, T., van Gool, L.: Speeded-up robust features (SURF). Comput. Vision Image Underst. 110(3), 346–359 (2008)

    Article  Google Scholar 

  4. Boykov, Y., Kolomogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1124–1137 (2004)

    Article  Google Scholar 

  5. Bradley, D., Boubekeur, T., Heidrich, W.: Accurate multi-view reconstruction using robust binocular stereo and surface meshing. In: International Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

  6. Chen, C., Zou, W., Wang, J.: 3D surface reconstruction based on Kinect. In: 2013 IEEE Third International Conference on Information Science and Technology (ICIST), pp. 986–990 (2013)

  7. Chen, K., Lai, Y.K., Wu, Y.X., Martin, R., Hu, S.M.: Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Trans. Graph. 33(6), 208:1–208:12 (2014)

    Article  Google Scholar 

  8. Clark, J.J.: Active photometric stereo. In: International Conference on Computer Vision and Pattern Recognition, pp. 29–34 (1992)

  9. Cochran, S.D., Medioni, G.: 3D surface description from binocular stereo. IEEE Trans. Pattern Anal. Mach. Intell. 14(10), 981–994 (1992)

    Article  Google Scholar 

  10. Dibra, E., Jain, H., Öztireli, A.C., Ziegler, R., Gross, M.H.: HS-Nets: Estimating human body shape from silhouettes with convolutional neural networks. In: Fourth International Conference on 3D Vision, 3DV 2016, Stanford, CA, USA, October 25-28, 2016, pp. 108–117 (2016). http://dx.doi.org/10.1109/3DV.2016.19

  11. Du, H., Henry, P., Ren, X., Cheng, M., Goldman, D.B., Seitz, S.M., Fox, D.: Interactive 3D modeling of indoor environments with a consumer depth camera. In: 13th ACM International Conference on Ubiquitous Computing, pp. 75–84. Beijing, China (2011)

  12. Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the RGB-D SLAM system. In: IEEE International Conference on Robotics and Automation (2012)

  13. Engelhard, N., Endres, F., Hess, J., Sturm, J., Burgard, W.: Real-time 3D visual slam with a hand-held RGB-D camera. In: The RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum (2011)

  14. Geiger, A., Ziegler, J., Stiller, C.: Stereoscan: Dense 3D reconstruction in real-time. In: IEEE Intelligent Vehicles Symposium, pp. 963–968 (2011)

  15. Geng, J.: Structured-light 3D surface imaging: a tutorial. Adv. Opt. Photonics 3, 128–160 (2011)

    Article  Google Scholar 

  16. Han, J., Shao, L., Xu, D., Shotton, J.: Enhanced computer vision with microsoft kinect sensor: a review. IEEE Trans. Cybern. 43(5), 1318–1334 (2013)

    Article  Google Scholar 

  17. Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. In: 12th International Symposium on Experimental Robotics (ISER) (2010)

  18. Higo, T., Matsushita, Y., Joshi, N., Ikeuchi, K.: A hand-held photometric stereo camera for 3D modeling. In: 12th IEEE International Conference on Computer Vision, pp. 1234–1241 (2009)

  19. Horaud, R., Hansard, M., Evangelidis, G., Ménier, C.: An overview of depth cameras and range scanners based on time-of-flight technologies. Mach. Vis. Appl. 27, 1005–1020 (2016)

    Article  Google Scholar 

  20. Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A.: KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera. In: 24th Annual ACM Symposium on User Interface Software and Technology, pp. 559–568 (2011)

  21. Li, Y., Sun, J., Tang, C.K., Shum, H.Y.: Lazy snapping. ACM Trans. Graph. 23(3), 303–308 (2004)

    Article  Google Scholar 

  22. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  MathSciNet  Google Scholar 

  23. Meilland, M., Comport, A., Rives, P.: Real-time dense visual tracking under large lighting variations. In: British Machine Vision Conference, pp. 1–11 (2011)

  24. Meilland, M., Comport, A., Rives, P.: Dense RGB-D mapping for real-time localisation and navigation. In: Workshop Navigation Positioning Mapping (2012)

  25. Meister, S., Izadi, S., Kohli, P., Haemmerle, M., Rother, C., Kondermann, D.: When can we use KinectFusion for ground truth acquisition? In: Workshop Color-Depth Camera Fusion Robot (2012)

  26. Morana, M.: 3D scene reconstruction using Kinect. In: Gaglio, S., Lo Re, G. (eds.) Advances onto the Internet of Things: How Ontologies Make the Internet of Things Meaningful, pp. 179–190. Springer, Cham (2014)

    Chapter  Google Scholar 

  27. Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: Real-time dense surface mapping and tracking. In: IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136. Basel, Switzerland (2011)

  28. Pan, R., Taubin, G.: Automatic segmentation of point clouds from multi-view reconstruction using graph-cut. Vis. Comput. 32(5), 601–609 (2016)

    Article  Google Scholar 

  29. Pollefeys, M., Gool, L.V., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. Int. J. Comput. Vis. 59(3), 207–232 (2004)

    Article  Google Scholar 

  30. Ribo, M., Brandner, M.: State of the art on vision-based structured light systems for 3D measurements. In: International Workshop on Robotic Sensors: Robotic and Sensor Environments 2005, pp. 2–6 (2005)

  31. Roth, H., Vona, M.: Moving volume KinectFusion. In: British Machine Vision Conference, pp. 1–11 (2012)

  32. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: International Conference on Computer Vision, pp. 2564–2571 (2011)

  33. Sahillioğlu, Y., Yemez, Y.: Coarse-to-fine surface reconstruction from silhouettes and range data using mesh deformation. Comput. Vis. Image Underst. 114, 334–348 (2010)

    Article  Google Scholar 

  34. Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. ACM Trans. Graph. (SIGGRAPH 2006) 25(3), 835–846 (2006)

    Article  Google Scholar 

  35. Song, T., Lyu, Z., Ding, X., Wan, Y.: 3D surface reconstruction based on kinect sensor. Int. J. Comput. Theory Eng. 5(3), 567–573 (2013)

    Article  Google Scholar 

  36. Song, X., Zhong, F., Wang, Y., Qin, X.: Estimation of Kinect depth confidence through self-training. Vis. Comput. 30(6), 855–865 (2014)

    Article  Google Scholar 

  37. Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: International Conference on Computer Vision (1998)

  38. Valentin, J., Vineet, V., Cheng, M.M., Kim, D., Shotton, J., Kohli, P., NieBner, M., Criminisi, A., Izadi, S., Torr, P.: SemanticPaint: interactive 3D labeling and learning at your fingertips. ACM Trans. Graph. 34(5), 154 (2015)

    Article  Google Scholar 

  39. van den Hengel, A., Dick, A., Thormählen, T., Ward, B., Torr, P.H.S.: VideoTrace: rapid interactive scene modeling from video. ACM Trans. Graph. 26(3), 86 (2007). doi:10.1145/1276377.1276485

    Article  Google Scholar 

  40. Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., McDonald, J.: Kintinuous: Spatially extended KinectFusion. Tech. rep., Computer Science and Artificial Intelligence Laboratory (2012). MIT-CSAIL-TR-2012-020

  41. Wolberg, G., Zokai, S.: PhotoSketch: a photocentric urban 3D modeling system. The Visual Computer (2017). doi:10.1007/s00371-017-1365-x

    Article  Google Scholar 

  42. Zhang, Z.: Iterative point matching for registration of free-form curves and surfaces. Int. J. Comput. Vis. 13(2), 119–152 (1994)

    Article  Google Scholar 

  43. Zheng, J.Y.: Acquiring 3-D models from sequences of contours. IEEE Trans. Pattern Anal. Mach. Intell. 16(2), 163–178 (1994)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Ministry of Science and Technology, Taiwan, under Grant Nos. NSC 102-2221-E-155-075 and MOST 105-2218-E-155-010.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chin-Hung Teng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Teng, CH., Chuo, KY. & Hsieh, CY. Reconstructing three-dimensional models of objects using a Kinect sensor. Vis Comput 34, 1507–1523 (2018). https://doi.org/10.1007/s00371-017-1425-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-017-1425-2

Keywords

Navigation