Skip to main content

Label Propagation for Large Scale 3D Indoor Scenes

  • Conference paper
  • First Online:
Advances in Visual Computing (ISVC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9474))

Included in the following conference series:

  • 2833 Accesses

Abstract

RGB-D mapping or semantic mapping is becoming more and more important for computer vision and robotics. However, manually segmenting and generating semantic labels for RGB-D image sequence or global point cloud will cost a lot of human labors. That is why there still lacks a satisfactory indoor dataset for testing semantic mapping system. While automatic label propagation can help, almost all existing methods were designed for 2D videos which ignore the 3D characteristics of RGB-D images. In this paper, we build a global map for RGB-D image sequence firstly, and then propagate labels on the global map. In this way, we can enforce label consistency over the global scene and require fewer frames to be manually labeled. Also we model the overlap information between images and use a greedy algorithm to automatically choose frames for manual labeling. Experiments demonstrate that our method can reduce manual efforts greatly. For a scene which contains 1831 images, only 22 labeled images can achieve 93 % accuracy for label propagation.

Keke Tang and Zhe Zhao — These two authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html.

  2. 2.

    http://vision.cs.utexas.edu/projects/activeframeselection/.

References

  1. Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments. I. J. Robotic Res. 31, 647–663 (2012)

    Article  Google Scholar 

  2. Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.W.: Kinectfusion: Real-time dense surface mapping and tracking. In: ISMAR, pp. 127–136 (2011)

    Google Scholar 

  3. Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., McDonald, J.: Kintinuous: spatially extended KinectFusion. In: RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras, Sydney, Australia (2012)

    Google Scholar 

  4. Engelhard, N., Endres, F., Hess, J., Sturm, J., Burgard, W.: Real-time 3d visual slam with a hand-held rgb-d camera. In: Proceedings of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum, Vasteras, Sweden, vol. 2011 (2011)

    Google Scholar 

  5. Ren, X., Bo, L., Fox, D.: Rgb-(d) scene labeling: features and algorithms. In: CVPR, pp. 2759–2766 (2012)

    Google Scholar 

  6. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  7. Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: ICCV Workshops, pp. 601–608 (2011)

    Google Scholar 

  8. Banica, D., Sminchisescu, C.: CPMC-3D-O2P: Semantic segmentation of rgb-d images using cpmc and second order pooling, CoRR abs/1312.7715 (2013)

    Google Scholar 

  9. Gupta, S., Arbelaez, P., Malik, J.: Perceptual organization and recognition of indoor scenes from rgb-d images. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 564–571. IEEE (2013)

    Google Scholar 

  10. Couprie, C., Farabet, C., Najman, L., LeCun, Y.: Indoor semantic segmentation using depth information, (2013). arXiv preprint arXiv:1301.3572

  11. Nüchter, A., Hertzberg, J.: Towards semantic maps for mobile robots. Robotics Auton. Syst. 56, 915–926 (2008)

    Article  Google Scholar 

  12. Stuckler, J., Biresev, N., Behnke, S.: Semantic mapping using object-class segmentation of rgb-d images. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3005–3010. IEEE (2012)

    Google Scholar 

  13. Hermans, A., Floros, G., Leibe, B.: Dense 3d semantic mapping of indoor scenes from rgb-d images. In: ICRA (2014)

    Google Scholar 

  14. Koppula, H.S., Anand, A., Joachims, T., Saxena, A.: Semantic labeling of 3d point clouds for indoor scenes. In: NIPS, pp. 244–252 (2011)

    Google Scholar 

  15. Valentin, J.P., Sengupta, S., Warrell, J., Shahrokni, A., Torr, P.H.: Mesh based semantic modelling for indoor and outdoor scenes. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2067–2074. IEEE (2013)

    Google Scholar 

  16. Floros, G., Leibe, B.: Joint 2d–3d temporally consistent semantic segmentation of street scenes. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2823–2830. IEEE (2012)

    Google Scholar 

  17. Miksik, O., Munoz, D., Bagnell, J.A., Hebert, M.: Efficient temporal consistency for streaming video scene analysis. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 133–139. IEEE (2013)

    Google Scholar 

  18. Xiao, J., Owens, A., Torralba, A.: Sun3d: A database of big spaces reconstructed using sfm and object labels. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1625–1632. IEEE (2013)

    Google Scholar 

  19. Vijayanarasimhan, S., Grauman, K.: Active frame selection for label propagation in videos. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 496–509. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  20. Fauqueur, J., Brostow, G.J., Cipolla, R.: Assisted video object labeling by joint tracking of regions and keypoints. In: ICCV, pp. 1–7 (2007)

    Google Scholar 

  21. Shi, J., Tomasi, C.: Good features to track. In: 1994 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1994, pp. 593–600. IEEE (1994)

    Google Scholar 

  22. Lucas, B.D., Kanade, T., et al.: An iterative image registration technique with an application to stereo vision. IJCAI 81, 674–679 (1981)

    Google Scholar 

  23. Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials, CoRR abs/1210.5644 (2012)

    Google Scholar 

  24. Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59, 167–181 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Keke Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Tang, K., Zhao, Z., Chen, X. (2015). Label Propagation for Large Scale 3D Indoor Scenes. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2015. Lecture Notes in Computer Science(), vol 9474. Springer, Cham. https://doi.org/10.1007/978-3-319-27857-5_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27857-5_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27856-8

  • Online ISBN: 978-3-319-27857-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics