Skip to main content
Log in

Labeling Complete Surfaces in Scene Understanding

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Scene understanding requires reasoning about both what we can see and what is occluded. We offer a simple and general approach to infer labels of occluded background regions. Our approach incorporates estimates of visible surrounding background, detected objects, and shape priors from transferred training regions. We demonstrate the ability to infer the labels of occluded background regions in three datasets: the outdoor StreetScenes dataset, IndoorScene dataset and SUN09 dataset, all using the same approach. Furthermore, the proposed approach is extended to 3D space to find layered support surfaces in RGB-Depth scenes. Our experiments and analysis show that our method outperforms competent baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  • Bileschi, S.M.: Streetscenes: Towards scene understanding in still images. Ph.D. thesis, Cambridge, MA (2006)

  • Brostow, G. J., Shotton, J., Fauqueur, J., & Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: Proceedings of the 10th European Conference on Computer Vision ECCV (2008).

  • Choi, M. J., Lim, J. J., Torralba, A., & Willsky, A. S.: Exploiting hierarchical context on a large database of object categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR (2010).

  • Everingham, M., Van Gool, L., Williams, C .K. I., Winn, J., & Zisserman, A.: The PASCAL Visual Object Classes Challenge VOC (2008) results. http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html. Accessed 29 Oct 2014.

  • Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D.: Object detection with discriminatively trained part based models. In: Proceedings of the IEEE transactions on Pattern Analysis and Machine Intelligence (2009).

  • Geiger, A., Wojek, C., & Urtasun, R.: Joint 3d estimation of objects and scene layout. In: Proceedings of the Advances in Neural Information Processing Systems NIPS (2011).

  • Gould, S., Gao, T., & Koller, D.: Region-based segmentation and object detection. In: Proceedings of the Advances in Neural Information Processing Systems NIPS (2009).

  • Gould, S., Rodgers, J., Cohen, D., Elidan, G., & Koller, D. (2008). Multi-class segmentation with relative location prior. International Journal of Computer Vision, 80(3), 300–316.

    Article  Google Scholar 

  • Guo, R., & Hoiem, D.: Beyond the line of sight: Labeling the underlying surfaces. In: Proceedings of the 12th European conference on Computer Vision ECCV (2012).

  • Guo, R., & Hoiem, D.: Support surface prediction in indoor scenes. In: Proceedings of the IEEE International Conference on Computer Vision ICCV (2013).

  • Gupta, A., Efros, A. A., & Hebert, M.: Blocks world revisited: Image understanding using qualitative geometry and mechanics. In: Proceedings of the 11th European Conference on Computer Vision ECCV (2010).

  • Hedau, V., Hoiem, D., & Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: Proceedings of the IEEE 12th International Computer Vision ICCV (2009).

  • Hoiem, D., Efros, A. A., & Hebert, M. (2007). Recovering surface layout from an image. International Journal of Computer Vision, 75(1), 151–172.

    Article  Google Scholar 

  • Hoiem, D., Efros, A. A., & Hebert, M.: Closing the loop on scene interpretation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR (2008).

  • Isola, P., & Liu, C.: Scene collaging: analysis and synthesis of natural images with semantic layers. In: Proceedings of the IEEE International Conference on Computer Vision ICCV (2013).

  • Khosla, A., An, B., Lim, J. J., & Torralba, A.: Looking beyond the visible scene. In: Proceedings of the International Conference on Computer Vision CVPR (2014).

  • Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? Pattern Analysis and Machine Intelligence, 26(2), 147–159.

  • Lee, D. C., Hebert, M., & Kanade, T.: Geometric reasoning for single image structure recovery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR (2009).

  • Li, C., Kowdle, A., Saxena, A., & Chen, T.: Towards holistic scene understanding: Feedback enabled cascaded classification models. In: Proceedings of the Advances in Neural Information Processing Systems NIPS (2010).

  • Liu, C., Yuen, J., & Torralba, A. (2011). Nonparametric scene parsing via label transfer. Pattern Analysis and Machine Intelligence, 33(12), 2368–2382.

    Article  Google Scholar 

  • Malisiewicz, T., & Efros, A. A.: Beyond categories: The visual memex model for reasoning about object relationships. In: Advances in Neural Information Processing Systems NIPS (2009).

  • Ross, S., Munoz, D., Hebert, M., & Bagnell, J. A. D.: Learning message-passing inference machines for structured prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR (2011).

  • Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2005). LabelMe: A database and web-based tool for image annotation. Technical Report, MIT.

  • Shotton, J., Winn, J., Rother, C., & Criminisi, A.: Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Proceedings of the 9th European conference on Computer Vision ECCV (2006).

  • Silberman, N., Hoiem, D., Kohli, P., & Fergus, R.: Indoor segmentation and support inference from rgbd images. In: Proceedings of the 12th European Conference on Computer Vision ECCV, pp. 746–760 (2012).

  • Silberman, N., Shapira, L., Gal, R., & Kohli, P.: A contour completion model for augmenting surface reconstructions. In: Proceedings of the European Conference on Computer Vision ECCV (2014).

  • Tighe, J., & Lazebnik, S.: Superparsing: Scalable nonparametric image parsing with superpixels. In: Proceedings of the European Conference on Computer Vision ECCV (2010).

  • Tu, Z., & Bai, X. (2010). Auto-context and its application to high-level vision tasks and 3D brain image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(10), 1744–1757.

    Article  Google Scholar 

  • Zhang, H., Xiao, J., & Quan, L.: Supervised label transfer for semantic segmentation of street scenes. In: Proceedings of the 11th European Conference on Computer Vision ECCV (2010).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruiqi Guo.

Additional information

Communicated by Derek Hoiem, James Hays, Jianxiong Xiao, and Aditya Khosla.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, R., Hoiem, D. Labeling Complete Surfaces in Scene Understanding. Int J Comput Vis 112, 172–187 (2015). https://doi.org/10.1007/s11263-014-0776-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0776-7

Keywords

Navigation