Skip to main content

Fast Semantic Segmentation of RGB-D Scenes with GPU-Accelerated Deep Neural Networks

  • Conference paper
KI 2014: Advances in Artificial Intelligence (KI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8736))

Abstract

In semantic scene segmentation, every pixel of an image is assigned a category label. This task can be made easier by incorporating depth information, which structured light sensors provide. Depth, however, has very different properties from RGB image channels. In this paper, we present a novel method to provide depth information to convolutional neural networks. For this purpose, we apply a simplified version of the histogram of oriented depth (HOD) descriptor to the depth channel. We evaluate the network on the challenging NYU Depth V2 dataset and show that with our method, we can reach competitive performance at a high frame rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schulz, H., Behnke, S.: Learning object-class segmentation with convolutional neural networks. In: Eur. Symp. on Art. Neural Networks (2012)

    Google Scholar 

  2. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Adv. in Neural Information Processing Systems (2012)

    Google Scholar 

  3. Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Scene parsing with multiscale feature learning, purity trees, and optimal covers. arXiv preprint arXiv:1202.2160 (2012)

    Google Scholar 

  4. Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: Int. Conf. on Computer Vision (ICCV) Workshops (2011)

    Google Scholar 

  5. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor Segmentation and Support Inference from RGBD Images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Couprie, C., Farabet, C., Najman, L., LeCun, Y.: Indoor Semantic Segmentation using depth information. CoRR abs/1301.3572 (2013)

    Google Scholar 

  7. Sharp, T.: Implementing decision trees and forests on a GPU. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 595–608. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Communications of the ACM (2013)

    Google Scholar 

  9. Stückler, J., Waldvogel, B., Schulz, H., Behnke, S.: Dense real-time mapping of object-class semantics from RGB-D video. Journal of Real-Time Image Processing (2013)

    Google Scholar 

  10. Müller, A.C., Behnke, S.: Learning Depth-Sensitive Conditional Random Fields for Semantic Segmentation of RGB-D Images. In: Int. Conf. on Robotics and Automation, ICRA (2014)

    Google Scholar 

  11. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, CVPR (2005)

    Google Scholar 

  12. Spinello, L., Arras, K.O.: People detection in RGB-D data. In: Int. Conf. on Intelligent Robots and Systems (IROS). IEEE (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Höft, N., Schulz, H., Behnke, S. (2014). Fast Semantic Segmentation of RGB-D Scenes with GPU-Accelerated Deep Neural Networks. In: Lutz, C., Thielscher, M. (eds) KI 2014: Advances in Artificial Intelligence. KI 2014. Lecture Notes in Computer Science(), vol 8736. Springer, Cham. https://doi.org/10.1007/978-3-319-11206-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11206-0_9

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11205-3

  • Online ISBN: 978-3-319-11206-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics