Skip to main content

Object Labeling in 3D from Multi-view Scenes Using Gaussian–Hermite Moment-Based Depth Map

  • Conference paper
  • First Online:
  • 565 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1024))

Abstract

Depth as well as intensity of a pixel plays a significant role in labeling objects in 3D environments. This paper presents a novel approach of labeling objects from multi-view video sequences by incorporating rich depth information. The depth map of a scene is estimated from focus-cues using the Gaussian–Hermite moments (GHMs) of local neighboring pixels. It is expected that the depth map obtained from GHMs provides robust features as compared to that provided by other popular depth maps such as those obtained from Kinect and defocus cue. We use the rich depth and intensity values of a pixel to score every point of a video frame for generating labeled probability maps in a 3D environment. These maps are then used to create a 3D scene wherein available objects are labeled distinctively. Experimental results reveal that our proposed approach yields excellent performance of object labeling for different multi-view scenes taken from RGB-D object dataset, in particular showing significant improvements in precision–recall characteristics and F1-score.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: Proceedings of IEEE International Conference on Intelligent Robots and Systems, pp. 821–826. San Francisco, CA, USA (2011)

    Google Scholar 

  2. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)

    Article  Google Scholar 

  3. Chuang, Y.Y., Curless, B., Salesin, D.H., Szeliski, R.: A Bayesian Approach to Digital Matting, vol. 2, pp. 264–271. Kauai, Hawaii (2001)

    Google Scholar 

  4. Collet, A., Berenson, D., Srinivasa, S.S., Ferguson, D.: Object recognition and full pose registration from a single image for robotic manipulation. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 48–55. Kobe, Japan (2009)

    Google Scholar 

  5. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 886–893. Washington, DC, USA (2005)

    Google Scholar 

  6. Das, S., Koperski, M., Bremond, F., Francesca, G.: Action recognition based on a mixture of RGB and depth based skeleton. In: Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 1–6. Lecce, Italy (2017)

    Google Scholar 

  7. Douillard, B., Fox, D., Ramos, F., Durrant-Whyte, H.: Classification and semantic mapping of urban environments. Int. J. Robot. Res. 30(1), 5–32 (2011)

    Article  Google Scholar 

  8. Engelcke, M., Rao, D., Zeng Wang, D., Hay Tong, C., Posner, I.: Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 1355–1361. Singapore (2017)

    Google Scholar 

  9. Haque, S., Rahman, S.M.M., Hatzinakos, D.: Gaussian-Hermite moment-based depth estimation from single still image for stereo vision. J. Vis. Commun. Image Represent. 41, 281–295 (2016)

    Article  Google Scholar 

  10. Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: using depth cameras for dense 3D modeling of indoor environments. In: Khatib, O., Kumar, V., Sukhatme, G. (eds.) Experimental Robotics: Springer Tracts in Advanced Robotics, vol. 79, pp. 477–491. Springer (2014)

    Google Scholar 

  11. Lai, K., Bo, L., Fox, D.: Unsupervised feature learning for 3D scene labeling. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 3050–3057. Hong Kong, China (2014)

    Google Scholar 

  12. Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 1817–1824 (2011)

    Google Scholar 

  13. Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining RGB and depth information. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 4007–4013. Shanghai, China (2011)

    Google Scholar 

  14. Lai, K., Bo, L., Ren, X., Fox, D.: Detection-based object labeling in 3D scenes. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 1330–1337. Saint Paul, MN, USA (2012)

    Google Scholar 

  15. Lai, K., Fox, D.: Object recognition in 3D point clouds using web data and domain adaptation. Int. J. Robot. Res. 29(8), 1019–1037 (2010)

    Article  Google Scholar 

  16. Levin, A., Lischinski, D., Weiss, Y.: A closed form solution to natural image matting. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 61–68. Washington, DC, USA (2006)

    Google Scholar 

  17. Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola, A.J., Bartlett, P.J. (eds.) Advances in Large Margin Classifiers, pp. 61–74. MIT Press (1999)

    Google Scholar 

  18. Quigley, M., Batra, S., Gould, S., Klingbeil, E., Le, Q., Wellman, A., Ng, A.Y.: High-accuracy 3D sensing for mobile manipulation: improving object detection and door opening. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 2816–2822. Kobe, Japan (2009)

    Google Scholar 

  19. Rahman, S.M.M., Lata, S.P., Howlader, T.: Bayesian face recognition using 2D Gaussian-Hermite moments. EURASIP J. Image Video Process. 2015(35), 1–20 (2015)

    Google Scholar 

  20. Ren, X., Ramanan, D.: Histograms of sparse codes for object detection. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3246–3253. Portland, OR, USA (2013)

    Google Scholar 

  21. Salakhutdinov, R., Torralba, A., Tenenbaum, J.: Learning to share visual appearance for multiclass object detection. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1481–1488. Los Alamitos, CA, USA (2011)

    Google Scholar 

  22. Shen, J., Shen, W., Shen, D.: On geometric and orthogonal moments. Int. J. Pattern Recognit. Artif. Intell. 14(07), 875–894 (2000)

    Article  Google Scholar 

  23. Su, H., Huang, Q., Mitra, N.J., Li, Y., Guibas, L.: Estimating image depth using shape collections. ACM Trans. Graph. 33(4), 37:1–37:11 (2014)

    Google Scholar 

  24. Triebel, R., Schmidt, R., Mozos, O.M., Burgard, W.: Instance-based AMN classification for improved object recognition in 2D and 3D laser range data. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 2225–2230. Hyderabad, India (2007)

    Google Scholar 

  25. Xiong, X., Munoz, D., Bagnell, J.A., Hebert, M.: 3-D scene analysis via sequenced predictions over points and regions. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 2609–2616. Shanghai, China (2011)

    Google Scholar 

  26. Xu, Y., Hu, X., Peng, S.: Sharp image estimation from a depth-involved motion-blurred image. Neurocomputing 171(C), 1185–1192 (2016)

    Article  Google Scholar 

  27. Zhuo, S., Sim, T.: Defocus map estimation from a single image. Pattern Recognit. 44(9), 1852–1858 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sadman Sakib Enan or S. M. Mahbubur Rahman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Enan, S.S., Mahbubur Rahman, S.M., Haque, S., Howlader, T., Hatzinakos, D. (2020). Object Labeling in 3D from Multi-view Scenes Using Gaussian–Hermite Moment-Based Depth Map. In: Chaudhuri, B., Nakagawa, M., Khanna, P., Kumar, S. (eds) Proceedings of 3rd International Conference on Computer Vision and Image Processing. Advances in Intelligent Systems and Computing, vol 1024. Springer, Singapore. https://doi.org/10.1007/978-981-32-9291-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-32-9291-8_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-32-9290-1

  • Online ISBN: 978-981-32-9291-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics