Histogram of Oriented Normal Vectors for Object Recognition with a Depth Sensor

Tang, Shuai; Wang, Xiaoyu; Lv, Xutao; Han, Tony X.; Keller, James; He, Zhihai; Skubic, Marjorie; Lao, Shihong

doi:10.1007/978-3-642-37444-9_41

Shuai Tang²⁰,
Xiaoyu Wang²⁰,
Xutao Lv²⁰,
Tony X. Han²⁰,
James Keller²⁰,
Zhihai He²⁰,
Marjorie Skubic²⁰ &
…
Shihong Lao²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7725))

Included in the following conference series:

Asian Conference on Computer Vision

4702 Accesses
51 Citations

Abstract

We propose a feature, the Histogram of Oriented Normal Vectors (HONV), designed specifically to capture local geometric characteristics for object recognition with a depth sensor. Through our derivation, the normal vector orientation represented as an ordered pair of azimuthal angle and zenith angle can be easily computed from the gradients of the depth image. We form the HONV as a concatenation of local histograms of azimuthal angle and zenith angle. Since the HONV is inherently the local distribution of the tangent plane orientation of an object surface, we use it as a feature for object detection/classification tasks. The object detection experiments on the standard RGB-D dataset [1] and a self-collected Chair-D dataset show that the HONV significantly outperforms traditional features such as HOG on the depth image and HOG on the intensity image, with an improvement of 11.6% in average precision. For object classification, the HONV achieved 5.0% improvement over state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Advanced Human Detection Using Fused Information of Depth and Intensity Images

Helmet-fourier orthogonal moments for image representation and recognition

Article 22 March 2022

SeLibCV: A Service Library for Computer Vision Researchers

References

Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view rgb-d object dataset. In: International Conference on Robotics and Automation (2011)
Google Scholar
Microsoft Corp., http://www.xbox.com/en-US/kinect
PrimeSense Corp., http://www.primesense.com/
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Ikemura, S., Fujiyoshi, H.: Real-Time Human Detection Using Relational Depth Similarity Features. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part IV. LNCS, vol. 6495, pp. 25–38. Springer, Heidelberg (2011)
Chapter Google Scholar
Xia, L., Chen, C.C., Aggarwal, J.K.: Human detection using depth information by kinect. In: Workshop on Human Activity Understanding from 3D Data in conjunction with IEEE Conference on Computer Vision and Pattern Recognition, HAU3D (2011)
Google Scholar
Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2011)
Google Scholar
Bo, L., Lai, K., Ren, X., Fox, D.: Object recognition with hierarchical kernel descriptors. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: International Conference on Robotics and Automation (2011)
Google Scholar
Ding, H., Moutarde, F., Shaiek, A.: 3d object recognition and person facial identification using time-averaged single-views from time-of-flight 3d depth-camera. In: Eurographics Workshop on 3D Object Retrieval (2010)
Google Scholar
Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-time signal processing, 2nd edn. Prentice-Hall, Inc. (1999)
Google Scholar
Oren, M., Papageorgiou, C., Sinha, P., Osuna, E., Poggio, T.: Pedestrian detection using wavelet templates. In: IEEE Conference on Computer Vision and Pattern Recognition (1997)
Google Scholar
Viola, P., Jones, M.: Robust real-time object detection. International Journal of Computer Vision (2002)
Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: IEEE International Conference on Computer Vision (1999)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Google Scholar
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part II. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Chapter Google Scholar
Zhu, Q., Avidan, S., Chen Yeh, M., Ting Cheng, K.: Fast human detection using a cascade of histograms of oriented gradients. In: IEEE Conference on Computer Vision and Pattern Recognition (2006)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., Mcallester, D.: Cascade object detection with deformable part models. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)
Google Scholar
Ahonen, T., Hadid, A., Pietikäinen, M.: Face Recognition with Local Binary Patterns. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004, Part I. LNCS, vol. 3021, pp. 469–481. Springer, Heidelberg (2004)
Chapter Google Scholar
Wang, X., Han, T.X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: IEEE International Conference on Computer Vision (2009)
Google Scholar
Sabata, B., Arman, F., Aggarwal, J.K.: Segmentation of 3d range images using pyramidal data structures. CVGIP: Image Underst. 57, 373–387 (1993)
Article Google Scholar
Vemuri, B.C., Mitiche, A., Aggarwal, J.K.: Curvature-based representation of objects from range data. Image Vision Comput. 4, 107–114 (1986)
Article Google Scholar
Zhu, Y., Fujimura, K.: 3d head pose estimation with optical flow and depth constraints. 3D Digital Imaging and Modeling (2003)
Google Scholar
Ess, A., Leibe, B., Gool, L.V.: Depth and appearance for mobile scene analysis. In: IEEE International Conference on Computer Vision (2007)
Google Scholar
Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. TPAMI (1999)
Google Scholar
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin of the Calcutta Mathematical Society 35, 99–109 (1943)
MathSciNet MATH Google Scholar
Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D Deformable Face Tracking with a Commodity Depth Camera. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 229–242. Springer, Heidelberg (2010)
Chapter Google Scholar
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgbd mapping: Using depth cameras for dense 3d modeling of indoor environments. In: RGB-D: Advanced Reasoning with Depth Cameras Workshop in Conjunction with RSS (2010)
Google Scholar
Du, H., Henry, P., Ren, X., Cheng, M., Goldman, D.B., Seitz, S.M., Fox, D.: Interactive 3d modeling of indoor environments with a consumer depth camera. In: ACM International Conference on Ubiquitous Computing (2011)
Google Scholar
Herbst, E., Ren, X., Fox, D.: Rgb-d object discovery via multi-scene analysis. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2011)
Google Scholar
Yu, K., Zhang, T.: Improved local coordinate coding using local tangents. In: Proceedings of the 27th International Conference on Machine Learning, ICML 2010 (2010)
Google Scholar
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: Advances in Neural Information Processing Systems, vol. 22, pp. 2223–2231 (2009)
Google Scholar
Xu, D., Xu, W.: Description and recognition of object contours using arc length and tangent orientation. Pattern Recognition Letters, 855–864 (2005)
Google Scholar
Joachims, T.: Making large-scale svm learning practical. LS8-Report 24, Universität Dortmund, LS VIII-Report (1998)
Google Scholar
Cheng, Y.: Mean shift, mode seeking, and clustering. TPAMI 17 (1995)
Google Scholar
Comaniciu, D., Meer, P., Member, S.: Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 603–619 (2002)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes (VOC) challenge. International Journal of Computer Vision 88, 303–338 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

ECE Department, University of Missouri, Columbia, MO, USA
Shuai Tang, Xiaoyu Wang, Xutao Lv, Tony X. Han, James Keller, Zhihai He & Marjorie Skubic
Core Technology Center of Omron Corporation, Kyoto, Japan
Shihong Lao

Authors

Shuai Tang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xutao Lv
View author publications
You can also search for this author in PubMed Google Scholar
Tony X. Han
View author publications
You can also search for this author in PubMed Google Scholar
James Keller
View author publications
You can also search for this author in PubMed Google Scholar
Zhihai He
View author publications
You can also search for this author in PubMed Google Scholar
Marjorie Skubic
View author publications
You can also search for this author in PubMed Google Scholar
Shihong Lao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, 151-744, Seoul, Korea
Kyoung Mu Lee
Microsoft Research Asia, No. 5, Danling st., Haidian district, 100080, Beijing, P.R. China
Yasuyuki Matsushita
School of Interactive Computing, Georgia Institute of Technology, 801 Atlantic Drive, CCB 315, 30332, Atlanta, GA, USA
James M. Rehg
Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Zhong Quan Cun East Road 95, Haidian District, 100 190, Beijing, P.R. China
Zhanyi Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, S. et al. (2013). Histogram of Oriented Normal Vectors for Object Recognition with a Depth Sensor. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37444-9_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-37444-9_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37443-2
Online ISBN: 978-3-642-37444-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics