Abstract
We propose a feature, the Histogram of Oriented Normal Vectors (HONV), designed specifically to capture local geometric characteristics for object recognition with a depth sensor. Through our derivation, the normal vector orientation represented as an ordered pair of azimuthal angle and zenith angle can be easily computed from the gradients of the depth image. We form the HONV as a concatenation of local histograms of azimuthal angle and zenith angle. Since the HONV is inherently the local distribution of the tangent plane orientation of an object surface, we use it as a feature for object detection/classification tasks. The object detection experiments on the standard RGB-D dataset [1] and a self-collected Chair-D dataset show that the HONV significantly outperforms traditional features such as HOG on the depth image and HOG on the intensity image, with an improvement of 11.6% in average precision. For object classification, the HONV achieved 5.0% improvement over state-of-the-art approaches.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view rgb-d object dataset. In: International Conference on Robotics and Automation (2011)
Microsoft Corp., http://www.xbox.com/en-US/kinect
PrimeSense Corp., http://www.primesense.com/
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Ikemura, S., Fujiyoshi, H.: Real-Time Human Detection Using Relational Depth Similarity Features. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part IV. LNCS, vol. 6495, pp. 25–38. Springer, Heidelberg (2011)
Xia, L., Chen, C.C., Aggarwal, J.K.: Human detection using depth information by kinect. In: Workshop on Human Activity Understanding from 3D Data in conjunction with IEEE Conference on Computer Vision and Pattern Recognition, HAU3D (2011)
Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2011)
Bo, L., Lai, K., Ren, X., Fox, D.: Object recognition with hierarchical kernel descriptors. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: International Conference on Robotics and Automation (2011)
Ding, H., Moutarde, F., Shaiek, A.: 3d object recognition and person facial identification using time-averaged single-views from time-of-flight 3d depth-camera. In: Eurographics Workshop on 3D Object Retrieval (2010)
Oppenheim, A.V., Schafer, R.W., Buck, J.R.: Discrete-time signal processing, 2nd edn. Prentice-Hall, Inc. (1999)
Oren, M., Papageorgiou, C., Sinha, P., Osuna, E., Poggio, T.: Pedestrian detection using wavelet templates. In: IEEE Conference on Computer Vision and Pattern Recognition (1997)
Viola, P., Jones, M.: Robust real-time object detection. International Journal of Computer Vision (2002)
Lowe, D.G.: Object recognition from local scale-invariant features. In: IEEE International Conference on Computer Vision (1999)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part II. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Zhu, Q., Avidan, S., Chen Yeh, M., Ting Cheng, K.: Fast human detection using a cascade of histograms of oriented gradients. In: IEEE Conference on Computer Vision and Pattern Recognition (2006)
Felzenszwalb, P.F., Girshick, R.B., Mcallester, D.: Cascade object detection with deformable part models. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)
Ahonen, T., Hadid, A., Pietikäinen, M.: Face Recognition with Local Binary Patterns. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004, Part I. LNCS, vol. 3021, pp. 469–481. Springer, Heidelberg (2004)
Wang, X., Han, T.X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: IEEE International Conference on Computer Vision (2009)
Sabata, B., Arman, F., Aggarwal, J.K.: Segmentation of 3d range images using pyramidal data structures. CVGIP: Image Underst. 57, 373–387 (1993)
Vemuri, B.C., Mitiche, A., Aggarwal, J.K.: Curvature-based representation of objects from range data. Image Vision Comput. 4, 107–114 (1986)
Zhu, Y., Fujimura, K.: 3d head pose estimation with optical flow and depth constraints. 3D Digital Imaging and Modeling (2003)
Ess, A., Leibe, B., Gool, L.V.: Depth and appearance for mobile scene analysis. In: IEEE International Conference on Computer Vision (2007)
Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. TPAMI (1999)
Bhattacharyya, A.: On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin of the Calcutta Mathematical Society 35, 99–109 (1943)
Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D Deformable Face Tracking with a Commodity Depth Camera. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 229–242. Springer, Heidelberg (2010)
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgbd mapping: Using depth cameras for dense 3d modeling of indoor environments. In: RGB-D: Advanced Reasoning with Depth Cameras Workshop in Conjunction with RSS (2010)
Du, H., Henry, P., Ren, X., Cheng, M., Goldman, D.B., Seitz, S.M., Fox, D.: Interactive 3d modeling of indoor environments with a consumer depth camera. In: ACM International Conference on Ubiquitous Computing (2011)
Herbst, E., Ren, X., Fox, D.: Rgb-d object discovery via multi-scene analysis. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2011)
Yu, K., Zhang, T.: Improved local coordinate coding using local tangents. In: Proceedings of the 27th International Conference on Machine Learning, ICML 2010 (2010)
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: Advances in Neural Information Processing Systems, vol. 22, pp. 2223–2231 (2009)
Xu, D., Xu, W.: Description and recognition of object contours using arc length and tangent orientation. Pattern Recognition Letters, 855–864 (2005)
Joachims, T.: Making large-scale svm learning practical. LS8-Report 24, Universität Dortmund, LS VIII-Report (1998)
Cheng, Y.: Mean shift, mode seeking, and clustering. TPAMI 17 (1995)
Comaniciu, D., Meer, P., Member, S.: Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 603–619 (2002)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes (VOC) challenge. International Journal of Computer Vision 88, 303–338 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tang, S. et al. (2013). Histogram of Oriented Normal Vectors for Object Recognition with a Depth Sensor. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37444-9_41
Download citation
DOI: https://doi.org/10.1007/978-3-642-37444-9_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37443-2
Online ISBN: 978-3-642-37444-9
eBook Packages: Computer ScienceComputer Science (R0)