Real-time Recognition and Pursuit in Robots Based on 3D Depth Data

Boubou, Somar; Jabbari Asl, Hamed; Narikiyo, Tatsuo; Kawanishi, Michihiro

doi:10.1007/s10846-017-0769-1

Real-time Recognition and Pursuit in Robots Based on 3D Depth Data

Published: 11 January 2018

Volume 93, pages 587–600, (2019)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Somar Boubou¹,
Hamed Jabbari Asl¹,
Tatsuo Narikiyo¹ &
…
Michihiro Kawanishi¹

331 Accesses
2 Altmetric
Explore all metrics

Abstract

In this work, we address the problem of robot pursuit based on a real-time object recognition system with 3D depth sensors. Compared with traditional RGBD data based recognition approaches, we propose a novel global online descriptor designed for object recognition from solely depth data. Proposed descriptor, which we name as Differential Histogram of Normal Vectors (DHONV), is designed to extract the geometric characteristics of the captured 3D surfaces of the objects presented in depth images. In order to obtain a brief description of the visible 3D surfaces of each object, we quantize the differential angles of the surface’s normal vectors into a 1D histogram. The object recognition experiments on a self-collected dataset and a benchmark RGB-D object dataset show that our proposed descriptor outperforms other depth data based descriptors. Moreover, we conducted real-time experiments with RoboCars. Our experiments with RoboCars validate our proposed method capability to perform a real-time recognition and pursuit tasks within indoor environment based solely on depth data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-sensor 3D object dataset for object recognition with full pose estimation

Article 02 March 2016

Recovering 6D Object Pose: A Review and Multi-modal Analysis

A 3D Recognition System with Local-Global Collaboration

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Andersen, M.R., Jensen, T., Lisouski, P., Mortensen, A.K., Hansen, M.K., Gregersen, T., Ahrendt, P.: Kinect depth sensor evaluation for computer vision applications. Tech. Rep. Electron. Comput. Eng. 1(6), 1–35 (2015)
Google Scholar
Blum, M., Springenberg, J., Wulfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D data. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1298–1303 (2012)
Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: Proceedings of 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 821–826 (2011a)
Bo, L., Ren, X., Fox, D.: Hierarchical matching pursuit for image classification: Architecture and fast algorithms. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), vol. 1, pp. 2115–2123 (2011b)
Boubou, S., Suzuki, E.: Classifying actions based on histogram of oriented velocity vectors. J. Intell. Inf. Syst. 44(1), 49–65 (2014)
Article Google Scholar
Boubou, S., Abdul Hafez, A., Suzuki, E.: Visual impression localization of autonomous robots. In: Proceedings of 2015 IEEE International Conference on Automation Science and Engineering (CASE), pp. 328–334 (2015)
Boubou, S., Narikiyo, T., Kawanishi, M.: Differential histogram of normal vectors for object recognition with depth sensors. In: Proceedings of 2016 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 162–167 (2016)
Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D deformable face tracking with a commodity depth camera. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 229–242 (2010)
Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comput. Vis. Image Underst. 81(2), 166–210 (2001)
Article MATH Google Scholar
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)
De Luca, A., Oriolo, G., Samson, C.: Feedback control of a nonholonomic car-like robot. In: Proceedings of robot motion planning and control, pp. 171–253 (1998)
Del Rio, F.D., Jimenez, G., Sevillano, J.L., Vicente, S., Balcells, A.C.: A generalization of path following for mobile robots. In: Proceedings of 1999 IEEE International Conference on Robotics and Automation (ICRA), vol. 1, pp. 7–12 (1999)
Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: Efficient and robust 3D object recognition. In: Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 998–1005 (2010)
Du, H., Henry, P., Ren, X., Cheng, M., Goldman, D.B., Seitz, S.M., Fox, D.: Interactive 3D modeling of indoor environments with a consumer depth camera. In: Proceedings of the 13th international conference on Ubiquitous computing, pp. 75–84 (2011)
Endres, F., Plagemann, C., Stachniss, C., Burgard, W.: Unsupervised Discovery of Object Classes from Range Data Using Latent Dirichlet Allocation. In: Robotics: Science and Systems, vol. 2, pp. 113–120 (2009)
Ess, A., Leibe, B., Gool, L.V.: Depth and appearance for mobile scene analysis. In: Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV), pp. 1–8 (2007)
Gelfand, N., Mitra, N.J., Guibas, L.J., Pottmann, H.: Robust Global Registration. In: Proceedings of the Third Eurographics Symposium on Geometry Processing (SGP), vol. 2, pp. 197–206 (2005)
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2270–2287 (2014)
Article Google Scholar
Hartmann, J., Forouher, D., Litza, M., Kluessendorff, J.H., Maehle, E.: Real-time visual slam using FastSLAM and the microsoft kinect camera. In: Proceedings of the 7th German Conference on Robotics (ROBOTIK), pp. 1–6 (2012)
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: using kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 31(5), 647–663 (2012)
Article Google Scholar
Herbst, E., Ren, X., Fox, D.: RGB-D object discovery via multi-scene analysis. In: Proceedings of 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4850–4856 (2011)
Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Proceedings of 2011 IEEE International Conference on Computer Vision (ICCV), pp. 858–865 (2011)
Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Proceedings of Asian Conference on Computer Vision (ACCV), pp. 548–562 (2012)
Ikemura, S., Fujiyoshi, H.: Real-time human detection using relational depth similarity features. In: Proceedings of 2010 Asian Conference on Computer Vision (ACCV), pp. 25–38 (2010)
Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)
Article Google Scholar
Karpathy, A., Miller, S., Fei-Fei, L.: Object discovery in 3D scenes via shape analysis. In: Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 2088–2095 (2013)
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: Proceedings of 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824 (2011a)
Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: Proceedings of 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 4007–4013 (2011b)
Mamic, G., Bennamoun, M.: Representation and recognition of 3D free-form objects. Digit. Signal Process. 12(1), 47–76 (2002)
Article MATH Google Scholar
Mian, A., Bennamoun, M., Owens, R.: Three-dimensional model-based object recognition and segmentation in cluttered scenes. Proc. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1584–1601 (2006)
Article Google Scholar
Mian, A.S., Bennamoun, M., Owens, R.A.: Automatic correspondence for 3D modeling: an extensive review. Int. J. Shape Model. 11(02), 253–291 (2005)
Article MATH Google Scholar
Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: Proceedings of 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 524–530 (2012)
Oreifej, O., Liu, Z.: HON4D: Histogram of oriented 4D normals for activity recognition from depth sequences. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 716–723 (2013)
Park, I.K., Germann, M., Breitenstein, M.D., Pfister, H.: Fast and Automatic Object Pose Estimation for Range Images on the GPU. Mach. Vis. Appl. 21, 749–766 (2010)
Article Google Scholar
Rabbani, T., Heuvel, F.V.D.: Efficient hough transform for automatic detection of cylinders in point clouds. In: Proceedings of the 11th Annual Conference of the Advanced School for Computing and Imaging (ASCI), vol. 3, pp. 60–65 (2005)
Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3D registration. In: Proceedings of 2009 IEEE International Conference on Robotics and Automation (ICRA), pp. 3212–3217 (2009)
Rusu, R.B., Bradski, G., Thibaux, R., Hsu, J.: Fast 3D recognition and pose using the viewpoint feature histogram. In: Proceedings of 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2155–2162 (2010)
Sabata, B., Arman, F., Aggarwal, J.K.: Segmentation of 3D range images using pyramidal data structures. CVGIP: Image Underst. 57(3), 373–387 (1993)
Article Google Scholar
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)
Article Google Scholar
Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: Proceedings of 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 601–608 (2011)
Tang, J., Miller, S., Singh, A., Abbeel, P.: A textured object recognition pipeline for color and depth image data. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 3467–3474 (2012a)
Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Proceedings of 2012 Asian Conference on Computer Vision (ACCV), pp. 525–538 (2012b)
Vemuri, B.C., Mitiche, A., Aggarwal, J.K.: Curvature-based representation of objects from range data. Image Vis. Comput. 4(2), 107–114 (1986)
Article Google Scholar
Wahl, E., Hillenbrand, U., Hirzinger, G.: Surflet-pair-relation histograms: A statistical 3d-shape representation for rapid classification. In: Proceedings of the Fourth International Conference on 3-D Digital Imaging and Modeling (3DIM), pp. 474–481 (2003)
Xia, L., Chen, C.C., Aggarwal, J.K.: Human detection using depth information by kinect. In: Proceedings of 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 15–22 (2011)
Yunqi, L., Haibin, L., Xutuan, J.: 3D face recognition by SURF operator based on depth image. In: Proceedings of the 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT), vol. 9, pp. 240–244 (2010)
Zhang, L., Shen, P., Ding, J., Song, J., Liu, J., Yi, K.: An improved RGB-D SLAM algorithm based on kinect sensor. In: Proceedings of 2015 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pp. 555–562 (2015)
Zhu, Y., Fujimura, K.: 3D Head Pose Estimation with Optical Flow and Depth constraints. In: Proceedings of the Fourth International Conference on 3-D Digital Imaging and Modeling (3DIM), pp. 211–216 (2003)

Download references

Author information

Authors and Affiliations

Department of Advanced Science and Technology, Toyota Technological Institute, Nagoya, Japan
Somar Boubou, Hamed Jabbari Asl, Tatsuo Narikiyo & Michihiro Kawanishi

Authors

Somar Boubou
View author publications
You can also search for this author inPubMed Google Scholar
Hamed Jabbari Asl
View author publications
You can also search for this author inPubMed Google Scholar
Tatsuo Narikiyo
View author publications
You can also search for this author inPubMed Google Scholar
Michihiro Kawanishi
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Somar Boubou.

Appendix

Here, we show how the kinematic models (13) and (14) are derived. Differentiating (8), we have

$$ \dot{\rho}=\frac{\left( \dot{x}_{2}-\dot{x}_{1}\right)\left( x_{2}-x_{1}\right)+\left( \dot{y}_{2}-\dot{y}_{1}\right)\left( y_{2}-y_{1}\right)}{\rho} $$

(21)

Also, from the definition of λ we have

$$\begin{array}{@{}rcl@{}} &&\rho\cos(\lambda)=x_{2}-x_{1} \end{array} $$

(22)

$$\begin{array}{@{}rcl@{}} &&\rho\sin(\lambda)=y_{2}-y_{1} \end{array} $$

(23)

Substituting these terms in Eq. 21, we obtain

$$\begin{array}{@{}rcl@{}} \dot{\rho}&=&-\nu_{1}\Big(\cos(\theta_{1})\cos(\lambda)+\sin(\theta_{1})\sin(\lambda)\Big)\\&+&\nu_{2}\Big(\cos(\theta_{2})\cos(\lambda)+\sin(\theta_{2})\sin(\lambda)\Big) \\&=&-\nu_{1}\cos(\theta_{1}-\lambda)+{\Delta}_{\rho} \end{array} $$

(24)

where ${\Delta }_{\rho }\triangleq \nu _{2}\cos (\theta _{2}-\lambda )$. Therefore, according to Eq. 9, the kinematics of ρ can be written as in Eq. 13.

Also, differentiating (10) and using Eqs. 22 and 23, we have

$$\begin{array}{@{}rcl@{}} \dot{\lambda} &=&\frac{\left( \sin(\theta_{2})\nu_{2}-\sin(\theta_{1})\nu_{1}\right)\cos(\lambda)-\left( \cos(\theta_{2})\nu_{2}-\cos(\theta_{1})\nu_{1}\right)\sin(\lambda)}{\rho}\\ &=&-\frac{\nu_{1}}{\rho}\sin(\theta_{1}-\lambda)-{\Delta}_{\psi} \end{array} $$

(25)

where ${\Delta }_{\psi }\triangleq -\frac {\nu _{2}}{\rho }\sin (\theta _{2}-\lambda )$. Therefore, differentiating (9) and using Eqs. 7 and 25, the kinematics of ψ can be written as in Eq. 14.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boubou, S., Jabbari Asl, H., Narikiyo, T. et al. Real-time Recognition and Pursuit in Robots Based on 3D Depth Data. J Intell Robot Syst 93, 587–600 (2019). https://doi.org/10.1007/s10846-017-0769-1

Download citation

Received: 01 October 2016
Accepted: 27 December 2017
Published: 11 January 2018
Issue Date: 15 March 2019
DOI: https://doi.org/10.1007/s10846-017-0769-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time Recognition and Pursuit in Robots Based on 3D Depth Data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-sensor 3D object dataset for object recognition with full pose estimation

Recovering 6D Object Pose: A Review and Multi-modal Analysis

A 3D Recognition System with Local-Global Collaboration

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now