Abstract
In this work, we address the problem of robot pursuit based on a real-time object recognition system with 3D depth sensors. Compared with traditional RGBD data based recognition approaches, we propose a novel global online descriptor designed for object recognition from solely depth data. Proposed descriptor, which we name as Differential Histogram of Normal Vectors (DHONV), is designed to extract the geometric characteristics of the captured 3D surfaces of the objects presented in depth images. In order to obtain a brief description of the visible 3D surfaces of each object, we quantize the differential angles of the surface’s normal vectors into a 1D histogram. The object recognition experiments on a self-collected dataset and a benchmark RGB-D object dataset show that our proposed descriptor outperforms other depth data based descriptors. Moreover, we conducted real-time experiments with RoboCars. Our experiments with RoboCars validate our proposed method capability to perform a real-time recognition and pursuit tasks within indoor environment based solely on depth data.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Andersen, M.R., Jensen, T., Lisouski, P., Mortensen, A.K., Hansen, M.K., Gregersen, T., Ahrendt, P.: Kinect depth sensor evaluation for computer vision applications. Tech. Rep. Electron. Comput. Eng. 1(6), 1–35 (2015)
Blum, M., Springenberg, J., Wulfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D data. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1298–1303 (2012)
Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: Proceedings of 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 821–826 (2011a)
Bo, L., Ren, X., Fox, D.: Hierarchical matching pursuit for image classification: Architecture and fast algorithms. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), vol. 1, pp. 2115–2123 (2011b)
Boubou, S., Suzuki, E.: Classifying actions based on histogram of oriented velocity vectors. J. Intell. Inf. Syst. 44(1), 49–65 (2014)
Boubou, S., Abdul Hafez, A., Suzuki, E.: Visual impression localization of autonomous robots. In: Proceedings of 2015 IEEE International Conference on Automation Science and Engineering (CASE), pp. 328–334 (2015)
Boubou, S., Narikiyo, T., Kawanishi, M.: Differential histogram of normal vectors for object recognition with depth sensors. In: Proceedings of 2016 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 162–167 (2016)
Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D deformable face tracking with a commodity depth camera. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 229–242 (2010)
Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comput. Vis. Image Underst. 81(2), 166–210 (2001)
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)
De Luca, A., Oriolo, G., Samson, C.: Feedback control of a nonholonomic car-like robot. In: Proceedings of robot motion planning and control, pp. 171–253 (1998)
Del Rio, F.D., Jimenez, G., Sevillano, J.L., Vicente, S., Balcells, A.C.: A generalization of path following for mobile robots. In: Proceedings of 1999 IEEE International Conference on Robotics and Automation (ICRA), vol. 1, pp. 7–12 (1999)
Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: Efficient and robust 3D object recognition. In: Proceedings of 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 998–1005 (2010)
Du, H., Henry, P., Ren, X., Cheng, M., Goldman, D.B., Seitz, S.M., Fox, D.: Interactive 3D modeling of indoor environments with a consumer depth camera. In: Proceedings of the 13th international conference on Ubiquitous computing, pp. 75–84 (2011)
Endres, F., Plagemann, C., Stachniss, C., Burgard, W.: Unsupervised Discovery of Object Classes from Range Data Using Latent Dirichlet Allocation. In: Robotics: Science and Systems, vol. 2, pp. 113–120 (2009)
Ess, A., Leibe, B., Gool, L.V.: Depth and appearance for mobile scene analysis. In: Proceedings of the 11th IEEE International Conference on Computer Vision (ICCV), pp. 1–8 (2007)
Gelfand, N., Mitra, N.J., Guibas, L.J., Pottmann, H.: Robust Global Registration. In: Proceedings of the Third Eurographics Symposium on Geometry Processing (SGP), vol. 2, pp. 197–206 (2005)
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2270–2287 (2014)
Hartmann, J., Forouher, D., Litza, M., Kluessendorff, J.H., Maehle, E.: Real-time visual slam using FastSLAM and the microsoft kinect camera. In: Proceedings of the 7th German Conference on Robotics (ROBOTIK), pp. 1–6 (2012)
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: using kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 31(5), 647–663 (2012)
Herbst, E., Ren, X., Fox, D.: RGB-D object discovery via multi-scene analysis. In: Proceedings of 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4850–4856 (2011)
Hinterstoisser, S., Holzer, S., Cagniart, C., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Proceedings of 2011 IEEE International Conference on Computer Vision (ICCV), pp. 858–865 (2011)
Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Proceedings of Asian Conference on Computer Vision (ACCV), pp. 548–562 (2012)
Ikemura, S., Fujiyoshi, H.: Real-time human detection using relational depth similarity features. In: Proceedings of 2010 Asian Conference on Computer Vision (ACCV), pp. 25–38 (2010)
Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)
Karpathy, A., Miller, S., Fei-Fei, L.: Object discovery in 3D scenes via shape analysis. In: Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 2088–2095 (2013)
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: Proceedings of 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824 (2011a)
Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: Proceedings of 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 4007–4013 (2011b)
Mamic, G., Bennamoun, M.: Representation and recognition of 3D free-form objects. Digit. Signal Process. 12(1), 47–76 (2002)
Mian, A., Bennamoun, M., Owens, R.: Three-dimensional model-based object recognition and segmentation in cluttered scenes. Proc. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1584–1601 (2006)
Mian, A.S., Bennamoun, M., Owens, R.A.: Automatic correspondence for 3D modeling: an extensive review. Int. J. Shape Model. 11(02), 253–291 (2005)
Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: Proceedings of 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 524–530 (2012)
Oreifej, O., Liu, Z.: HON4D: Histogram of oriented 4D normals for activity recognition from depth sequences. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 716–723 (2013)
Park, I.K., Germann, M., Breitenstein, M.D., Pfister, H.: Fast and Automatic Object Pose Estimation for Range Images on the GPU. Mach. Vis. Appl. 21, 749–766 (2010)
Rabbani, T., Heuvel, F.V.D.: Efficient hough transform for automatic detection of cylinders in point clouds. In: Proceedings of the 11th Annual Conference of the Advanced School for Computing and Imaging (ASCI), vol. 3, pp. 60–65 (2005)
Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3D registration. In: Proceedings of 2009 IEEE International Conference on Robotics and Automation (ICRA), pp. 3212–3217 (2009)
Rusu, R.B., Bradski, G., Thibaux, R., Hsu, J.: Fast 3D recognition and pose using the viewpoint feature histogram. In: Proceedings of 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2155–2162 (2010)
Sabata, B., Arman, F., Aggarwal, J.K.: Segmentation of 3D range images using pyramidal data structures. CVGIP: Image Underst. 57(3), 373–387 (1993)
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)
Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: Proceedings of 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 601–608 (2011)
Tang, J., Miller, S., Singh, A., Abbeel, P.: A textured object recognition pipeline for color and depth image data. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 3467–3474 (2012a)
Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Proceedings of 2012 Asian Conference on Computer Vision (ACCV), pp. 525–538 (2012b)
Vemuri, B.C., Mitiche, A., Aggarwal, J.K.: Curvature-based representation of objects from range data. Image Vis. Comput. 4(2), 107–114 (1986)
Wahl, E., Hillenbrand, U., Hirzinger, G.: Surflet-pair-relation histograms: A statistical 3d-shape representation for rapid classification. In: Proceedings of the Fourth International Conference on 3-D Digital Imaging and Modeling (3DIM), pp. 474–481 (2003)
Xia, L., Chen, C.C., Aggarwal, J.K.: Human detection using depth information by kinect. In: Proceedings of 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 15–22 (2011)
Yunqi, L., Haibin, L., Xutuan, J.: 3D face recognition by SURF operator based on depth image. In: Proceedings of the 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT), vol. 9, pp. 240–244 (2010)
Zhang, L., Shen, P., Ding, J., Song, J., Liu, J., Yi, K.: An improved RGB-D SLAM algorithm based on kinect sensor. In: Proceedings of 2015 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pp. 555–562 (2015)
Zhu, Y., Fujimura, K.: 3D Head Pose Estimation with Optical Flow and Depth constraints. In: Proceedings of the Fourth International Conference on 3-D Digital Imaging and Modeling (3DIM), pp. 211–216 (2003)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Here, we show how the kinematic models (13) and (14) are derived. Differentiating (8), we have
Also, from the definition of λ we have
Substituting these terms in Eq. 21, we obtain
where \({\Delta }_{\rho }\triangleq \nu _{2}\cos (\theta _{2}-\lambda )\). Therefore, according to Eq. 9, the kinematics of ρ can be written as in Eq. 13.
Also, differentiating (10) and using Eqs. 22 and 23, we have
where \({\Delta }_{\psi }\triangleq -\frac {\nu _{2}}{\rho }\sin (\theta _{2}-\lambda )\). Therefore, differentiating (9) and using Eqs. 7 and 25, the kinematics of ψ can be written as in Eq. 14.
Rights and permissions
About this article
Cite this article
Boubou, S., Jabbari Asl, H., Narikiyo, T. et al. Real-time Recognition and Pursuit in Robots Based on 3D Depth Data. J Intell Robot Syst 93, 587–600 (2019). https://doi.org/10.1007/s10846-017-0769-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-017-0769-1