Skip to main content
Log in

Efficient Hybrid-Supervised Deep Reinforcement Learning for Person Following Robot

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Traditional person following robots usually need hand-crafted features and a well-designed controller to follow the assigned person. Normally it is difficult to be applied in outdoor situations due to variability and complexity of the environment. In this paper, we propose an approach in which an agent is trained by hybrid-supervised deep reinforcement learning (DRL) to perform a person following task in end-to-end manner. The approach enables the robot to learn features autonomously from monocular images and to enhance performance via robot-environment interaction. Experiments show that the proposed approach is adaptive to complex situations with significant illumination variation, object occlusion, target disappearance, pose change, and pedestrian interference. In order to speed up the training process to ensure easy application of DRL to real-world robotic follower controls, we apply an integration method through which the agent receives prior knowledge from a supervised learning (SL) policy network and reinforces its performance with a value-based or policy-based (including actor-critic method) DRL model. We also utilize an efficient data collection approach for supervised learning in the context of person following. Experimental results not only verify the robustness of the proposed DRL-based person following robot system, but also indicate how easily the robot can learn from mistakes and improve performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Shaker, S.: Fuzzy inference-based person-following robot. International Journal of Systems Applications Engineering & Development (2008)

  2. Rashid, M.M., Shafie, A.A., Alamgir, T.B., Alfar, I.J.: Design and implementation of fuzzy based person following mobile robot. In: Applied mechanics and materials, vol. 151, pp 184–188. Trans Tech Publ (2012)

  3. Xiang, Y., Alahi, A., Savarese, S.: Learning to track: Online multi-object tracking by decision making. In: 2015 IEEE International Conference on Computer Vision (ICCV), number EPFL-CONF-230283, pp. 4705–4713. IEEE (2015)

  4. Danelljan, M., Bhat, G., Shahbaz Khan, F., Felsberg, M.: Eco: Efficient convolution operators for tracking. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 21–26, Honolulu (2017)

  5. Nachum, O., Norouzi, M., Xu, K., Schuurmans, D.: Bridging the gap between value and policy based reinforcement learning (2017)

  6. Doisy, G., Jevtic, A., Lucet, E., Edan, Y.: Adaptive person-following algorithm based on depth images and mapping. IEEE. In: RSJ International Conference on Intelligent Robots and Systems (IROS) (2012)

  7. Ilias, B., Abdul Shukor, S.A., Yaacob, S., Adom, A.H., Mohd Razali, M.H.: A nurse following robot with high speed kinect sensor. ARPN J. Eng. Appl. Sci. 9(12), 2454–2459 (2014)

    Google Scholar 

  8. Chen, Z., Birchfield, S.T.: Person following with a mobile robot using binocular feature-based tracking. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, 2007. IROS 2007, pp. 815–820. IEEE (2007)

  9. Satake, J., Miura, J.: Robust stereo-based person detection and tracking for a person following robot. In: ICRA Workshop on People Detection and Tracking, pp. 1–10 (2009)

  10. Petrović, E., Leu, A., Ristić-Durrant, D., Nikolić, V.: Stereo vision-based human tracking for robotic follower. Int. J. Adv. Robot. Syst. 10(5), 230 (2013)

    Article  Google Scholar 

  11. Chen, B.X., Sahdev, R., Tsotsos, J.K.: Person following robot using selected online ada-boosting with stereo camera. In: 2017 14th Conference on Computer and Robot Vision (CRV), pp. 48–55 (2017)

  12. Atrey, P.K., Anwar Hossain, M., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: A survey. Multimed. Syst. 16(6), 345–379 (2010)

    Article  Google Scholar 

  13. Motai, Y., Jha, S.K., Kruse, D.: Human tracking from a mobile agent: Optical flow and Kalman filter arbitration. Signal Process. Image Commun. 27(1), 83–95 (2012)

    Article  Google Scholar 

  14. Kobilarov, M., Sukhatme, G., Hyams, J., Batavia, P.: People tracking and following with mobile robot using an omnidirectional camera and a laser. In: IEEE International Conference on Robotics and Automation, pp. 557–562 (2006)

  15. Tabe, Y., Uesugi, A., Misawa, N., Oguri, T., Omote, E., Igari, J.: Person following robot with vision-based and sensor fusion tracking algorithm. InTech (2008)

  16. Hoshino, F., Morioka, K.: Human following robot based on control of particle distribution with integrated range sensors. In: IEEE/sice International Symposium on System Integration, pp. 212–217 (2011)

  17. Germa, T., Ouadah, N., Cadenat, V., Devy, M.: Vision and RFID-based person tracking in crowds from a mobile robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5591–5596 (2009)

  18. Chen, B.X., Sahdev, R., Tsotsos, J.K.: Integrating stereo vision with a CNN tracker for a person-following robot. In: International Conference on Computer Vision Systems, pp. 300–313 (2017)

    Google Scholar 

  19. Antipov, G., Berrani, S.A., Ruchaud, N., Dugelay, J.L.: Learned vs. hand-crafted features for pedestrian gender recognition, pp. 1263–1266 (2015)

  20. Budnik, M., Gutierrez-Gomez, E.L., Safadi, B., Quénot, G.: Learned features versus engineered features for semantic video indexing. In: International Workshop on Content-Based Multimedia Indexing, pp. 1–6 (2015)

  21. Pomerleau, D.A.: Alvinn: An autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, pp. 305–313 (1989)

  22. Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic grasping with large-scale data collection. International Journal of Robotics Research, (10) (2016)

  23. Finn, C., Tan, X.Y., Duan, Y., Darrell, T., Levine, S., Abbeel, P.: Deep spatial autoencoders for visuomotor learning (2015)

  24. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J.: End to end learning for self-driving cars (2016)

  25. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)

    MathSciNet  MATH  Google Scholar 

  26. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 23–30 (2017)

  27. Bousmalis, K., Irpan, A., Wohlhart, P., Bai, Y., Kelcey, M., Kalakrishnan, M., Downs, L., Ibarz, J., Pastor, P., Konolige, K.: Using simulation and domain adaptation to improve efficiency of deep robotic grasping (2017)

  28. James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task (2017)

  29. Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484 (2016)

    Article  Google Scholar 

  30. Vitelli, M., Nayebi, A.: Carma: A deep reinforcement learning approach to autonomous driving (2016)

  31. Koutník, J., Schmidhuber, J., Gomez, F.: Evolving deep unsupervised convolutional networks for vision-based reinforcement learning. ACM (2014)

  32. El Sallab, A., Abdou, M., Perot, E., Yogamani, S.: End-to-end deep reinforcement learning for lane keeping assist (2016)

  33. Pan, X., You, Y., Wang, Z., Lu, C.: Virtual to real reinforcement learning for autonomous driving (2017)

  34. Rusu, A.A., Vecerik, M., Rothörl, T., Heess, N., Pascanu, R., Hadsell, R.: Sim-to-real robot learning from pixels with progressive nets (2016)

  35. Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: Ground truth from computer games, pp. 102–118 (2016)

    Chapter  Google Scholar 

  36. Sadeghi, F., Levine, S.: Cad2rl: Real single-image flight without a single real image (2016)

  37. Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization (2017)

  38. Devin, C., Abbeel, P., Darrell, T., Levine, S.: Deep object-centric representations for generalizable robot learning (2017)

  39. Lee, A.X., Levine, S., Abbeel, P.: Learning visual servoing with deep features and fitted q-iteration (2017)

  40. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)

  41. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: Imagenet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 248–255 (2009)

  42. Giusti, A., Guzzi, J., Cireşan Dan, C., He, F.L., Rodríguez, J.P., Fontana, F., Faessler, M., Forster, C., Schmidhuber, J., Di Caro, G.: A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robot. Autom. Lett. 1(2), 661–667 (2017)

    Article  Google Scholar 

  43. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction, vol. 1. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  44. Chen, Z., Jacobson, A., Sünderhauf, N., Upcroft, B., Liu, L., Shen, C., Reid, I., Milford, M.: Deep learning features at scale for visual place recognition. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3223–3230, 05 (2017)

  45. Williams, J.D., Zweig, G.: End-to-end lstm-based dialog control optimized with supervised and reinforcement learning (2016)

  46. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)

    Article  Google Scholar 

  47. Sutton, R.: Policy gradient methods for reinforcement learning with function approximation, vol. 12 (1999)

  48. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3-4), 229–256 (1992)

    Article  Google Scholar 

  49. Silver, D.: Google DeepMind. Tutorial: Deep reinforcement learning (2016)

  50. Martin, M.: Reinforcement learning framework (2011)

  51. Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)

    Article  Google Scholar 

  52. Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409 (2012)

    Article  Google Scholar 

  53. Li, F.F., Fergus, R., Perona, P.: A Bayesian approach to unsupervised one-shot learning of object categories. In: IEEE International Conference on Computer Vision, 2003. Proceedings, vol. 2, pp. 1134–1141 (2008)

Download references

Acknowledgements

This work is supported by the Fundamental Research Funds for the Central Universities(N172608005, N182608004), Young and Middle-aged Innovative Talent Plan of Shenyang(RC170490), Natural Science Foundation of Liaoning (No.20180520040) and National Natural Science Foundation of China (No. 61471110, 61733003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunzhou Zhang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(MP4 21.6 MB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pang, L., Zhang, Y., Coleman, S. et al. Efficient Hybrid-Supervised Deep Reinforcement Learning for Person Following Robot. J Intell Robot Syst 97, 299–312 (2020). https://doi.org/10.1007/s10846-019-01030-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-019-01030-0

Keywords

Navigation