A hierarchical approach for updating targeted person states in human-following mobile robots

Van Toan, Nguyen; Bach, Sy-Hung; Yi, Soo-Yeong

doi:10.1007/s11370-023-00463-9

A hierarchical approach for updating targeted person states in human-following mobile robots

Original Research Paper
Published: 23 May 2023

Volume 16, pages 287–306, (2023)
Cite this article

Intelligent Service Robotics Aims and scope Submit manuscript

439 Accesses
7 Citations
Explore all metrics

Abstract

In the human-following task, the human detection, tracking and identification are fundamental steps to help the mobile robot to follow and maintain an appropriate distance and orientation to the selected target person (STP) without any threatenings. Recently, along with the widespread development of robots in general and service robots in particular, not only the safety, but the flexibility, the naturality and the sociality in applications of human-friendly services and collaborative tasks are also increasingly demanded with a higher level. This request poses more challenges in robustly detecting, tracking and identifying the STP since the human–robot cooperation is more complex and unpredictable. Obviously, the safe natural robot behavior cannot be ensured if the STP is lost or the robot misidentified its target. In this paper, a hierarchical approach is presented to update the states of the STP more robustly during the human-following task. This method is proposed with the goal of achieving good performance (robust, accurate and fast response) to serve safe natural robot behaviors, with modest hardware. The proposed system is verified by a set of experiments, and shown reasonable results.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Onboard Robust Person Detection and Tracking for Domestic Service Robots

Angular Position Estimation for Human-Following and Robot Navigation

An Adaptive and Proactive Human-Aware Robot Guide

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Code or data availability

Not applicable.

References

Islam MJ, Hong J, Sattar J (2019) Person-following by autonomous robots: a categorical overview. Int J Robot Res 38(14):1581–1618
Article Google Scholar
Rudenko A et al (2020) Human motion trajectory prediction: a survey. Int J Robot Res 39(8):895–935
Article Google Scholar
Leigh A et al (2015) Person tracking and following with 2D laser scanners. In: 2015 IEEE international conference on robotics and automation (ICRA), Seattle, Washington, USA, 26–30 May 2015, pp 726–733
Yuan J et al (2018) Laser-based intersection-aware human following with a mobile robot in indoor environments. IEEE Trans Syst Man Cybern Syst 51(1):354–369
Article Google Scholar
Beyer L et al (2018) Deep person detection in two-dimensional range data. IEEE Robot Autom Lett 3(3):2726–2733
Article Google Scholar
Guerrero-Higueras AM et al (2019) Tracking people in a mobile robot from 2D LIDAR scans using full convolutional neural networks for security in cluttered environments. Front Neurorobot 12:85
Article Google Scholar
Eguchi R, Yorozu A, Takahashi M (2019) Spatiotemporal and kinetic gait analysis system based on multisensor fusion of laser range sensor and instrumented insoles. In: 2019 IEEE international conference on robotics and automation (ICRA), Montreal, QC, Canada, 20–24 May 2019, pp 4876–4881
Duong HT, Suh YS (2020) Human gait tracking for normal people and walker users using a 2D LiDAR. IEEE Sens J 20(11):6191–6199
Article Google Scholar
Cha D, Chung W (2020) Human-leg detection in 3D feature space for a person-following mobile robot using 2D LiDARs. Int J Precis Eng Manuf 21(7):1299–1307
Article Google Scholar
Mandischer N et al (2021) Radar tracker for human legs based on geometric and intensity features. In: 2021 29th European signal processing conference (EUSIPCO), Dublin, Ireland, 23–27 August 2021, pp 1521–1525
Eguchi R, Takahashi M (2022) Human leg tracking by fusion of laser range and insole force sensing with Gaussian mixture model-based occlusion compensation. IEEE Sens J 22(4):3704–3714
Article Google Scholar
Torta E et al (2011) Design of robust robotic proxemic behavior. In: Social robotics: third international conference on social robotics, ICSR 2011, Amsterdam, The Netherlands, 24–25 November 2011, Proceedings 3, pp 21–30
Torta E et al (2013) Design of a parametric model of personal space for robotic social navigation. Int J Soc Robot 5(3):357–365
Article Google Scholar
Truong X-T, Ngo T-D (2016) Dynamic social zone based mobile robot navigation for human comfortable safety in social environments. Int J Soc Robot 8(5):663–684
Article Google Scholar
Van Toan N, Khoi PB (2019) Fuzzy-based-admittance controller for safe natural human-robot interaction. Adv Robot 33(15–16):815–823
Article Google Scholar
Van Toan N, Khoi PB (2019) A control solution for closed-form mechanisms of relative manipulation based on fuzzy approach. Int J Adv Robot Syst 16(2):1–11
Google Scholar
Van Toan N, Do MH, Jo J (2022) Robust-adaptive-behavior strategy for human-following robots in unknown environments based on fuzzy inference mechanism. Ind Robot Int J Robot Res Appl 49(6):1089–1100
Google Scholar
Van Toan N et al (2023) The human-following strategy for mobile robots in mixed environments. Robot Auton Syst 160:104317
Article Google Scholar
Van Toan N, Khoi PB, Yi SY (2021) A MLP-hedge-algebras admittance controller for physical human–robot interaction. Appl Sci 11(12):5459
Article Google Scholar
Van Toan N, Yi S-Y, Khoi PB (2020) Hedge algebras-based admittance controller for safe natural human-robot interaction. Adv Robot 34(24):1546–1558
Article Google Scholar
Khoi PB, Van Toan N (2018) Hedge-algebras-based controller for mechanisms of relative manipulation. Int J Precis Eng Manuf 19(3):377–385
Article Google Scholar
Fosty B et al (2016) Accuracy and reliability of the RGB-D camera for measuring walking speed on a treadmill. Gait Posture 48:113–119
Article Google Scholar
Koide K, Miura J (2016) Identification of a specific person using color, height, and gait features for a person following robot. Robot Auton Syst 84:76–87
Article Google Scholar
Chen BX, Sahdev R, Tsotsos JK (2017) Integrating stereo vision with a CNN tracker for a person-following robot. In: International conference on computer vision systems; computer vision systems. Springer, Berlin/Heidelberg, pp 300–313
Lee B-J et al (2018) Robust human following by deep Bayesian trajectory prediction for home service robots. In: 2018 IEEE international conference on robotics and automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018, pp 7189–7195
Yang C-A, Song K-T (2019) Control design for robotic human-following and obstacle avoidance using an RGB-D camera. In: 2019 19th International conference on control, automation and systems (ICCAS 2019), Jeju, South Korea, 15–18 October 2019, pp 934–939
Vilas-Boas MC et al (2019) Full-body motion assessment: concurrent validation of two body tracking depth sensors versus a gold standard system during gait. J Biomech 87:189–196
Article Google Scholar
Yagi K et al (2020) Gait measurement at home using a single RGB camera. Gait Posture 76:136–140
Article Google Scholar
Yorozu A, Takahashi M (2020) Estimation of body direction based on gait for service robot applications. Robot Auton Syst 132:103603
Article Google Scholar
Redhwan A, Choi M-T (2020) Deep-learning-based indoor human following of mobile robot using color feature. Sensors (Basel) 20(9):2699
Article Google Scholar
Van Toan N, Hoang MD, Jo J (2022) MoDeT: a low-cost obstacle tracker for self-driving mobile robot navigation using 2D-laser scan. Ind Robot Int J Robot Res Appl 49(6):1032–1041
Google Scholar
Ren S et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the 28th international conference on neural information processing systems, Montreal, Canada, 7–12 December 2015, pp 91–99
Girshick R et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, 23–28 June 2014, pp 580–87
Dai J et al (2016) R-FCN: Object detection via region-based fully convolutional networks. In: Proceedings of the 30th international conference on neural information processing systems, Barcelona, Spain, 5–10 December 2016, pp 379–387
Redmon J et al (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NY, USA, 27–30 June 2016, pp 779–788
Liu W et al (2016) SSD: Single shot multibox detector. In: European conference on computer vision, Amsterdam, Netherlands, 11–14 October 2016, pp 21–37
Vu T-H, Osokin A, Laptev I (2015) Context-aware CNNs for person head detection. In: 2015 IEEE international conference on computer vision (ICCV), Santiago, Chile, 07–13 December 2015, pp 2893–2901
Rashid M, Gu X, Lee YJ (2017) Interspecies knowledge transfer for facial keypoint detection. In: IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017, pp 6894–6903
Girdhar R et al (2018) Detect-and-track: efficient pose estimation in videos. In: IEEE conference on computer vision and pattern recognition (CVPR), Salt Lake City, Utah, USA, 19–21 June 2018, pp 350–359
Hong M et al (2022) SSPNet: scale selection pyramid network for tiny person detection from UAV images. IEEE Geosci Remote Sens Lett 19:1–5. https://doi.org/10.1109/LGRS.2021.3103069
Article Google Scholar
Howard AG et al (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.1704.04861
Labeling Image (labelImg). Available at: https://github.com/heartexlabs/labelImg
King D (2017) A high quality face recognition with deep metric learning. Available at: http://blog.dlib.net/2017/02/high-quality-face-recognition-with-deep.html
Huang GB, Learned-Miller E (2014) Labeled faces in the wild: updates and new reporting procedures. Technical Report UM-CS-2014–03, University of Massachusetts, Amherst
Huang GB et al (2007) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07–49, University of Massachusetts, Amherst
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. Comput Vis Patter Recognit. https://arxiv.org/abs/1703.07737
Yuan Y et al (2020) In defense of the triplet loss again: learning robust person re-identification with fast approximated triplet loss and label distillation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020, pp 1454–1463
He K et al (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp 770–778
He K et al (2016) Identity mappings in deep residual networks. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. ECCV 2016. Lecture notes in computer science, vol 9908. Springer, Cham. https://doi.org/10.1007/978-3-319-46493-0_38
Chapter Google Scholar
van der Maaten L (2014) Accelerating t-SNE using tree-based algorithm. J Mach Learn Res 15(93):3221–3245
MathSciNet MATH Google Scholar
Ku J, Haraked A, Waslander SL (2018) In defense of classical image processing: fast depth completion on the CPU. In: 2018 15th conference on computer and robot vision (CRV), Toronto, Canada, 9–11 May 2018, pp 16–22
Ku J et al (2018) Joint 3D proposal generation and object detection from view aggregation. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), Madrid, Spain, 01–05 October 2018, pp 1–8
Lahoud J, Ghanem B (2017) 2D-driven 3D object detection in RGB-D images. In: 2017 IEEE international conference on computer vision (ICCV), Venice, Italy, 22–29 October 2017, pp 4622–4630
Qi CR et al (2018) Frustum pointnets for 3D object detection from RGB-D data. In: 2018 IEEE conference on computer vision and pattern recognition (CVPR), Salt Lake City, Utah, USA, 18–22 June 2018, pp 918–927
Shi W et al (2018) Dynamic obstacles rejection for 3D map simultaneous updating. IEEE Access 6:37715–37724
Article Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by the Research Program funded by the Seoul National University of Science and Technology.

Author information

Authors and Affiliations

Department of Electrical and Information Engineering, Seoul National University of Science and Technology, Seoul, South Korea
Nguyen Van Toan, Sy-Hung Bach & Soo-Yeong Yi

Authors

Nguyen Van Toan
View author publications
You can also search for this author inPubMed Google Scholar
Sy-Hung Bach
View author publications
You can also search for this author inPubMed Google Scholar
Soo-Yeong Yi
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Soo-Yeong Yi.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the author(s).

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

“Seq” indicates the sequence of the events, “Time” indicates the moment when the action starts in the experiment, and “Snapshot” indicates the time instant in which the action starts in the video.

1.1 Appendix 1: Sequence of events of the experiment in this paper (in the demo video)

Seq	Time (mm:ss)	Action	Snapshot (mm:ss)
1	00:00	Only use 2D-LiDAR sensor. The robot detects and tracks its STP and other persons appeared in its detection range	https://www.youtube.com/watch?v=D17h_oRf_cs&t=0s 00:00
2	00:13	Only use 2D-LiDAR sensor. The robot is tracking the STP when he is too close to environmental objects	https://www.youtube.com/watch?v=D17h_oRf_cs&t=13s 00:13
3	00:29	Only use 2D-LiDAR sensor. The robot misidentified its STP. The identification number of the STP is switched to the environmental object	https://www.youtube.com/watch?v=D17h_oRf_cs&t=29s 00:29
4	00:35	Hierarchical approach. The STP moves too close to environmental objects. Then, the robot activates the visual human detection	https://www.youtube.com/watch?v=D17h_oRf_cs&t=35s 00:35
5	00:55	Hierarchical approach. The STP moves far from environmental objects. Nothing around the STP, then visual-based modules are deactivated	https://www.youtube.com/watch?v=D17h_oRf_cs&t=55s 00:55
6	01:20	Hierarchical approach. Other people are close to the STP. The robot activates the visual human detection and face identification to identify its correct STP	https://www.youtube.com/watch?v=D17h_oRf_cs&t=80s 01:20
7	01:53	Hierarchical approach. Other people are close to the STP. Firstly, the robot activates the human detection and face identification to identify its STP. If there is no STP’s face, then the body identification is activated	https://www.youtube.com/watch?v=D17h_oRf_cs&t=113s 01:53
8	02:37	Hierarchical approach. There are no other people and environmental objects near the STP. The robot then deactivates visual-based modules	https://www.youtube.com/watch?v=D17h_oRf_cs&t=157s 02:37
9	02:46	Hierarchical approach. During the visual identification procedure, the STP is tracked (2D RGB human tracking and 3D PCL tracking) to match with tracked human-leg after the visual identification finished	https://www.youtube.com/watch?v=D17h_oRf_cs&t=165s 02:46
10	02:58	The STP data collection for the visual identification procedure	https://www.youtube.com/watch?v=D17h_oRf_cs&t=178s 02:58

1.2 Appendix 2: Effects and CPU consumptions of sub-methods in the visual identification procedure of the “Appendix 1”

Seq	Time (mm:ss)	Action	Snapshot (mm:ss)
1	00:00	Only the visual-based human detection is activated	https://www.youtube.com/watch?v=1jlm8ZWK_pA&t=0s 00:00
2	00:31	Visual-based human detection and body identification are activated	https://www.youtube.com/watch?v=1jlm8ZWK_pA&t=31s 00:31
3	01:24	Only the face identification is activated	https://www.youtube.com/watch?v=1jlm8ZWK_pA&t=84s 01:24

1.3 Appendix 3: Sequences of the flexible heading during the human–robot cooperation (presented in [17, 18])

Seq	Time (mm:ss)	Action	Snapshot (mm:ss)
1	00:00 (in [17])	The robot changes its moving heading flexibly when its STP change his moving directions and his sides with respect to the robot local coordinates, in [17]	https://www.youtube.com/watch?v=5EJiJUNxSCU&t=0s 00:00
2	03:40 (in [17])	The human-following task is harassed by other people, in [17]	https://www.youtube.com/watch?v=5EJiJUNxSCU&t=220s 03:40
3	00:00
(in [18])	The robot follows and supports the STP to take food trays in the office	https://www.youtube.com/watch?v=YGrWU6ldKuw&t=0s 00:00
4	01:00 (in [18])	The human–robot cooperation in narrow areas, and surrounded by many environmental objects and prohibited areas (in [18])	https://www.youtube.com/watch?v=YGrWU6ldKuw&t=60s 01:00

1.4 Appendix 4: Sequences of events of the experiment video of the object tracking using the fusion of 2D-LiDAR and RGB-D cameras.

Seq	Action	Snapshot (mm:ss)
1	The robot is not moving when detecting and tracking objects using only RGB-D cameras	https://www.youtube.com/watch?v=bi42fB3EfWA 00:00
2	The robot is not moving when detecting and tracking objects using only RGB-D cameras. Here, the visualization of the filtered 3D point cloud data is turned on	https://www.youtube.com/watch?v=bi42fB3EfWA&t=58s 00:58
3	The robot is not moving when detecting and tracking objects using the fusion of the 2D-LiDAR and RGB-D cameras	https://www.youtube.com/watch?v=z0FMGI5_tsg&t=0s 00:00
4	The robot is moving when detecting and tracking objects using the fusion of the 2D-LiDAR and RGB-D cameras	https://www.youtube.com/watch?v=z0FMGI5_tsg&t=50s 00:50
5	The robot is not moving when detecting and tracking objects using the fusion of the 2D-LiDAR and RGB-D cameras	https://www.youtube.com/watch?v=z0FMGI5_tsg&t=152s 02:32

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Van Toan, N., Bach, SH. & Yi, SY. A hierarchical approach for updating targeted person states in human-following mobile robots. Intel Serv Robotics 16, 287–306 (2023). https://doi.org/10.1007/s11370-023-00463-9

Download citation

Received: 08 December 2022
Accepted: 19 April 2023
Published: 23 May 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11370-023-00463-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical approach for updating targeted person states in human-following mobile robots

Abstract

Graphical abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Onboard Robust Person Detection and Tracking for Domestic Service Robots

Angular Position Estimation for Human-Following and Robot Navigation

An Adaptive and Proactive Human-Aware Robot Guide

Explore related subjects

Code or data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Appendix

Appendix

1.1 Appendix 1: Sequence of events of the experiment in this paper (in the demo video)

1.2 Appendix 2: Effects and CPU consumptions of sub-methods in the visual identification procedure of the “Appendix 1”

1.3 Appendix 3: Sequences of the flexible heading during the human–robot cooperation (presented in [17, 18])

1.4 Appendix 4: Sequences of events of the experiment video of the object tracking using the fusion of 2D-LiDAR and RGB-D cameras.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now