Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots

Yang, Yefeng; Huang, Tao; Wang, Tianqi; Yang, Wenyu; Chen, Han; Li, Boyang; Wen, Chih-yung

doi:10.1007/s11432-022-3904-9

Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots

Research Paper
Published: 26 April 2024

Volume 67, article number 152204, (2024)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Yefeng Yang^1,2,
Tao Huang^1,2,
Tianqi Wang¹,
Wenyu Yang¹,
Han Chen¹,
Boyang Li⁴ &
…
Chih-yung Wen^1,3

491 Accesses
Explore all metrics

Abstract

Autonomous robots have garnered extensive utilization in diverse fields. Among the critical concerns for autonomous systems, path planning holds paramount importance. Notwithstanding considerable efforts in its development over the years, path planning for autonomous systems continues to grapple with challenges related to low planning efficiency and inadequate obstacle avoidance response in a timely manner. This study proposes a novel and systematic solution to the path planning problem within intricate office buildings. The solution consists of a global planner and a local planner. To handle the global planning aspect, an adaptive clustering-based dynamic programming rapidly exploring random tree (ACDP-RRT) algorithm is proposed. ACDP-RRT effectively identifies obstacles on the map by leveraging geometric features. These obstacles are then represented as a collection of sequentially arranged convex polygons, optimizing the sampling region and significantly enhancing sampling efficiency. For local planning, a network decoupling actor-critic (ND-AC) algorithm is employed. The proposed ND-AC simplifies the local planner design process by integrating planning and control loops into a neural network (NN) trained via an end-to-end model-free deep reinforcement learning (DRL) framework. Moreover, the adoption of network decoupling (ND) techniques leads to an improved obstacle avoidance success rate when compared to conventional actor-critic (AC)-based methods. Extensive simulations and experiments are conducted to demonstrate the effectiveness and robustness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Energy-Aware Adaptive Obstacle Avoidance Based on Meta-Reinforcement Learning with Segmentation for UAV Trajectory Planning

Article 23 April 2025

Reinforcement learning path planning algorithm based on obstacle area expansion strategy

Article 03 February 2020

R3T*-MOSafeRL($\lambda $): path planning of mobile robots in unknown dynamic environments

Article 28 October 2024

References

Niroui F, Zhang K C, Kashino Z, et al. Deep reinforcement learning robot for search and rescue applications: exploration in unknown cluttered environments. IEEE Robot Autom Lett, 2019, 4: 610–617
Article Google Scholar
Hou X L, Li Z Y, Pan Q. Autonomous navigation of a multirotor robot in GNSS-denied environments for search and rescue. Sci China Inf Sci, 2023, 66: 139203
Article Google Scholar
Kamegawa T, Akiyama T, Sakai S, et al. Development of a separable search-and-rescue robot composed of a mobile robot and a snake robot. Adv Robotics, 2020, 34: 132–139
Article Google Scholar
Ma T, Zhou H B, Qian B, et al. A large-scale clustering and 3D trajectory optimization approach for UAV swarms. Sci China Inf Sci, 2021, 64: 140306
Article MathSciNet Google Scholar
Wang J K, Chi W Z, Li C M, et al. Neural RRT*: learning-based optimal path planning. IEEE Trans Automat Sci Eng, 2020, 17: 1748–1758
Article Google Scholar
Kleinbort M, Solovey K, Littlefield Z, et al. Probabilistic completeness of RRT for geometric and kinodynamic planning with forward propagation. IEEE Robot Autom Lett, 2019, 4: 1–7
Article Google Scholar
Kuffner J J, LaValle S M. RRT-connect: an efficient approach to single-query path planning. In: Proceedings of the IEEE International Conference on Robotics and Automation, San Francisco, 2000. 995–1001
Karaman S, Frazzoli E. Sampling-based algorithms for optimal motion planning. Int J Robotics Res, 2011, 30: 846–894
Article Google Scholar
Nasir J, Islam F, Malik U, et al. RRT*-Smart: a rapid convergence implementation of RRT*. Int J Adv Robotic Syst, 2013, 10: 299
Article Google Scholar
Li Y, Cui R X, Li Z J, et al. Neural network approximation based near-optimal motion planning with kinodynamic constraints using RRT. IEEE Trans Ind Electron, 2018, 65: 8718–8729
Article Google Scholar
Tahir Z, Qureshi A H, Ayaz Y, et al. Potentially guided bidirectionalized RRT* for fast optimal path planning in cluttered environments. Robotics Autonomous Syst, 2018, 108: 13–27
Article Google Scholar
Wang J K, Meng M Q H, Khatib O. EB-RRT: optimal motion planning for mobile robots. IEEE Trans Automat Sci Eng, 2020, 17: 2063–2073
Article Google Scholar
Qi J, Yang H, Sun H X. MOD-RRT*: a sampling-based algorithm for robot path planning in dynamic environment. IEEE Trans Ind Electron, 2021, 68: 7244–7251
Article Google Scholar
Chi W Z, Wang C Q, Wang J K, et al. Risk-DTRRT-based optimal motion planning algorithm for mobile robots. IEEE Trans Automat Sci Eng, 2019, 16: 1271–1288
Article Google Scholar
Xi L L, Peng Z H, Jiao L, et al. Smooth quadrotor trajectory generation for tracking a moving target in cluttered environments. Sci China Inf Sci, 2021, 64: 172209
Article MathSciNet Google Scholar
Chang L, Shan L, Jiang C, et al. Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment. Auton Robot, 2021, 45: 51–76
Article Google Scholar
Huang Y J, Ding H T, Zhang Y B, et al. A motion planning and tracking framework for autonomous vehicles based on artificial potential field elaborated resistance network approach. IEEE Trans Ind Electron, 2019, 67: 1376–1386
Article Google Scholar
Zhang Y Z, Ma B, Wai C K. A practical study of time-elastic-band planning method for driverless vehicle for auto-parking. In: Proceedings of the International Conference on Intelligent Autonomous Systems, Singapore, 2018. 196–200
Ames A, Coogan S, Egerstedt M, et al. Control barrier functions: theory and applications. In: Proceedings of the 18th European Control Conference (ECC), Naples, 2019. 3420–3431
Yang G, Vang B, Serlin Z, et al. Sampling-based motion planning via control barrier functions. In: Proceedings of the 3rd International Conference on Automation, Control and Robots, Beijing, 2019. 22–29
Saveriano M, Lee D. Learning barrier functions for constrained motion planning with dynamical systems. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macao, 2019. 112–119
Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518: 529–533
Article Google Scholar
Shi W S, Li J L, Wu H Q, et al. Drone-cell trajectory planning and resource allocation for highly mobile networks: a hierarchical DRL approach. IEEE Internet Things J, 2021, 8: 9800–9813
Article Google Scholar
Peng Y F, Tan G Z, Si H W, et al. DRL-GAT-SA: deep reinforcement learning for autonomous driving planning based on graph attention networks and simplex architecture. J Syst Architecture, 2022, 126: 102505
Article Google Scholar
Zhang L X, Zhang R X, Wu T, et al. Safe reinforcement learning with stability guarantee for motion planning of autonomous vehicles. IEEE Trans Neural Netw Learn Syst, 2021, 32: 5435–5444
Article Google Scholar
Wang J, An J, Chen M S, et al. From model to implementation: a network algorithm programming language. Sci China Inf Sci, 2020, 63: 172102
Article Google Scholar
You H L, Hu Y Y, Pan Z W, et al. Density-based user clustering in downlink NOMA systems. Sci China Inf Sci, 2022, 65: 152303
Article MathSciNet Google Scholar
Fahim A. K and starting means for k-means algorithm. J Comput Sci, 2021, 55: 101445
Article Google Scholar
Li H Z, Wang J. CAPKM++2.0: an upgraded version of the collaborative annealing power k-means++ clustering algorithm. Knowledge-Based Syst, 2023, 262: 110241
Article Google Scholar
Dijkstra E W. A note on two problems in connexion with graphs. Numer Math, 1959, 1: 269–271
Article MathSciNet Google Scholar
Dong L, Yuan X, Sun C Y. Event-triggered receding horizon control via actor-critic design. Sci China Inf Sci, 2020, 63: 150210
Article MathSciNet Google Scholar
Sutton R S, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of the Advances in Neural Information Processing Systems, 1999. 1057–1063
Pflueger M, Agha A, Sukhatme G S. Rover-IRL: inverse reinforcement learning with soft value iteration networks for planetary rover path planning. IEEE Robot Autom Lett, 2019, 4: 1387–1394
Article Google Scholar
Islam F, Nasir J, Malik U, et al. RRT*-Smart: rapid convergence implementation of RRT* towards optimal solution. In: Proceedings of the International Conference on Mechatronics and Automation, Chengdu, 2012. 1651–1656
Tang Z, Xu X, Wang F, et al. Coordinated control for path following of two-wheel independently actuated autonomous ground vehicle. IET Intelligent Transp Syst, 2019, 13: 628–635
Article Google Scholar
Dankwa S, Zheng W F. Twin-delayed DDPG: a deep reinforcement learning technique to model a continuous movement of an intelligent robot agent. In: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing, 2019. 1–5
Qiu C R, Hu Y, Chen Y, et al. Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications. IEEE Internet Things J, 2019, 6: 8577–8588
Article Google Scholar

Download references

Acknowledgements

This work was supported by Research Center of Unmanned Autonomous Systems (RCUAS), The Hong Kong Polytechnic University (Grant No. P0046487).

Author information

Authors and Affiliations

Department of Aeronautical and Aviation Engineering, The Hong Kong Polytechnic University, Hong Kong, 999077, China
Yefeng Yang, Tao Huang, Tianqi Wang, Wenyu Yang, Han Chen & Chih-yung Wen
Center for Control Theory and Guidance Technology, Harbin Institute of Technology, Harbin, 150001, China
Yefeng Yang & Tao Huang
Research Center for Unmanned Autonomous Systems, The Hong Kong Polytechnic University, Hong Kong, 999077, China
Chih-yung Wen
School of Engineering, The University of Newcastle, Callaghan, NSW, 2308, Australia
Boyang Li

Authors

Yefeng Yang
View author publications
You can also search for this author inPubMed Google Scholar
Tao Huang
View author publications
You can also search for this author inPubMed Google Scholar
Tianqi Wang
View author publications
You can also search for this author inPubMed Google Scholar
Wenyu Yang
View author publications
You can also search for this author inPubMed Google Scholar
Han Chen
View author publications
You can also search for this author inPubMed Google Scholar
Boyang Li
View author publications
You can also search for this author inPubMed Google Scholar
Chih-yung Wen
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Boyang Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Y., Huang, T., Wang, T. et al. Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots. Sci. China Inf. Sci. 67, 152204 (2024). https://doi.org/10.1007/s11432-022-3904-9

Download citation

Received: 20 December 2022
Revised: 02 August 2023
Accepted: 18 November 2023
Published: 26 April 2024
DOI: https://doi.org/10.1007/s11432-022-3904-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Energy-Aware Adaptive Obstacle Avoidance Based on Meta-Reinforcement Learning with Segmentation for UAV Trajectory Planning

Reinforcement learning path planning algorithm based on obstacle area expansion strategy

R3T*-MOSafeRL(\(\lambda \)): path planning of mobile robots in unknown dynamic environments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Energy-Aware Adaptive Obstacle Avoidance Based on Meta-Reinforcement Learning with Segmentation for UAV Trajectory Planning

Reinforcement learning path planning algorithm based on obstacle area expansion strategy

R3T*-MOSafeRL(\(\lambda \)): path planning of mobile robots in unknown dynamic environments

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now