Abstract
Autonomous robots have garnered extensive utilization in diverse fields. Among the critical concerns for autonomous systems, path planning holds paramount importance. Notwithstanding considerable efforts in its development over the years, path planning for autonomous systems continues to grapple with challenges related to low planning efficiency and inadequate obstacle avoidance response in a timely manner. This study proposes a novel and systematic solution to the path planning problem within intricate office buildings. The solution consists of a global planner and a local planner. To handle the global planning aspect, an adaptive clustering-based dynamic programming rapidly exploring random tree (ACDP-RRT) algorithm is proposed. ACDP-RRT effectively identifies obstacles on the map by leveraging geometric features. These obstacles are then represented as a collection of sequentially arranged convex polygons, optimizing the sampling region and significantly enhancing sampling efficiency. For local planning, a network decoupling actor-critic (ND-AC) algorithm is employed. The proposed ND-AC simplifies the local planner design process by integrating planning and control loops into a neural network (NN) trained via an end-to-end model-free deep reinforcement learning (DRL) framework. Moreover, the adoption of network decoupling (ND) techniques leads to an improved obstacle avoidance success rate when compared to conventional actor-critic (AC)-based methods. Extensive simulations and experiments are conducted to demonstrate the effectiveness and robustness of the proposed approach.
Similar content being viewed by others
References
Niroui F, Zhang K C, Kashino Z, et al. Deep reinforcement learning robot for search and rescue applications: exploration in unknown cluttered environments. IEEE Robot Autom Lett, 2019, 4: 610–617
Hou X L, Li Z Y, Pan Q. Autonomous navigation of a multirotor robot in GNSS-denied environments for search and rescue. Sci China Inf Sci, 2023, 66: 139203
Kamegawa T, Akiyama T, Sakai S, et al. Development of a separable search-and-rescue robot composed of a mobile robot and a snake robot. Adv Robotics, 2020, 34: 132–139
Ma T, Zhou H B, Qian B, et al. A large-scale clustering and 3D trajectory optimization approach for UAV swarms. Sci China Inf Sci, 2021, 64: 140306
Wang J K, Chi W Z, Li C M, et al. Neural RRT*: learning-based optimal path planning. IEEE Trans Automat Sci Eng, 2020, 17: 1748–1758
Kleinbort M, Solovey K, Littlefield Z, et al. Probabilistic completeness of RRT for geometric and kinodynamic planning with forward propagation. IEEE Robot Autom Lett, 2019, 4: 1–7
Kuffner J J, LaValle S M. RRT-connect: an efficient approach to single-query path planning. In: Proceedings of the IEEE International Conference on Robotics and Automation, San Francisco, 2000. 995–1001
Karaman S, Frazzoli E. Sampling-based algorithms for optimal motion planning. Int J Robotics Res, 2011, 30: 846–894
Nasir J, Islam F, Malik U, et al. RRT*-Smart: a rapid convergence implementation of RRT*. Int J Adv Robotic Syst, 2013, 10: 299
Li Y, Cui R X, Li Z J, et al. Neural network approximation based near-optimal motion planning with kinodynamic constraints using RRT. IEEE Trans Ind Electron, 2018, 65: 8718–8729
Tahir Z, Qureshi A H, Ayaz Y, et al. Potentially guided bidirectionalized RRT* for fast optimal path planning in cluttered environments. Robotics Autonomous Syst, 2018, 108: 13–27
Wang J K, Meng M Q H, Khatib O. EB-RRT: optimal motion planning for mobile robots. IEEE Trans Automat Sci Eng, 2020, 17: 2063–2073
Qi J, Yang H, Sun H X. MOD-RRT*: a sampling-based algorithm for robot path planning in dynamic environment. IEEE Trans Ind Electron, 2021, 68: 7244–7251
Chi W Z, Wang C Q, Wang J K, et al. Risk-DTRRT-based optimal motion planning algorithm for mobile robots. IEEE Trans Automat Sci Eng, 2019, 16: 1271–1288
Xi L L, Peng Z H, Jiao L, et al. Smooth quadrotor trajectory generation for tracking a moving target in cluttered environments. Sci China Inf Sci, 2021, 64: 172209
Chang L, Shan L, Jiang C, et al. Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment. Auton Robot, 2021, 45: 51–76
Huang Y J, Ding H T, Zhang Y B, et al. A motion planning and tracking framework for autonomous vehicles based on artificial potential field elaborated resistance network approach. IEEE Trans Ind Electron, 2019, 67: 1376–1386
Zhang Y Z, Ma B, Wai C K. A practical study of time-elastic-band planning method for driverless vehicle for auto-parking. In: Proceedings of the International Conference on Intelligent Autonomous Systems, Singapore, 2018. 196–200
Ames A, Coogan S, Egerstedt M, et al. Control barrier functions: theory and applications. In: Proceedings of the 18th European Control Conference (ECC), Naples, 2019. 3420–3431
Yang G, Vang B, Serlin Z, et al. Sampling-based motion planning via control barrier functions. In: Proceedings of the 3rd International Conference on Automation, Control and Robots, Beijing, 2019. 22–29
Saveriano M, Lee D. Learning barrier functions for constrained motion planning with dynamical systems. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macao, 2019. 112–119
Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518: 529–533
Shi W S, Li J L, Wu H Q, et al. Drone-cell trajectory planning and resource allocation for highly mobile networks: a hierarchical DRL approach. IEEE Internet Things J, 2021, 8: 9800–9813
Peng Y F, Tan G Z, Si H W, et al. DRL-GAT-SA: deep reinforcement learning for autonomous driving planning based on graph attention networks and simplex architecture. J Syst Architecture, 2022, 126: 102505
Zhang L X, Zhang R X, Wu T, et al. Safe reinforcement learning with stability guarantee for motion planning of autonomous vehicles. IEEE Trans Neural Netw Learn Syst, 2021, 32: 5435–5444
Wang J, An J, Chen M S, et al. From model to implementation: a network algorithm programming language. Sci China Inf Sci, 2020, 63: 172102
You H L, Hu Y Y, Pan Z W, et al. Density-based user clustering in downlink NOMA systems. Sci China Inf Sci, 2022, 65: 152303
Fahim A. K and starting means for k-means algorithm. J Comput Sci, 2021, 55: 101445
Li H Z, Wang J. CAPKM++2.0: an upgraded version of the collaborative annealing power k-means++ clustering algorithm. Knowledge-Based Syst, 2023, 262: 110241
Dijkstra E W. A note on two problems in connexion with graphs. Numer Math, 1959, 1: 269–271
Dong L, Yuan X, Sun C Y. Event-triggered receding horizon control via actor-critic design. Sci China Inf Sci, 2020, 63: 150210
Sutton R S, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of the Advances in Neural Information Processing Systems, 1999. 1057–1063
Pflueger M, Agha A, Sukhatme G S. Rover-IRL: inverse reinforcement learning with soft value iteration networks for planetary rover path planning. IEEE Robot Autom Lett, 2019, 4: 1387–1394
Islam F, Nasir J, Malik U, et al. RRT*-Smart: rapid convergence implementation of RRT* towards optimal solution. In: Proceedings of the International Conference on Mechatronics and Automation, Chengdu, 2012. 1651–1656
Tang Z, Xu X, Wang F, et al. Coordinated control for path following of two-wheel independently actuated autonomous ground vehicle. IET Intelligent Transp Syst, 2019, 13: 628–635
Dankwa S, Zheng W F. Twin-delayed DDPG: a deep reinforcement learning technique to model a continuous movement of an intelligent robot agent. In: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing, 2019. 1–5
Qiu C R, Hu Y, Chen Y, et al. Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications. IEEE Internet Things J, 2019, 6: 8577–8588
Acknowledgements
This work was supported by Research Center of Unmanned Autonomous Systems (RCUAS), The Hong Kong Polytechnic University (Grant No. P0046487).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, Y., Huang, T., Wang, T. et al. Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots. Sci. China Inf. Sci. 67, 152204 (2024). https://doi.org/10.1007/s11432-022-3904-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-022-3904-9