Skip to main content

Advertisement

Log in

Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Autonomous robots have garnered extensive utilization in diverse fields. Among the critical concerns for autonomous systems, path planning holds paramount importance. Notwithstanding considerable efforts in its development over the years, path planning for autonomous systems continues to grapple with challenges related to low planning efficiency and inadequate obstacle avoidance response in a timely manner. This study proposes a novel and systematic solution to the path planning problem within intricate office buildings. The solution consists of a global planner and a local planner. To handle the global planning aspect, an adaptive clustering-based dynamic programming rapidly exploring random tree (ACDP-RRT) algorithm is proposed. ACDP-RRT effectively identifies obstacles on the map by leveraging geometric features. These obstacles are then represented as a collection of sequentially arranged convex polygons, optimizing the sampling region and significantly enhancing sampling efficiency. For local planning, a network decoupling actor-critic (ND-AC) algorithm is employed. The proposed ND-AC simplifies the local planner design process by integrating planning and control loops into a neural network (NN) trained via an end-to-end model-free deep reinforcement learning (DRL) framework. Moreover, the adoption of network decoupling (ND) techniques leads to an improved obstacle avoidance success rate when compared to conventional actor-critic (AC)-based methods. Extensive simulations and experiments are conducted to demonstrate the effectiveness and robustness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Niroui F, Zhang K C, Kashino Z, et al. Deep reinforcement learning robot for search and rescue applications: exploration in unknown cluttered environments. IEEE Robot Autom Lett, 2019, 4: 610–617

    Article  Google Scholar 

  2. Hou X L, Li Z Y, Pan Q. Autonomous navigation of a multirotor robot in GNSS-denied environments for search and rescue. Sci China Inf Sci, 2023, 66: 139203

    Article  Google Scholar 

  3. Kamegawa T, Akiyama T, Sakai S, et al. Development of a separable search-and-rescue robot composed of a mobile robot and a snake robot. Adv Robotics, 2020, 34: 132–139

    Article  Google Scholar 

  4. Ma T, Zhou H B, Qian B, et al. A large-scale clustering and 3D trajectory optimization approach for UAV swarms. Sci China Inf Sci, 2021, 64: 140306

    Article  MathSciNet  Google Scholar 

  5. Wang J K, Chi W Z, Li C M, et al. Neural RRT*: learning-based optimal path planning. IEEE Trans Automat Sci Eng, 2020, 17: 1748–1758

    Article  Google Scholar 

  6. Kleinbort M, Solovey K, Littlefield Z, et al. Probabilistic completeness of RRT for geometric and kinodynamic planning with forward propagation. IEEE Robot Autom Lett, 2019, 4: 1–7

    Article  Google Scholar 

  7. Kuffner J J, LaValle S M. RRT-connect: an efficient approach to single-query path planning. In: Proceedings of the IEEE International Conference on Robotics and Automation, San Francisco, 2000. 995–1001

  8. Karaman S, Frazzoli E. Sampling-based algorithms for optimal motion planning. Int J Robotics Res, 2011, 30: 846–894

    Article  Google Scholar 

  9. Nasir J, Islam F, Malik U, et al. RRT*-Smart: a rapid convergence implementation of RRT*. Int J Adv Robotic Syst, 2013, 10: 299

    Article  Google Scholar 

  10. Li Y, Cui R X, Li Z J, et al. Neural network approximation based near-optimal motion planning with kinodynamic constraints using RRT. IEEE Trans Ind Electron, 2018, 65: 8718–8729

    Article  Google Scholar 

  11. Tahir Z, Qureshi A H, Ayaz Y, et al. Potentially guided bidirectionalized RRT* for fast optimal path planning in cluttered environments. Robotics Autonomous Syst, 2018, 108: 13–27

    Article  Google Scholar 

  12. Wang J K, Meng M Q H, Khatib O. EB-RRT: optimal motion planning for mobile robots. IEEE Trans Automat Sci Eng, 2020, 17: 2063–2073

    Article  Google Scholar 

  13. Qi J, Yang H, Sun H X. MOD-RRT*: a sampling-based algorithm for robot path planning in dynamic environment. IEEE Trans Ind Electron, 2021, 68: 7244–7251

    Article  Google Scholar 

  14. Chi W Z, Wang C Q, Wang J K, et al. Risk-DTRRT-based optimal motion planning algorithm for mobile robots. IEEE Trans Automat Sci Eng, 2019, 16: 1271–1288

    Article  Google Scholar 

  15. Xi L L, Peng Z H, Jiao L, et al. Smooth quadrotor trajectory generation for tracking a moving target in cluttered environments. Sci China Inf Sci, 2021, 64: 172209

    Article  MathSciNet  Google Scholar 

  16. Chang L, Shan L, Jiang C, et al. Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment. Auton Robot, 2021, 45: 51–76

    Article  Google Scholar 

  17. Huang Y J, Ding H T, Zhang Y B, et al. A motion planning and tracking framework for autonomous vehicles based on artificial potential field elaborated resistance network approach. IEEE Trans Ind Electron, 2019, 67: 1376–1386

    Article  Google Scholar 

  18. Zhang Y Z, Ma B, Wai C K. A practical study of time-elastic-band planning method for driverless vehicle for auto-parking. In: Proceedings of the International Conference on Intelligent Autonomous Systems, Singapore, 2018. 196–200

  19. Ames A, Coogan S, Egerstedt M, et al. Control barrier functions: theory and applications. In: Proceedings of the 18th European Control Conference (ECC), Naples, 2019. 3420–3431

  20. Yang G, Vang B, Serlin Z, et al. Sampling-based motion planning via control barrier functions. In: Proceedings of the 3rd International Conference on Automation, Control and Robots, Beijing, 2019. 22–29

  21. Saveriano M, Lee D. Learning barrier functions for constrained motion planning with dynamical systems. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macao, 2019. 112–119

  22. Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518: 529–533

    Article  Google Scholar 

  23. Shi W S, Li J L, Wu H Q, et al. Drone-cell trajectory planning and resource allocation for highly mobile networks: a hierarchical DRL approach. IEEE Internet Things J, 2021, 8: 9800–9813

    Article  Google Scholar 

  24. Peng Y F, Tan G Z, Si H W, et al. DRL-GAT-SA: deep reinforcement learning for autonomous driving planning based on graph attention networks and simplex architecture. J Syst Architecture, 2022, 126: 102505

    Article  Google Scholar 

  25. Zhang L X, Zhang R X, Wu T, et al. Safe reinforcement learning with stability guarantee for motion planning of autonomous vehicles. IEEE Trans Neural Netw Learn Syst, 2021, 32: 5435–5444

    Article  Google Scholar 

  26. Wang J, An J, Chen M S, et al. From model to implementation: a network algorithm programming language. Sci China Inf Sci, 2020, 63: 172102

    Article  Google Scholar 

  27. You H L, Hu Y Y, Pan Z W, et al. Density-based user clustering in downlink NOMA systems. Sci China Inf Sci, 2022, 65: 152303

    Article  MathSciNet  Google Scholar 

  28. Fahim A. K and starting means for k-means algorithm. J Comput Sci, 2021, 55: 101445

    Article  Google Scholar 

  29. Li H Z, Wang J. CAPKM++2.0: an upgraded version of the collaborative annealing power k-means++ clustering algorithm. Knowledge-Based Syst, 2023, 262: 110241

    Article  Google Scholar 

  30. Dijkstra E W. A note on two problems in connexion with graphs. Numer Math, 1959, 1: 269–271

    Article  MathSciNet  Google Scholar 

  31. Dong L, Yuan X, Sun C Y. Event-triggered receding horizon control via actor-critic design. Sci China Inf Sci, 2020, 63: 150210

    Article  MathSciNet  Google Scholar 

  32. Sutton R S, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of the Advances in Neural Information Processing Systems, 1999. 1057–1063

  33. Pflueger M, Agha A, Sukhatme G S. Rover-IRL: inverse reinforcement learning with soft value iteration networks for planetary rover path planning. IEEE Robot Autom Lett, 2019, 4: 1387–1394

    Article  Google Scholar 

  34. Islam F, Nasir J, Malik U, et al. RRT*-Smart: rapid convergence implementation of RRT* towards optimal solution. In: Proceedings of the International Conference on Mechatronics and Automation, Chengdu, 2012. 1651–1656

  35. Tang Z, Xu X, Wang F, et al. Coordinated control for path following of two-wheel independently actuated autonomous ground vehicle. IET Intelligent Transp Syst, 2019, 13: 628–635

    Article  Google Scholar 

  36. Dankwa S, Zheng W F. Twin-delayed DDPG: a deep reinforcement learning technique to model a continuous movement of an intelligent robot agent. In: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing, 2019. 1–5

  37. Qiu C R, Hu Y, Chen Y, et al. Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications. IEEE Internet Things J, 2019, 6: 8577–8588

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by Research Center of Unmanned Autonomous Systems (RCUAS), The Hong Kong Polytechnic University (Grant No. P0046487).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boyang Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Huang, T., Wang, T. et al. Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots. Sci. China Inf. Sci. 67, 152204 (2024). https://doi.org/10.1007/s11432-022-3904-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-022-3904-9

Keywords