Path Planning for the Robotic Manipulator in Dynamic Environments Based on a Deep Reinforcement Learning Method

Liu, Jie; Yap, Hwa Jen; Khairuddin, Anis Salwa Mohd

doi:10.1007/s10846-024-02205-0

Path Planning for the Robotic Manipulator in Dynamic Environments Based on a Deep Reinforcement Learning Method

Regular paper
Open access
Published: 21 December 2024

Volume 111, article number 3, (2025)
Cite this article

Download PDF

You have full access to this open access article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Path Planning for the Robotic Manipulator in Dynamic Environments Based on a Deep Reinforcement Learning Method

Download PDF

425 Accesses
Explore all metrics

Abstract

Collaborative and autonomous robots are increasingly important in meeting the demands of a faster and more cost-effective market. To ensure production efficiency and safety, robots must swiftly respond to the presence of human operators or other dynamic obstacles, avoiding potential collisions by quickly planning alternative paths. Deep Reinforcement Learning (DRL) based methods have shown great potential in path planning due to their rapid response capabilities. However, existing DRL-based planners lack a safety verification system to evaluate the feasibility of actions generated by neural models, and they cannot guarantee 100% collision-free paths. This paper presents an enhanced DRL-based path planning system incorporating a robust safety verification mechanism. This system predicts potential collisions and generates alternative collision-free paths as necessary. We analyzed the essential elements of trajectory planning using the DRL method and proposed improvements to accelerate planning speed. The results demonstrate that our planner consistently generates paths for typical reaching tasks with an average planning time of 12.1 ms, a notable improvement over traditional algorithms. Moreover, the paths produced by our method are nearly optimal, akin to those generated by Optimization-based algorithms.

Article PDF

Guided Deep Reinforcement Learning for Path Planning of Robotic Manipulators

Robot Path Planning via Deep Reinforcement Learning with Improved Reward Function

Towards Dynamic Obstacle Avoidance for Robot Manipulators with Deep Reinforcement Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data Availability

Data will be made available upon reasonable request.

References

Zabalza, J., Fei, Z., Wong, C., Yan, Y., Mineo, C., Yang, E., Rodden, T., Mehnen, J., Pham, Q., Ren, J.: Smart sensing and adaptive reasoning for enabling industrial robots with interactive human-robot capabilities in dynamic environments: a case study. Sensors 19(6), 1354 (2019)
Article Google Scholar
Nicola, G., Ghidoni, S.: Deep Reinforcement Learning for Motion Planning in Human Robot cooperative Scenarios. in 2021 26th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA ). IEEE, 1–7 (2021).
Li, S., Han, K., Li, X., Zhang, S., Xiong, Y., Xie, Z.: Hybrid trajectory replanning-based dynamic obstacle avoidance for physical human-robot interaction. J. Intell. Rob. Syst. 103(3), 1–14 (2021)
Article Google Scholar
LaValle, S.: Rapidly-exploring random trees: a new tool for path planning. Res. Rep. 9811 (1998).
Long, H., Li, G., Zhou, F., Chen, T.: Cooperative dynamic motion planning for dual manipulator arms based on RRT*Smart-AD algorithm. Sensors 23(18), 7759 (2023)
Article Google Scholar
Yuan, C., Shuai, C., Zhang, W.: A dynamic multiple-query RRT planning algorithm for manipulator obstacle avoidance. Appl. Sci. Basel 13(6), 3394 (2023)
Article Google Scholar
Yu, Y., Zhang, Y.: Collision avoidance and path planning for industrial manipulator using slice-based heuristic fast marching tree. Robot. Comput.-Integr. Manuf 75, 102289 (2022)
Article Google Scholar
Merckaert, K., Convens, B., Nicotra, M., Vanderborght, B.: Real-time constraint-based planning and control of robotic manipulators for safe human-robot collaboration. Robot. Comput.-Integr. Manuf 87, 102711 (2024)
Article Google Scholar
Wei, S., Liu, B., Yao, M., Yu, X., Tang, L.: Efficient online motion planning method for the robotic arm to pick-up moving objects smoothly with temporal constraints. Proc. Inst. Mech. Eng 236(15), 8650–8662 (2022)
Google Scholar
Dam, T., Chalvatzaki, G., Peters, J., Pajarinen, J.: Monte-Carlo robot path planning. IEEE Robot. Autom. Lett 7(4), 11213–11220 (2022)
Article Google Scholar
Cao, X., Zou, X., Jia, C., Chen, M., Zeng, Z.: RRT-based path planning for an intelligent litchi-picking manipulator. Comput. Electron. Agric. 156, 105–118 (2019)
Article Google Scholar
Yuan, C., Liu, G., Zhang, W., Pan, X.: An efficient RRT cache method in dynamic environments for path planning. Robot. Auton. Syst. 131, 103595 (2020)
Article Google Scholar
Zhang, H., Wang, Y., Zheng, J., Yu, J.: Path planning of industrial robot based on improved RRT algorithm in complex environments. IEEE Access 6, 53296–53306 (2018)
Article Google Scholar
Ichter, B., Harrison, J., Pavone, M.: Learning sampling distributions for robot motion planning. in 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 7087–7094 (2018).
Wang, J., Chi, W., Li, C., Wang, C., Meng, M.: Neural RRT*: learning-based optimal path planning. IEEE Trans. Autom. Sci. Eng. 17(4), 1748–1758 (2020)
Article Google Scholar
Ma, N., Wang, J., Liu, J., Meng, M.: Conditional generative adversarial networks for optimal path planning. IEEE Trans. Cogn. Dev. Syst. 14(2), 662–671 (2022)
Article Google Scholar
Wang, Y., Wei, L., Du, K., Liu, G., Yang, Q., Wei, Y., Fang, Q.: An online collision-free trajectory generation algorithm for human-robot collaboration. Robot. Comput.-Integr. Manuf 80, 102475 (2023)
Article Google Scholar
Power, T., Berenson, D.: Learning a generalizable trajectory sampling distribution for model predictive control. IEEE Trans. Rob. 40, 2111–2127 (2024)
Article Google Scholar
Lee, C., Song, K.: Path re-planning design of a cobot in a dynamic environment based on current obstacle configuration. Robot. Autom. Lett. 8(3), 1183–1190 (2023)
Article Google Scholar
Jiang, L., Liu, S., Cui, Y., Jiang, H.: Path planning for robotic manipulator in complex multi-obstacle environment based on improved_RRT. IEEE/ASME Trans. Mechatron. 27(6), 4774–4785 (2022)
Article Google Scholar
Ratliff, N., Zucker, M., Bagnell, J., Srinivasa, S.: CHOMP: Gradient optimization techniques for efficient motion planning. In: IEEE International Conference on Robotics and Automation (ICRA). IEEE, 489–494 (2009).
Kalakrishnan, M., Chitta, S., Theodorou, E., Pastor, P., Schaal, S.: STOMP: Stochastic trajectory optimization for motion planning. In: IEEE International Conference on Robotics and Automation (ICRA). IEEE, 4569–4574 (2011).
Park, C., Pan, J., Manocha, D.: ITOMP: Incremental Trajectory Optimization for Real-time Replanning in Dynamic Environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 22, 207–215 (2012).
Finean, M., Petrovic, L., Merkt, W., Markovic, I., Havoutis, I.: Motion planning in dynamic environments using context-aware human trajectory prediction. Robot. Auton. Syst. 166, 104450 (2023)
Article Google Scholar
Dong, J., Mukadam, M., Dellaert, F., Boots, B.: Motion Planning as Probabilistic Inference using Gaussian Processes and Factor Graphs. In: Robotics: Science and Systems (RSS). 12(4), (2016).
Finean, M., Merkt, W., Havoutis, I.: Simultaneous Scene Reconstruction and Whole-Body Motion Planning for Safe Operation in Dynamic Environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 3710–3717 (2021)..
Kuntz, A., Bowen, C., Alterovitz, R.: Fast Anytime Motion Planning in Point Clouds by Interleaving Sampling and Interior Point Optimization. In: Springer International Conference on Intelligent Robots and Systems (IROS). Springer, 929–945 (2020).
Alwala, K., Mukadam, M.: Joint Sampling and Trajectory Optimization over Graphs for Online Motion Planning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 4700–4707 (2021).
Watkins, C. J. C. H.: Learning from delayed rewards. PhD Thesis, King's College, University of Cambridge (1989)
Salmaninejad, M., Zilles, S., Mayorga, R.: Motion Path Planning of Two Robot Arms in a Common Workspace. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 45–51 (2020).
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602 (2013).
Petrenko, V., Tebueva, F., Ryabtsev, S., Gurchinsky, M.: Method of Controlling the Movement of an Anthropomorphic Manipulator in the Working Area With Dynamic Obstacle. In: 8th Scientific Conference on Information Technologies for Intelligent Decision Making Support (ITIDS). IEEE, 359–364 (2020).
Alam, M. S., Sudha, S. K. R., Somayajula, A.: AI on the Water: Applying DRL to Autonomous Vessel Navigation. arXiv preprint arXiv:2310.14938 (2023).
Regunathan, R.D., Sudha, S.K.R., Alam, M.S., Somayajula, A.: Deep Reinforcement Learning Based Controller for Ship Navigation. Ocean Eng. 273, 113937 (2023)
Article Google Scholar
Lillicrap, T., Hunt, J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015).
Li, Z., Ma, H., Ding, Y., Wang, C., Jin, Y.: Motion planning of six-dof arm robot based on improved DDPG algorithm. In: 2020 39th Chinese Control Conference (CCC). IEEE, 3954–3959 (2020).
Lindner, T., Milecki, A.: Reinforcement learning-based algorithm to avoid obstacles by the anthropomorphic robotic arm. Appl. Sci. 12, 6629 (2022)
Article Google Scholar
Zeng, R., Liu, M., Zhang, J., Li, X., Zhou, Q., Jiang, Y.: Manipulator Control Method Based on Deep Reinforcement Learning. In: 2020 Chinese Control And Decision Conference (CCDC). IEEE, 415–420 (2020).
Um, D., Nethala, P., Shin, H.: Hierarchical DDPG for manipulator motion planning in dynamic environments. AI 3(3), 645–658 (2022)
Article Google Scholar
Jose, J., Alam, M. S., Somayajula, A. S.: Navigating the Ocean with DRL: Path following for marine vessels. arXiv preprint arXiv:2310.14932 (2023).
Fujimoto, S., van Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. arXiv:1802.09477 (2018).
Wang, S., Yi, W., He, Z., Xu, J., Yang, L.: Safe reinforcement learning-based trajectory planning for industrial robot. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 3471–3476 (2020).
Huang, Z., Chen, G., Shen, Y., Wang, R., Liu, C., Zhang, L.: An obstacle-avoidance motion planning method for redundant space robot via reinforcement learning. Actuators 12(2), 69 (2023)
Article Google Scholar
Chen, P., Pei, J., Lu, W., Li, M.: A deep reinforcement learning based method for real-time path planning and dynamic obstacle avoidance. Neurocomputing 497, 64–75 (2022)
Article Google Scholar
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized Experience Replay. CoRR abs/1511.05952 (2015).
Andrychowicz, M., Crow, D., Ray, A., Schneider, J., Fong, R., Welinder, P., ..., Zaremba, W.: Hindsight Experience Replay. ArXiv abs/1707.01495 (2017).
Feng, X.: Consistent experience replay in high-dimensional continuous control with decayed hindsights. Machines 10, 856 (2022)
Article Google Scholar
Kim, S., An, B.: Learning Heuristic A: Efficient Graph Search using Neural Network. In: 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 9542–9547 (2020).
Prianto, E., Park, J.H., Bae, J.H., Kim, J.S.: Deep reinforcement learning-based path planning for multi-arm manipulators with periodically moving obstacles. Applied Sciences-Basel 11(6), 2587 (2021)
Article Google Scholar
Ren, Z., Dong, K., Zhou, Y., Liu, Q., Peng, J.: Exploration via Hindsight Goal Generation. Adv. Neural Inf. Process Syst. 32 (2019).
Bing, Z., Brucker, M., Morin, F.O., Li, R., Su, X., Huang, K., Knoll, A.: Complex robotic manipulation via graph-based hindsight goal generation. IEEE Trans. Neural Netw. Learn. Syst 33(12), 7863–7876 (2021)
Article MathSciNet Google Scholar
Bing, Z. S., Alvarez, E., Cheng, L., Morin, F. O., Li, R., Su, X. J., ..., Knoll, A.: Robotic Manipulation in Dynamic Scenarios via Bounding-Box-Based Hindsight Goal Generation. IEEE Trans. Neural Netw. Learn. Syst. 34(8), 5037–5050 (2023).
Althoff, M., Dolan, J.M.: Online verification of automated road vehicles using reachability analysis. IEEE Trans. Rob. 30(4), 903–918 (2014)
Article Google Scholar
Chan, C.C., Tsai, C.C.: Collision-free path planning based on new navigation function for an industrial robotic manipulator in human-robot coexistence environments. J. Chin. Inst. Eng. 43(6), 508–518 (2020)
Article Google Scholar
Zhao, J. B., Zhao, Q., Wang, J. Z., Zhang, X., Wang, Y. L.: Path Planning and Evaluation for Obstacle Avoidance of Manipulator Based on Improved Artificial Potential Field and Danger Field. In: 33rd Chinese Control and Decision Conference (CCDC). IEEE, 3018–3025 (2021).
Tulbure, A., Khatib, O.: Closing the Loop: Real-Time Perception and Control for Robust Collision Avoidance with Occluded Obstacles. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 5700–5707 (2020).
Zhao, M., Lv, X.Q.: Improved manipulator obstacle avoidance path planning based on potential field method. J. Robot. 2020, 1–12 (2020)
Article Google Scholar
Zhang, H., Zhu, Y.F., Liu, X.F., Xu, X.R.: Analysis of obstacle avoidance strategy for dual-arm robot based on speed field with improved artificial potential field algorithm. Electronics 10(15), 1850 (2021)
Article Google Scholar
Elahres, M., Fonte, A., Poisson, G.: Evaluation of an artificial potential field method in collision-free path planning for a robot manipulator. In: 2nd International Conference on Robotics, Computer Vision and Intelligent Systems (ROBOVIS). 92–102 (2021).
Khatib, O.: Real-time obstacle avoidance for manipulators and mobile robots. In: Proceedings of the 1985 IEEE International Conference on Robotics and Automation. IEEE, 500–505 (1985).
Kavraki, L.E., Svestka, P., Latombe, J., Overmars, M.H.: Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Trans. Robot. Autom. 12(4), 566–580 (1996)
Article Google Scholar
Karaman, S., Frazzoli, E.: Sampling-based algorithms for optimal motion planning. Int.l J. Robot. Res. 30(7), 846–894 (2011)
Article Google Scholar
Mukadam, M., Dong, J., Yan, X., Dellaert, F., Boots, B.: Continuous- time Gaussian process motion planning via probabilistic inference. Int. J. Robot. Res. 37(11), 1319–1340 (2018)
Article Google Scholar
Thakar, S., Rajendran, P., Kim, H., Kabir, A. M., Gupta, S. K.: Accelerating bi-directional sampling-based search for motion planning of non-holonomic mobile manipulators. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 6711–6717 (2020).
Gammell, J. D., Srinivasa, S. S., Barfoot, T. D., Batch Informed Trees (BIT): Sampling-based optimal planning via the heuristically guided search of implicit random geometric graphs. In: 2015 IEEE international conference on robotics and automation (ICRA), 3067–3074 (2015).
Schulman, J., Ho, J., Lee, A.X., Awwal, I., Bradlow, H., Abbeel, P.: Finding locally optimal, collision-free trajectories with sequential convex optimization. Robot Sci Syst IX 9(1), 1–10 (2013)
Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International conference on machine learning (PMLR), 1861–1870 (2018).

Download references

Funding

This work was supported by The Ministry of Higher Education for the Fundamental Research Grant Scheme (FRGS/1/2022/TK10/UM/02/7) awarded to Ir. Dr. Hwa-Jen Yap (Universiti Malaya) and Application Innovation Project of Hebei Vocational University of Technology and Engineering (202205).

Author information

Authors and Affiliations

Department of Mechanical Engineering, Faculty of Engineering, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
Jie Liu & Hwa Jen Yap
Department of Electrical Engineering, Hebei Vocational University of Technology and Engineering, Xingtai, 054000, China
Jie Liu
Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
Anis Salwa Mohd Khairuddin

Authors

Jie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hwa Jen Yap
View author publications
You can also search for this author in PubMed Google Scholar
Anis Salwa Mohd Khairuddin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: [JL], [HJY], [ASMK]; Methodology: [JL], [HJY], [ASMK]; Formal analysis and investigation: [JL]; Writing—original draft preparation: [JL]; Writing—review and editing: [HJY], [ASMK]; Funding acquisition, resources: [HJY], [JL]; Supervision: [HJY], [ASMK].

Corresponding author

Correspondence to Hwa Jen Yap.

Ethics declarations

Competing Interests

The authors have no competing interests to disclose about the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

1.1 Implementation Details

A. Sampling-Based Algorithms

In sampling-based algorithms, the orientation of the end-effector remains constant throughout the planning process. The planning is performed by first sampling positions for the Tool Center Point (TCP) within the workspace. Then, inverse kinematics is applied to compute the joint angles for these positions. To avoid collisions, the ‘getContactPoints’ API in PyBullet is used for collision detection.

1)
Common Parameters for Sampling-Based Algorithms
2)
Max Samples: The algorithm can generate up to 1000 random samples to grow the tree.
3)
Step Length: The maximum step length for tree extension towards a sampled point is 0.1 m per iteration.
4)
Collision Checking: Performed every 0.02 m when connecting path points to ensure no collisions.
5)
Goal Threshold: The goal is considered reached if the tree reaches within 0.1 m of the goal position.
6)
Specific Algorithm Parameters
7)
Bias Goal RRT: A bias factor of 0.5 is used, meaning that after determining the nearest node to a random sample, an additional 50% of the maximum step length is taken towards the goal.
8)
RRT*: A search radius of 0.5 m is applied for local optimization to minimize path length.
9)
PRM: A sample size of 1000 is employed with a maximum step length of 0.1 m for smooth transitions, and 10 nearest neighbors are considered to improve connectivity.
10)
BIT*: Similar to PRM, a sample size of 1000 and a maximum step length of 0.1 m are used. Heuristic and cost calculations are also used to reduce path length.

B. Optimization-Based Algorithms

In optimization-based algorithms, the path planning problem is framed as an optimization task. The objective is to find a feasible solution by minimizing a cost function while meeting certain constraints. The general formulation for optimization-based motion planning can be expressed as:

$$\begin{array}{cc}\text{minimize}& \mathcal{F}[\theta \left(t\right)]\\ \text{subject to}& {\mathcal{G}}_{i}[\theta (t)]\le 0,i=1,\dots ,{m}_{ineq}\\ & {\mathcal{H}}_{i}[\theta (t)]=0,i=1,\dots ,{m}_{eq}\end{array}$$

(9)

1) Common Setting for Optimization-Based Algorithms

The objective function consists of multiple cost components, including path length, obstacle avoidance, smoothness, and roughness. The function $\mathcal{F}$ is expressed as:

$$\mathcal{F}={R}_{l}+{R}_{\text{s}}+{R}_{\text{r}}+{R}_{\text{o}}$$

(10)

where: ${R}_{l}$ is the path length cost:

$${R}_{l}=\sum_{t=1}^{T}\Vert \frac{d{s}_{t}}{dt}\Vert$$

(11)

${R}_{s}$ is the smoothness cost:

$${R}_{\text{s}}=\sum_{t=1}^{T}\Vert \frac{{d}^{2}{s}_{t}}{d{t}^{2}}\Vert$$

(12)

${R}_{r}$ is the roughness cost [100]:

$${R}_{r}=\sum_{t=1}^{T} {\left(\frac{{d}^{2}{s}_{t}}{d{t}^{2}}\right)}^{2}$$

(13)

${R}_{o}$ is the obstacle cost:

$${R}_{o}=\sum_{t=1}^{T} \alpha \sum_{i} {\left(\beta -{d}_{t,i}\right)}^{2}$$

(14)

Here, ${S}_{t}$ represents the position of the TCP at the $t$-th discrete point, and $T=10$ is the total number of discrete points. ${d}_{t,i}$ is the distance between the robot and the $i$-th closest obstacle at the $t$-th waypoint, as determined by the closest points, which are obtained by the API ‘getClosestPoints’. $\alpha =100$ is a scaling factor for the obstacle cost. $\beta =0.1$ meter is the threshold distance below which the obstacle cost starts to increase.

Inequality Constraints: The joint angles ${\theta }_{j}(t)$ must stay within allowable limits:
$${\theta }_{j}^{min}\le {\theta }_{j}(t)\le {\theta }_{j}^{max},j=1,\dots ,n$$
(15)
where $n$ is the number of joints.
Equality constraints: Ensure that the initial and final positions are at the start and goal positions:
$$s\left({t}_{0}\right)={s}_{\text{start}},s({t}_{f})={s}_{\text{goal}}.$$
(16)
A straight-line trajectory is generated as an initial guess, and optimization refines the trajectory until either a convergence threshold of 1e − 4) is reached or the maximum of 100 iterations is completed.
Specific Algorithm Parameters:
TrajOpt: Gradients are computed by perturbing trajectory points by a small value (1e-5) and observing the changes in cost. The trajectory points are iteratively updated using a learning rate of 0.01 until convergence or the maximum number of iterations.
CHOMP: Like TrajOpt, CHOMP computes gradients through perturbation (epsilon = 1e-5) but updates points using Covariant Gradient Descent. A Hessian approximation is used to speed up the process.
STOMP: This algorithm uses a stochastic method, generating noisy trajectories with random perturbations (standard deviation of 0.01) and then updating points based on the cost of these trajectories, iterating until convergence or the iteration limit.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Liu, J., Yap, H.J. & Khairuddin, A.S.M. Path Planning for the Robotic Manipulator in Dynamic Environments Based on a Deep Reinforcement Learning Method. J Intell Robot Syst 111, 3 (2025). https://doi.org/10.1007/s10846-024-02205-0

Download citation

Received: 24 January 2024
Accepted: 15 November 2024
Published: 21 December 2024
DOI: https://doi.org/10.1007/s10846-024-02205-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Path Planning for the Robotic Manipulator in Dynamic Environments Based on a Deep Reinforcement Learning Method

Abstract

Article PDF

Similar content being viewed by others

Guided Deep Reinforcement Learning for Path Planning of Robotic Manipulators

Robot Path Planning via Deep Reinforcement Learning with Improved Reward Function

Towards Dynamic Obstacle Avoidance for Robot Manipulators with Deep Reinforcement Learning

Explore related subjects

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Appendix A

Appendix A

1.1 Implementation Details

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation