Skip to main content
Log in

A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots

  • ISNN2012
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Robust motion control is fundamental to autonomous mobile robots. In the past few years, reinforcement learning (RL) has attracted considerable attention in the feedback control of wheeled mobile robot. However, it is still difficult for RL to solve problems with large or continuous state spaces, which is common in robotics. To improve the generalization ability of RL, this paper presents a novel hierarchical RL approach for optimal path tracking of wheeled mobile robots. In the proposed approach, a graph Laplacian-based hierarchical approximate policy iteration (GHAPI) algorithm is developed, in which the basis functions are constructed automatically using the graph Laplacian operator. In GHAPI, the state space of an Markov decision process is divided into several subspaces and approximate policy iteration is carried out on each subspace. Then, a near-optimal path-tracking control strategy can be obtained by GHAPI combined with proportional-derivative (PD) control. The performance of the proposed approach is evaluated by using a P3-AT wheeled mobile robot. It is demonstrated that the GHAPI-based PD control can obtain better near-optimal control policies than previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Oh SY, Lee JH et al (2000) A new reinforcement learning vehicle control architecture for vision-based road following [J]. IEEE Trans Veh Technol 49(3):997–1005

    Article  Google Scholar 

  2. Yamaguchi T, Sato E et al (2003) Intelligent space and human centered robotics [J]. IEEE Trans Ind Electron 50(5):881–889

    Article  Google Scholar 

  3. Lee JM, Son K et al (2003) Localization of a mobile robot using the image of a moving object [J]. IEEE Trans Ind Electron 50(3):612–619

    Article  Google Scholar 

  4. Lee TC, Tsai CY et al (2004) Fast parking control of mobile robots: a motion planning approach with experimental validation [J]. IEEE Trans Control Syst Technol 12(5):661–676

    Article  Google Scholar 

  5. Palacin J, Salse JA et al (2004) Building a mobile robot for a floor-cleaning operation in domestic environments [J]. IEEE Trans Instrum Meas 53(5):1418–1424

    Article  Google Scholar 

  6. Ding D, Cooper RA (2005) Electric-powered wheelchairs: a review of current technology and insight into future direction [J]. IEEE Control Syst Mag 25(2):22–34

    Article  Google Scholar 

  7. Shim HS, Sung YG (2004) Stability and four-posture control for nonholonomic mobile robots [J]. IEEE Trans Robot Autom 20(1):148–154

    Article  Google Scholar 

  8. Zhao DB, Deng XY, Yi JQ (2009) Motion and internal force control for omni-directional wheeled mobile robots [J]. IEEE ASME Trans Mechatron 14(3):382–387

    Article  Google Scholar 

  9. Wu Y, Wang B et al (2005) Finite-time tracking controller design for nonholonomic systems with extended chained form[J]. IEEE Trans Circuit Syst II Exp Briefs 52(11):798–802

    Article  Google Scholar 

  10. Antonelli G, Chiaverini S et al (2007) A fuzzy-logic-based approach for mobile robot path tracking[J]. IEEE Trans Fuzzy Syst 15(2):211–221

    Article  Google Scholar 

  11. Raffo GV, Gomes GK et al (2009) A predictive controller for autonomous vehicle path tracking[J]. IEEE Trans Intell Transp Syst 10(1):92–102

    Article  Google Scholar 

  12. Wai R, Liu C (2009) Design of dynamic petri recurrent fuzzy neural network and its application to path-tracking control of nonholonomic mobile robot[J]. IEEE Trans Ind Electron 56(7):2667–2683

    Article  Google Scholar 

  13. Mohareri O, Dhaouadi R et al (2012) Indirect adaptive tracking control of a nonholonomic mobile robot via neural networks[J]. Neurocomputing 88:54–66

    Article  Google Scholar 

  14. Aguiar AP, Hespanha JP (2007) Trajectory-tracking and path-following of underactuated autonomous vehicles with parametric modeling uncertainty[J]. IEEE Trans Autom Cont 52(8):1362–1379

    Article  MathSciNet  Google Scholar 

  15. Xu D, Zhao DB, Yi JQ, Tan XM (2009) Trajectory tracking control of omnidirectional wheeled mobile manipulators: robust neural network based sliding mode approach [J]. IEEE Trans Syst Man Cybern Part B 39(3):788–799

    Article  Google Scholar 

  16. Park BS, Yoo SJ et al (2010) A simple adaptive control approach for trajectory tracking of electrically driven nonholonomic mobile robots[J]. IEEE Trans Control Syst Technol 18(5):1199–1206

    Article  Google Scholar 

  17. Sutton R, Barto AG (1998) Reinforcement learning: an introduction[M]. The MIT Press, Cambridge

    Google Scholar 

  18. Zhang Q, Li M, Wang XS, Zhang Y (2012) Reinforcement learning in robot path optimization [J]. J Softw 7(3):657–662

    Google Scholar 

  19. Zhang PC, Xu X, Liu C, Yuan Q (2009) Reinforcement learning control of a real mobile robot using approximate policy iteration [C]. ISNN 2009, Part III, Lecture Notes in Computer Science, LNCS 5553, pp 278–288

  20. Yen GG, Hickey TW (2004) Reinforcement learning algorithms for robotic navigation in dynamic environments. ISA Trans 43:217–230

    Article  Google Scholar 

  21. Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction [J]. IEEE Comput Intell Mag 4(2):39–47

    Article  Google Scholar 

  22. Sutton R (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding[C]. In: Advances in Neural Information Processing Systems 8 (Proceedings of the 1995 conference). MIT Press, pp 1038–1044

  23. Xu X, He H et al (2002) Efficient reinforcement learning using recursive least-squares methods[J]. J Art Intell Res 16:259–292

    MathSciNet  MATH  Google Scholar 

  24. Lagoudakis MG, Parr R (2003) Least-squares policy Iteration[J]. J Mach Learn Res 4:1107–1149

    MathSciNet  Google Scholar 

  25. Liu D, Javaherian H, Kovalenko O, Huang T (2008) Adaptive critic learning techniques for engine torque and air-fuel ratio control [J]. IEEE Trans Syst Man Cybern Part B Cybern 38(4):988–993

    Article  Google Scholar 

  26. Zhang H, Luo Y, Liu D (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20(9):1490–1503

    Article  Google Scholar 

  27. Zhang HG, Wei QL, Liu D (2011) An iterative approximate dynamic programming method to solve for a class of nonlinear zero-sum differential games. Automatica 47(1):207–214

    Article  MathSciNet  MATH  Google Scholar 

  28. Yang Q, Jagannathan S (2007) Online reinforcement learning neural network controller design for nanomanipulation. In: Proceedings of IEEE symposium on approximate dynamic programming and reinforcement learning. Honolulu, HI, pp 225–232

  29. Xu X, Hu DW et al (2007) Kernel-based least squares policy iteration for reinforcement learning[J]. IEEE Trans Neural Netw 18(4):973–992

    Article  Google Scholar 

  30. Bernhard H (2003) Discovering hierarchy in reinforcement learning[D]. University of New South Wales, Australia

    Google Scholar 

  31. Doina SRSP et al (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning[J]. Artif Intell 112:181–211

    Article  MATH  Google Scholar 

  32. Parr R (1998) Hierarchical control and learning for Markov decision processes[D]. University of California, Berkeley, California

    Google Scholar 

  33. Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. J Art Intell Res 13:227–303

    MathSciNet  MATH  Google Scholar 

  34. Xu X, Liu C et al (2011) Hierarchical approximate policy iteration with binary-tree state space decomposition[J]. IEEE Trans Neural Netw 22(12):1863–1877

    Article  Google Scholar 

  35. Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont, Massachusetts

    MATH  Google Scholar 

  36. Vapnik V (1998) Statistical learning theory[M]. John Wiley and Sons, Inc., New York

    Google Scholar 

  37. Mahadevan S, Maggioni M (2007) Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes[J]. J Mach Learn Res 8:2169–2231

    MathSciNet  MATH  Google Scholar 

  38. Mahadevan S (2008) Learning representation and control in Markov decision processes: new Frontiers[J]. Found Trends Mach Learn 1(4):403–565

    Article  MathSciNet  Google Scholar 

  39. Normey-Rico JE, Alcalab I et al (2001) Mobile robot path tracking using a robust PID controller[J]. Control Eng Pract 9:1209–1214

    Article  Google Scholar 

  40. Mahadevan S, Maggioni M (2006) Value function approximation with diffusion wavelets and Laplacian eigenfunctions[C]. In: Proceedings of the neural information processing systems (NIPS). MIT Press

  41. Munos R (2003) Error bounds for approximate policy iteration. In: Proceedings of the 20th annual international conference machine learning. p 560

Download references

Acknowledgments

This paper is supported by National Natural Science Foundation of China under Grant 61075072, & 90820302, the Program for New Century Excellent Talents in University under Grant NCET-10-0901.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Xu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zuo, L., Xu, X., Liu, C. et al. A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots. Neural Comput & Applic 23, 1873–1883 (2013). https://doi.org/10.1007/s00521-012-1243-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-1243-4

Keywords