Skip to main content

An Improved Q-Learning Algorithm for Path Planning in Maze Environments

  • Conference paper
  • First Online:
Intelligent Systems and Applications (IntelliSys 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1251))

Included in the following conference series:

Abstract

The path planning is the problem of finding the optimal paths in a given environment, which has become an important way to test the intelligent learning algorithms. In AI-based path planning, the earliest and more in-depth issue is Intelligent Obstacle Avoidance, that is, an agent needs to successfully avoid all obstacles or traps in an unknown environment. Compared with other learning methods, RL (Reinforcement Learning) has inherent advantages in path planning. Unlike most machine learning methods, RL is an unsupervised active learning method. It can not only effectively achieve intelligent obstacle avoidance, but also find the optimal path from unfamiliar environment such as maze through many experiments. Q-learning algorithm is recognized as one of the most typical RL algorithms. Its advantages are simple and practical, but it also has the significant disadvantage of slow convergence speed. This paper gives a called ɛ-Q-Learning algorithm, which is an improvement to the traditional Q-Learning algorithm by using Dynamic Search Factor technology. Experiments show that compared with the existing Q-Learning algorithms, ɛ-Q-Learning can find out a better optimal paths with lower costs of searching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kober, J., Bagnell, J., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)

    Article  Google Scholar 

  2. Mohanan, M., Salgoankar, A.: A survey of robotic motion planning in dynamic environments. Robot. Auton. Syst. 100, 171–185 (2018)

    Article  Google Scholar 

  3. Chen, C.-L.: Autonomous Learning and Navigation Control of Mobile Robot Based on Reinforcement Learning. University of Science and Technology of China (2006)

    Google Scholar 

  4. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Boston (2018)

    Google Scholar 

  5. Gao, Y., Chen, S.-F., Lu, X.: Review of reinforcement learning. J. Autom. 1, 88–102 (2004)

    Google Scholar 

  6. Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)

    Article  Google Scholar 

  7. Liu, Z., Li, H., Liu, Q.: Reinforcement learning algorithm research. Comput. Eng. Des. 9(22), 5805–5809 (2008)

    Google Scholar 

  8. Silver, D., Maddison, C.J., Guez, A., Sifre, L., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  9. Castronovo, M., Maes, F., Fonteneau, R., et al.: Learning exploration/exploitation strategies for single trajectory reinforcement learning. In: European Workshop on Reinforcement Learning, pp. 1–10 (2013)

    Google Scholar 

  10. Xu, Y.: Research on Agent Path Planning Based on Reinforcement Learning. Shandong University (2013)

    Google Scholar 

  11. Santos, M., Martín, H.J.A., López, V., et al.: Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing game strategy decision systems. Knowl.-Based Syst. 23(4), 541–550 (2010)

    Google Scholar 

  12. He, D.-F., Sun, S.-D.: An online self-learning fuzzy navigation method for mobile robots. J. Xi’an Univ. Technol. 27(4), 325–329 (2007)

    Google Scholar 

  13. Hao, Z.-Z., Fang-Fang, Li-Ping: Three-dimensional track planning algorithm of UAV based on q-learning. J. Shanghai Jiaotong Univ. 46(12), 1931–1935 (2012)

    Google Scholar 

  14. Mnih, V., Rusu, A., Veness, J., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  15. Fakoor, M., Kosari, A., Jafarzadeh, M.: Humanoid robot path planning with fuzzy markov decision processes. J. Appl. Res. Technol. 14(5), 300–310 (2016)

    Article  Google Scholar 

  16. Roozegar, M., et al.: XCS-based reinforcement learning algorithm for motion planning of a spherical mobile robot. Appl. Intell. 45(3), 736–746 (2016)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guojun Mao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gu, S., Mao, G. (2021). An Improved Q-Learning Algorithm for Path Planning in Maze Environments. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, vol 1251. Springer, Cham. https://doi.org/10.1007/978-3-030-55187-2_40

Download citation

Publish with us

Policies and ethics