Abstract
This paper focuses on applying reinforcement learning methods to solve the game Sokoban. This game is a popular puzzle, relatively easy for humans to solve. However, it poses a significant challenge for computer algorithms due to the irreversible nature of certain moves. To predict which actions will lead to such undesirable states is often difficult for a learning agent – a common problem in tasks requiring planning. We propose using a Monte-Carlo tree search (MCTS) algorithm and a heuristic convolution neural network (CNN) specially trained to separate undesirable, neutral, and desired game states to address this issue. We experimented with different heuristic variations of algorithms and compared them against each other. We have implemented MCTS in two different setups: one with a CNN trained using data obtained during the solving process and one without such training. We also varied the number of rollouts for each move in MCTS and compared the results. The paper’s research question was how to improve the performance of learning agents in tasks that require planning to avoid unwanted states.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Racanière, S., et al.: Imagination-augmented agents for deep reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017). https://arxiv.org/abs/1707.06203
Ge, V.: Solving planning problems with deep reinforcement learning and tree search (2018). https://hdl.handle.net/2142/101086
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Plaat, A., Kosters, W., Preuss, M.: Deep model-based reinforcement learning for high-dimensional problems, a survey. arXiv preprint arXiv:2008.05598 (2020)
Shoham, Y., Elidan, G.: Solving Sokoban with forward-backward reinforcement learning. In: Proceedings of the International Symposium on Combinatorial Search, vol. 12, no. 1 (2021)
Feng, D., Gomes, C.P., Selman, B.: A novel automated curriculum strategy to solve hard Sokoban planning instances. In: Advances in Neural Information Processing Systems, vol. 33, pp. 3141–3152 (2020). https://arxiv.org/abs/2110.00898
Gym Sokoban. https://github.com/mpSchrader/gym-sokoban. Accessed 30 Apr 2023
Kissmann, P., Edelkamp, S.: Improving cost-optimal domain-independent symbolic planning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 25, no. 1 (2011)
Acknowledgments
We would like to thank the Armed Forces of Ukraine for providing security, that made this work possible.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ignatenko, O., Pravosud, R. (2023). Solving Sokoban Game with a Heuristic for Avoiding Dead-End States. In: Antoniou, G., et al. Information and Communication Technologies in Education, Research, and Industrial Applications. ICTERI 2023. Communications in Computer and Information Science, vol 1980. Springer, Cham. https://doi.org/10.1007/978-3-031-48325-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-48325-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48324-0
Online ISBN: 978-3-031-48325-7
eBook Packages: Computer ScienceComputer Science (R0)