Discovering Strategy in Navigation Problem

Haji Mohd Sani, Nurulhidayati; Phon-Amnuaisuk, Somnuk; Au, Thien Wan

doi:10.1007/978-981-32-9563-6_24

Nurulhidayati Haji Mohd Sani¹⁰,
Somnuk Phon-Amnuaisuk^10,11 &
Thien Wan Au¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1071))

Included in the following conference series:

International Conference on Data Mining and Big Data

1078 Accesses

Abstract

This paper explores ways to discover strategy from a state-action-state-reward log recorded during a reinforcement learning session. The term strategy here implies that we are interested not only in a one-step state-action but also a fruitful sequence of state-actions. Traditional RL has proved that it can successfully learn a good sequence of actions. However, it is often observed that some of the action sequences learned could be more effective. For example, an effective five-step navigation to the north direction can be achieved in thousands of ways if there are no other constraints since an agent could move in numerous tactics to achieve the same end result. Traditional RL such as value learning or state-action value learning does not directly address this issue. In this preliminary experiment, sets of state-action (i.e., a one-step policy) are extracted from 10,446 records, grouped together and then joined together forming a directed graph. This graph summarizes the policy learned by the agent. We argue that strategy could be extracted from the analysis of this graph network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Atyabi, A., Phon-Amnuaisuk, S., Ho, C.K.: Navigating a robotic swarm in an uncharted 2D landscape. Appl. Soft Comput. 10(1), 149–169 (2010). https://doi.org/10.1016/j.asoc.2009.06.017
Article MATH Google Scholar
Dereszynski, E., Hostetler, J., Fern, A., Dietterich, T., Hoang, T.T., Udarbe, M.: Learning probabilistic behavior models in real-time strategy games. In: Proceedings of the Seventh AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE 2011, pp. 20–25. AAAI Press (2011). http://dl.acm.org/citation.cfm?id=3014589.3014594
Dharmarajan, K., Dorairangaswamy, M.A.: Web user navigation pattern behavior prediction using nearest neighbor interchange from weblog data. Int. J. Pure Appl. Math. 116, 761–775 (2017)
Google Scholar
Dhiman, V., Banerjee, S., Griffin, B., Siskind, J.M., Corso, J.J.: A critical investigation of deep reinforcement learning for navigation. CoRR abs/1802.02274 (2018)
Google Scholar
Duchoň, F., et al.: Path planning with modified a star algorithm for a mobile robot. Procedia Eng. 96, 59–69 (2014). https://doi.org/10.1016/j.proeng.2014.12.098
Article Google Scholar
Feldman, J.A., Sproull, R.F.: Decision theory and artificial intelligence II: the hungry monkey. Cogn. Sci. 1(2), 158–192 (1977)
Article Google Scholar
Glavin, F.G., Madden, M.G.: Adaptive shooting for bots in first person shooter games using reinforcement learning. IEEE Trans. Comput. Intell. AI Games 7(2), 180–192 (2015)
Article Google Scholar
Guruji, A.K., Agarwal, H., Parsediya, D.: Time-efficient \(A\ast \) algorithm for robot path planning. Procedia Technol. 23, 144–149 (2016). https://doi.org/10.1016/j.protcy.2016.03.010
Article Google Scholar
Hsieh, J.L., Sun, C.T.: Building a player strategy model by analyzing replays of real-time strategy games. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence). IEEE, June 2008. https://doi.org/10.1109/ijcnn.2008.4634237
Hussein, A., Elyan, E., Gaber, M.M., Jayne, C.: Deep imitation learning for 3D navigation tasks. Neural Comput. Appl. 29(7), 389–404 (2018)
Article Google Scholar
Li, Y., Lin, Q., Zhong, G., Duan, D., Jin, Y., Bi, W.: A directed labeled graph frequent pattern mining algorithm based on minimum code. In: 2009 Third International Conference on Multimedia and Ubiquitous Engineering. IEEE, June 2009. https://doi.org/10.1109/mue.2009.67
McCarthy, J.: Situations, actions, and causal laws, p. 14, July 1963
Google Scholar
Phon-Amnuaisuk, S.: Evolving and discovering Tetris gameplay strategies. Procedia Comput. Sci. 60, 458–467 (2015). https://doi.org/10.1016/j.procs.2015.08.167
Article Google Scholar
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson, London (2010)
MATH Google Scholar
Haji Mohd Sani, N., Phon-Amnuaisuk, S., Au, T.W., Tan, E.L.: Learning to navigate in a 3D environment. In: Sombattheera, C., Stolzenburg, F., Lin, F., Nayak, A. (eds.) MIWAI 2016. LNCS (LNAI), vol. 10053, pp. 271–278. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49397-8_23
Chapter Google Scholar
Sani, N.H.M., Phon-Amnuaisuk, S., Au, T.W., Tan, E.L.: Learning to navigate in 3D virtual environment using Q-learning. In: Omar, S., Haji Suhaili, W.S., Phon-Amnuaisuk, S. (eds.) CIIS 2018. AISC, vol. 888, pp. 191–202. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-03302-6_17
Chapter Google Scholar
Tekin, U., Buzluca, F.: A graph mining approach for detecting identical design structures in object-oriented design models. Sci. Comput. Program. 95, 406–425 (2014). https://doi.org/10.1016/j.scico.2013.09.015
Article Google Scholar
Wang, D., Tan, A.H.: Creating autonomous adaptive agents in a real-time first-person shooter computer game. IEEE Trans. Comput. Intell. AI Games 7(2), 123–138 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing and Informatics, Universiti Teknologi Brunei, Gadong, Brunei Darussalam
Nurulhidayati Haji Mohd Sani, Somnuk Phon-Amnuaisuk & Thien Wan Au
Centre for Innovative Engineering, Universiti Teknologi Brunei, Jalan Tungku Link, Gadong, BE1410, Brunei Darussalam
Somnuk Phon-Amnuaisuk

Authors

Nurulhidayati Haji Mohd Sani
View author publications
You can also search for this author in PubMed Google Scholar
Somnuk Phon-Amnuaisuk
View author publications
You can also search for this author in PubMed Google Scholar
Thien Wan Au
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nurulhidayati Haji Mohd Sani .

Editor information

Editors and Affiliations

Department of Machine Intelligence, Peking University, Beijing, China
Ying Tan
Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China
Yuhui Shi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Haji Mohd Sani, N., Phon-Amnuaisuk, S., Au, T.W. (2019). Discovering Strategy in Navigation Problem. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2019. Communications in Computer and Information Science, vol 1071. Springer, Singapore. https://doi.org/10.1007/978-981-32-9563-6_24

Download citation

DOI: https://doi.org/10.1007/978-981-32-9563-6_24
Published: 26 July 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9562-9
Online ISBN: 978-981-32-9563-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics