Skip to main content

Estimation of the Change of Agents Behavior Strategy Using State-Action History

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2017 (ICANN 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10614))

Included in the following conference series:

  • 4212 Accesses

Abstract

Reinforcement learning (RL) provides a computational model to animal’s autonomous acquisition of behaviors even in an uncertain environment. Inverse reinforcement learning (IRL) is its opposite; given a history of behaviors of an agent, IRL attempts to determine the unknown characteristics, like a reward function, of the agent. Conventional IRL methods usually assume the agent has taken a stationary policy that is optimal in the environment. However, real RL agents do not necessarily take stationary policy, because they are often on the way of adapting to their own environments. Especially when facing an uncertain environment, an intelligent agent should take a mixed (or switching) strategy consisting of an exploitation that is best at the current situation and an exploration to resolve the environmental uncertainty. In this study, we propose a new IRL method that can identify both of a non-stationary policy and a fixed but unknown reward function, based on the behavioral history of a learning agent; in particular, we estimate a change point of the behavior policy from an exploratory one in the agent’s early stage of the learning and an exploitative one in its later learning stage. When applied to a computer simulation during a simple maze task of an agent, our method could identify the change point of the behavior policy and the fixed reward function, only from the agent’s history of behaviors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sutton, R.A., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  2. Ramachandran, D., Amir, E.: Bayesian inverse reinforcement learning. Urbana 51(61801), 1–4 (2007)

    Google Scholar 

  3. Russell, S.: Learning agents for uncertain environments. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 101–103. ACM, July 1998

    Google Scholar 

  4. Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1. ACM, July 2004

    Google Scholar 

  5. Samejima, K., Doya, K., Ueda, Y., Kimura, M.: Estimating internal variables and paramters of a learning agent by a particle filter. In: Neural Information Processing Systems (NIPS), pp. 1335–1342, December 2003

    Google Scholar 

  6. Ishii, S., Yoshida, W., Yoshimoto, J.: Control of exploitation-exploration meta-parameter in reinforcement learning. Neural Netw. 15(4), 665–687 (2002)

    Article  Google Scholar 

  7. Sakurai, S., Oba, S., Ishii, S.: Inverse reinforcement learning based on behaviors of a learning agent. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9489, pp. 724–732. Springer, Cham (2015). doi:10.1007/978-3-319-26532-2_80

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shihori Uchida .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Uchida, S., Oba, S., Ishii, S. (2017). Estimation of the Change of Agents Behavior Strategy Using State-Action History. In: Lintas, A., Rovetta, S., Verschure, P., Villa, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2017. ICANN 2017. Lecture Notes in Computer Science(), vol 10614. Springer, Cham. https://doi.org/10.1007/978-3-319-68612-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68612-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68611-0

  • Online ISBN: 978-3-319-68612-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics