Skip to main content

A Modified I2A Agent for Learning in a Stochastic Environment

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12496))

Included in the following conference series:

Abstract

The paper proposes and analyses the evolution of a deep reinforcement learning agent in a stochastic environment that represents a simple game. We investigate the use of an embedded planning loop in the training of a model free agent, using a learned model in the style of I2A (Imagination-Augmented Agent), to solve a stochastic grid environment. The performance of the proposed agent architecture is compared against a baseline A2C (Advantage Actor Critic) agent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://gym.openai.com/.

References

  1. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)

    Google Scholar 

  2. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms (2017). https://arxiv.org/abs/1707.06347

  3. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)

    Google Scholar 

  4. Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor (2018). https://arxiv.org/abs/1801.01290

  5. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with Deep Reinforcement Learning. Preprint at: https://arxiv.org/abs/1801.01290 (2013)

  6. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning (2015). https://arxiv.org/abs/1509.02971

  7. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017). https://doi.org/10.1038/nature24270

  8. Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T.: A general reinforcement learning algorithm that masters chess, Shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)

    Article  MathSciNet  Google Scholar 

  9. Talvitie, E.: Model regularization for stable sample rollouts. In: Thirtieth Conference on Uncertainty in Artificial Intelligence, pp. 780–789 (2014)

    Google Scholar 

  10. Talvitie, E.: Agnostic system identification for monte carlo planning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  11. Racanière, S., et al.: Imagination-augmented agents for deep reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 5690–5701 (2017)

    Google Scholar 

  12. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-Cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)

    Google Scholar 

  13. Lapan, M.: Deep Reinforcement Learning Hands-On: Apply Modern RL Methods, with Deep Q-networks, Value Iteration, Policy Gradients, TRPO, AlphaGo Zero and More. Packt Publishing, Birmingham (2018)

    Google Scholar 

  14. Hafner, D., et al.: Learning latent dynamics for planning from pixels (2018). https://arxiv.org/abs/1811.04551

  15. Ha, D., Schmidhuber, J.: World models (2018). https://arxiv.org/abs/1803.10122

  16. Schrittwieser, J., et al.: Mastering Atari, go, chess and shogi by planning with a learned model (2019). https://arxiv.org/abs/1911.08265

  17. Pal, C.V.: I2AGrid. Online source code (2020). https://github.com/ValentinPal/I2AGrid

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florin Leon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pal, CV., Leon, F. (2020). A Modified I2A Agent for Learning in a Stochastic Environment. In: Nguyen, N.T., Hoang, B.H., Huynh, C.P., Hwang, D., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2020. Lecture Notes in Computer Science(), vol 12496. Springer, Cham. https://doi.org/10.1007/978-3-030-63007-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63007-2_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63006-5

  • Online ISBN: 978-3-030-63007-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics