Skip to main content

Case-Based Task Generalization in Model-Based Reinforcement Learning

  • Conference paper
  • First Online:
Artificial General Intelligence (AGI 2021)

Abstract

Model-based reinforcement learning has recently demonstrated significant advances in solving complex problems of sequential decision-making. Updating the model using the case of solving the current task allows the agent to update the model and apply it to improve the efficiency of solving the following similar tasks. This approach also aligns with case-based planning methods, which already have mechanisms for retrieving and reusing precedents. In this work, we propose a meta-learned case retrieval mechanism that provides case-based samples for the agent to accelerate the learning process. We have tested the performance of the proposed approach on the well-known MuJoCo dataset and have shown results at the level of methods using pre-generated expert data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Afsar, M.M., Crump, T., Far, B.: Reinforcement learning based recommender systems: a survey (2021)

    Google Scholar 

  2. Auslander, B., Lee-Urban, S., Hogg, C., Muñoz-Avila, H.: Recognizing the enemy: combining reinforcement learning with strategy selection using case-based reasoning. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 59–73. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85502-6_4

    Chapter  Google Scholar 

  3. Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013). http://arxiv.org/abs/1308.3432

  4. Bianchi, R.A.C., Ros, R., Lopez de Mantaras, R.: Improving reinforcement learning by using case based heuristics. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS (LNAI), vol. 5650, pp. 75–89. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02998-1_7

    Chapter  Google Scholar 

  5. Blundell, C., et al.: Model-free episodic control. CoRR abs/1606.04460 (2016). http://dblp.uni-trier.de/db/journals/corr/corr1606.html#BlundellUPLRLRW16

  6. Bornschein, J., Mnih, A., Zoran, D., Jimenez Rezende, D.: Variational memory addressing in generative models. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/3937230de3c8041e4da6ac3246a888e8-Paper.pdf

  7. Borrajo, D., Roubíčková, A., Serina, I.: Progress in case-based planning. ACM Comput. Surv. 47(2), 1–39 (2015). https://doi.org/10.1145/2674024

    Article  Google Scholar 

  8. Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014). http://arxiv.org/abs/1406.1078

  9. Gorodetskiy, A., Shlychkova, A., Panov, A.I.: Delta schema network in model-based reinforcement learning. In: Goertzel, B., Panov, A.I., Potapov, A., Yampolskiy, R. (eds.) AGI 2020. LNCS (LNAI), vol. 12177, pp. 172–182. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52152-3_18

    Chapter  Google Scholar 

  10. Ha, D., Schmidhuber, J.: World models. CoRR abs/1803.10122 (2018). http://arxiv.org/abs/1803.10122

  11. Hafner, D., Lillicrap, T.P., Ba, J., Norouzi, M.: Dream to control: learning behaviors by latent imagination. CoRR abs/1912.01603 (2019). http://arxiv.org/abs/1912.01603

  12. Hafner, D., et al.: Learning latent dynamics for planning from pixels. CoRR abs/1811.04551 (2018). http://arxiv.org/abs/1811.04551

  13. Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P., Levine, S.: How to train your robot with deep reinforcement learning: lessons we have learned. Int. J. Robot. Res. 40, 698–721 (2021). https://doi.org/10.1177/0278364920987859

    Article  Google Scholar 

  14. Janner, M., Fu, J., Zhang, M., Levine, S.: When to trust your model: model-based policy optimization (2019)

    Google Scholar 

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015. http://arxiv.org/abs/1412.6980, arxiv:1412.6980Comment

  16. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2014). http://dblp.uni-trier.de/db/conf/iclr/iclr2014.html#KingmaW13

  17. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  18. Moerland, T.M., Broekens, J., Jonker, C.M.: Model-based reinforcement learning: a survey (2021)

    Google Scholar 

  19. Pritzel, A., et al.: Neural episodic control. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, 06–11 August 2017, vol. 70, pp. 2827–2836. PMLR (2017). http://proceedings.mlr.press/v70/pritzel17a.html

  20. Schrittwieser, J., et al.: Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020). https://doi.org/10.1038/s41586-020-03051-4

    Article  Google Scholar 

  21. Skrynnik, A., Staroverov, A., Aitygulov, E., Aksenov, K., Davydov, V., Panov, A.I.: Hierarchical Deep Q-Network from imperfect demonstrations in Minecraft. Cogn. Syst. Res. 65, 74–78 (2021). https://doi.org/10.1016/j.cogsys.2020.08.012. https://arxiv.org/pdf/1912.08664.pdf. https://www.sciencedirect.com/science/article/pii/S1389041720300723?via%3Dihub. https://www.scopus.com/record/display.uri?eid=2-s2.0-85094320898&origin=resultslist. https://linkinghub.elsevier.com/retrieve/pii/S138904172

  22. Spalzzi, L.: A survey on case-based planning. Artif. Intell. Rev. 16(1), 3–36 (2001). https://doi.org/10.1023/A:1011081305027

    Article  Google Scholar 

  23. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018). http://incompleteideas.net/book/the-book-2nd.html

  24. Tassa, Y., et al.: dm_control: software and tasks for continuous control (2020)

    Google Scholar 

  25. Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033 (2012). https://doi.org/10.1109/IROS.2012.6386109

  26. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992). https://doi.org/10.1007/BF00992696

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the Russian Science Foundation (Project No. 20-71-10116).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Artem Zholus .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zholus, A., Panov, A.I. (2022). Case-Based Task Generalization in Model-Based Reinforcement Learning. In: Goertzel, B., Iklé, M., Potapov, A. (eds) Artificial General Intelligence. AGI 2021. Lecture Notes in Computer Science(), vol 13154. Springer, Cham. https://doi.org/10.1007/978-3-030-93758-4_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93758-4_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93757-7

  • Online ISBN: 978-3-030-93758-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics