Abstract
Model-based reinforcement learning has recently demonstrated significant advances in solving complex problems of sequential decision-making. Updating the model using the case of solving the current task allows the agent to update the model and apply it to improve the efficiency of solving the following similar tasks. This approach also aligns with case-based planning methods, which already have mechanisms for retrieving and reusing precedents. In this work, we propose a meta-learned case retrieval mechanism that provides case-based samples for the agent to accelerate the learning process. We have tested the performance of the proposed approach on the well-known MuJoCo dataset and have shown results at the level of methods using pre-generated expert data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Afsar, M.M., Crump, T., Far, B.: Reinforcement learning based recommender systems: a survey (2021)
Auslander, B., Lee-Urban, S., Hogg, C., Muñoz-Avila, H.: Recognizing the enemy: combining reinforcement learning with strategy selection using case-based reasoning. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 59–73. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85502-6_4
Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013). http://arxiv.org/abs/1308.3432
Bianchi, R.A.C., Ros, R., Lopez de Mantaras, R.: Improving reinforcement learning by using case based heuristics. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS (LNAI), vol. 5650, pp. 75–89. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02998-1_7
Blundell, C., et al.: Model-free episodic control. CoRR abs/1606.04460 (2016). http://dblp.uni-trier.de/db/journals/corr/corr1606.html#BlundellUPLRLRW16
Bornschein, J., Mnih, A., Zoran, D., Jimenez Rezende, D.: Variational memory addressing in generative models. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/3937230de3c8041e4da6ac3246a888e8-Paper.pdf
Borrajo, D., Roubíčková, A., Serina, I.: Progress in case-based planning. ACM Comput. Surv. 47(2), 1–39 (2015). https://doi.org/10.1145/2674024
Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014). http://arxiv.org/abs/1406.1078
Gorodetskiy, A., Shlychkova, A., Panov, A.I.: Delta schema network in model-based reinforcement learning. In: Goertzel, B., Panov, A.I., Potapov, A., Yampolskiy, R. (eds.) AGI 2020. LNCS (LNAI), vol. 12177, pp. 172–182. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52152-3_18
Ha, D., Schmidhuber, J.: World models. CoRR abs/1803.10122 (2018). http://arxiv.org/abs/1803.10122
Hafner, D., Lillicrap, T.P., Ba, J., Norouzi, M.: Dream to control: learning behaviors by latent imagination. CoRR abs/1912.01603 (2019). http://arxiv.org/abs/1912.01603
Hafner, D., et al.: Learning latent dynamics for planning from pixels. CoRR abs/1811.04551 (2018). http://arxiv.org/abs/1811.04551
Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P., Levine, S.: How to train your robot with deep reinforcement learning: lessons we have learned. Int. J. Robot. Res. 40, 698–721 (2021). https://doi.org/10.1177/0278364920987859
Janner, M., Fu, J., Zhang, M., Levine, S.: When to trust your model: model-based policy optimization (2019)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015. http://arxiv.org/abs/1412.6980, arxiv:1412.6980Comment
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2014). http://dblp.uni-trier.de/db/conf/iclr/iclr2014.html#KingmaW13
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Moerland, T.M., Broekens, J., Jonker, C.M.: Model-based reinforcement learning: a survey (2021)
Pritzel, A., et al.: Neural episodic control. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, 06–11 August 2017, vol. 70, pp. 2827–2836. PMLR (2017). http://proceedings.mlr.press/v70/pritzel17a.html
Schrittwieser, J., et al.: Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020). https://doi.org/10.1038/s41586-020-03051-4
Skrynnik, A., Staroverov, A., Aitygulov, E., Aksenov, K., Davydov, V., Panov, A.I.: Hierarchical Deep Q-Network from imperfect demonstrations in Minecraft. Cogn. Syst. Res. 65, 74–78 (2021). https://doi.org/10.1016/j.cogsys.2020.08.012. https://arxiv.org/pdf/1912.08664.pdf. https://www.sciencedirect.com/science/article/pii/S1389041720300723?via%3Dihub. https://www.scopus.com/record/display.uri?eid=2-s2.0-85094320898&origin=resultslist. https://linkinghub.elsevier.com/retrieve/pii/S138904172
Spalzzi, L.: A survey on case-based planning. Artif. Intell. Rev. 16(1), 3–36 (2001). https://doi.org/10.1023/A:1011081305027
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018). http://incompleteideas.net/book/the-book-2nd.html
Tassa, Y., et al.: dm_control: software and tasks for continuous control (2020)
Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033 (2012). https://doi.org/10.1109/IROS.2012.6386109
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992). https://doi.org/10.1007/BF00992696
Acknowledgements
This work was supported by the Russian Science Foundation (Project No. 20-71-10116).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Zholus, A., Panov, A.I. (2022). Case-Based Task Generalization in Model-Based Reinforcement Learning. In: Goertzel, B., Iklé, M., Potapov, A. (eds) Artificial General Intelligence. AGI 2021. Lecture Notes in Computer Science(), vol 13154. Springer, Cham. https://doi.org/10.1007/978-3-030-93758-4_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-93758-4_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93757-7
Online ISBN: 978-3-030-93758-4
eBook Packages: Computer ScienceComputer Science (R0)