Case-Based Task Generalization in Model-Based Reinforcement Learning

Zholus, Artem; Panov, Aleksandr I.

doi:10.1007/978-3-030-93758-4_35

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13154))

Included in the following conference series:

International Conference on Artificial General Intelligence

1364 Accesses
2 Citations

Abstract

Model-based reinforcement learning has recently demonstrated significant advances in solving complex problems of sequential decision-making. Updating the model using the case of solving the current task allows the agent to update the model and apply it to improve the efficiency of solving the following similar tasks. This approach also aligns with case-based planning methods, which already have mechanisms for retrieving and reusing precedents. In this work, we propose a meta-learned case retrieval mechanism that provides case-based samples for the agent to accelerate the learning process. We have tested the performance of the proposed approach on the well-known MuJoCo dataset and have shown results at the level of methods using pre-generated expert data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Afsar, M.M., Crump, T., Far, B.: Reinforcement learning based recommender systems: a survey (2021)
Google Scholar
Auslander, B., Lee-Urban, S., Hogg, C., Muñoz-Avila, H.: Recognizing the enemy: combining reinforcement learning with strategy selection using case-based reasoning. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 59–73. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85502-6_4
Chapter Google Scholar
Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013). http://arxiv.org/abs/1308.3432
Bianchi, R.A.C., Ros, R., Lopez de Mantaras, R.: Improving reinforcement learning by using case based heuristics. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS (LNAI), vol. 5650, pp. 75–89. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02998-1_7
Chapter Google Scholar
Blundell, C., et al.: Model-free episodic control. CoRR abs/1606.04460 (2016). http://dblp.uni-trier.de/db/journals/corr/corr1606.html#BlundellUPLRLRW16
Bornschein, J., Mnih, A., Zoran, D., Jimenez Rezende, D.: Variational memory addressing in generative models. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017). https://proceedings.neurips.cc/paper/2017/file/3937230de3c8041e4da6ac3246a888e8-Paper.pdf
Borrajo, D., Roubíčková, A., Serina, I.: Progress in case-based planning. ACM Comput. Surv. 47(2), 1–39 (2015). https://doi.org/10.1145/2674024
Article Google Scholar
Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs/1406.1078 (2014). http://arxiv.org/abs/1406.1078
Gorodetskiy, A., Shlychkova, A., Panov, A.I.: Delta schema network in model-based reinforcement learning. In: Goertzel, B., Panov, A.I., Potapov, A., Yampolskiy, R. (eds.) AGI 2020. LNCS (LNAI), vol. 12177, pp. 172–182. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52152-3_18
Chapter Google Scholar
Ha, D., Schmidhuber, J.: World models. CoRR abs/1803.10122 (2018). http://arxiv.org/abs/1803.10122
Hafner, D., Lillicrap, T.P., Ba, J., Norouzi, M.: Dream to control: learning behaviors by latent imagination. CoRR abs/1912.01603 (2019). http://arxiv.org/abs/1912.01603
Hafner, D., et al.: Learning latent dynamics for planning from pixels. CoRR abs/1811.04551 (2018). http://arxiv.org/abs/1811.04551
Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P., Levine, S.: How to train your robot with deep reinforcement learning: lessons we have learned. Int. J. Robot. Res. 40, 698–721 (2021). https://doi.org/10.1177/0278364920987859
Article Google Scholar
Janner, M., Fu, J., Zhang, M., Levine, S.: When to trust your model: model-based policy optimization (2019)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015. http://arxiv.org/abs/1412.6980, arxiv:1412.6980Comment
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Bengio, Y., LeCun, Y. (eds.) ICLR (2014). http://dblp.uni-trier.de/db/conf/iclr/iclr2014.html#KingmaW13
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Article Google Scholar
Moerland, T.M., Broekens, J., Jonker, C.M.: Model-based reinforcement learning: a survey (2021)
Google Scholar
Pritzel, A., et al.: Neural episodic control. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, 06–11 August 2017, vol. 70, pp. 2827–2836. PMLR (2017). http://proceedings.mlr.press/v70/pritzel17a.html
Schrittwieser, J., et al.: Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020). https://doi.org/10.1038/s41586-020-03051-4
Article Google Scholar
Skrynnik, A., Staroverov, A., Aitygulov, E., Aksenov, K., Davydov, V., Panov, A.I.: Hierarchical Deep Q-Network from imperfect demonstrations in Minecraft. Cogn. Syst. Res. 65, 74–78 (2021). https://doi.org/10.1016/j.cogsys.2020.08.012. https://arxiv.org/pdf/1912.08664.pdf. https://www.sciencedirect.com/science/article/pii/S1389041720300723?via%3Dihub. https://www.scopus.com/record/display.uri?eid=2-s2.0-85094320898&origin=resultslist. https://linkinghub.elsevier.com/retrieve/pii/S138904172
Spalzzi, L.: A survey on case-based planning. Artif. Intell. Rev. 16(1), 3–36 (2001). https://doi.org/10.1023/A:1011081305027
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018). http://incompleteideas.net/book/the-book-2nd.html
Tassa, Y., et al.: dm_control: software and tasks for continuous control (2020)
Google Scholar
Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033 (2012). https://doi.org/10.1109/IROS.2012.6386109
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992). https://doi.org/10.1007/BF00992696
Article MATH Google Scholar

Download references

Acknowledgements

This work was supported by the Russian Science Foundation (Project No. 20-71-10116).

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology, Moscow, Russia
Artem Zholus & Aleksandr I. Panov

Authors

Artem Zholus
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandr I. Panov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Artem Zholus .

Editor information

Editors and Affiliations

SingularityNET, Amsterdam, The Netherlands
Ben Goertzel
SingularityNET, Amsterdam, The Netherlands
Matthew Iklé
SingularityNET, Amsterdam, The Netherlands
Alexey Potapov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zholus, A., Panov, A.I. (2022). Case-Based Task Generalization in Model-Based Reinforcement Learning. In: Goertzel, B., Iklé, M., Potapov, A. (eds) Artificial General Intelligence. AGI 2021. Lecture Notes in Computer Science(), vol 13154. Springer, Cham. https://doi.org/10.1007/978-3-030-93758-4_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-93758-4_35
Published: 06 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93757-7
Online ISBN: 978-3-030-93758-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Case-Based Task Generalization in Model-Based Reinforcement Learning