Transformed Successor Features for Transfer Reinforcement Learning

Garces, Kiyoshige; Xuan, Junyu; Zuo, Hua

doi:10.1007/978-981-99-8391-9_24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14472))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

572 Accesses

Abstract

Reinforcement learning algorithms require an extensive number of samples to perform a specific task. To achieve the same performance on a new task, the agent must learn from scratch. Transfer reinforcement learning is an emerging solution that aims to improve sample efficiency by reusing previously learnt knowledge in new tasks. Successor feature is a technique aiming to reuse representations to leverage that knowledge in unseen tasks. Successor feature has achieved outstanding results on the assumption that the transition dynamics must remain across tasks. Initial Successor feature approach omits settings with different environment dynamics, common among real-life tasks in reinforcement learning problems. Our approach transformed successor feature projects a set of diverse dynamics into a common dynamic distribution. Hence, it is an initial solution to relax the restriction of transference across fixed environment dynamics. Experimental results indicate that the transformed successor feature improves the transfer of knowledge in environments with fixed and diverse dynamics under the control of a simulated robotic arm, a robotic leg, and the cartpole environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abdolshah, M., Le, H., George, T.K., Gupta, S., Rana, S., Venkatesh, S.: A new representation of successor features for transfer across dissimilar environments. In: International Conference on Machine Learning (ICML), vol. 139, pp. 1–9 (2021)
Google Scholar
Abel, D., Arumugam, D., Lehnert, L., Littman, M.: State abstractions for lifelong reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80 (2018)
Google Scholar
Allen, C., Parikh, N., Gottesman, O., Konidaris, G.: Learning Markov state abstractions for deep reinforcement learning. Adv. Neural Inf. Process. Syst. 34, 8229–8241 (2021)
Google Scholar
Barreto, A., et al.: Successor features for transfer in reinforcement learning. In: Advances in Neural Information Processing Systems (NIPS), vol. 30. Barcelona, Spain (2017)
Google Scholar
Barreto, A., Hou, S., Borsa, D., Silver, D., Precup, D.: Fast reinforcement learning with generalized policy updates. Proc. Natl. Acad. Sci. 117, 30079–30087 (2020)
Google Scholar
Barreto, A., et al.: Transfer in deep reinforcement learning using successor features and generalised policy improvement. In: International Conference on Machine Learning (ICML), pp. 501–510 (2019)
Google Scholar
Brantley, K., Mehri, S., Gordon, G.J.: Successor feature sets: generalizing successor representations across policies. In: AAAI Conference on Artificial Intelligence, vol. 35, pp. 11774–11781 (2021)
Google Scholar
Carvalho, W., Filos, A., Lewis, R.L., Lee, H., Singh, S.: Composing task knowledge with modular successor feature approximators. In: International Conference on Learning Representations (ICLR) (2023)
Google Scholar
Dayan, P.: Improving generalization for temporal difference learning: the successor representation. Neural Comput. 5(4), 613–624 (1993)
Article Google Scholar
Goumiri, I.R., Priest, B.W., Schneider, M.D.: Reinforcement learning via gaussian processes with neural network dual kernels. In: IEEE Conference on Games (CoG), pp. 1–8 (2020)
Google Scholar
Hunt, J., Barreto, A., Lillicrap, T., Heess, N.: Composing entropic policies using divergence correction. In: Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 2911–2920 (2019)
Google Scholar
Janner, M., Mordatch, I., Levine, S.: \(\gamma \)-models: generative temporal difference learning for infinite-horizon prediction. Adv. Neural Inf. Process. Syst. (NIPS) 33, 1724–1735 (2020)
Google Scholar
Kulkarni, T.D., Saeedi, A., Gautam, S., Gershman, S.J.: Deep successor reinforcement learning (2016). preprint on webpage at https://arxiv.org/abs/1606.02396
van der Laan, M.J., Polley, E.C., Hubbard, A.E.: Super learner. Stat. Appl. Genet. Mol. Biol. 6 (2007)
Google Scholar
Madarasz, T., Behrens, T.: Better transfer learning with inferred successor maps. In: Advances in Neural Information Processing Systems (NIPS), vol. 32. Vancouver, BC, Canada (2019)
Google Scholar
Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows. In: International Conference on Machine Learning (ICML), vol. 37, pp. 1530–1538 (2015)
Google Scholar
Schaul, T., Horgan, D., Gregor, K., Silver, D.: Universal value function approximators. In: International Conference on Machine Learning (ICML), vol. 37, pp. 1312–1320. Lille, France (2015)
Google Scholar
Tasfi, N., Santana, E., Liboni, L., Capretz, M.: Dynamic successor features for transfer learning and guided exploration. Knowl.-Based Syst. 267, 110401 (2023)
Article Google Scholar
Zhang, J., Springenberg, J.T., Boedecker, J., Burgard, W.: Deep reinforcement learning with successor features for navigation across similar environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2371–2378 (2017)
Google Scholar
Zhu, Z., Lin, K., Jain, A.K., Zhou, J.: Transfer learning in deep reinforcement learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45(11), 13344–13362 (2023)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the Australian Research Council through Discovery Early Career Researcher Awards DE220101075 and DE200100245, and by the University of Technology Sydney (UTS) and Australian Technology Network (ATN) through UTS ATN-LATAM Research Scholarship Award.

Author information

Authors and Affiliations

Australian Artificial Intelligence Institute (AAII), University of Technology Sydney, 15 Broadway, Sydney, NSW, 2007, Australia
Kiyoshige Garces, Junyu Xuan & Hua Zuo

Authors

Kiyoshige Garces
View author publications
You can also search for this author in PubMed Google Scholar
Junyu Xuan
View author publications
You can also search for this author in PubMed Google Scholar
Hua Zuo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kiyoshige Garces .

Editor information

Editors and Affiliations

The University of Sydney, Darlington, NSW, Australia
Tongliang Liu
Monash University, Clayton, VIC, Australia
Geoff Webb
The University of Newcastle, Callaghan, NSW, Australia
Lin Yue
CSIRO Data61, Sydney, NSW, Australia
Dadong Wang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1097 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garces, K., Xuan, J., Zuo, H. (2024). Transformed Successor Features for Transfer Reinforcement Learning. In: Liu, T., Webb, G., Yue, L., Wang, D. (eds) AI 2023: Advances in Artificial Intelligence. AI 2023. Lecture Notes in Computer Science(), vol 14472. Springer, Singapore. https://doi.org/10.1007/978-981-99-8391-9_24

Download citation

DOI: https://doi.org/10.1007/978-981-99-8391-9_24
Published: 27 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8390-2
Online ISBN: 978-981-99-8391-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics