Unsupervised State Representation Learning in Partially Observable Atari Games

Meng, Li; Goodwin, Morten; Yazidi, Anis; Engelstad, Paal

doi:10.1007/978-3-031-44240-7_21

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14185))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

302 Accesses

Abstract

State representation learning aims to capture latent factors of an environment. Although some researchers realize the connections between masked image modeling and contrastive representation learning, the effort is focused on using masks as an augmentation technique to represent the latent generative factors better. Partially observable environments in reinforcement learning have not yet been carefully studied using unsupervised state representation learning methods.

In this article, we create an unsupervised state representation learning scheme for partially observable states. We conducted our experiment on a previous Atari 2600 framework designed to evaluate representation learning models. A contrastive method called Spatiotemporal DeepInfomax (ST-DIM) has shown state-of-the-art performance on this benchmark but remains inferior to its supervised counterpart. Our approach improves ST-DIM when the environment is not fully observable and achieves higher F1 scores and accuracy scores than the supervised learning counterpart. The mean accuracy score averaged over categories of our approach is \(\sim \)66%, compared to \(\sim \)38% of supervised learning. The mean F1 score is \(\sim \)64% to \(\sim \)33%. The code can be found on https://github.com/mengli11235/MST_DIM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anand, A., Racah, E., Ozair, S., Bengio, Y., Côté, M.A., Hjelm, R.D.: Unsupervised state representation learning in atari. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., Efros, A.A.: Large-scale study of curiosity-driven learning. arXiv preprint arXiv:1808.04355 (2018)
Cassandra, A.R., Kaelbling, L.P., Littman, M.L.: Acting optimally in partially observable stochastic domains. In: AAAI, vol. 94, pp. 1023–1028 (1994)
Google Scholar
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
Google Scholar
Chen, X., Xie, S., He, K.: An empirical study of training self-supervised vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9640–9649 (2021)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Doersch, C., Zisserman, A.: Multi-task self-supervised visual learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2051–2060 (2017)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Gregor, K., Danihelka, I., Graves, A., Rezende, D., Wierstra, D.: DRAW: a recurrent neural network for image generation. In: International Conference on Machine Learning, pp. 1462–1471. PMLR (2015)
Google Scholar
Guo, Z.D., et al.: Bootstrap latent-predictive representations for multitask reinforcement learning. In: International Conference on Machine Learning, pp. 3875–3886. PMLR (2020)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)
Google Scholar
Hjelm, R.D., et al.: Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018)
Jonschkowski, R., Brock, O.: Learning state representations with robotic priors. Auton. Robot. 39(3), 407–428 (2015). https://doi.org/10.1007/s10514-015-9459-7
Article Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013)
Kolesnikov, A., Zhai, X., Beyer, L.: Revisiting self-supervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1920–1929 (2019)
Google Scholar
Laskin, M., Srinivas, A., Abbeel, P.: CURL: contrastive unsupervised representations for reinforcement learning. In: International Conference on Machine Learning, pp. 5639–5650. PMLR (2020)
Google Scholar
Lee, K.H., et al.: Predictive information accelerates learning in RL. Adv. Neural. Inf. Process. Syst. 33, 11890–11901 (2020)
Google Scholar
Lesort, T., Díaz-Rodríguez, N., Goudou, J.F., Filliat, D.: State representation learning for control: an overview. Neural Netw. 108, 379–392 (2018)
Article Google Scholar
Mnih, V., et al.: Playing atari with deep reinforcement learning (2013)
Google Scholar
Oord, A.V.D., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Puterman, M.L.: Markov decision processes. Handbooks Oper. Res. Management Sci. 2, 331–434 (1990)
Article MathSciNet MATH Google Scholar
Stooke, A., Lee, K., Abbeel, P., Laskin, M.: Decoupling representation learning from reinforcement learning. In: International Conference on Machine Learning, pp. 9870–9879. PMLR (2021)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Google Scholar
Taylor, L.N., Whalen, Z.: Playing the Past: History and Nostalgia in Video Games. JSTOR (2008)
Google Scholar
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9446–9454 (2018)
Google Scholar
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Article MATH Google Scholar
Zhu, J., et al.: Masked contrastive representation learning for reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3421–3433 (2022)
MathSciNet Google Scholar

Download references

Acknowledgements

This work was performed on the [ML node] resource, owned by the University of Oslo, and operated by the Department for Research Computing at USIT, the University of Oslo IT-department. http://www.hpc.uio.no/.

Author information

Authors and Affiliations

University of Oslo, Oslo, Norway
Li Meng & Paal Engelstad
Centre for Artificial Intelligence Research, University of Agder, Kristiansand, Norway
Morten Goodwin
Oslo Metropolitan University, Oslo, Norway
Anis Yazidi
Oslo Metropolitan University, Kristiansand, Norway
Morten Goodwin

Authors

Li Meng
View author publications
You can also search for this author in PubMed Google Scholar
Morten Goodwin
View author publications
You can also search for this author in PubMed Google Scholar
Anis Yazidi
View author publications
You can also search for this author in PubMed Google Scholar
Paal Engelstad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Meng .

Editor information

Editors and Affiliations

Cyprus University of Technology, Limassol, Cyprus
Nicolas Tsapatsoulis
Cyprus University of Technology/CYENS Center of Excellence, Limassol, Cyprus
Andreas Lanitis
The University of New Mexico, Albuquerque, NM, USA
Marios Pattichis
University of Cyprus/CYENS Center of Excellence, Nicosia, Cyprus
Constantinos Pattichis
University of Cyprus/KIOS Center of Excellence, Nicosia, Cyprus
Christos Kyrkou
Cyprus University of Technology, Limassol, Cyprus
Efthyvoulos Kyriacou
Cyprus University of Technology/CYENS Center of Excellence, Limassol, Cyprus
Zenonas Theodosiou
CYENS Center of Excellence, Nicosia, Cyprus
Andreas Panayides

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meng, L., Goodwin, M., Yazidi, A., Engelstad, P. (2023). Unsupervised State Representation Learning in Partially Observable Atari Games. In: Tsapatsoulis, N., et al. Computer Analysis of Images and Patterns. CAIP 2023. Lecture Notes in Computer Science, vol 14185. Springer, Cham. https://doi.org/10.1007/978-3-031-44240-7_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-44240-7_21
Published: 20 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44239-1
Online ISBN: 978-3-031-44240-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics