Skip to main content

Boosting Reinforcement Learning with Unsupervised Feature Extraction

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation (ICANN 2019)

Abstract

Learning to process visual input for Deep Reinforcement Learning is challenging and training a neural network with nothing else but a sparse and delayed reward signal seems rather inappropriate. In this work, Deep Q-Networks are leveraged by several unsupervised machine learning methods that provide additional information for the training of the feature extraction stage to find a well suited representation of the input data. The influence of convolutional filters that were pretrained on a supervised classification task, a Convolutional Autoencoder and Slow Feature Analysis are investigated in an end-to-end architecture. Experiments are performed on five ViZDoom environments. We found that the unsupervised methods boost Deep Q-Networks significantly depending on the underlying task the agent has to fulfill. While pretrained filters improve object detection tasks, we find that Convolutional Autoencoders leverage navigation and orientation tasks. Combining these two approaches leads to an agent that performs well on all tested environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alvernaz, S., Togelius, J.: Autoencoder-augmented neuroevolution for visual Doom playing. CoRR abs/1707.03902 (2017). http://arxiv.org/abs/1707.03902

  2. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 245–250. ACM, New York (2001). https://doi.org/10.1145/502512.502546

  3. Cuccu, G., Togelius, J., Cudré-Mauroux, P.: Playing atari with six neurons. In: International Conference on Autonomous Agents and MultiAgent Systems, pp. 998–1006. International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC (2019)

    Google Scholar 

  4. Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 2758–2766, December 2015. https://doi.org/10.1109/ICCV.2015.316

  5. Franzius, M., Sprekeler, H., Wiskott, L.: Slowness and sparseness lead to place, head-direction, and spatial-view cells. PLoS Comput. Biol. 3(8), 1–18 (2007)

    Article  MathSciNet  Google Scholar 

  6. Franzius, M., Wilbert, N., Wiskott, L.: Invariant object recognition and pose estimation with slow feature analysis. Neural Comput. 23, 2289–2323 (2011)

    Article  Google Scholar 

  7. Godard, C., Aodha, O.M., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6602–6611, July 2017. https://doi.org/10.1109/CVPR.2017.699

  8. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  9. Hasselt, H.V., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: AAAI Conference on Artificial Intelligence, pp. 2094–2100. AAAI Press (2016)

    Google Scholar 

  10. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006). https://doi.org/10.1126/science.1127647

    Article  MathSciNet  MATH  Google Scholar 

  11. Hinton, G., Sejnowski, T., Poggio, T.: Unsupervised Learning: Foundations of Neural Computation. A Bradford Book. MCGRAW HILL BOOK Company (1999)

    Google Scholar 

  12. Jaderberg, M., et al.: Reinforcement learning with unsupervised auxiliary tasks. CoRR abs/1611.05397 (2016)

    Google Scholar 

  13. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: IEEE Conference on Computational Intelligence and Games, pp. 341–348. IEEE, Santorini, September 2016

    Google Scholar 

  14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012)

    Google Scholar 

  15. Kulkarni, T.D., Saeedi, A., Gautam, S., Gershman, S.J.: Deep successor reinforcement learning. CoRR abs/1606.02396 (2016). http://arxiv.org/abs/1606.02396

  16. Lange, S., Riedmiller, M.: Deep auto-encoder neural networks in reinforcement learning. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8, July 2010. https://doi.org/10.1109/IJCNN.2010.5596468

  17. Legenstein, R., Wilbert, N., Wiskott, L.: Reinforcement learning on slow features of high-dimensional input streams. PLoS Comput. Biol. 6(8), e1000894 (2010)

    Article  MathSciNet  Google Scholar 

  18. Masci, J., Meier, U., Cireşan, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 52–59. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_7

    Chapter  Google Scholar 

  19. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  20. Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1717–1724, June 2014

    Google Scholar 

  21. Papoudakis, G., Chatzidimitriou, K.C., Mitkas, P.A.: Deep reinforcement learning for Doom using unsupervised auxiliary tasks. CoRR abs/1807.01960 (2018). http://arxiv.org/abs/1807.01960

  22. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  23. Saxe, A.M., Koh, P.W., Chen, Z., Bhand, M., Suresh, B., Ng, A.Y.: On random weights and unsupervised feature learning. In: International Conference on Machine Learning, pp. 1089–1096. Omnipress (2011)

    Google Scholar 

  24. Schüler, M., Hlynsson, H.D., Wiskott, L.: Gradient-based training of slow feature analysis by differentiable approximate whitening. CoRR abs/1808.08833 (2018). http://arxiv.org/abs/1808.08833

  25. Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, King’s College, Cambridge (1989)

    Google Scholar 

  26. Wiskott, L.: Learning invariance manifolds. In: Niklasson, L., Bodén, M., Ziemke, T. (eds.) ICANN 1998. PNC, pp. 555–560. Springer, London (1998). https://doi.org/10.1007/978-1-4471-1599-1_83

    Chapter  Google Scholar 

  27. Wiskott, L., Sejnowski, T.: Slow feature analysis: unsupervised learning of invariances. Neural Comput. 14(4), 715–770 (2002)

    Article  Google Scholar 

  28. Wohlfarth, K., et al.: Dense cloud classification on multispectralsatellite imagery. In: IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS), August 2018. https://doi.org/10.1109/PRRS.2018.8486379

  29. Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 3320–3328. Curran Associates, Inc. (2014)

    Google Scholar 

  30. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Hakenes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hakenes, S., Glasmachers, T. (2019). Boosting Reinforcement Learning with Unsupervised Feature Extraction. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation. ICANN 2019. Lecture Notes in Computer Science(), vol 11727. Springer, Cham. https://doi.org/10.1007/978-3-030-30487-4_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30487-4_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30486-7

  • Online ISBN: 978-3-030-30487-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics