Skip to main content

Deep Reinforcement Learning with Temporal-Awareness Network

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12533))

Included in the following conference series:

  • 2478 Accesses

Abstract

Advances in deep reinforcement learning have allowed autonomous agents to perform well on video games, often outperforming humans, using only raw pixels to make their decisions. However, timely context awareness is not fully integrated. In this paper, we extend Deep Q-network (DQN) with spatio-temporal architecture - a novel framework that handles the temporal limitation problem. To incorporate spatio-temporal information, we construct variants of architectures by feeding spatial and temporal representations into Deep Q-networks in different ways, which are DQN with convolutional neural network (DQN-Conv), DQN with LSTM recurrent neural network (DQN-LSTM), DQN with 3D convolutional neural network (DQN-3DConv), and DQN with spatial and temporal fusion (DQN-Fusion), to explore the mutual but also fuzzy relationship between them. Extensive experiments are conducted on popular mobile game Flappy Bird and our framework achieves superior results when compared to baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bunel, R., Hausknecht, M., Devlin, J., Singh, R., Kohli, P.: Leveraging grammar and reinforcement learning for neural program synthesis. arXiv preprint arXiv:1805.04276 (2018)

  2. Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE (2016)

    Google Scholar 

  3. Kiran, B.R., et al.: Deep Reinforcement Learning for Autonomous Driving: A Survey. arXiv preprint arXiv:2002.00444 (2020)

  4. Lipovetzky, N., Ramirez, M., Geffner, H.: Classical planning with simulators: results on the Atari video games. In 24th International Joint Conference on Artificial Intelligence (2015)

    Google Scholar 

  5. Aldape, P., Sowell, S.: Reinforcement Learning for a Simple Racing Game (2018)

    Google Scholar 

  6. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1106–1114 (2012)

    Google Scholar 

  7. Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2013)

    Google Scholar 

  8. Mnih, V.: Machine Learning for Aerial Image Labeling. Ph.D. thesis, University of Toronto (2013)

    Google Scholar 

  9. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)

    Google Scholar 

  10. Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action conditional video prediction using deep networks in Atari games. In: Advances in Neural Information Processing Systems, vol. 2863–2871 (2015)

    Google Scholar 

  11. Watter, M., Springenberg, J., Boedecker, J., Riedmiller, M.: Embed to control: a locally linear latent dynamics model for control from raw images. In: Advances in Neural Information Processing Systems, pp. 2746–2754 (2015)

    Google Scholar 

  12. Wahlström, N., Schön, T.B., Deisenroth, M.P.: Learning deep dynamical models from image pixels. IFAC-PapersOnLine 48(28), 1059–1064 (2015)

    Article  Google Scholar 

  13. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  14. Lin, L.-J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992). https://doi.org/10.1007/BF00992699

    Article  Google Scholar 

  15. Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)

    Google Scholar 

  16. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)

    Google Scholar 

  17. Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  18. Hausknecht, M., Stone, P.: Deep recurrent Q-learning for partially observable MDPs. In AAAI Fall Symposium Series (2015)

    Google Scholar 

  19. Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2012)

    Article  Google Scholar 

  20. Vu, T., Tran, L.: FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques. arXiv preprint arXiv:2003.09579 (2020)

  21. Piper, M., Bhounsule, P., Castillo-Villar, K.K.: How to beat flappy bird: a mixed-integer model predictive control approach. In: ASME Dynamic Systems and Control Conference. American Society of Mechanical Engineers Digital Collection (2017)

    Google Scholar 

  22. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)

    Article  Google Scholar 

  23. Bellemare, M., Veness, J., Bowling, M.: Bayesian learning of recursively factored environments. In: Proceedings of the 13th International Conference on Machine Learning, pp. 1211–1219 (2013)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the National Key R&D Program of China (No. 2017YFE0117500).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian-wei Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Zy., Liu, Jw., Li, W., Zuo, X. (2020). Deep Reinforcement Learning with Temporal-Awareness Network. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12533. Springer, Cham. https://doi.org/10.1007/978-3-030-63833-7_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63833-7_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63832-0

  • Online ISBN: 978-3-030-63833-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics