Deep Reinforcement Learning with Temporal-Awareness Network

Liu, Ze-yu; Liu, Jian-wei; Li, Weimin; Zuo, Xin

doi:10.1007/978-3-030-63833-7_24

Ze-yu Liu¹⁴,
Jian-wei Liu¹⁴,
Weimin Li¹⁵ &
…
Xin Zuo¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12533))

Included in the following conference series:

International Conference on Neural Information Processing

2478 Accesses

Abstract

Advances in deep reinforcement learning have allowed autonomous agents to perform well on video games, often outperforming humans, using only raw pixels to make their decisions. However, timely context awareness is not fully integrated. In this paper, we extend Deep Q-network (DQN) with spatio-temporal architecture - a novel framework that handles the temporal limitation problem. To incorporate spatio-temporal information, we construct variants of architectures by feeding spatial and temporal representations into Deep Q-networks in different ways, which are DQN with convolutional neural network (DQN-Conv), DQN with LSTM recurrent neural network (DQN-LSTM), DQN with 3D convolutional neural network (DQN-3DConv), and DQN with spatial and temporal fusion (DQN-Fusion), to explore the mutual but also fuzzy relationship between them. Extensive experiments are conducted on popular mobile game Flappy Bird and our framework achieves superior results when compared to baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bunel, R., Hausknecht, M., Devlin, J., Singh, R., Kohli, P.: Leveraging grammar and reinforcement learning for neural program synthesis. arXiv preprint arXiv:1805.04276 (2018)
Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE (2016)
Google Scholar
Kiran, B.R., et al.: Deep Reinforcement Learning for Autonomous Driving: A Survey. arXiv preprint arXiv:2002.00444 (2020)
Lipovetzky, N., Ramirez, M., Geffner, H.: Classical planning with simulators: results on the Atari video games. In 24th International Joint Conference on Artificial Intelligence (2015)
Google Scholar
Aldape, P., Sowell, S.: Reinforcement Learning for a Simple Racing Game (2018)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1106–1114 (2012)
Google Scholar
Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Mnih, V.: Machine Learning for Aerial Image Labeling. Ph.D. thesis, University of Toronto (2013)
Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Google Scholar
Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action conditional video prediction using deep networks in Atari games. In: Advances in Neural Information Processing Systems, vol. 2863–2871 (2015)
Google Scholar
Watter, M., Springenberg, J., Boedecker, J., Riedmiller, M.: Embed to control: a locally linear latent dynamics model for control from raw images. In: Advances in Neural Information Processing Systems, pp. 2746–2754 (2015)
Google Scholar
Wahlström, N., Schön, T.B., Deisenroth, M.P.: Learning deep dynamical models from image pixels. IFAC-PapersOnLine 48(28), 1059–1064 (2015)
Article Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Lin, L.-J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992). https://doi.org/10.1007/BF00992699
Article Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015)
Google Scholar
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Hausknecht, M., Stone, P.: Deep recurrent Q-learning for partially observable MDPs. In AAAI Fall Symposium Series (2015)
Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2012)
Article Google Scholar
Vu, T., Tran, L.: FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques. arXiv preprint arXiv:2003.09579 (2020)
Piper, M., Bhounsule, P., Castillo-Villar, K.K.: How to beat flappy bird: a mixed-integer model predictive control approach. In: ASME Dynamic Systems and Control Conference. American Society of Mechanical Engineers Digital Collection (2017)
Google Scholar
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)
Article Google Scholar
Bellemare, M., Veness, J., Bowling, M.: Bayesian learning of recursively factored environments. In: Proceedings of the 13th International Conference on Machine Learning, pp. 1211–1219 (2013)
Google Scholar

Download references

Acknowledgement

This work was supported by the National Key R&D Program of China (No. 2017YFE0117500).

Author information

Authors and Affiliations

Department of Automation, College of Information Science and Engineering, China University of Petroleum, Beijing Campus (CUP), Beijing, 102249, China
Ze-yu Liu, Jian-wei Liu & Xin Zuo
School of Computer Engineering and Technology, Shanghai University, Shanghai, China
Weimin Li

Authors

Ze-yu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jian-wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Weimin Li
View author publications
You can also search for this author in PubMed Google Scholar
Xin Zuo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian-wei Liu .

Editor information

Editors and Affiliations

Department of AI, Ping An Life, Shenzhen, China
Haiqin Yang
Faculty of Information Technology, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
City University of Hong Kong, Kowloon, China
Andrew Chi-Sing Leung
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
James T. Kwok
School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
Jonathan H. Chan
The Chinese University of Hong Kong, New Territories, Hong Kong
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Zy., Liu, Jw., Li, W., Zuo, X. (2020). Deep Reinforcement Learning with Temporal-Awareness Network. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12533. Springer, Cham. https://doi.org/10.1007/978-3-030-63833-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-63833-7_24
Published: 20 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63832-0
Online ISBN: 978-3-030-63833-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics