Abstract
Attention-based agents have had much success in many areas of Artificial Intelligence, such as Deep Reinforcement Learning. This work revisits two such architectures, namely, Deep Attention Recurrent Q-Networks (DARQNs) and Soft Top-Down Spatial Attention (STDA) and explores the similarities between them. More specifically, this work tries to improve the performance of the DARQN architecture by leveraging elements proposed by the STDA architecture, such as the formulation of its attention function which also includes the incorporation of a spatial basis into its computation. The implementation tested, denoted Deep Attention Recurrent Actor-Critic (DARAC), uses the A2C learning algorithm. The results obtained seem to suggest that the performance of DARAC can be improved by the incorporation of some of the techniques proposed in STDA. Overall, DARAC showed competitive results when compared to STDA and slightly better in some of the experiments performed. The Atari 2600 videogame benchmark was the testbed used to perform and validate all the experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In the limit, at each timestep t this memory module could potentially integrate information from the k≤t−1 past observations, effectively compressing the whole history.
- 2.
The ‘where’ and ‘what’ driving the decision process.
- 3.
References
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, pp. 5998–6008 (2017)
Kosiorek, A.R., Bewley, A., Posner, I.: Hierarchical attentive recurrent tracking. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, pp. 3053–3061 (2017)
Hermann, K.M., et al.: Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 1693–1701 (2015)
Shan, M., Atanasov, N.: A spatiotemporal model with visual attention for video classification (2017). CoRR, abs/1707.02069
Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, pp. 2204–2212 (2014)
Mott, A., Zoran, D., Chrzanowski, M., Wierstra, D., Rezende, D.J.: Towards interpretable reinforcement learning using attention augmented agents. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, pp. 12329–12338 (2019)
Sorokin, I., Seleznev, A., Pavlov, M., Fedorov, A., Ignateva, A.: Deep Attention Recurrent Q-Network (2015). CoRR, abs/1512.01693
Zambaldi, V.F., et al.: Deep reinforcement learning with relational inductive biases. In: 7th International Conference on Learning Representations, ICLR (2019)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Hausknecht, M., Stone, P.: Deep recurrent Q-learning for partially observable MDPs. In: AAAI Fall Symposium-Technical Report, AI Access Foundation, pp. 29–37 (2015)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput.Comput. 9(8), 1735–1780 (1997)
Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., Woo, W.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 802–810 (2015)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, pp. 448–456 (2015)
Ba, L.J., Kiros, J.R., Hinton, G.E.: Layer normalization (2016). CoRR, abs/1607.06450
Machado, M.C., Bellemare, M.G., Talvitie, E., Veness, J., Hausknecht, M.J., Bowling, M.: Revisiting the arcade learning environment: evaluation protocols and open problems for general agents. J. Artif. Intell. Res.Artif. Intell. Res. 61, 523–562 (2018)
Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, pp. 1928–1937 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR (2015)
Schulman, J., Moritz, P., Levine, S., Jordan, M.I., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. In: 4th International Conference on Learning Representations, ICLR (2016)
Brockman, G., et al.: OpenAI Gym (2016). CoRR, abs/1606.01540
Acknowledgements
This research was funded by Fundação para a Ciência e a Tecnologia, grant number SFRH/BD/145723 /2019–UID/CEC/00127/2019.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Duarte, F.F., Lau, N., Pereira, A., Reis, L.P. (2023). Revisiting Deep Attention Recurrent Networks. In: Moniz, N., Vale, Z., Cascalho, J., Silva, C., Sebastião, R. (eds) Progress in Artificial Intelligence. EPIA 2023. Lecture Notes in Computer Science(), vol 14115. Springer, Cham. https://doi.org/10.1007/978-3-031-49008-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-49008-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49007-1
Online ISBN: 978-3-031-49008-8
eBook Packages: Computer ScienceComputer Science (R0)