Model-Based Reinforcement Learning with Self-attention Mechanism for Autonomous Driving in Dense Traffic

Wen, Junjie; Zhao, Zuoquan; Cui, Jinqiang; Chen, Ben M.

doi:10.1007/978-3-031-30108-7_27

Junjie Wen^12,13,
Zuoquan Zhao¹²,
Jinqiang Cui¹³ &
…
Ben M. Chen¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13624))

Included in the following conference series:

International Conference on Neural Information Processing

725 Accesses
2 Citations

Abstract

Reinforcement learning (RL) has recently been applied in autonomous driving for planning and decision-making. However, most of them use model-free RL techniques, requiring a large volume of interactions with the environment to achieve satisfactory performance. In this work, we for the first time introduce a solely self-attention environment model to model-based RL for autonomous driving in a dense traffic environment where intensive interactions among traffic participants may occur. Firstly, an environment model based solely on the self-attention mechanism is proposed to simulate the dynamic transition of the dense traffic. Two attention modules are introduced: the Horizon Attention (HA) module and the Frame Attention (FA) module. The proposed environment model shows superior predicting performance compared to other state-of-the-art (SOTA) prediction methods in dense traffic environment. Then the environment model is employed for developing various model-based RL algorithms. Experiments show the higher sample efficiency of our proposed model-based RL algorithms in contrast with model-free methods. Moreover, the intelligent agents trained with our algorithms all outperform their corresponding model-free methods in metrics of success rate and passing time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alahi, A., Goel, K., Ramanathan, V., et al.: Social LSTM: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016)
Google Scholar
Altché, F., de La Fortelle, A.: An LSTM network for highway trajectory prediction. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pp. 353–359. IEEE (2017)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Dankwa, S., Zheng, W.: Twin-delayed ddpg: a deep reinforcement learning technique to model a continuous movement of an intelligent robot agent. In: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing, pp. 1–5 (2019)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dosovitskiy, A., Beyer, L., Kolesnikov, S., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Eraqi, H.M., Moustafa, M.N., Honer, J.: End-to-end deep learning for steering autonomous vehicles considering temporal dependencies. arXiv preprint arXiv:1710.03804 (2017)
Fletcher, L., Teller, S., Olson, E., et al.: The MIT-cornell collision and why it happened. J. Field Robot. 25(10), 775–807 (2008)
Article Google Scholar
Ha, D., Schmidhuber, J.: World models. arXiv preprint arXiv:1803.10122 (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Isele, D., Nakhaei, A., Fujimura, K.: Safe reinforcement learning on autonomous vehicles. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 1–6. IEEE (2018)
Google Scholar
Kaiser, L., Babaeizadeh, M., Milos, P., et al.: Model-based reinforcement learning for atari. arXiv preprint arXiv:1903.00374 (2019)
Kesting, A., Treiber, M., Helbing, D.: Enhanced intelligent driver model to access the impact of driving strategies on traffic capacity. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 368(1928), 4585–4605 (2010)
Article MATH Google Scholar
Konda, V., Tsitsiklis, J.: Actor-critic algorithms. Adv. Neural Inf. Process. Syst. 12, 1008–1014 (1999)
MATH Google Scholar
Leurent, E.: An environment for autonomous driving decision-making. GitHub (2018)
Google Scholar
Leurent, E., Mercat, J.: Social attention for autonomous decision-making in dense traffic. arXiv preprint arXiv:1911.12250 (2019)
Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030 (2021)
Ma, Y., Zhu, X., Zhang, S., et al.: Trafficpredict: trajectory prediction for heterogeneous traffic-agents. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6120–6127 (2019)
Google Scholar
Minderhoud, M.M., Bovy, P.H.: Extended time-to-collision measures for road traffic safety assessment. Accid. Anal. Prev. 33(1), 89–97 (2001)
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Montemerlo, M., Becker, J., Bhat, S., et al.: Junior: the stanford entry in the urban challenge. J. Field Robot. 25(9), 569–597 (2008)
Article Google Scholar
Silver, D., Lever, G., Heess, N., et al.: Deterministic policy gradient algorithms. In: International Conference on Machine Learning, pp. 387–395. PMLR (2014)
Google Scholar
Sutton, R.S.: Dyna, an integrated architecture for learning, planning, and reacting. ACM Sigart Bull. 2(4), 160–163 (1991)
Article Google Scholar
Tang, C., Salakhutdinov, R.R.: Multiple futures prediction. Adv. Neural Inf. Process. Syst. 32, 15424–15434 (2019)
Google Scholar
Urmson, C., Anhalt, J., Bagnell, D., et al.: Autonomous driving in urban environments: boss and the urban challenge. J. Field Robot. 25(8), 425–466 (2008)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Vemula, A., Muelling, K., Oh, J.: Social attention: modeling attention in human crowds. In: 2018 IEEE international Conference on Robotics and Automation (ICRA), pp. 4601–4607. IEEE (2018)
Google Scholar
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Article MATH Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3), 229–256 (1992)
Article MATH Google Scholar
Xu, H., Gao, Y., Yu, F., Darrell, T.: End-to-end learning of driving models from large-scale video datasets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2174–2182 (2017)
Google Scholar
Zhuo, X., Jianyu, C., Masayoshi, T.: Guided policy search model-based reinforcement learning for urban autonomous driving. arXiv preprint arXiv:2005.03076 (2020)

Download references

Author information

Authors and Affiliations

Department of Mechanical and Automation Engineering, The Chinese Uinversity of Hong Kong, Hong Kong, People’s Republic of China
Junjie Wen, Zuoquan Zhao & Ben M. Chen
Department of Mathematics and Theories, Peng Cheng Laboratory, Shenzhen, 518055, China
Junjie Wen & Jinqiang Cui

Authors

Junjie Wen
View author publications
You can also search for this author in PubMed Google Scholar
Zuoquan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jinqiang Cui
View author publications
You can also search for this author in PubMed Google Scholar
Ben M. Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinqiang Cui .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wen, J., Zhao, Z., Cui, J., Chen, B.M. (2023). Model-Based Reinforcement Learning with Self-attention Mechanism for Autonomous Driving in Dense Traffic. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Lecture Notes in Computer Science, vol 13624. Springer, Cham. https://doi.org/10.1007/978-3-031-30108-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-30108-7_27
Published: 13 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30107-0
Online ISBN: 978-3-031-30108-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Model-Based Reinforcement Learning with Self-attention Mechanism for Autonomous Driving in Dense Traffic