Behavior analysis of emergent rule discovery for cooperative automated driving using deep reinforcement learning

Harada, Tomohiro; Matsuoka, Johei; Hattori, Kiyohiko

doi:10.1007/s10015-022-00839-7

Behavior analysis of emergent rule discovery for cooperative automated driving using deep reinforcement learning

Original Article
Published: 17 December 2022

Volume 28, pages 31–42, (2023)
Cite this article

Artificial Life and Robotics Aims and scope Submit manuscript

Tomohiro Harada¹,
Johei Matsuoka² &
Kiyohiko Hattori²

322 Accesses
5 Citations
2 Altmetric
Explore all metrics

Abstract

With the improvements in AI technology and sensor performance, research on automated driving has become increasingly popular. However, most studies are based on human driving styles. In this study, we consider an environment in which only autonomous vehicles are present. In such an environment, it is essential to develop an appropriate control method that actively utilizes the characteristics of autonomous vehicles, such as dense information exchange and highly accurate vehicle control. To address this issue, we investigated the emergence of automatic driving rules using reinforcement learning based on information from surrounding vehicles using inter-vehicle communication. We evaluated whether reinforcement learning converges in a situation where distance sensor information can be shared in real-time using vehicle-to-vehicle communication and whether reinforcement learning can learn a rational driving method. The simulation results show a positive trend in the cumulative rewards value, and it indicates that the proposed multi-agent learning method with an extended own-vehicle environment has the potential to learn automated vehicle control with cooperative behavior automatically. Furthermore, we analyzed whether a rational driving method (action selection) can be learned by reinforcement learning. The simulation results showed that reinforcement learning achieves rational control of the overtaking behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Autonomous vehicles: challenges, opportunities, and future implications for transportation policies

Article Open access 29 August 2016

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Public acceptance and perception of autonomous vehicles: a comprehensive review

Article 26 February 2021

References

François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. Foundations and Trends® in Machine Learning 11(3-4), 219–354 . https://doi.org/10.1561/2200000071
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: J. Dy, A. Krause (eds.) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 80, pp. 1861–1870. PMLR . https://proceedings.mlr.press/v80/haarnoja18b.html. Accessed 13 Dec 2022
Haas JK (2014) A history of the unity game engine. Worcester Polytechnic Institute, vol 483. p 484
Hausknecht M, Stone P (2015) Deep recurrent Q-learning for partially observable MDPs. In: 2015 aaai fall symposium series
Juliani A, Berges VP, Teng E, Cohen A, Harper J, Elion C, Goy C, Gao Y, Henry H, Mattar M. Lange D (2018) Unity: A general platform for intelligent agents . https://doi.org/10.48550/ARXIV.1809.02627. https://arxiv.org/abs/1809.02627
Kishi Y, Cao W, Mukai M (2022) Study on the formulation of vehicle merging problems for model predictive control. Artif Life Robot. https://doi.org/10.1007/s10015-022-00751-0
Article Google Scholar
Kuutti S, Bowden R, Jin Y, Barber P, Fallah S (2021) A survey of deep learning applications to autonomous vehicle control. IEEE Trans Intell Transp Syst 22(2):712–733. https://doi.org/10.1109/TITS.2019.2962338
Article Google Scholar
Li Z, Du Y, Zhu M, Zhou S, Zhang L (2021) A survey of 3d object detection algorithms for intelligent vehicles development. Artif Life Robot 27(1):115–122. https://doi.org/10.1007/s10015-021-00711-0
Article Google Scholar
Marcano M, Díaz S, Pérez J, Irigoyen E (2020) A review of shared control for automated vehicles: theory and applications. IEEE Trans Hum Mach Syst 50(6):475–491. https://doi.org/10.1109/THMS.2020.3017748
Article Google Scholar
Miki T, Lee J, Hwangbo J, Wellhausen L, Koltun V, Hutter M (2022) Learning robust perceptive locomotion for quadrupedal robots in the wild. Sci Robot 7(62):eabk2822. https://doi.org/10.1126/scirobotics.abk2822
Article Google Scholar
Mirhoseini A, Goldie A, Yazgan M, Jiang J, Songhori E, Wang S, Lee Y.J, Johnson E, Pathak O, Bae S, Nazi A, Pak J, Tong A, Srinivasa K, Hang W, Tuncer E, Babu A, Le Q.V, Laudon J, Ho R, Carpenter R, Dean J (2020) Chip placement with deep reinforcement learning . https://doi.org/10.48550/ARXIV.2004.10746. https://arxiv.org/abs/2004.10746
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533. https://doi.org/10.1038/nature14236
Article Google Scholar
Mohri K, Yamamoto M, Uchiyama T (2019) Application topics of amorphous wire CMOS IC magneto-impedance micromagnetic sensors for i-o-t smart society. J Sens 2019:1–8. https://doi.org/10.1155/2019/8285240
Article Google Scholar
Ogawa I, Yokoyama S, Yamashita T, Kawamura H, Sakatoku A, Yanagaihara T, Tanaka H (2017) Proposal of cooperative learning to realize motion control of rc cars group by deep q-network. Proceedings of the Annual Conference of JSAI JSAI2017, 3I2OS13b5–3I2OS13b5 . https://doi.org/10.11517/pjsai.JSAI2017.0_3I2OS13b5. Accessed 13 Dec 2022
Ogawa I, Yokoyama S, Yanashita T, Kawamura H, Sakatoku A, Yanagihara T, Ogishi T, Tanaka H (2018) Efficiency of traffic flow with mutual concessions of autonomous cars using deep q-network. Proceedings of the Annual Conference of JSAI JSAI2018, 3Z204–3Z204 . https://doi.org/10.11517/pjsai.JSAI2018.0_3Z204
Open AI, Berner C, Brockman G, Chan B, Cheung V, Dbiak P, Dennison C, Farhi D, Fischer Q, Hashme S, Hesse C, Józefowicz R, Gray S, Olsson C, Pachocki J, Petrov M, Pinto HPdO, Raiman J, Salimans T, Schlatter J, Schneider J, Sidor S, Sutskever I, Tang J, Wolski F, Zhang S (2019) Dota 2 with large scale deep reinforcement learning . https://doi.org/10.48550/ARXIV.1912.06680. https://arxiv.org/abs/1912.06680
Pal A, Philion J, Liao YH, Fidler S (2021) Emergent road rules in multi-agent driving environments. In: International Conference on Learning Representations. https://openreview.net/forum?id=d8Q1mt2Ghw. Accessed 13 Dec 2022
Rashid T, Samvelyan M, Schroeder C, Farquhar G, Foerster J, Whiteson S (2018) QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning. In: J. Dy, A. Krause (eds) Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 80, pp. 4295–4304. PMLR . https://proceedings.mlr.press/v80/rashid18a.html
Rashid T, Samvelyan M, Schroeder de Witt C, Farquhar G, Foerster JN, Whiteson S (2020) Monotonic value function factorisation for deep multi-agent reinforcement learning. J Mach Learn Res. https://doi.org/10.5555/3455716.3455894
Article MATH Google Scholar
Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation . https://doi.org/10.48550/ARXIV.1506.02438. https://arxiv.org/abs/1506.02438
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms . https://doi.org/10.48550/ARXIV.1707.06347. https://arxiv.org/abs/1707.06347
Shimada H, Yamaguchi A, Takada H, Sato K (2015) Implementation and evaluation of local dynamic map in safety driving systems. J Transp Technol 05(02):102–112. https://doi.org/10.4236/jtts.2015.52010
Article Google Scholar
Streck A (2021) Reinforcement learning a self-driving car ai in unity . https://towardsdatascience.com/reinforcement-learning-a-self-driving-car-ai-in-unity-60b0e7a10d9e. Accessed 13 Dec 2022
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT press
MATH Google Scholar
Witt CS, Gupta T, Makoviichuk D, Makoviychuk V, Torr PHS, Sun M, Whiteson S (2020) Is independent learning all you need in the starcraft multi-agent challenge? . https://doi.org/10.48550/ARXIV.2011.09533. https://arxiv.org/abs/2011.09533
Yu C, Velu A, Vinitsky E, Wang Y, Bayen A, Wu Y (2021) The surprising effectiveness of ppo in cooperative, multi-agent games . https://doi.org/10.48550/ARXIV.2103.01955. https://arxiv.org/abs/2103.01955
Zhang K, Yang Z, Başar T (2021) Multi-agent reinforcement learning: a selective overview of theories and algorithms. Handbook of reinforcement learning and control. Springer, pp 321–384
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of System Design, Tokyo Metropolitan University, Tokyo, Japan
Tomohiro Harada
School of Computer Science, Tokyo University of Technology, Tokyo, Japan
Johei Matsuoka & Kiyohiko Hattori

Authors

Tomohiro Harada
View author publications
You can also search for this author in PubMed Google Scholar
Johei Matsuoka
View author publications
You can also search for this author in PubMed Google Scholar
Kiyohiko Hattori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kiyohiko Hattori.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was presented in part at the joint symposium of the 27th International Symposium on Artificial Life and Robotics, the 7th International Symposium on BioComplexity, and the 5th International Symposium on Swarm Behavior and Bio-Inspired Robotics (Online, January 25–27, 2022).

About this article

Cite this article

Harada, T., Matsuoka, J. & Hattori, K. Behavior analysis of emergent rule discovery for cooperative automated driving using deep reinforcement learning. Artif Life Robotics 28, 31–42 (2023). https://doi.org/10.1007/s10015-022-00839-7

Download citation

Received: 31 May 2022
Accepted: 16 November 2022
Published: 17 December 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10015-022-00839-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Behavior analysis of emergent rule discovery for cooperative automated driving using deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Autonomous vehicles: challenges, opportunities, and future implications for transportation policies

Multi-agent deep reinforcement learning: a survey

Public acceptance and perception of autonomous vehicles: a comprehensive review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Keywords

Navigation

Behavior analysis of emergent rule discovery for cooperative automated driving using deep reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Autonomous vehicles: challenges, opportunities, and future implications for transportation policies

Multi-agent deep reinforcement learning: a survey

Public acceptance and perception of autonomous vehicles: a comprehensive review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Share this article

Keywords

Search

Navigation