An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway

Zhang, Mei; Chen, Kai; Zhu, Jinhui

doi:10.1007/s13042-023-01845-2

An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway

Original Article
Published: 28 June 2023

Volume 14, pages 3483–3499, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Mei Zhang¹,
Kai Chen¹ &
Jinhui Zhu^2,3

392 Accesses
1 Citation
Explore all metrics

Abstract

Due to the complexity and uncertainty of the traffic, planning for autonomous driving (AD) on highway is challenging. Traditional planning algorithms have the problems of low and unstable efficiency, which reduces the real-time performance of the autonomous vehicle (AV). Deep reinforcement learning (DRL) is an emerging and promising method that has achieved amazing performance in many fields. In this paper, we propose a novel planning approach based on soft actor critic (SAC) with hybrid actions. The algorithm takes the structured information of the ego vehicle and the surroundings as input, and generates a termination state on the Frenet space for ego vehicle, then a feasible and continuous spatiotemporal trajectory will be output by a polynomial planner based on the intermediate state. Different from other sampling-based planning methods, only single polynomial planning is required, which improves planning efficiency significantly. Experiments show that DRL agent with hybrid actions is more secure than the agents with only continuous or discrete actions. Compared with other planning methods, the proposed algorithm has the least and most robust time for planning in different scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

Article 05 May 2021

Autonomous Driving Ethics: from Trolley Problem to Ethics of Risk

Article Open access 12 April 2021

UAV Path Planning Using Optimization Approaches: A Survey

Article 18 April 2022

References

Alizadeh A, Moghadam M, Bicer Y, et al (2019) Automated lane change decision making using deep reinforcement learning in dynamic and uncertain highway environment. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), IEEE, pp 1399–1404, https://doi.org/10.1109/ITSC.2019.8917192
Brito B, Agarwal A, Alonso-Mora J (2021) Learning interaction-aware guidance policies for motion planning in dense traffic scenarios. arXiv preprint arXiv:2107.04538
Cortes C, Lawarence N, Lee D, et al (2015) Advances in neural information processing systems 28. In: NIPS 2015
Fan H, Zhu F, Liu C, et al (2018) Baidu apollo em motion planner. arXiv preprint arXiv:1807.08048
Gao F, Geng P, Guo J, et al (2022) Apollorl: a reinforcement learning platform for autonomous driving. arXiv:2201.12609
González D, Pérez J, Milanés V et al (2015) A review of motion planning techniques for automated vehicles. IEEE Trans Intell Transport Syst (TITS) 17(4):1135–1145. https://doi.org/10.1109/TITS.2015.2498841
Article Google Scholar
Haarnoja T, Zhou A, Abbeel P, et al (2018) Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on machine learning, PMLR, pp 1861–1870
Hoel CJ, Wolff K, Laine L (2018) Automated speed and lane change decision making using deep reinforcement learning. In: 2018 21st International Conference on intelligent transportation systems (ITSC), IEEE, pp 2148–2155, https://doi.org/10.1109/ITSC.2018.8569568
Jin X, Yan Z, Yin G et al (2020) An adaptive motion planning technique for on-road autonomous driving. IEEE Access 9:2655–2664. https://doi.org/10.1109/ACCESS.2020.3047385
Article Google Scholar
Kesting A, Treiber M, Helbing D (2007) General lane-changing model mobil for car-following models. Transp Res Rec 1999(1):86–94
Article Google Scholar
Kiran BR, Sobh I, Talpaert V et al (2021) Deep reinforcement learning for autonomous driving: a survey. IEEE Trans Intell Transport Syst (TITS). https://doi.org/10.1109/TITS.2021.3054625
Article Google Scholar
Kuutti S, Bowden R, Jin Y et al (2020) A survey of deep learning applications to autonomous vehicle control. IEEE Trans Intell Transport Syst (TITS) 22(2):712–733. https://doi.org/10.1109/TITS.2019.2962338
Article Google Scholar
Li C, Wang L, Huang Z (2022) Hindsight-aware deep reinforcement learning algorithm for multi-agent systems. Int J Mach Learn Cybern 13(7):2045–2057. https://doi.org/10.1007/s13042-022-01505-x
Article Google Scholar
Lillicrap TP, Hunt JJ, Pritzel A, et al (2015) Continuous control with deep reinforcement learning. arXiv:1509.02971
Meng L (2021) Multi-scene trajectory planning of intelligent vehicle based on frenet coordinate system. Master’s thesis, Harbin Institute of Technology
Mnih V, Kavukcuoglu K, Silver D, et al (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602
Moghadam M, Alizadeh A, Tekin E, et al (2021) A deep reinforcement learning approach for long-term short-term planning on frenet frame. In: 2021 IEEE 17th International Conference on automation science and engineering (CASE), IEEE, pp 1751–1756, https://doi.org/10.1109/CASE49439.2021.9551598
Mohammadhasani A, Mehrivash H, Lynch A, et al (2021) Reinforcement learning based safe decision making for highway autonomous driving. arXiv:2105.0651
Peng B, Yu D, Zhou H et al (2022) A motion planning method for automated vehicles in dynamic traffic scenarios. Symmetry 14(2):208. https://doi.org/10.3390/sym14020208
Article Google Scholar
Schulman J, Wolski F, Dhariwal P, et al (2017) Proximal policy optimization algorithms. arXiv:1707.06347
Sun S, Liu Z, Yin H et al (2022) Fiss: a trajectory planning framework using fast iterative search and sampling strategy for autonomous driving. IEEE Robot Autom Lett 7(4):9985–9992. https://doi.org/10.1109/LRA.2022.3191940
Article Google Scholar
Treiber M, Hennecke A, Helbing D (2000) Congested traffic states in empirical observations and microscopic simulations. Phys Rev E 62(2):1805. https://doi.org/10.1103/PhysRevE.62.1805
Article MATH Google Scholar
Werling M, Ziegler J, Kammel S, et al (2010) Optimal trajectory generation for dynamic street scenarios in a frenet frame. In: 2010 IEEE International Conference on robotics and automation, IEEE, pp 987–993, https://doi.org/10.1109/ROBOT.2010.5509799
Xin X, Tu Y, Stojanovic V et al (2022) Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems. Appl Math Comput 412(126):537. https://doi.org/10.1016/j.amc.2021.126537
Article MathSciNet MATH Google Scholar
Xing X, Zhao B, Han C et al (2022) Vehicle motion planning with joint Cartesian-Frenét mpc. IEEE Robot Autom Lett. https://doi.org/10.1109/LRA.2022.3194330
Article Google Scholar
Yang Weiyi BC, Cai Chao ZY, Peng L (2020) Survey on sparse reward in deep reinforcement learning. Comput Sci. https://doi.org/10.11896/jsjkx.190200352
Article Google Scholar
Yurtsever E, Lambert J, Carballo A et al (2020) A survey of autonomous driving: common practices and emerging technologies. IEEE Access 8:58443–58469. https://doi.org/10.1109/ACCESS.2020.2983149
Article Google Scholar
Zhang C, Han Z, Liu B et al (2022) Scc-rfmq: a multiagent reinforcement learning method in cooperative Markov games with continuous actions. Int J Mach Learn Cybern (IJMLC) 13(7):1927–1944. https://doi.org/10.1007/s13042-021-01497-0
Article Google Scholar
Zhang Y, Sun H, Zhou J, et al (2020) Optimal vehicle path planning using quadratic optimization for Baidu Apollo open platform. In: 2020 IEEE Intelligent Vehicles Symposium (IV), IEEE, pp 978–984, https://doi.org/10.1109/IV47402.2020.9304787
Zhou R (2021) Research and application of intelligent car decision planning method combining dynamic scene information and ddpg algorithm. Master’s thesis, School of Autonomation Engineering
Zhuang Z, Tao H, Chen Y et al (2022) Iterative learning control for repetitive tasks with randomly varying trial lengths using successive projection. Int J Adapt Control Signal Process 36(5):1196–1215. https://doi.org/10.1002/acs.3396
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Automation Science and Engineering, South China University of Technology, Guangzhou, 510640, China
Mei Zhang & Kai Chen
School of Software Engineering, South China University of Technology, Guangzhou, China
Jinhui Zhu
Key Laboratory of Big Data and Intelligent Robot (South China University of Technology) Ministry of Education, Guangzhou, China
Jinhui Zhu

Authors

Mei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kai Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jinhui Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinhui Zhu.

Ethics declarations

Conflict of interest

We declare that there are no any conflict of interests in the connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, M., Chen, K. & Zhu, J. An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway. Int. J. Mach. Learn. & Cyber. 14, 3483–3499 (2023). https://doi.org/10.1007/s13042-023-01845-2

Download citation

Received: 01 September 2022
Accepted: 16 April 2023
Published: 28 June 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s13042-023-01845-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway

Abstract

Access this article

Similar content being viewed by others

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

Autonomous Driving Ethics: from Trolley Problem to Ethics of Risk

UAV Path Planning Using Optimization Approaches: A Survey

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway

Abstract

Access this article

Similar content being viewed by others

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

Autonomous Driving Ethics: from Trolley Problem to Ethics of Risk

UAV Path Planning Using Optimization Approaches: A Survey

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation