Abstract
A novel supervised Actor–Critic (SAC) approach for adaptive cruise control (ACC) problem is proposed in this paper. The key elements required by the SAC algorithm namely Actor and Critic, are approximated by feed-forward neural networks respectively. The output of Actor and the state are input to Critic to approximate the performance index function. A Lyapunov stability analysis approach has been presented to prove the uniformly ultimate bounded property of the estimation errors of the neural networks. Moreover, we use the supervisory controller to pre-train Actor to achieve a basic control policy, which can improve the training convergence and success rate. We apply this method to learn an approximate optimal control policy for the ACC problem. Experimental results in several driving scenarios demonstrate that the SAC algorithm performs well, so it is feasible and effective for the ACC problem.
Similar content being viewed by others
References
Andreas T (2012) Vehicle trajectory effects of adaptive cruise control. J Intell Trans Syst 16(1):36–44
Barto A, Dietterich T (2004) Reinforcement learning and its relationship to supervised learning. In: Si J, Barto A, Powell W, Wunsch D (eds) Handbook of learning and approximate dynamic programming. IEEE Press, Wiley, London, pp 47–63
Bifulco G, Simonelli F, Pace D (2008) Experiments toward an human-like adaptive cruise control. In: 2008 IEEE intelligent vehicles, symposium, pp 919–924
Dierks T, Thumati B, Jagannathan S (2009) Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Netw 22(5):851–860
Fritz A, Schiehlen W (2001) Nonlinear acc in simulation and measurement. Veh Syst Dyn 36:159–177
Guvenc B, Kural E (2006) Adaptive cruise control simulator: a low-cost, multiple-driver-in-the-loop simulator. IEEE Control Syst Mag 26(3):42–55
Hayakawa T, Haddad W, Hovakimyan N (2008) Neural network adaptive control for a class of nonlinear uncertain dynamical systems with asymptotic stability guarantees. IEEE Trans Neural Netw 19:80–89
He P, Jagannathan S (2005) Reinforcement learning-based output feedback control of nonlinear systems with input constraints. IEEE Trans Syst Man Cybern Part B Cybern 35(1):150–154
Hu Z, Zhao D (2011) Adaptive cruise control based on reinforcement leaning with shaping rewards. J Adv Comput Intell Intell Info 15(3):4645–4650
Kesting A, Treiber M, Schoenhof M, Kranke F, Helbing D (2007) Traffic and granular flow’05. In: Jam-avoiding adaptive cruise control (ACC) and its impact on traffic, dynamics. Springer, Berlin, pp 633–643
Kural E, Guvenc B (2010) Model predictive adaptive cruise control. In: 2010 IEEE international conference on systems man and cybernetics (SMC), pp 1455–1461
Kyongsu Y, Ilki M (2004) A driver-adaptive stop-and-go cruise control strategy. In: 2004 IEEE international conference on networking, sensing and, control, pp 601–606
Li T, Zhao D, Yi J (2008) Adaptive dynamic neuro-fuzzy system for traffic signal control. In: 2008 IEEE international joint conference on neural networks (IJCNN), pp 1840–1846
Li S, Li K, Rajamani R, Wang J (2011) Model predictive multi-objective vehicular adaptive cruise control. IEEE Trans Control Syst Technol 19:556–566
Liu F, Sun J, Si J, Guo W, Mei S (2012) A boundedness result for the direct heuristic dynamic programming. Neural Netw 32:229–235
Martinez J, Canudas-De-Wit C (2007) A safe longitudinal control for adaptive cruise control and stop-and-go scenarios. IEEE Trans Control Syst Technol 15:246–258
Milanes V, Villagra J, Godoy J, Gonzalez C (2012) Comparing fuzzy and intelligent pi controllers in stop-and-go maneuvers. IEEE Trans Control Syst Technol 20:770–778
Moon S, Moon I, Yi K (2009) Design, tuning, and evaluation of a full-range adaptive cruise control system with collision avoidance. Control Eng Pract 17(4):442–455
Murray J, Cox C, Lendaris G, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Sys Man Cybern Part C 32(2):140–152
Naranjo J, Gonzalez C, Garcia R, Pedro d (2006) Acc plus stop &go maneuvers with throttle and brake fuzzy control. IEEE Trans Intell Transp Syst 7:213–225
Ohno H (2001) Analysis and modeling of human driving behaviors using adaptive cruise control. Appl Soft Comput 1:237–243
Rosenstein M, Barto A (2004) Supervised actor-critic reinforcement learning. In: Si J, Barto A, Powell W, Wunsch D (eds) Handbook of learning and approximate dynamic programming. IEEE Press, Wiley, London, pp 359–380
Si J, Wang Y (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276
Siciliano B, Khatib O (2008) Springer handbook of robotics, chap. 51 intelligent vehicles. Springer, Berlin
Sutton R, Barto A (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge
Vamvoudakis K, Lewis F (2009) Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem. In: 2009 international joint conference on neural networks (IJCNN), pp 3180–3187
Xia Z, Zhao D (2012) Hybrid feedback control of vehicle longitudinal acceleration. In: 2012 31st Chinese control conference (CCC), pp 7292–7297
Xiao L, Gao F (2011) Practical string stability of platoon of adaptive cruise control vehicles. IEEE Trans Intell Transp Syst 12:1184–1194
Zhao D, Bai X, Wang F, Xu J, Yu W (2011) Dhp for coordinated freeway ramp metering. IEEE Trans Intell Transp Syst 12(4):990–999
Zhao D, Zhang Z, Dai Y (2012a) Self-teaching adaptive dynamic programming for go-moku. Neurocomputing 78(1):23–29
Zhao D, Zhu Y, He H (2012b) Neural and fuzzy dynamic programming for under-actuated systems. In: 2012 international joint conference on neural networks(IJCNN), pp 1–7
Zhao D, Hu Z, Xia Z, Alippi C, Zhu Y, Wang D (2013) Full-range adaptive cruise control based on supervised adaptive dynamic programming. Neurocomputing. doi:10.1016/jneucom201209034
Acknowledgments
We acknowledge Dr. Cesare Alippi and Dr. Yuzhu Huang for their valuable discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by C. Alippi, D. Zhao and D. Liu.
This work was supported in part by National Natural Science Foundation of China under Grant Nos. 61273136 and 61034002, and Beijing Natural Science Foundation under Grant No. 4122083.
Rights and permissions
About this article
Cite this article
Zhao, D., Wang, B. & Liu, D. A supervised Actor–Critic approach for adaptive cruise control. Soft Comput 17, 2089–2099 (2013). https://doi.org/10.1007/s00500-013-1110-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-013-1110-y