Skip to main content
Log in

A supervised Actor–Critic approach for adaptive cruise control

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

A novel supervised Actor–Critic (SAC) approach for adaptive cruise control (ACC) problem is proposed in this paper. The key elements required by the SAC algorithm namely Actor and Critic, are approximated by feed-forward neural networks respectively. The output of Actor and the state are input to Critic to approximate the performance index function. A Lyapunov stability analysis approach has been presented to prove the uniformly ultimate bounded property of the estimation errors of the neural networks. Moreover, we use the supervisory controller to pre-train Actor to achieve a basic control policy, which can improve the training convergence and success rate. We apply this method to learn an approximate optimal control policy for the ACC problem. Experimental results in several driving scenarios demonstrate that the SAC algorithm performs well, so it is feasible and effective for the ACC problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Andreas T (2012) Vehicle trajectory effects of adaptive cruise control. J Intell Trans Syst 16(1):36–44

    MathSciNet  Google Scholar 

  • Barto A, Dietterich T (2004) Reinforcement learning and its relationship to supervised learning. In: Si J, Barto A, Powell W, Wunsch D (eds) Handbook of learning and approximate dynamic programming. IEEE Press, Wiley, London, pp 47–63

  • Bifulco G, Simonelli F, Pace D (2008) Experiments toward an human-like adaptive cruise control. In: 2008 IEEE intelligent vehicles, symposium, pp 919–924

  • Dierks T, Thumati B, Jagannathan S (2009) Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Netw 22(5):851–860

    Article  Google Scholar 

  • Fritz A, Schiehlen W (2001) Nonlinear acc in simulation and measurement. Veh Syst Dyn 36:159–177

    Article  Google Scholar 

  • Guvenc B, Kural E (2006) Adaptive cruise control simulator: a low-cost, multiple-driver-in-the-loop simulator. IEEE Control Syst Mag 26(3):42–55

    Article  Google Scholar 

  • Hayakawa T, Haddad W, Hovakimyan N (2008) Neural network adaptive control for a class of nonlinear uncertain dynamical systems with asymptotic stability guarantees. IEEE Trans Neural Netw 19:80–89

    Article  Google Scholar 

  • He P, Jagannathan S (2005) Reinforcement learning-based output feedback control of nonlinear systems with input constraints. IEEE Trans Syst Man Cybern Part B Cybern 35(1):150–154

    Article  Google Scholar 

  • Hu Z, Zhao D (2011) Adaptive cruise control based on reinforcement leaning with shaping rewards. J Adv Comput Intell Intell Info 15(3):4645–4650

    MathSciNet  Google Scholar 

  • Kesting A, Treiber M, Schoenhof M, Kranke F, Helbing D (2007) Traffic and granular flow’05. In: Jam-avoiding adaptive cruise control (ACC) and its impact on traffic, dynamics. Springer, Berlin, pp 633–643

  • Kural E, Guvenc B (2010) Model predictive adaptive cruise control. In: 2010 IEEE international conference on systems man and cybernetics (SMC), pp 1455–1461

  • Kyongsu Y, Ilki M (2004) A driver-adaptive stop-and-go cruise control strategy. In: 2004 IEEE international conference on networking, sensing and, control, pp 601–606

  • Li T, Zhao D, Yi J (2008) Adaptive dynamic neuro-fuzzy system for traffic signal control. In: 2008 IEEE international joint conference on neural networks (IJCNN), pp 1840–1846

  • Li S, Li K, Rajamani R, Wang J (2011) Model predictive multi-objective vehicular adaptive cruise control. IEEE Trans Control Syst Technol 19:556–566

    Article  Google Scholar 

  • Liu F, Sun J, Si J, Guo W, Mei S (2012) A boundedness result for the direct heuristic dynamic programming. Neural Netw 32:229–235

    Article  MATH  Google Scholar 

  • Martinez J, Canudas-De-Wit C (2007) A safe longitudinal control for adaptive cruise control and stop-and-go scenarios. IEEE Trans Control Syst Technol 15:246–258

    Article  Google Scholar 

  • Milanes V, Villagra J, Godoy J, Gonzalez C (2012) Comparing fuzzy and intelligent pi controllers in stop-and-go maneuvers. IEEE Trans Control Syst Technol 20:770–778

    Article  Google Scholar 

  • Moon S, Moon I, Yi K (2009) Design, tuning, and evaluation of a full-range adaptive cruise control system with collision avoidance. Control Eng Pract 17(4):442–455

    Article  Google Scholar 

  • Murray J, Cox C, Lendaris G, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Sys Man Cybern Part C 32(2):140–152

    Article  Google Scholar 

  • Naranjo J, Gonzalez C, Garcia R, Pedro d (2006) Acc plus stop &go maneuvers with throttle and brake fuzzy control. IEEE Trans Intell Transp Syst 7:213–225

    Article  Google Scholar 

  • Ohno H (2001) Analysis and modeling of human driving behaviors using adaptive cruise control. Appl Soft Comput 1:237–243

    Article  Google Scholar 

  • Rosenstein M, Barto A (2004) Supervised actor-critic reinforcement learning. In: Si J, Barto A, Powell W, Wunsch D (eds) Handbook of learning and approximate dynamic programming. IEEE Press, Wiley, London, pp 359–380

  • Si J, Wang Y (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276

    Article  MathSciNet  Google Scholar 

  • Siciliano B, Khatib O (2008) Springer handbook of robotics, chap. 51 intelligent vehicles. Springer, Berlin

  • Sutton R, Barto A (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge

    Google Scholar 

  • Vamvoudakis K, Lewis F (2009) Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem. In: 2009 international joint conference on neural networks (IJCNN), pp 3180–3187

  • Xia Z, Zhao D (2012) Hybrid feedback control of vehicle longitudinal acceleration. In: 2012 31st Chinese control conference (CCC), pp 7292–7297

  • Xiao L, Gao F (2011) Practical string stability of platoon of adaptive cruise control vehicles. IEEE Trans Intell Transp Syst 12:1184–1194

    Article  Google Scholar 

  • Zhao D, Bai X, Wang F, Xu J, Yu W (2011) Dhp for coordinated freeway ramp metering. IEEE Trans Intell Transp Syst 12(4):990–999

    Google Scholar 

  • Zhao D, Zhang Z, Dai Y (2012a) Self-teaching adaptive dynamic programming for go-moku. Neurocomputing 78(1):23–29

    Article  Google Scholar 

  • Zhao D, Zhu Y, He H (2012b) Neural and fuzzy dynamic programming for under-actuated systems. In: 2012 international joint conference on neural networks(IJCNN), pp 1–7

  • Zhao D, Hu Z, Xia Z, Alippi C, Zhu Y, Wang D (2013) Full-range adaptive cruise control based on supervised adaptive dynamic programming. Neurocomputing. doi:10.1016/jneucom201209034

Download references

Acknowledgments

We acknowledge Dr. Cesare Alippi and Dr. Yuzhu Huang for their valuable discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Wang.

Additional information

Communicated by C. Alippi, D. Zhao and D. Liu.

This work was supported in part by National Natural Science Foundation of China under Grant Nos. 61273136 and 61034002, and Beijing Natural Science Foundation under Grant No. 4122083.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, D., Wang, B. & Liu, D. A supervised Actor–Critic approach for adaptive cruise control. Soft Comput 17, 2089–2099 (2013). https://doi.org/10.1007/s00500-013-1110-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-013-1110-y

Keywords

Navigation