Abstract
With the increasing scale of urbanization, traffic congestion has caused a severe negative impact on the efficiency of social development. To this end, a series of intelligent traffic light control methods based on reinforcement learning are proposed. They get superior performance compared with conventional control methods under certain conditions. However, because of the usage of actions based on switching phase, almost all of these methods cannot provide a countdown function, for the switching phase actions need to be executed immediately. So they are difficult to be applied in most real-world scenarios from the practical consideration of traffic safety and efficiency. For example, without the countdown function, it cannot inform pedestrians how many seconds the green light remains to cross the road. This paper proposes a novel method that can naturally provide a countdown function by adopting a new action design. Specifically, this action design achieves control in the manner of time allocation, and the model figures out the duration of each phase at the beginning of every signal cycle. In this way, our method is more practical for real-world traffic applications. Plenty of simulation experiments show that our method eases congestion substantially in single intersection environments with the countdown requirement, e.g., our model cuts down 74% waiting time compared with a competitive baseline in the experiment with 2 phases and mix flow.
Similar content being viewed by others
References
Schneider B (2018) Traffic’s mind-boggling economic toll. Accessed 7 February 2018. https://www.citylab.com/transportation/2018/02/traffics-mind-boggling-economic-toll/552488/
Burfeind M (2018) Traffic congestion cost UK motorists over £\(37.7\) billion in 2017. Accessed 5 Feb 2018. http://inrix.com/press-releases/scorecard-2017-uk/
Smith SF, Barlow G, Xie X-F, Rubinstein ZB (2013) Surtrac: scalable urban traffic control
Chin S-M, Franzese O, Greene DL, Hwang H-L, Gibson R et al (2004) Temporary losses of highway capacity and impacts on performance: Phase 2. United States. Dept. of Energy. Office of Scientific and Technical Information
Webster FV (1958) Traffic signal settings. Technical report
Miller AJ (1963) Settings for fixed-cycle traffic signals. J Oper Res Soc 14(4):373–386
Porche I, Lafortune S (1999) Adaptive look-ahead optimization of traffic signals. J Intell Trans Syst 4(3–4):209–254
Cools SB, Gershenson C, D’Hooghe B (2013) Self-organizing traffic lights: A realistic simulation. Advances in applied self-organizing systems 45–55
Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction. MIT press
Mannion P, Duggan J, Howley E (2015) Parallel reinforcement learning for traffic signal control. Proc Comput Sci 52:956–961
Touhbi S, Babram MA, Nguyen-Huu T, Marilleau N, Hbid ML, Cambier C, Stinckwich S (2017) Adaptive traffic signal control: Exploring reward definition for reinforcement learning. Proc Comput Sci 109:513–520
Kuyer L, Whiteson S, Bakker B, Vlassis N (2008) Multiagent reinforcement learning for urban traffic control using coordination graphs. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 656–671. Springer
El-Tantawy S, Abdulhai B, Abdelgawad H (2013) Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (marlin-atsc): methodology and large-scale application on downtown toronto. IEEE Trans Intell Trans Syst 14(3):1140–1150
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354
Genders W, Razavi S (2016) Using a deep reinforcement learning agent for traffic signal control. arXiv preprint arXiv:1611.01142
Gao J, Shen Y, Liu J, Ito M, Shiratori N (2017) Adaptive traffic signal control: Deep reinforcement learning algorithm with experience replay and target network. arXiv preprint arXiv:1705.02755
Van der Pol E, Oliehoek FA (2016) Coordinated deep reinforcement learners for traffic light control. Proceedings of Learning, Inference and Control of Multi-Agent Systems (at NIPS 2016)
Wei H, Zheng G, Yao H, Li Z (2018) Intellilight: A reinforcement learning approach for intelligent traffic light control. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2496–2505. ACM
Zhao J, Guan Z, Xu C, Zhao W, Chen E (2022) Charge prediction by constitutive elements matching of crimes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 22, (23–29):4517–4523. https://doi.org/10.24963/ijcai.2022/627
Guan Z, Wu H, Cao Q, Liu H, Zhao W, Li S, Xu C, Qiu G, Xu J, Zheng B (2021) Multi-agent cooperative bidding games for multi-objective optimization in e-commercial sponsored search. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 2899–2909
Zhao J, Qiu G, Guan Z, Zhao W, He X (2018) Deep reinforcement learning for sponsored search real-time bidding. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1021–1030
Watkins CJCH (1989) Learning form delayed rewards. PhD thesis, King’s College, University of Cambridge
Yang Y, Guan Z, Li J, Zhao W, Cui J, Wang Q (2021) Interpretable and efficient heterogeneous graph convolutional network. IEEE Trans Knowl Data Eng
Yu J, Tan M, Zhang H, Rui Y, Tao D (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(2):563–578
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Ye T, Zhang Z, Zhang X, Chen Y, Zhou F (2021) Fault detection of railway freight cars mechanical components based on multi-feature fusion convolutional neural network. Int J Mach Learn Cybernet 12(6):1789–1801
Liu S, Li T, Ding H, Tang B, Wang X, Chen Q, Yan J, Zhou Y (2020) A hybrid method of recurrent neural network and graph neural network for next-period prescription prediction. Int J Mach Learn Cybernet 11(12):2849–2856
Hong C, Yu J, Zhang J, Jin X, Lee K-H (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inform 15(7):3952–3961
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971
Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning, pp. 2829–2838
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814
Krajzewicz D, Erdmann J, Behrisch M, Bieker L (2012) Recent development and applications of SUMO—simulation of Urban MObility. Int J Adv Syst Measure 5(3 &4):128–138
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
This work was supported in part by The National Key Research and Development Program of China (Grant Nos: 2018AAA0101400), in part by The National Nature Science Foundation of China (Grant Nos: 62036009, U1909203, 61936006), in part by Innovation Capability Support Program of Shaanxi (Program No. 2021TD-05).
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xiang, C., Jin, Z., Yu, Z. et al. Optimizing traffic efficiency via a reinforcement learning approach based on time allocation. Int. J. Mach. Learn. & Cyber. 14, 3381–3391 (2023). https://doi.org/10.1007/s13042-023-01838-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-01838-1