Abstract
Optimizing energy consumption in public transportation systems is a severe issue as the cost of energy increases over time. Since a palpable part of energy in transportation systems is consumed by subways, this issue has increased concerns over time. In this paper, the problem of train speed profile determination is discussed under the framework of Reinforcement Learning. First, the train dynamics are modeled, and the basics of RL are explained for the problem. As the novelty of this work, a new RL algorithm named Q-SARSA is proposed by incorporating the Q-learning and SARSA update rules. This helps Q-SARSA be as fast as SARSA and as accurate as Q-learning. The algorithm is prevented from local optimums by defining a new parameter as convergence measurement (CM). Furthermore, another RL-based method is designed called Deep-Q network by combining a deep, fully connected neural network with Q-table using Q-SARSA updates. This Deep-Q net relieves the problem of iterative calculations by adapting gradient ascend in networks weight updates, and a new reward function is formed to accord with the network and the time-energy problem. The conventional Q-learning and SARSA algorithms, latest versions of Genetic algorithm, Bees algorithm, Dynamic programming and Deep neural network are developed as well for comparison purposes. Simulations are conducted using the route information of Tehran Metro Lines 3, 5, and Shiraz Metro Line 1. The consulting results show the proposed methods’ huge advantage and efficiency compared with the mentioned methods.
Similar content being viewed by others
Data availability
All data analyzed during this study are included in this published article presented in Fig. 7. Further details are available from the corresponding author on reasonable request.
References
Aljohani TM, Ebrahim A, Mohammed O (2021) Real-Time metadata-driven routing optimization for electric vehicle energy consumption minimization using deep reinforcement learning and Markov chain model. Electr Power Syst Res 192:106962. https://doi.org/10.1016/j.epsr.2020.106962
Anh AT, Thu H, Van Quyen N (2021) A novel method for determining fixed running time in operating electric train tracking optimal speed profile. Int J Electr Comput Eng (IJECE) 11(6):4881–4890. https://doi.org/10.11591/ijece.v11i6.pp4881-4890
Bertsekas DP, Tsitsiklis JN(1996) Neuro-dynamic programming. Chap.4 Sec. 4.3 p.161.Athena Scientific. https://doi.org/10.1007/978-0-387-74759-0_440
Buşoniu L, Babuška R, De Schutter B (2010) Multi-agent reinforcement learning: An overview. In: Innovations in multi-agent systems and applications-1, pp 183–221. https://doi.org/10.1007/978-3-642-14435-6_7
Chen M, Fang Q, He T, Guo Y, Wang Q, Sun P (2021). Integrated Optimization of Train Speed Profile and Timetable Considering the Location of Substations. In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), IEEE, pp 460–465. https://doi.org/10.1109/ITSC48978.2021.9564452
Cheng Y, Yin J, Yang L (2021) Robust energy-efficient train speed profile optimization in a scenario-based position—time—speed network. Front Eng Manag 8(4):595–614. https://doi.org/10.1007/s42524-021-0173-1
Deng K, Liu Y, Hai D, Peng H, Löwenstein L, Pischinger S, Hameyer K (2022) Deep reinforcement learning based energy management strategy of fuel cell hybrid railway vehicles considering fuel cell aging. Energy Convers Manag 251:115030. https://doi.org/10.1016/j.enconman.2021.115030
Du Y, Chen J, Zhao C, Liu C, Liao F, Chan C-Y (2022) Comfortable and energy-efficient speed control of autonomous vehicles on rough pavements using deep reinforcement learning. Transp Res C Emerg Technol 134:103489. https://doi.org/10.1016/j.trc.2021.103489
Esveld C, Esveld C (2001) Modern railway track, vol 385. MRT-productions, Zaltbommel URI: http://www.esveld.com/Documents/MRT_Selection.pdf
Gosavi A (2009) Reinforcement learning: A tutorial survey and recent advances. INFORMS J Comput 21(2):178–192. https://doi.org/10.1287/ijoc.1080.0305
Havaei P, Sandidzadeh MAMA (2021) Non-dominated Sorting Bees Algorithm for Multi-Objective Train Speed Profile Optimization. Int J Railway Res 8(1):25–32. https://doi.org/10.22068/ijrare.282
Howlett P (1996) Optimal strategies for the control of a train. Automatica 32(4):519–532. https://doi.org/10.1016/0005-1098(95)00184-0
Howlett P (2000) The optimal control of a train. Ann Oper Res 98(1):65–87. https://doi.org/10.1023/A:1019235819716
Howlett PG, Cheng J (1997) Optimal driving strategies for a train on a track with continuously varying gradient. ANZIAM J 38(3):388–410. https://doi.org/10.1017/S0334270000000746
Howlett PG, Leizarowitz A (2001) Optimal strategies for vehicle control problems with finite control sets. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS SERIES B, 8, pp 41-70. URL: http://online.watsci.org/abstract_pdf/2001v8/v8n1b-pdf/4.pdf
Hu H, Y-p Fu, Hu C (2010) PSO-based optimal operation strategy of energy saving control for train. In 2010 IEEE 17Th International Conference on Industrial Engineering and Engineering Management. IEEE pp 1560–1563. https://doi.org/10.1109/ICIEEM.2010.5646113
Huang Y, Yang L, Tang T, Gao Z, Cao F, Li K (2018) Train speed profile optimization with on-board energy storage devices: A dynamic programming based approach. Comput Ind Eng 126:149–164. https://doi.org/10.1016/j.cie.2018.09.024
Huang K, Wu J, Yang X, Gao Z, Liu F, Zhu Y (2019) Discrete train speed profile optimization for urban rail transit: a data-driven model and integrated algorithms based on machine learning. J Adv Transp 2019:1–17. https://doi.org/10.1155/2019/7258986
Hwang H-S (1998) Control strategy for optimal compromise between trip time and energy consumption in a high-speed railway. IEEE Trans Syst Man Cybern A: Syst Humans 28(6):791–802. https://doi.org/10.1109/3468.725350
Jaakkola T, Jordan M, Singh S. (1993) Convergence of stochastic iterative dynamic programming algorithms. Advances in neural information processing systems 6. URL: https://papers.nips.cc/paper/764-convergence-of-stochastic-iterative-dynamic-programming-algorithms.pdf
Kang M-H (2011) A GA-based algorithm for creating an energy-optimum train speed trajectory. J Int Counc Electr Eng 1(2):123–128. https://doi.org/10.5370/JICEE.2011.1.2.123
Khmelnitsky E (2000) On an optimal control problem of train operation. IEEE Trans Autom Control 45(7):1257–1266. https://doi.org/10.1109/9.867018
Kim H, Pyeon H, Park JS, Hwang JY, Lim S (2020) Autonomous Vehicle Fuel Economy Optimization with Deep Reinforcement Learning. Electronics 9(11):1911. https://doi.org/10.3390/electronics9111911
Lai Q, Liu J, Haghani A, Meng L, Wang Y (2020) Energy-efficient speed profile optimization for medium-speed maglev trains. Transp Res E: Logist Transp Rev 141:102007. https://doi.org/10.1016/j.tre.2020.102007
Li W, Cui H, Nemeth T, Jansen J, Uenluebayir C, Wei Z, Zhang L et al (2021) Deep reinforcement learning-based energy management of hybrid battery systems in electric vehicles. J Energy Storage 36:102355. https://doi.org/10.1016/j.est.2021.102355
Liang F, Shen C, Wei Y, Feng W (2019) Towards optimal power control via ensembling deep neural networks. IEEE Trans Commun 68(3):1760–1776. https://doi.org/10.1109/TCOMM.2019.2957482
Liu W, Tang T, Su S, Cao Y, Bao F, Gao J (2018) An intelligent train control approach based on the monte carlo reinforcement learning algorithm. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), IEEE, pp 1944-1949. https://doi.org/10.1109/ITSC.2018.8569399
Liu J, Shi T, Ma X, Xue R, Liu M (2021) Optimal Train Speed Optimization under Several Safety Points by the PSO Algorithm. In 2021 IEEE Congress on Evolutionary Computation (CEC),IEEE, pp 1333–1340. https://doi.org/10.1109/CEC45853.2021.9504810
Liu W, Shuai S, Tang T, Wang X (2021) A DQN-based intelligent control method for heavy haul trains on long steep downhill section. Transp Res C Emerg Technol 129:103249. https://doi.org/10.1016/j.trc.2021.103249
Lu Q, Feng X (2011) Optimal control strategy for energy saving in trains under the four-aspect fixed autoblock system. J Mod Transp 19(2):82–87. https://doi.org/10.1007/BF03325744
Mao Q (2022) Multi-agent Lenient Reinforcement Learning Based Algorithm for Balanced Train Operation of Single-Track Railway Considering DoS Attacks. In International Conference on Intelligent Transportation Engineering, Springer, Singapore, pp 351–361. https://doi.org/10.1007/978-981-19-2259-6_31
Melo FS (2001) Convergence of Q-learning: A simple proof. Institute Of Systems and Robotics, Tech. Rep: 1–4. URL: http://users.isr.ist.utl.pt/~mtjspaan/readingGroup/ProofQlearning.pdf
Milroy IP (1980) Aspects of automatic train control. PhD diss., Loughborough University, URL: https://hdl.handle.net/2134/10613
Naldini F, Pellegrini P, Rodriguez J (2022) Real-time optimization of energy consumption in railway networks. Transp Res Procedia 62:35–42. https://doi.org/10.1016/j.trpro.2022.02.005
Ning L, Zhou M, Hou Z, Goverde RMP, Wang F-Y, Dong H (2021) Deep Deterministic Policy Gradient for High-Speed Train Trajectory Optimization. IEEE Trans Intell Transp Syst 23:11562–11574. https://doi.org/10.1109/TITS.2021.3105380
Pan Z, Chen M, Lu S, Tian Z, Liu Y (2020) Integrated timetable optimization for minimum total energy consumption of an AC railway system. IEEE Trans Veh Technol 69(4):3641–3653. https://doi.org/10.1109/TVT.2020.2975603
Peng H, Li J, Deng K, Thul A, Li W, Lowenstein L, Sauer DU, Hameyer K (2019) An efficient optimum energy management strategy using parallel dynamic programming for a hybrid train powered by fuel-cells and batteries. In 2019 IEEE vehicle power and propulsion conference (VPPC), IEEE, pp 1–7. https://doi.org/10.1109/VPPC46532.2019.8952323
Rochard BP, Schmid F (2000) A review of methods to measure and calculate train resistances. Proc Inst Mech Eng F: J Rail Rapid Transit 214(4):185–199. https://doi.org/10.1243/0954409001531306
Shang M, Zhou Y, Fujita H (2021) Deep reinforcement learning with reference system to handle constraints for energy-efficient train control. Inf Sci 570:708–721. https://doi.org/10.1016/j.ins.2021.04.088
Tang H, Wang Y, Liu X, Feng X (2020) Reinforcement learning approach for optimal control of multiple electric locomotives in a heavy-haul freight train: A Double-Switch-Q-network architecture. Knowl-Based Syst 190:105173. https://doi.org/10.1016/j.knosys.2019.105173
Too J, Abdullah AR (2021) A new and fast rival genetic algorithm for feature selection. J Supercomput 77(3):2844–2874. https://doi.org/10.1007/s11227-020-03378-9
Wang Y, Zhu S, Li S, Yang L, De Schutter B (2022) Hierarchical Model Predictive Control for on-Line High-Speed Railway Delay Management and Train Control in a Dynamic Operations Environment. IEEE Trans Control Syst Technol 30:2344–2359. https://doi.org/10.1109/TCST.2022.3140805
Xiao Z, Wang Q, Sun P, Zhao Z, Rao Y, Feng X (2021) Real-time energy-efficient driver advisory system for high-speed trains. IEEE Trans Transp Electrif 7(4):3163–3172. https://doi.org/10.1109/TTE.2021.3071251
Yang L, Li K, Gao Z, Li X (2012) Optimizing trains movement on a railway network. Omega 40(5):619–633. https://doi.org/10.1016/j.omega.2011.12.001
Yang J, Xu X, Peng Y, Deng P, Wu X, Zhang J (2022) Hierarchical energy management of a hybrid propulsion system considering speed profile optimization. Energy 244:123098. https://doi.org/10.1016/j.energy.2022.123098
Yao C (2002) Optimization of running profile of train by dynamic programming. in IEEJ National Convention 2002, Tokyo, Japan, march. URL: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.116.7155&rep=rep1&type=pdf
Yong DING, Haidong LIU, Yun BAI, Fangming ZHOU (2011) A two-level optimization model and algorithm for energy-efficient urban train operation. J Transp Syst Eng Inform Technol 11(1):96–101. https://doi.org/10.1016/S1570-6672(10)60106-7
Zhang M, Zhang Q, Lv Y, Sun W, Wang H (2018) An AI based high-speed railway automatic train operation system analysis and design. In 2018 International Conference on Intelligent Rail Transportation (ICIRT), IEEE, pp 1–5. https://doi.org/10.1109/ICIRT.2018.8641650
Zhong W, Li S, Xu H, Zhang W (2020) On-line train speed profile generation of high-speed railway with energy-saving: a model predictive control method. IEEE Trans Intell Transp Syst 23:4063–4074. https://doi.org/10.1109/TITS.2020.3040730
Zhou K, Song S, Xue A, You K, Hui W (2020) Smart train operation algorithms based on expert knowledge and reinforcement learning. IEEE Trans Syst Man Cybern Syst. https://doi.org/10.1109/TSMC.2020.3000073
Author information
Authors and Affiliations
Contributions
Mohammad Ali Sandidzadeh: Supervision, Evaluation, Revision, and Editing.
Pedram Havaei: Methodology, Coding, Software, Revision, Editing, and Writing.
Corresponding author
Ethics declarations
Conflict of interest
The authors certify that they have NO affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sandidzadeh, M.A., Havaei, P. A comprehensive study on reinforcement learning application for train speed profile optimization. Multimed Tools Appl 82, 37351–37386 (2023). https://doi.org/10.1007/s11042-023-15051-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15051-3