A comprehensive study on reinforcement learning application for train speed profile optimization

Sandidzadeh, Mohammad Ali; Havaei, Pedram

doi:10.1007/s11042-023-15051-3

A comprehensive study on reinforcement learning application for train speed profile optimization

Published: 21 March 2023

Volume 82, pages 37351–37386, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

238 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Optimizing energy consumption in public transportation systems is a severe issue as the cost of energy increases over time. Since a palpable part of energy in transportation systems is consumed by subways, this issue has increased concerns over time. In this paper, the problem of train speed profile determination is discussed under the framework of Reinforcement Learning. First, the train dynamics are modeled, and the basics of RL are explained for the problem. As the novelty of this work, a new RL algorithm named Q-SARSA is proposed by incorporating the Q-learning and SARSA update rules. This helps Q-SARSA be as fast as SARSA and as accurate as Q-learning. The algorithm is prevented from local optimums by defining a new parameter as convergence measurement (CM). Furthermore, another RL-based method is designed called Deep-Q network by combining a deep, fully connected neural network with Q-table using Q-SARSA updates. This Deep-Q net relieves the problem of iterative calculations by adapting gradient ascend in networks weight updates, and a new reward function is formed to accord with the network and the time-energy problem. The conventional Q-learning and SARSA algorithms, latest versions of Genetic algorithm, Bees algorithm, Dynamic programming and Deep neural network are developed as well for comparison purposes. Simulations are conducted using the route information of Tehran Metro Lines 3, 5, and Shiraz Metro Line 1. The consulting results show the proposed methods’ huge advantage and efficiency compared with the mentioned methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Improving Fuel Economy with LSTM Networks and Reinforcement Learning

ASTPSI: Allocating Spare Time and Planning Speed Interval for Intelligent Train Control of Sparse Reward

Adaptive Energy Management Strategy for Hybrid Electric Vehicles Based on Reinforcement Learning

Data availability

All data analyzed during this study are included in this published article presented in Fig. 7. Further details are available from the corresponding author on reasonable request.

References

Aljohani TM, Ebrahim A, Mohammed O (2021) Real-Time metadata-driven routing optimization for electric vehicle energy consumption minimization using deep reinforcement learning and Markov chain model. Electr Power Syst Res 192:106962. https://doi.org/10.1016/j.epsr.2020.106962
Article Google Scholar
Anh AT, Thu H, Van Quyen N (2021) A novel method for determining fixed running time in operating electric train tracking optimal speed profile. Int J Electr Comput Eng (IJECE) 11(6):4881–4890. https://doi.org/10.11591/ijece.v11i6.pp4881-4890
Article Google Scholar
Bertsekas DP, Tsitsiklis JN(1996) Neuro-dynamic programming. Chap.4 Sec. 4.3 p.161.Athena Scientific. https://doi.org/10.1007/978-0-387-74759-0_440
Buşoniu L, Babuška R, De Schutter B (2010) Multi-agent reinforcement learning: An overview. In: Innovations in multi-agent systems and applications-1, pp 183–221. https://doi.org/10.1007/978-3-642-14435-6_7
Chapter Google Scholar
Chen M, Fang Q, He T, Guo Y, Wang Q, Sun P (2021). Integrated Optimization of Train Speed Profile and Timetable Considering the Location of Substations. In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), IEEE, pp 460–465. https://doi.org/10.1109/ITSC48978.2021.9564452
Cheng Y, Yin J, Yang L (2021) Robust energy-efficient train speed profile optimization in a scenario-based position—time—speed network. Front Eng Manag 8(4):595–614. https://doi.org/10.1007/s42524-021-0173-1
Article Google Scholar
Deng K, Liu Y, Hai D, Peng H, Löwenstein L, Pischinger S, Hameyer K (2022) Deep reinforcement learning based energy management strategy of fuel cell hybrid railway vehicles considering fuel cell aging. Energy Convers Manag 251:115030. https://doi.org/10.1016/j.enconman.2021.115030
Article Google Scholar
Du Y, Chen J, Zhao C, Liu C, Liao F, Chan C-Y (2022) Comfortable and energy-efficient speed control of autonomous vehicles on rough pavements using deep reinforcement learning. Transp Res C Emerg Technol 134:103489. https://doi.org/10.1016/j.trc.2021.103489
Article Google Scholar
Esveld C, Esveld C (2001) Modern railway track, vol 385. MRT-productions, Zaltbommel URI: http://www.esveld.com/Documents/MRT_Selection.pdf
Google Scholar
Gosavi A (2009) Reinforcement learning: A tutorial survey and recent advances. INFORMS J Comput 21(2):178–192. https://doi.org/10.1287/ijoc.1080.0305
Article MathSciNet MATH Google Scholar
Havaei P, Sandidzadeh MAMA (2021) Non-dominated Sorting Bees Algorithm for Multi-Objective Train Speed Profile Optimization. Int J Railway Res 8(1):25–32. https://doi.org/10.22068/ijrare.282
Article Google Scholar
Howlett P (1996) Optimal strategies for the control of a train. Automatica 32(4):519–532. https://doi.org/10.1016/0005-1098(95)00184-0
Article MathSciNet MATH Google Scholar
Howlett P (2000) The optimal control of a train. Ann Oper Res 98(1):65–87. https://doi.org/10.1023/A:1019235819716
Article MathSciNet MATH Google Scholar
Howlett PG, Cheng J (1997) Optimal driving strategies for a train on a track with continuously varying gradient. ANZIAM J 38(3):388–410. https://doi.org/10.1017/S0334270000000746
Article MathSciNet MATH Google Scholar
Howlett PG, Leizarowitz A (2001) Optimal strategies for vehicle control problems with finite control sets. DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS SERIES B, 8, pp 41-70. URL: http://online.watsci.org/abstract_pdf/2001v8/v8n1b-pdf/4.pdf
Hu H, Y-p Fu, Hu C (2010) PSO-based optimal operation strategy of energy saving control for train. In 2010 IEEE 17Th International Conference on Industrial Engineering and Engineering Management. IEEE pp 1560–1563. https://doi.org/10.1109/ICIEEM.2010.5646113
Huang Y, Yang L, Tang T, Gao Z, Cao F, Li K (2018) Train speed profile optimization with on-board energy storage devices: A dynamic programming based approach. Comput Ind Eng 126:149–164. https://doi.org/10.1016/j.cie.2018.09.024
Article Google Scholar
Huang K, Wu J, Yang X, Gao Z, Liu F, Zhu Y (2019) Discrete train speed profile optimization for urban rail transit: a data-driven model and integrated algorithms based on machine learning. J Adv Transp 2019:1–17. https://doi.org/10.1155/2019/7258986
Article Google Scholar
Hwang H-S (1998) Control strategy for optimal compromise between trip time and energy consumption in a high-speed railway. IEEE Trans Syst Man Cybern A: Syst Humans 28(6):791–802. https://doi.org/10.1109/3468.725350
Article Google Scholar
Jaakkola T, Jordan M, Singh S. (1993) Convergence of stochastic iterative dynamic programming algorithms. Advances in neural information processing systems 6. URL: https://papers.nips.cc/paper/764-convergence-of-stochastic-iterative-dynamic-programming-algorithms.pdf
Kang M-H (2011) A GA-based algorithm for creating an energy-optimum train speed trajectory. J Int Counc Electr Eng 1(2):123–128. https://doi.org/10.5370/JICEE.2011.1.2.123
Article MathSciNet Google Scholar
Khmelnitsky E (2000) On an optimal control problem of train operation. IEEE Trans Autom Control 45(7):1257–1266. https://doi.org/10.1109/9.867018
Article MathSciNet MATH Google Scholar
Kim H, Pyeon H, Park JS, Hwang JY, Lim S (2020) Autonomous Vehicle Fuel Economy Optimization with Deep Reinforcement Learning. Electronics 9(11):1911. https://doi.org/10.3390/electronics9111911
Article Google Scholar
Lai Q, Liu J, Haghani A, Meng L, Wang Y (2020) Energy-efficient speed profile optimization for medium-speed maglev trains. Transp Res E: Logist Transp Rev 141:102007. https://doi.org/10.1016/j.tre.2020.102007
Article Google Scholar
Li W, Cui H, Nemeth T, Jansen J, Uenluebayir C, Wei Z, Zhang L et al (2021) Deep reinforcement learning-based energy management of hybrid battery systems in electric vehicles. J Energy Storage 36:102355. https://doi.org/10.1016/j.est.2021.102355
Article Google Scholar
Liang F, Shen C, Wei Y, Feng W (2019) Towards optimal power control via ensembling deep neural networks. IEEE Trans Commun 68(3):1760–1776. https://doi.org/10.1109/TCOMM.2019.2957482
Article Google Scholar
Liu W, Tang T, Su S, Cao Y, Bao F, Gao J (2018) An intelligent train control approach based on the monte carlo reinforcement learning algorithm. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), IEEE, pp 1944-1949. https://doi.org/10.1109/ITSC.2018.8569399
Liu J, Shi T, Ma X, Xue R, Liu M (2021) Optimal Train Speed Optimization under Several Safety Points by the PSO Algorithm. In 2021 IEEE Congress on Evolutionary Computation (CEC),IEEE, pp 1333–1340. https://doi.org/10.1109/CEC45853.2021.9504810
Liu W, Shuai S, Tang T, Wang X (2021) A DQN-based intelligent control method for heavy haul trains on long steep downhill section. Transp Res C Emerg Technol 129:103249. https://doi.org/10.1016/j.trc.2021.103249
Article Google Scholar
Lu Q, Feng X (2011) Optimal control strategy for energy saving in trains under the four-aspect fixed autoblock system. J Mod Transp 19(2):82–87. https://doi.org/10.1007/BF03325744
Article Google Scholar
Mao Q (2022) Multi-agent Lenient Reinforcement Learning Based Algorithm for Balanced Train Operation of Single-Track Railway Considering DoS Attacks. In International Conference on Intelligent Transportation Engineering, Springer, Singapore, pp 351–361. https://doi.org/10.1007/978-981-19-2259-6_31
Melo FS (2001) Convergence of Q-learning: A simple proof. Institute Of Systems and Robotics, Tech. Rep: 1–4. URL: http://users.isr.ist.utl.pt/~mtjspaan/readingGroup/ProofQlearning.pdf
Milroy IP (1980) Aspects of automatic train control. PhD diss., Loughborough University, URL: https://hdl.handle.net/2134/10613
Naldini F, Pellegrini P, Rodriguez J (2022) Real-time optimization of energy consumption in railway networks. Transp Res Procedia 62:35–42. https://doi.org/10.1016/j.trpro.2022.02.005
Article Google Scholar
Ning L, Zhou M, Hou Z, Goverde RMP, Wang F-Y, Dong H (2021) Deep Deterministic Policy Gradient for High-Speed Train Trajectory Optimization. IEEE Trans Intell Transp Syst 23:11562–11574. https://doi.org/10.1109/TITS.2021.3105380
Article Google Scholar
Pan Z, Chen M, Lu S, Tian Z, Liu Y (2020) Integrated timetable optimization for minimum total energy consumption of an AC railway system. IEEE Trans Veh Technol 69(4):3641–3653. https://doi.org/10.1109/TVT.2020.2975603
Article Google Scholar
Peng H, Li J, Deng K, Thul A, Li W, Lowenstein L, Sauer DU, Hameyer K (2019) An efficient optimum energy management strategy using parallel dynamic programming for a hybrid train powered by fuel-cells and batteries. In 2019 IEEE vehicle power and propulsion conference (VPPC), IEEE, pp 1–7. https://doi.org/10.1109/VPPC46532.2019.8952323
Rochard BP, Schmid F (2000) A review of methods to measure and calculate train resistances. Proc Inst Mech Eng F: J Rail Rapid Transit 214(4):185–199. https://doi.org/10.1243/0954409001531306
Article Google Scholar
Shang M, Zhou Y, Fujita H (2021) Deep reinforcement learning with reference system to handle constraints for energy-efficient train control. Inf Sci 570:708–721. https://doi.org/10.1016/j.ins.2021.04.088
Article MathSciNet Google Scholar
Tang H, Wang Y, Liu X, Feng X (2020) Reinforcement learning approach for optimal control of multiple electric locomotives in a heavy-haul freight train: A Double-Switch-Q-network architecture. Knowl-Based Syst 190:105173. https://doi.org/10.1016/j.knosys.2019.105173
Article Google Scholar
Too J, Abdullah AR (2021) A new and fast rival genetic algorithm for feature selection. J Supercomput 77(3):2844–2874. https://doi.org/10.1007/s11227-020-03378-9
Article Google Scholar
Wang Y, Zhu S, Li S, Yang L, De Schutter B (2022) Hierarchical Model Predictive Control for on-Line High-Speed Railway Delay Management and Train Control in a Dynamic Operations Environment. IEEE Trans Control Syst Technol 30:2344–2359. https://doi.org/10.1109/TCST.2022.3140805
Article Google Scholar
Xiao Z, Wang Q, Sun P, Zhao Z, Rao Y, Feng X (2021) Real-time energy-efficient driver advisory system for high-speed trains. IEEE Trans Transp Electrif 7(4):3163–3172. https://doi.org/10.1109/TTE.2021.3071251
Article Google Scholar
Yang L, Li K, Gao Z, Li X (2012) Optimizing trains movement on a railway network. Omega 40(5):619–633. https://doi.org/10.1016/j.omega.2011.12.001
Article Google Scholar
Yang J, Xu X, Peng Y, Deng P, Wu X, Zhang J (2022) Hierarchical energy management of a hybrid propulsion system considering speed profile optimization. Energy 244:123098. https://doi.org/10.1016/j.energy.2022.123098
Article Google Scholar
Yao C (2002) Optimization of running profile of train by dynamic programming. in IEEJ National Convention 2002, Tokyo, Japan, march. URL: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.116.7155&rep=rep1&type=pdf
Yong DING, Haidong LIU, Yun BAI, Fangming ZHOU (2011) A two-level optimization model and algorithm for energy-efficient urban train operation. J Transp Syst Eng Inform Technol 11(1):96–101. https://doi.org/10.1016/S1570-6672(10)60106-7
Article Google Scholar
Zhang M, Zhang Q, Lv Y, Sun W, Wang H (2018) An AI based high-speed railway automatic train operation system analysis and design. In 2018 International Conference on Intelligent Rail Transportation (ICIRT), IEEE, pp 1–5. https://doi.org/10.1109/ICIRT.2018.8641650
Zhong W, Li S, Xu H, Zhang W (2020) On-line train speed profile generation of high-speed railway with energy-saving: a model predictive control method. IEEE Trans Intell Transp Syst 23:4063–4074. https://doi.org/10.1109/TITS.2020.3040730
Article Google Scholar
Zhou K, Song S, Xue A, You K, Hui W (2020) Smart train operation algorithms based on expert knowledge and reinforcement learning. IEEE Trans Syst Man Cybern Syst. https://doi.org/10.1109/TSMC.2020.3000073

Download references

Author information

Authors and Affiliations

Department of Control Engineering and Signaling, School of Railway Engineering, Iran University of Science and Technology (IUST), Tehran, Tehran, Iran
Mohammad Ali Sandidzadeh
Department of Control Engineering and Signaling, School of Railway Engineering, Iran University of Science and Technology (IUST), University St., Hengam St., Resalat Square, Tehran, 13114-16846, Iran
Pedram Havaei

Authors

Mohammad Ali Sandidzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Pedram Havaei
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Mohammad Ali Sandidzadeh: Supervision, Evaluation, Revision, and Editing.

Pedram Havaei: Methodology, Coding, Software, Revision, Editing, and Writing.

Corresponding author

Correspondence to Mohammad Ali Sandidzadeh.

Ethics declarations

Conflict of interest

The authors certify that they have NO affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sandidzadeh, M.A., Havaei, P. A comprehensive study on reinforcement learning application for train speed profile optimization. Multimed Tools Appl 82, 37351–37386 (2023). https://doi.org/10.1007/s11042-023-15051-3

Download citation

Received: 21 April 2022
Revised: 23 July 2022
Accepted: 27 February 2023
Published: 21 March 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11042-023-15051-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive study on reinforcement learning application for train speed profile optimization

Abstract

Access this article

Similar content being viewed by others

Improving Fuel Economy with LSTM Networks and Reinforcement Learning

ASTPSI: Allocating Spare Time and Planning Speed Interval for Intelligent Train Control of Sparse Reward

Adaptive Energy Management Strategy for Hybrid Electric Vehicles Based on Reinforcement Learning

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comprehensive study on reinforcement learning application for train speed profile optimization

Abstract

Access this article

Similar content being viewed by others

Improving Fuel Economy with LSTM Networks and Reinforcement Learning

ASTPSI: Allocating Spare Time and Planning Speed Interval for Intelligent Train Control of Sparse Reward

Adaptive Energy Management Strategy for Hybrid Electric Vehicles Based on Reinforcement Learning

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation