ASTPSI: Allocating Spare Time and Planning Speed Interval for Intelligent Train Control of Sparse Reward

Zhang, Haotong; Xian, Gang

doi:10.1007/978-981-99-8079-6_6

Haotong Zhang¹² &
Gang Xian¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14447))

Included in the following conference series:

International Conference on Neural Information Processing

717 Accesses

Abstract

When using deep reinforcement learning (DRL) to solve train operation control in urban railways, encounter complex and dynamic environments with sparse rewards. Therefore, it is crucial to alleviate the negative impact of sparse rewards on finding the optimal trajectory. This paper introduces a novel algorithm called Allocating Spare Time and Planning Speed Intervals (ASTPSI), which can reduce the blindness of exploration dramatically of intelligent train agents under sparse rewards when using DRL and significantly improve their learning efficiency and operation quality. The ASTPSI can generate real-time train trajectories that meet the requirements by combining different DRL algorithms. To evaluate the algorithm’s performance, we verified the convergence rate of the ASTPSI-DRL to optimize train trajectories in the face of sparse rewards on a real track. ASTPSI-DRL has better performance and stability than genetic algorithms and original DRL algorithms in reducing train energy consumption, punctuality, and accurate stopping.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Albrecht, A.R., Howlett, P.G., Pudney, P.J., Vu, X.: Energy-efficient train control: from local convexity to global optimization and uniqueness. Automatica 49(10), 3072–3078 (2013)
Google Scholar
Cao, Y., Zhang, Z., Cheng, F., Shuai, S.: Trajectory optimization for high-speed trains via a mixed integer linear programming approach. IEEE Trans. Intell. Transp. Syst. 23(10), 17666–17676 (2022)
Article Google Scholar
Chen, J., et al.: Integrated regenerative braking energy utilization system for multi-substations in electrified railways. IEEE Trans. Industr. Electron. 70(1), 298–310 (2022)
Article Google Scholar
Deng, K., et al.: An adaptive PMP-based model predictive energy management strategy for fuel cell hybrid railway vehicles. eTransportation 7, 100094 (2021)
Article Google Scholar
Dong, H., Ning, B., Cai, B., Hou, Z.: Automatic train control system development and simulation for high-speed railways. IEEE Circuits Syst. Mag. 10(2), 6–18 (2010)
Article Google Scholar
Howlett, P.: An optimal strategy for the control of a train. ANZIAM J. 31(4), 454–471 (1990)
MathSciNet MATH Google Scholar
Howlett, P.G., Pudney, P.J., Vu, X.: Local energy minimization in optimal train control. Automatica 45(11), 2692–2698 (2009)
Google Scholar
Ladosz, P., Weng, L., Kim, M., Oh, H.: Exploration in deep reinforcement learning: a survey. Inf. Fusion 85, 1–22 (2022)
Google Scholar
Liu, R.R., Golovitcher, I.M.: Energy-efficient operation of rail vehicles. Transp. Res. Part A: Policy Pract. 37(10), 917–932 (2003)
Google Scholar
Liu, W., Shuai, S., Tang, T., Wang, X.: A DQN-based intelligent control method for heavy haul trains on long steep downhill section. Transp. Res. Part C: Emerg. Technol. 129, 103249 (2021)
Article Google Scholar
Lu, M., Ou, D., Hua, Z., Gu, L.: Analysis of stopping accuracy deviation of urban rail transit train in ATO driving Mode. In: Qin, Y., Jia, L., Liang, J., Liu, Z., Diao, L., An, M. (eds.) Proceedings of the 5th International Conference on Electrical Engineering and Information Technologies for Rail Transportation (EITRT) 2021. EITRT 2021. LNEE, vol. 868. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-9913-9_72
Ning, L., Zhou, M., Hou, Z., Goverde, R.M.P., Wang, F.-Y., Dong, H.: Deep deterministic policy gradient for high-speed train trajectory optimization. IEEE Trans. Intell. Transp. Syst. 23(8), 11562–11574 (2021)
Google Scholar
Shang, M., Zhou, Y., Fujita, H.: Deep reinforcement learning with reference system to handle constraints for energy-efficient train control. Inf. Sci. 570, 708–721 (2021)
Article MathSciNet Google Scholar
Xiao, Z., Wang, Q., Sun, P., You, B., Feng, X.: Modeling and energy-optimal control for high-speed trains. IEEE Trans. Transp. Electrification 6(2), 797–807 (2020)
Article Google Scholar
Zhang, L., Zhou, M., Li, Z., et al.: An intelligent train operation method based on event-driven deep reinforcement learning. IEEE Trans. Industr. Inf. 18(10), 6973–6980 (2021)
Article Google Scholar
Zhou, K., Song, S., Xue, A., You, K., Hui, W.: Smart train operation algorithms based on expert knowledge and reinforcement learning. IEEE Trans. Syst. Man Cybern. Syst. 52(2), 716–727 (2020)
Article Google Scholar
Zhu, Q., Shuai, S., Tang, T., Liu, W., Zhang, Z., Tian, Q.: An eco-driving algorithm for trains through distributing energy: a Q-learning approach. ISA Trans. 122, 24–37 (2022)
Article Google Scholar
Zhuang, D., Gan, V.J.L., Tekler, Z.D., Chong, A., Tian, S., Shi, X.: Data-driven predictive control for smart HVAC system in IoT-integrated buildings with time-series forecasting and reinforcement learning. App. Energy 338, 120936 (2023)
Google Scholar

Download references

Acknowledgment

This work was supported by the Graduate Student Research Innovation Program of Chongqing Jiaotong University (CYS23515).

Author information

Authors and Affiliations

College of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, China
Haotong Zhang
College of Computer, National University of Defense Technology, Changsha, China
Gang Xian

Authors

Haotong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Gang Xian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gang Xian .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Biao Luo
Chinese Academy of Sciences, Beijing, China
Long Cheng
Zhejiang University, Hangzhou, China
Zheng-Guang Wu
Guangdong University of Technology, Guangzhou, China
Hongyi Li
UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Xian, G. (2024). ASTPSI: Allocating Spare Time and Planning Speed Interval for Intelligent Train Control of Sparse Reward. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14447. Springer, Singapore. https://doi.org/10.1007/978-981-99-8079-6_6

Download citation

DOI: https://doi.org/10.1007/978-981-99-8079-6_6
Published: 14 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8078-9
Online ISBN: 978-981-99-8079-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ASTPSI: Allocating Spare Time and Planning Speed Interval for Intelligent Train Control of Sparse Reward