research-article

City Metro Network Expansion with Reinforcement Learning

Authors:
Yu Wei

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

,
Minjia Mao

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

,
Xi Zhao

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

,
Jianhua Zou

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

,
Ping An

Xi'an Jiaotong University, Xi'an, China

Xi'an Jiaotong University, Xi'an, China
View Profile

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningAugust 2020Pages 2646–2656https://doi.org/10.1145/3394486.3403315

Published:20 August 2020Publication History

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 2646–2656

ABSTRACT

City metro network expansion, included in the transportation network design, aims to design new lines based on the existing metro network. Existing methods in the field of transportation network design either (i) can hardly formulate this problem efficiently, (ii) depend on expert guidance to produce solutions, or (iii) appeal to problem-specific heuristics which are difficult to design. To address these limitations, we propose a reinforcement learning based method for the city metro network expansion problem. In this method, we formulate the metro line expansion as a Markov decision process (MDP), which characterizes the problem as a process of sequential station selection. Then, we train an actor-critic model to design the next metro line on the basis of the existing metro network. The actor is an encoder-decoder network with an attention mechanism to generate the parameterized policy which is used to select the stations. The critic estimates the expected cumulative reward to assist the training of the actor by reducing training variance. The proposed method does not require expert guidance during design, since the learning procedure only relies on the reward calculation to tune the policy for better station selection. Also, it avoids the difficulty of heuristics designing by the policy formalizing the station selection. Considering origin-destination (OD) trips and social equity, we expand the current metro network in Xi'an, China, based on the real mobility information of 24,770,715 mobile phone users in the whole city. The results demonstrate the advantages of our method compared with existing approaches.

References

Elisabete Arsenio, Karel Martens, and Floridea Di Ciommo. 2016. Sustainable urban mobility plans: Bridging climate change and equity targets? Research in Transportation Economics, Vol. 55 (2016), 30--39.Google ScholarCross Ref
Hamid Behbahani, Sobhan Nazari, Masood Jafari Kang, and Todd Litman. 2019. A conceptual framework to formulate transportation network design problem considering social equity criteria. Transportation Research Part A: Policy and Practice, Vol. 125 (2019), 171--183.Google ScholarCross Ref
Irwan Bello, Hieu Pham, Quoc V Le, Mohammad Norouzi, and Samy Bengio. 2016. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016).Google Scholar
Giuseppe Bruno, Michel Gendreau, and Gilbert Laporte. 2002. A heuristic for the location of a rapid transit line. Computers & Operations Research, Vol. 29, 1 (2002), 1--12.Google ScholarCross Ref
Partha Chakroborty. 2003. Genetic algorithms for optimal urban transit network design. Computer-Aided Civil and Infrastructure Engineering, Vol. 18, 3 (2003), 184--200.Google ScholarCross Ref
Hélène Dufourd, Michel Gendreau, and Gilbert Laporte. 1996. Locating a transit line using tabu search. Location Science, Vol. 4, 1--2 (1996), 1--19.Google ScholarCross Ref
Wei Fan and Randy B Machemehl. 2006. Using a simulated annealing algorithm to solve the transit route network design problem. Journal of transportation engineering, Vol. 132, 2 (2006), 122--132.Google ScholarCross Ref
Reza Zanjirani Farahani, Elnaz Miandoabchi, Wai Yuen Szeto, and Hannaneh Rashidi. 2013. A review of urban transportation network design problems. European Journal of Operational Research, Vol. 229, 2 (2013), 281--302.Google ScholarCross Ref
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. 249--256.Google Scholar
Ivo Grondman, Lucian Busoniu, Gabriel AD Lopes, and Robert Babuska. 2012. A survey of actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol. 42, 6 (2012), 1291--1307.Google ScholarDigital Library
Gabriel Gutiérrez-Jarpa, Gilbert Laporte, and Vladimir Marianov. 2018. Corridor-based metro network design with travel flow capture. Computers & Operations Research, Vol. 89 (2018), 58--67.Google ScholarDigital Library
Gabriel Gutiérrez-Jarpa, Carlos Obreque, Gilbert Laporte, and Vladimir Marianov. 2013. Rapid transit network design for optimal cost and origin--destination demand capture. Computers & Operations Research, Vol. 40, 12 (2013), 3000--3009.Google ScholarDigital Library
Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, and David Meger. 2018. Deep reinforcement learning that matters. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.Google Scholar
Konstantinos Kepaptsoglou and Matthew Karlaftis. 2009. Transit route network design problem. Journal of transportation engineering, Vol. 135, 8 (2009), 491--505.Google ScholarCross Ref
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Vijay R Konda and John N Tsitsiklis. 2000. Actor-critic algorithms. In Advances in neural information processing systems. 1008--1014.Google Scholar
Michael Kuntz and Marco Helbich. 2014. Geostatistical mapping of real estate prices: an empirical comparison of kriging and cokriging. International Journal of Geographical Information Science, Vol. 28, 9 (2014), 1904--1921.Google ScholarDigital Library
Gilbert Laporte and Juan A Mesa. 2015. The design of rapid transit networks. In Location science. Springer, 581--594.Google Scholar
Gilbert Laporte, Juan A Mesa, and Francisco A Ortega. 2000. Optimization methods for the planning of rapid transit systems. European Journal of Operational Research, Vol. 122, 1 (2000), 1--10.Google ScholarCross Ref
Gilbert Laporte, Juan A Mesa, Francisco A Ortega, and Ignacio Sevillano. 2005. Maximizing trip coverage in the location of a single rapid transit alignment. Annals of Operations Research, Vol. 136, 1 (2005), 49--63.Google ScholarCross Ref
Gilbert Laporte and Marta MB Pascoal. 2015. Path based algorithms for metro network design. Computers & Operations Research, Vol. 62 (2015), 78--94.Google ScholarDigital Library
Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).Google Scholar
Kevin Manaugh, Madhav G Badami, and Ahmed M El-Geneidy. 2015. Integrating social equity into urban transportation planning: A critical evaluation of equity objectives and measures in transportation plans in North America. Transport policy, Vol. 37 (2015), 167--176.Google Scholar
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).Google Scholar
Mahmoud Owais and Mostafa K Osman. 2018. Complete hierarchical multi-objective genetic algorithm for transit network design problem. Expert Systems with Applications, Vol. 114 (2018), 143--154.Google ScholarCross Ref
Yanshuo Sun, Paul Schonfeld, and Qianwen Guo. 2018. Optimal extension of rail transit lines. International Journal of Sustainable Transportation, Vol. 12, 10 (2018), 753--769.Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998--6008.Google Scholar
Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In Advances in Neural Information Processing Systems. 2692--2700.Google Scholar
Yi Wei, Jian Gang Jin, Jingfeng Yang, and Linjun Lu. 2019. Strategic network expansion of urban rapid transit systems: A bi-objective programming model. Computer-Aided Civil and Infrastructure Engineering, Vol. 34, 5 (2019), 431--443.Google ScholarDigital Library
Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, Vol. 8, 3--4 (1992), 229--256.Google ScholarDigital Library
Zhongzhen Yang, Bin Yu, and Chuntian Cheng. 2007. A parallel ant colony algorithm for bus network optimization. Computer-Aided Civil and Infrastructure Engineering, Vol. 22, 1 (2007), 44--55.Google ScholarCross Ref
Yang Ye, Yu Zheng, Yukun Chen, Jianhua Feng, and Xing Xie. 2009. Mining individual life pattern based on location history. In 2009 tenth international conference on mobile data management: systems, services and middleware. IEEE, 1--10.Google Scholar
Junjun Yin, Aiman Soliman, Dandong Yin, and Shaowen Wang. 2017. Depicting urban boundaries from a mobility network of spatial interactions: a case study of Great Britain with geo-located Twitter data. International Journal of Geographical Information Science, Vol. 31, 7 (2017), 1293--1313.Google ScholarDigital Library
Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei Yin 2018. Recommendations with negative feedback via pairwise deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1040--1048.Google ScholarDigital Library

Index Terms

City Metro Network Expansion with Reinforcement Learning
1. Applied computing
  1. Operations research
    1. Transportation
2. Computing methodologies
  1. Artificial intelligence
    1. Planning and scheduling
  2. Machine learning
    1. Learning paradigms
      1. Reinforcement learning

Recommendations

Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Read More
Evaluating the competitiveness of Indian metro cities: in smart city context

The eminent outgrowth rate of urban population and varying conditions need to improve public utilities and various services to its citizens in a particular city. There is a need for smarter, effective, efficient and sustainable cities in developing ...
Read More
Reinforcement learning algorithms: A brief survey
Highlights
- RL can be used to solve problems involving sequential decision-making.
- RL is based on trial-and-error learning through rewards and punishments.
- The ultimate goal of an RL agent is to maximize cumulative reward.
- RL agent tries ...
Abstract
Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential decision-making in complex problems. RL is inspired by trial-and-error based human/animal learning. It can learn an optimal policy autonomously with knowledge ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 August 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
actor-critic model
metro network expansion
reinforcement learning
social equity
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 1,033
  Total Downloads
- Downloads (Last 12 months)131
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

City Metro Network Expansion with Reinforcement Learning

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Reward Shaping in Episodic Reinforcement Learning

Evaluating the competitiveness of Indian metro cities: in smart city context

Reinforcement learning algorithms: A brief survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

City Metro Network Expansion with Reinforcement Learning

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Reward Shaping in Episodic Reinforcement Learning

Evaluating the competitiveness of Indian metro cities: in smart city context

Reinforcement learning algorithms: A brief survey

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media