skip to main content
10.1145/3488560.3498438acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

RLMob: Deep Reinforcement Learning for Successive Mobility Prediction

Published: 15 February 2022 Publication History

Abstract

Human mobility prediction is an important task in the field of spatiotemporal sequential data mining and urban computing. Despite the extensive work on mining human mobility behavior, little attention was paid to the problem of successive mobility prediction. The state-of-the-art methods of human mobility prediction are mainly based on supervised learning. To achieve higher predictability and adapt well to the successive mobility prediction, there are four key challenges: 1) disability to the circumstance that the optimizing target is discrete-continuous hybrid and non-differentiable. In our work, we assume that the user's demands are always multi-targeted and can be modeled as a discrete-continuous hybrid function; 2) difficulty to alter the recommendation strategy flexibly according to the changes in user needs in real scenarios; 3) error propagation and exposure bias issues when predicting multiple points in successive mobility prediction; 4) cannot interactively explore user's potential interest that does not appear in the history. While previous methods met these difficulties, reinforcement learning (RL) is an intuitive answer for this task to settle these issues. We innovatively introduce RL to the successive prediction task. In this paper, we formulate this problem as a Markov Decision Process. We further propose a framework - RLMob to solve our problem. A simulated environment is carefully designed. An actor-critic framework with an instance of Proximal Policy Optimization (PPO) is applied to adapt to our scene with a large state space. Experiments show that on the task, the performance of our approach is consistently superior to that of the compared approaches.

Supplementary Material

MP4 File (WSDM22-fp349.mp4)
This presentation video is a short video that briefly goes through the whole paper of RLMob. From introduction part, problem statement and MDP formulation, methodology to experiment part. This video only contains basic ideas behind this paper and you can check our paper if you are interested in technical details.

References

[1]
Oren Barkan and Noam Koenigstein. 2016. Item2vec: neural item embedding for collaborative filtering. In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 1--6.
[2]
Yanju Chen, Chenglong Wang, Osbert Bastani, Isil Dillig, and Yu Feng. 2020. Program Synthesis Using Deduction-Guided Reinforcement Learning. In Computer Aided Verification, Shuvendu K. Lahiri and Chao Wang (Eds.). Springer International Publishing, Cham, 587--610.
[3]
Wei Cheng, Ziyan Luo, and Qiyue Yin. 2021. Adaptive Prior-Dependent Correction Enhanced Reinforcement Learning for Natural Language Generation. Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, 14 (May 2021), 12701--12709. https://ojs.aaai.org/index.php/AAAI/article/view/17504
[4]
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014).
[5]
Jie Feng, Yong Li, Chao Zhang, Funing Sun, Fanchao Meng, Ang Guo, and Depeng Jin. 2018. DeepMove: Predicting Human Mobility with Attentional Recurrent Networks. (2018), 1459--1468.
[6]
Sébastien Gambs, Marc-Olivier Killijian, and Miguel Nú nez del Prado Cortez. 2012. Next place prediction using mobility markov chains. In Proceedings of the First Workshop on Measurement, Privacy, and Mobility. ACM, 3.
[7]
Qiang Gao, Fan Zhou, Goce Trajcevski, Kunpeng Zhang, Ting Zhong, and Fengli Zhang. 2019. Predicting Human Mobility via Variational Attention. (2019), 2750--2756.
[8]
Qiang Gao, Fan Zhou, Kunpeng Zhang, Goce Trajcevski, Xucheng Luo, and Fengli Zhang. 2017. Identifying human mobility via trajectory embeddings. (2017), 1689--1695.
[9]
Nicolas Heess, Jonathan J Hunt, Timothy P Lillicrap, and David Silver. 2015. Memory-based control with recurrent neural networks. arXiv preprint arXiv:1512.04455 (2015).
[10]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).
[11]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[12]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[13]
Defu Lian, Cong Zhao, Xing Xie, Guangzhong Sun, Enhong Chen, and Yong Rui. 2014. GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 831--840.
[14]
Bin Liu, Yanjie Fu, Zijun Yao, and Hui Xiong. 2013. Learning geographical preferences for point-of-interest recommendation. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1043--1051.
[15]
Qiang Liu, Shu Wu, Liang Wang, and Tieniu Tan. 2016. Predicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts. In Thirtieth Aaai Conference on Artificial Intelligence .
[16]
Ziyan Luo, Linfeng Zhao, Wei Cheng, Sihao Chen, Qi Chen, Hui Xue, Haidong Wang, Chuanjie Liu, Mao Yang, and Lintao Zhang. 2021. Match Plan Generation in Web Search with Parameterized Action Reinforcement Learning. In Proceedings of the Web Conference 2021 (WWW '21). Association for Computing Machinery, New York, NY, USA, 1040--1052. https://doi.org/10.1145/3442381.3449862
[17]
Wesley Mathew, Ruben Raposo, and Bruno Martins. 2012. Predicting future locations with hidden Markov models. In Proceedings of the 2012 ACM conference on ubiquitous computing. ACM, 911--918.
[18]
Congcong Miao, Ziyan Luo, Fengzhu Zeng, and Jilong Wang. 2020 a. Predicting Human Mobility via Attentive Convolutional Network. In Proceedings of the 13th International Conference on Web Search and Data Mining . 438--446.
[19]
Congcong Miao, Jilong Wang, Heng Yu, Weicheng Zhang, and Yinyao Qi. 2020 b. Trajectory-User Linking with Attentive Recurrent Network. In Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems . 878--886.
[20]
John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In International conference on machine learning . 1889--1897.
[21]
John Schulman, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel. 2016. High-Dimensional Continuous Control Using Generalized Advantage Estimation. CoRR, Vol. abs/1506.02438 (2016).
[22]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[23]
Hongzhi Shi, Hancheng Cao, Xiangxin Zhou, Yong Li, Chao Zhang, Vassilis Kostakos, Funing Sun, and Fanchao Meng. 2019. Semantics-Aware Hidden Markov Model for Human Mobility. In Proceedings of the 2019 SIAM International Conference on Data Mining. SIAM, 774--782.
[24]
RS Sutton and AG Barto. 1998. Reinforcement Learning: An Introduction Vol. 1 MIT press.
[25]
Shiqiang Wang, Rahul Urgaonkar, Murtaza Zafer, Ting He, and Kin K Leung. 2019. Dynamic Service Migration in Mobile Edge Computing Based on Markov Decision Process. IEEE/ACM Transactions on Networking, Vol. PP, 99 (2019).
[26]
Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, Vol. 8, 3--4 (1992), 229--256.
[27]
Ronald J Williams and David Zipser. 1989. A learning algorithm for continually running fully recurrent neural networks. Neural computation, Vol. 1, 2 (1989), 270--280.
[28]
Dingqi Yang, Daqing Zhang, Vincent. W. Zheng, and Zhiyong Yu. 2015. Modeling User Activity Preference by Leveraging User Spatial Temporal Characteristics in LBSNs. IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 45, 1 (2015), 129--142.
[29]
Junbo Zhang, Yu Zheng, and Dekang Qi. 2016. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction. (2016), 1655--1661.
[30]
Yu Zheng. 2015. Trajectory Data Mining: An Overview. Acm Transactions on Intelligent Systems & Technology, Vol. 6, 3 (2015), 1--41.

Cited By

View all
  • (2024)Predicting Human Mobility via Self-Supervised Disentanglement LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331717536:5(2126-2141)Online publication date: May-2024
  • (2024)Inferring Individual Human Mobility From Sparse Check-in Data: A Temporal-Context-Aware ApproachIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.323160111:1(600-611)Online publication date: Feb-2024
  • (2023)Hierarchical Transformer with Spatio-temporal Context Aggregation for Next Point-of-interest RecommendationACM Transactions on Information Systems10.1145/359793042:2(1-30)Online publication date: 27-Sep-2023

Index Terms

  1. RLMob: Deep Reinforcement Learning for Successive Mobility Prediction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
    February 2022
    1690 pages
    ISBN:9781450391320
    DOI:10.1145/3488560
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 February 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep reinforcement learning
    2. human mobility prediction
    3. sequential data mining
    4. successive mobility prediction

    Qualifiers

    • Research-article

    Funding Sources

    • National Key Research and Development Program of China

    Conference

    WSDM '22

    Acceptance Rates

    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)35
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Predicting Human Mobility via Self-Supervised Disentanglement LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331717536:5(2126-2141)Online publication date: May-2024
    • (2024)Inferring Individual Human Mobility From Sparse Check-in Data: A Temporal-Context-Aware ApproachIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.323160111:1(600-611)Online publication date: Feb-2024
    • (2023)Hierarchical Transformer with Spatio-temporal Context Aggregation for Next Point-of-interest RecommendationACM Transactions on Information Systems10.1145/359793042:2(1-30)Online publication date: 27-Sep-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media