ABSTRACT
As a critical mission of intelligent transportation systems, urban flow prediction (UFP) benefits in many city services including trip planning, congestion control, and public safety. Despite the achievements of previous studies, limited efforts have been observed on simultaneous investigation of the heterogeneity in both space and time aspects. That is, regional correlations would be variable at different timestamps. In this paper, we propose a spatio-temporal learning framework with mask and contrast enhancements to capture spatio-temporal variabilities among city regions. We devise a mask-enhanced pre-training task to learn latent correlations across the spatial and temporal dimensions, and then a graph-based method is developed to extract the significance of regions by using the inter-regional attention weights. To further acquire contrastive correlations of regions, we elaborate a pre-trained contrastive learning task with the global-local cross-attention mechanism. Thereafter, two well-trained encoders have strong capability to capture latent spatio-temporal representations for the flow forecasting with time-varying. Extensive experiments conducted on real-world urban flow datasets demonstrate that our method compares favorably with other state-of-the-art models.
- Taghreed Alghamdi, Khalid Elgazzar, Magdi Bayoumi, Taysseer Sharaf, and Sumit Shah. 2019. Forecasting traffic congestion using ARIMA modeling. In 2019 15th international wireless communications & mobile computing conference (IWCMC). IEEE, 1227--1232.Google Scholar
- Lei Bai, Lina Yao, Can Li, Xianzhi Wang, and Can Wang. 2020. Adaptive graph convolutional recurrent network for traffic forecasting. Advances in neural information processing systems, Vol. 33 (2020), 17804--17815.Google Scholar
- Defu Cao, Yujing Wang, Juanyong Duan, Ce Zhang, Xia Zhu, Congrui Huang, Yunhai Tong, Bixiong Xu, Jing Bai, Jie Tong, and Qi Zhang. 2021. Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting. CoRR, Vol. abs/2103.07719 (2021). showeprint[arXiv]2103.07719 https://arxiv.org/abs/2103.07719Google Scholar
- Zhe Cao, Hang Gao, Karttikeya Mangalam, Qi-Zhi Cai, Minh Vo, and Jitendra Malik. 2020. Long-term human motion prediction with scene context. In European Conference on Computer Vision. Springer, 387--404.Google ScholarDigital Library
- Ali Diba, Vivek Sharma, Luc Van Gool, and Rainer Stiefelhagen. 2019. Dynamonet: Dynamic action and motion network. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6192--6201.Google ScholarCross Ref
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. CoRR, Vol. abs/2010.11929 (2020). showeprint[arXiv]2010.11929 https://arxiv.org/abs/2010.11929Google Scholar
- Yongshun Gong, Xue Dong, Jian Zhang, and Meng Chen. 2023 a. Latent evolution model for change point detection in time-varying networks. Information Sciences (2023), 119376.Google Scholar
- Yongshun Gong, Zhibin Li, Wei Liu, Xiankai Lu, Xinwang Liu, Ivor W Tsang, and Yilong Yin. 2023 b. Missingness-Pattern-Adaptive Learning With Incomplete Data. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).Google Scholar
- Yongshun Gong, Zhibin Li, Jian Zhang, Wei Liu, Bei Chen, and Xiangjun Dong. 2021a. A spatial missing value imputation method for multi-view urban statistical data. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 1310--1316.Google Scholar
- Yongshun Gong, Zhibin Li, Jian Zhang, Wei Liu, and Jinfeng Yi. 2020a. Potential passenger flow prediction: A novel study for urban transportation development. In Proceedings of the AAAI Conference on Artificial Intelligence. 4020--4027.Google ScholarCross Ref
- Yongshun Gong, Zhibin Li, Jian Zhang, Wei Liu, Yilong Yin, and Yu Zheng. 2021b. Missing value imputation for multi-view urban statistical data via spatial correlation learning. IEEE Transactions on Knowledge and Data Engineering, Vol. 35, 1 (2021), 686--698.Google Scholar
- Yongshun Gong, Zhibin Li, Jian Zhang, Wei Liu, and Yu Zheng. 2020b. Online spatio-temporal crowd flow distribution prediction for complex metro system. IEEE Transactions on Knowledge and Data Engineering (2020).Google ScholarDigital Library
- Yongshun Gong, Zhibin Li, Jian Zhang, Wei Liu, Yu Zheng, and Christina Kirsch. 2018. Network-wide crowd flow prediction of sydney trains via customized online non-negative matrix factorization. In Proceedings of the 27th ACM international conference on information and knowledge management. 1243--1252.Google ScholarDigital Library
- Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. 2022. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16000--16009.Google ScholarCross Ref
- Antoine G Hobeika and Chang Kyun Kim. 1994. Traffic-flow-prediction systems based on upstream traffic. In Proceedings of VNIS'94--1994 Vehicle Navigation and Information Systems Conference. IEEE, 345--350.Google ScholarCross Ref
- Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li F Fei-Fei, and Juan Carlos Niebles. 2018. Learning to decompose and disentangle representations for video prediction. Advances in neural information processing systems, Vol. 31 (2018).Google Scholar
- Md Amirul Islam, Sen Jia, and Neil DB Bruce. 2020. How much position information do convolutional neural networks encode? arXiv preprint arXiv:2001.08248 (2020).Google Scholar
- Jiahao Ji, Jingyuan Wang, Chao Huang, Junjie Wu, Boren Xu, Zhenhe Wu, Junbo Zhang, and Yu Zheng. 2022. Spatio-Temporal Self-Supervised Learning for Traffic Flow Prediction. arXiv preprint arXiv:2212.04475 (2022).Google Scholar
- Zhishuai Li, Gang Xiong, Yuanyuan Chen, Yisheng Lv, Bin Hu, Fenghua Zhu, and Fei-Yue Wang. 2019. A Hybrid Deep Learning Approach with GCN and LSTM for Traffic Flow Prediction. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC). 1929--1933. https://doi.org/10.1109/ITSC.2019.8916778Google ScholarDigital Library
- Zhuo Lin Li, Gao Wei Zhang, Jie Yu, and Ling Yu Xu. 2023. Dynamic graph structure learning for multivariate time series forecasting. Pattern Recognition, Vol. 138 (2023), 109423. https://doi.org/10.1016/j.patcog.2023.109423Google ScholarDigital Library
- Lingbo Liu, Jiajie Zhen, Guanbin Li, Geng Zhan, Zhaocheng He, Bowen Du, and Liang Lin. 2020. Dynamic spatial-temporal representation learning for traffic flow prediction. IEEE Transactions on Intelligent Transportation Systems, Vol. 22, 11 (2020), 7169--7183.Google ScholarDigital Library
- Michael Mathieu, Camille Couprie, and Yann LeCun. 2015. Deep multi-scale video prediction beyond mean square error. arXiv preprint arXiv:1511.05440 (2015).Google Scholar
- FA Omonov. 2022. The important role of intellectual transport systems in increasing the economic efficiency of public transport services. Academic research in educational sciences, Vol. 3, 3 (2022), 36--40.Google Scholar
- Sergiu Oprea, Pablo Martinez-Gonzalez, Alberto Garcia-Garcia, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez, and Antonis Argyros. 2020. A review on deep learning techniques for video prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).Google Scholar
- Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, and Bharath Hariharan. 2017. Learning features by watching objects move. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2701--2710.Google ScholarCross Ref
- Hao Qu, Yongshun Gong, Meng Chen, Junbo Zhang, Yu Zheng, and Yilong Yin. 2022. Forecasting Fine-Grained Urban Flows Via Spatio-Temporal Contrastive Self-Supervision. IEEE Transactions on Knowledge and Data Engineering (2022).Google ScholarDigital Library
- Md. Mokhlesur Rahman, Pooya Najaf, Milton Gregory Fields, and Jean-Claude Thill. 2022. Traffic congestion and its urban scale factors: Empirical evidence from American urban areas. International Journal of Sustainable Transportation, Vol. 16, 5 (2022), 406--421. https://doi.org/10.1080/15568318.2021.1885085Google ScholarCross Ref
- Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-shot text-to-image generation. In International Conference on Machine Learning. PMLR, 8821--8831.Google Scholar
- Adria Recasens, Pauline Luc, Jean-Baptiste Alayrac, Luyu Wang, Florian Strub, Corentin Tallec, Mateusz Malinowski, Viorica Pua trua ucean, Florent Altché, Michal Valko, et al. 2021. Broaden your views for self-supervised video learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1255--1265.Google Scholar
- Andrey Rudenko, Luigi Palmieri, Michael Herman, Kris M Kitani, Dariu M Gavrila, and Kai O Arras. 2020. Human motion trajectory prediction: A survey. The International Journal of Robotics Research, Vol. 39, 8 (2020), 895--935.Google ScholarDigital Library
- Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 815--823.Google ScholarCross Ref
- Sofia Serrano and Noah A Smith. 2019. Is attention interpretable? arXiv preprint arXiv:1906.03731 (2019).Google Scholar
- Xingjian Shi, Zhihan Gao, Leonard Lausen, Hao Wang, Dit-Yan Yeung, Wai-kin Wong, and Wang-chun Woo. 2017. Deep learning for precipitation nowcasting: A benchmark and a new model. Advances in neural information processing systems, Vol. 30 (2017).Google Scholar
- Tom van Dijk and Guido C. H. E. de Croon. 2019. How do neural networks see depth in single images? CoRR, Vol. abs/1905.07005 (2019). showeprint[arXiv]1905.07005 http://arxiv.org/abs/1905.07005Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).Google Scholar
- Jesse Vig. 2019. A Multiscale Visualization of Attention in the Transformer Model. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics, Florence, Italy, 37--42. https://doi.org/10.18653/v1/P19--3007Google Scholar
- Xiaolong Wang, Allan Jabri, and Alexei A Efros. 2019a. Learning correspondence from the cycle-consistency of time. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2566--2576.Google ScholarCross Ref
- Xinlong Wang, Tao Kong, Chunhua Shen, Yuning Jiang, and Lei Li. 2019b. SOLO: Segmenting Objects by Locations. CoRR, Vol. abs/1912.04488 (2019). showeprint[arXiv]1912.04488 http://arxiv.org/abs/1912.04488Google Scholar
- Donglai Wei, Joseph J Lim, Andrew Zisserman, and William T Freeman. 2018. Learning and using the arrow of time. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8052--8060.Google ScholarCross Ref
- Billy M Williams. 2001. Multivariate vehicular traffic flow prediction: evaluation of ARIMAX modeling. Transportation Research Record, Vol. 1776, 1 (2001), 194--200.Google ScholarCross Ref
- Tyler Wilson, Pang-Ning Tan, and Lifeng Luo. 2018. A Low Rank Weighted Graph Convolutional Approach to Weather Prediction. In 2018 IEEE International Conference on Data Mining (ICDM). 627--636. https://doi.org/10.1109/ICDM.2018.00078Google ScholarCross Ref
- Yuankai Wu, Huachun Tan, Lingqiao Qin, Bin Ran, and Zhuxi Jiang. 2018. A hybrid deep learning based traffic flow prediction method and its understanding. Transportation Research Part C: Emerging Technologies, Vol. 90 (2018), 166--180.Google ScholarCross Ref
- Jiexia Ye, Juanjuan Zhao, Kejiang Ye, and Chengzhong Xu. 2020. Multi-STGCnet: A Graph Convolution Based Spatial-Temporal Framework for Subway Passenger Flow Forecasting. In 2020 International Joint Conference on Neural Networks (IJCNN). 1--8. https://doi.org/10.1109/IJCNN48605.2020.9207049Google Scholar
- Bing Yu, Haoteng Yin, and Zhanxing Zhu. 2017. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875 (2017).Google Scholar
- Junbo Zhang, Yu Zheng, Dekang Qi, Ruiyuan Li, Xiuwen Yi, and Tianrui Li. 2018. Predicting citywide crowd flows using deep spatio-temporal residual networks. Artificial Intelligence, Vol. 259 (2018), 147--166. https://doi.org/10.1016/j.artint.2018.03.002Google ScholarCross Ref
- Liang Zhao, Min Gao, and Zongwei Wang. 2022. ST-GSP: Spatial-Temporal Global Semantic Representation Learning for Urban Flow Prediction. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 1443--1451.Google ScholarDigital Library
Index Terms
- Mask- and Contrast-Enhanced Spatio-Temporal Learning for Urban Flow Prediction
Recommendations
Spatio-temporal fusion and contrastive learning for urban flow prediction
AbstractUrban flow prediction is critical for urban planning, management, and safety. However, owing to the inherent instability of urban flows, prediction accuracy requires the fusion of multi-view influencing factors. Current prediction methods are ...
Highlights- Multi-view fusion based on contrastive learning is used for urban flow prediction.
- Effective sampling methods are proposed for spatio-temporal contrastive learning.
- Appropriate multi-view extraction and fusion prediction networks ...
Spatial-Temporal Semantic Generative Adversarial Networks for Flexible Multi-step Urban Flow Prediction
Artificial Neural Networks and Machine Learning – ICANN 2022AbstractAccurate multi-step citywide urban flow prediction plays a critical role in traffic management and future smart city. However, it is very challenging since urban flow is affected by complex semantic factors and has multi-scale dependencies on both ...
Dual-track spatio-temporal learning for urban flow prediction with adaptive normalization
AbstractRobust urban flow prediction is crucial for transportation planning and management in urban areas. Although recent advances in modeling spatio-temporal correlations have shown potential, most models fail to adequately consider the complex spatio-...
Highlights- Propose a dual-track inference module for urban flow prediction via the contextual and causality learning.
- Exploit the regional and global spatial correlations without inducing additional prior knowledge.
- Introduce the spatio-...
Comments