Abstract
Crowd flow prediction has involved in extensive applications like intelligent transportation and public safety, especially in metropolis where the crowd flow usually show high nonlinearities and complex patterns. Among the existing prediction methods, most of them suffer from (1) implicit long-term spatial dependency, (2) the external factors lack of crucial spatial attributes, (3) complex spatio-temporal dynamics with uncertain external conditions, which yield limited performance. This paper proposes a novel method using spatio-temporal attention network with heterogeneous feature enhancement. Specifically, heterogeneous feature enhancement introduces spatial mapping and Periodic Dilated Convolution (PDC), the former provides the dimension supplement of external factors while PDC could capture the correlations of both spatial and temporal domain. Moreover, a Spatio-Temporal Attention (STA) mechanism is proposed to further obtain the dynamic spatial-temporal correlations. Our framework is evaluated on several citywide crowd flow datasets, i.e. TaxiBJ, MobileBJ and TaxiNYC, the experimental results indicate the proposed method outperforms the state-of-the-art baselines by a satisfied margin.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Box, G.E.P., Jenkins, G.M.: Time series analysis: forecasting and control. J. Time 31(3) (2010)
Fu, J., et al.: Dual attention network for scene segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 3146–3154. Computer Vision Foundation. IEEE (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pp. 770–778 (2016)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018, pp. 7132–7141. IEEE Computer Society (2018)
Jiang, R., et al.: VLUC: an empirical benchmark for video-like urban computing on citywide crowd and traffic prediction (2019)
Lin, Z., Feng, J., Lu, Z., Li, Y., Jin, D.: DeepSTN+: context-aware spatial-temporal neural network for crowd flow prediction in metropolis. In: Thirty-Thrid AAAI Conference on Artificial Intelligence, pp. 1020–1027 (2019)
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 5998–6008 (2017)
Vlahogianni, E.I., Karlaftis, M.G., Golias, J.C.: Short-term traffic forecasting: where we are and where we’re going. Transp. Res. Part C Emerg. Technol. 43(1), 3–19 (2014)
Wang, F., et al.: Residual attention network for image classification. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 6450–6458. IEEE Computer Society (2017)
Wang, H., Zhu, Y., Green, B., Adam, H., Yuille, A., Chen, L.-C.: Axial-DeepLab: stand-alone axial-attention for panoptic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 108–126. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_7
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: 4th International Conference on Learning Representations. ICLR (2016)
Yu, H.F., Rao, N., Dhillon, I.S.: Temporal regularized matrix factorization for high-dimensional time series prediction. In: Advances in Neural Information Processing Systems, pp. 847–855 (2016)
Zhang, J., Zheng, Y., Qi, D.: Deep spatio-temporal residual networks for citywide crowd flows prediction. In: Thirty-First AAAI Conference on Artificial Intelligence, pp. 1655–1661 (2017)
Zhang, J., Zheng, Y., Qi, D., Li, R., Yi, X.: DNN-based prediction model for spatio-temporal data. In: 24th ACM International Conference on Advances in Geographic Information Systems, pp. 92:1–92:4 (2016)
Zheng, Y., Capra, L., Wolfson, O., Yang, H.: Urban computing: concepts, methodologies, and applications. ACM Trans. Intell. Syst. Technol. 5(3), 38:1–38:55 (2014)
Acknowledgment
This work was support by the fund of China Academy of Railway Sciences (DZYF20-14).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Fang, K., Yang, E., Liu, Y., Liu, S. (2021). Hetero-STAN: Crowd Flow Prediction by Heterogeneous Spatio-Temporal Attention Network. In: Peng, Y., Hu, SM., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds) Image and Graphics. ICIG 2021. Lecture Notes in Computer Science(), vol 12889. Springer, Cham. https://doi.org/10.1007/978-3-030-87358-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-87358-5_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87357-8
Online ISBN: 978-3-030-87358-5
eBook Packages: Computer ScienceComputer Science (R0)