Abstract
Predicting pedestrian trajectories in dynamic scenarios is extremely challenging due to the mobility and flexibility of pedestrian motion. However, most existing methods cannot fully extract the interaction information between pedestrians. In this paper, a generative adversarial network model-based attention mechanism (Atten-GAN) was proposed to model social relationships of the interaction information between pedestrians. The Atten-GAN is composed of a generator and a discriminator. The generator predicts multiple possible future trajectories according to the past trajectories of pedestrians. The discriminator scores the trajectories according to the trajectories input, determines whether the trajectories are ground-truth or generated by the generator, and then facilitates the generator to generate trajectories in line with social norms. Atten-GAN introduces an attention pooling module to allocate the influence weight of pedestrians in the scene, which can fully extract pedestrian interaction information. In addition, to solve the problem associated with how the GAN network gradient is easy to disappear and difficult to train, the noise decreasing with time is introduced into the loss function of the discriminator during the training. The comparison experiments on ETH and UCY datasets showed that Atten-GAN could not only provide a variety of socially acceptable prediction trajectories in accordance with the social norms but was also was superior to the existing generative model-based methods in the prediction accuracy. The Atten-GAN model had a significant improvement in prediction accuracy and improved the training effects.
Similar content being viewed by others
References
Cheng J, Cheng H, Meng MQ, Zhang H. Autonomous navigation by mobile robots in human environments: a survey. In: 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), Kuala Lumpur, Malaysia. 2018. pp. 1981–6.
Alatise MB, Hancke GP. A review on challenges of autonomous mobile robot and sensor fusion methods. IEEE Access. 2020;8:39830–46.
Rosete A, Soares B, Salvadorinho J, Reis J, Amorim M. Service robots in the hospitality industry: an exploratory literature review. Springer International Publishing; 2020. p. 174–86.
Wang J, Liu J, Kato N. Networking and communications in autonomous driving: a survey. IEEE Commun Surv Tutor. 2018;21:1243–74.
Luo Y, Cai P, Bera A, Hsu D, Lee WS, Manocha D. PORCA: modeling and planning for autonomous driving among many pedestrians. IEEE Robot Autom Lett. 2018;3(4):3418–25.
Kaiser MS, et al. Advances in crowd analysis for urban applications through urban event detection. IEEE Trans Intell Transp Syst. 2017;19(10):3092–112.
Rudenko A, Palmieri L, Herman M, et al. Human motion trajectory prediction: a survey. Int J Robot Res. 2020;39:895–935.
Ridel D, Rehder E, Lauer M, Stiller C, Wolf D. A literature review on the prediction of pedestrian behavior in urban scenarios. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC). 2018. pp. 3105–12.
Gandhi T, Trivedi MM. Pedestrian collision avoidance systems: a survey of computer vision based recent studies. In: 2006 IEEE Intelligent Transportation Systems Conference. 2006. pp. 976–81.
Morris BT, Trivedi MM. Trajectory learning for activity understanding: unsupervised, multilevel, and long-term adaptive approach. IEEE Trans Pattern Anal Mach Intell. 2011;33(11):2287–301.
Li J, Zhan W, Tomizuka M. Generic vehicle tracking framework capable of handling occlusions based on modified mixture particle filter. In: 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu. 2018. pp. 936–42.
Shrestha A, Mahmood A. Review of deep learning algorithms and architectures. IEEE Access. 2019;7:53040–65.
Mahmud M, Kaiser MS, McGinnity TM, et al. Deep learning in mining biological data. Cognit Comput. 2021;13(1):1–33.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
Gupta A, Johnson J, Fei-Fei L, Savarese S, Alahi A. Social GAN: socially acceptable trajectories with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. pp. 2255–64.
Helbing D, Molnar P. Social force model for pedestrian dynamics. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 1995;51(5):4282–6.
Johansson F, Peterson A, Tapani A. Waiting pedestrians in the social force model. Phys A Stat Mech Appl. 2015;419:95–107.
Kretz T, Lohmiller J, Sukennik P. Some indications on how to calibrate the social force model of pedestrian dynamics. Transp Res Rec. 2018;2672(20):228–38.
Antonini G, Bierlaire M, Weber M. Discrete choice models of pedestrian walking behavior. Transp Res B Meth. 2006;40(8):667–87.
Yi S, Li H, X Wang. Pedestrian behavior understanding and prediction with deep neural networks. In: European Conference on Computer Vision. 2016, pp. 263–79.
Vemula A, Muelling K, Oh J. Social attention: modeling attention in human crowds. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD. 2018. pp. 4601–7.
Xu Y, Piao Z, Gao S. Encoding crowd interaction with deep neural network for pedestrian trajectory prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. pp. 5275–84.
Sun J, Jiang J, Liu Y. An introductory survey on attention mechanisms in computer vision problems. In: 2020 6th International Conference on Big Data and Information Analytics (BigDIA). 2020. pp. 295–300.
He W, Wu Y, Li X. Attention mechanism for neural machine translation: a survey. In: 2021 IEEE 5th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). 2021. pp. 1485–9.
Zhang B, Xiong D, Su J. Neural machine translation with deep attention. IEEE Trans Pattern Anal Mach Intell. 2020;42(1):154–63.
Fu J, Liu J, Jiang J, Li Y, Bao Y, Lu H. Scene segmentation with dual relation-aware attention network. IEEE Trans Neural Netw Learn Syst. 2021;32(6):2547–60.
Jin Q, Meng Z, Pham TD, Chen Q, Wei L, Su R. DUNet: a deformable network for retinal vessel segmentation. Knowl Based Syst. 2019;178:149–62.
Kosaraju V, Sadeghian A, Martín-Martín R, Reid I, Rezatofighi H, Savarese S. Social-bigat: multimodal trajectory forecasting using bicycle-gan and graph attention networks. In: Advances in Neural Information Processing Systems. 2019. pp. 137–46.
Makridakis S. Time-series prediction - forecasting the future and understanding the past - WEIGEND, AS, GERSHENFELD, NA. Int J Forecast. 1994;10(3):463–6.
Zhou Z, Chen J, Shen B, Xiong Z, Shen H, Guo F. A trajectory prediction method based on aircraft motion model and grey theory. In: 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC). 2016. pp. 1523–7.
Kitani KM, Ziebart BD, Bagnell JA, Hebert M. Activity forecasting. In: European conference on on Computer Vision. 2012. pp. 201–4.
Yamaguchi K, Berg AC, Ortiz LE, Berg TL. Who are you with and where are you going? In: CVPR 2011. Providence, RI; 2011. pp. 1345–52.
Alahi A, Goel K, Ramanathan V, Robicquet A, Fei-Fei L, Savarese S. Social lstm: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. pp. 961–71.
Xue H, Huynh DQ, Reynolds M. SS-LSTM: a hierarchical lstm model for pedestrian trajectory prediction. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE; 2018. pp. 1186–94.
Syed A, Morris BT. Sseg-lstm: semantic scene segmentation for trajectory prediction. In: 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE; 2019. pp. 2504–9.
Zhang P, Ouyang W, Zhang P, Xue J, Zheng N. SR-LSTM: state refinement for lstm towards pedestrian trajectory prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019. pp. 12085–94.
Amirian J, Hayet J-B, Pettré J. Social ways: learning multi-modal distributions of pedestrian trajectories with gans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2019.
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P. Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems. 2016. pp. 2172–80.
Pellegrini S, Ess A, Van Gool L. Improving data association by joint modeling of pedestrian trajectories and groupings. In: European conference on computer vision. 2010. pp. 452–65.
Leal-Taixe L, Fenzi M, Kuznetsova A, Rosenhahn B, Savarese S. Learning an image-based motion context for multiple people tracking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2014. pp. 3542–9.
Funding
This study is supported by the National Natural Science Foundation of China (No. 62073075, 61573100) and Zhejiang Lab (NO.2022NB0AB02).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Fang, F., Zhang, P., Zhou, B. et al. Atten-GAN: Pedestrian Trajectory Prediction with GAN Based on Attention Mechanism. Cogn Comput 14, 2296–2305 (2022). https://doi.org/10.1007/s12559-022-10029-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-022-10029-z