Abstract
Predicting the crossing intention of pedestrian is an essential task for autonomous driving systems. Whether or not a pedestrian will cross a crosswalk is a significantly inevitable skills for safety driving. Although many datasets and models are proposed to precisely predict the intention of pedestrian, they lack the ability to integrate different types of information. Therefore, we propose a Multi-Stream Network for Pedestrian Crossing Intention Prediction (MCIP) based on our novel optimal merging method. The proposed method consists of integration modules that takes two visual and three non-visual elements as an input. We achieved state-of-the-art performance on accuracy of pedestrian crossing intention, F1-score, and AUC with both public standard pedestrian datasets, PIE and JAAD. Furthermore, we compared the performance of our MCIP with other networks quantitatively by visualizing the intention of the pedestrian. Lastly, we performed ablation studies to observe the effectiveness of our multi-stream methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bhattacharyya, A., Reino, D.O., Fritz, M., Schiele, B.: Euro-PVI: pedestrian vehicle interactions in dense urban centers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Bouhsain, S.A., Saadatnejad, S., Alahi, A.: Pedestrian intention prediction: a multi-task perspective. ArXiv preprint arXiv:2010.10270 (2020)
Braun, M., Krebs, S., Flohr, F., Gavrila, D.M.: Eurocity persons: a novel benchmark for person detection in traffic scenes. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2019)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: high quality object detection and instance segmentation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2019)
Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2019)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Dendorfer, P., Elflein, S., Leal-Taixé, L.: MG-GAN: a multi-generator model preventing out-of-distribution samples in pedestrian trajectory prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)
Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2011)
Giuliari, F., Hasan, I., Cristani, M., Galasso, F.: Transformer networks for trajectory forecasting. In: 2020 25th International Conference on Pattern Recognition (ICPR) (2021)
Hasan, I., Liao, S., Li, J., Akram, S.U., Shao, L.: Generalizable pedestrian detection: the elephant in the room. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Khan, A.H., Munir, M., van Elst, L., Dengel, A.: F2DNet: fast focal detection network for pedestrian detection. ArXiv preprint arXiv:2203.02331 (2022)
Kim, K., Lee, Y.K., Ahn, H., Hahn, S., Oh, S.: Pedestrian intention prediction for autonomous driving using a multiple stakeholder perspective model. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2020)
Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2021)
Lin, Z., Pei, W., Chen, F., Zhang, D., Lu, G.: Pedestrian detection by exemplar-guided contrastive learning. ArXiv preprint arXiv:2111.08974 (2021)
Liu, B., et al.: Spatiotemporal relationship reasoning for pedestrian intent prediction. IEEE Robot. Autom. Lett. (RA-L) PP(99), 1 (2020)
Lorenzo, J., et al.: CAPformer: pedestrian crossing action prediction using transformer. Sensors 21(17), 5694 (2021)
Lorenzo, J., Parra, I., Sotelo, M.: IntFormer: predicting pedestrian intention with the aid of the transformer architecture. ArXiv preprint arXiv:2105.08647 (2021)
Lorenzo, J., Parra, I., Wirth, F., Stiller, C., Llorca, D.F., Sotelo, M.A.: RNN-based pedestrian crossing prediction using activity and pose-related features. In: IEEE Intelligent Vehicles Symposium (IV) (2020)
Lv, Z., Huang, X., Cao, W.: An improved GAN with transformers for pedestrian trajectory prediction models. Int. J. Intell. Syst. 36(12), 6989–7962 (2021)
Malla, S., Dariush, B., Choi, C.: Titan: future forecast using action priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Neumann, L., Vedaldi, A.: Pedestrian and ego-vehicle trajectory prediction from monocular camera. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Postnikov, A., Gamayunov, A., Ferrer, G.: Transformer based trajectory prediction. ArXiv preprint arXiv:2112.04350 (2021)
Qingyun, F., Dapeng, H., Zhaokui, W.: Cross-modality fusion transformer for multispectral object detection. ArXiv preprint arXiv:2111.00273 (2021)
Rasouli, A., Kotseruba, I., Kunic, T., Tsotsos, J.K.: Pie: a large-scale dataset and models for pedestrian intention estimation and trajectory prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Agreeing to cross: how drivers and pedestrians communicate. In: IEEE Intelligent Vehicles Symposium (IV) (2017)
Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (2017)
Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Pedestrian action anticipation using contextual feature fusion in stacked RNNs. In: Proceedings of The British Machine Vision Conference (BMVC) (2019)
Razali, H., Mordan, T., Alahi, A.: Pedestrian intention prediction: a convolutional bottom-up multi-task approach. Transport. Res. Part C: Emerg. Technol. 130, 103259 (2021)
Shi, L., et al.: SGCN: sparse graph convolution network for pedestrian trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Sui, Z., Zhou, Y., Zhao, X., Chen, A., Ni, Y.: Joint intention and trajectory prediction based on transformer. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2021)
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Yang, D., Zhang, H., Yurtsever, E., Redmill, K., Ozguner, U.: Predicting pedestrian crossing intention with feature fusion and spatio-temporal attention. IEEE Transactions on Intelligent Vehicles (T-IV) (2022)
Yao, H.Y., Wan, W.G., Li, X.: End-to-end pedestrian trajectory forecasting with transformer network. ISPRS Int. J. Geo-Inf. 11(1), 44 (2022)
Yao, Y., Atkins, E., Johnson-Roberson, M., Vasudevan, R., Du, X.: Coupling intent and action for pedestrian crossing behavior prediction. In: Proceedings of 30th International Joint Conference on Artificial Intelligence (IJCAI) (2021)
Yin, Z., Liu, R., Xiong, Z., Yuan, Z.: Multimodal transformer networks for pedestrian trajectory prediction. In: Proceedings of 30th International Joint Conference on Artificial Intelligence (IJCAI) (2021)
Zhang, S., Benenson, R., Schiele, B.: Citypersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Acknowledgement
This work was supported by the Institute of Information and Communications Technology Planning and Evaluation (IITP) Grant through the Ministry of Science and ICT (MSIT), Government of Korea (Development of Previsional Intelligence Based on Long-Term Visual Memory Network) under Grant 2020-0-00004.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ham, JS., Bae, K., Moon, J. (2023). MCIP: Multi-Stream Network for Pedestrian Crossing Intention Prediction. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13801. Springer, Cham. https://doi.org/10.1007/978-3-031-25056-9_42
Download citation
DOI: https://doi.org/10.1007/978-3-031-25056-9_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25055-2
Online ISBN: 978-3-031-25056-9
eBook Packages: Computer ScienceComputer Science (R0)