Skip to main content

MCIP: Multi-Stream Network for Pedestrian Crossing Intention Prediction

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 Workshops (ECCV 2022)

Abstract

Predicting the crossing intention of pedestrian is an essential task for autonomous driving systems. Whether or not a pedestrian will cross a crosswalk is a significantly inevitable skills for safety driving. Although many datasets and models are proposed to precisely predict the intention of pedestrian, they lack the ability to integrate different types of information. Therefore, we propose a Multi-Stream Network for Pedestrian Crossing Intention Prediction (MCIP) based on our novel optimal merging method. The proposed method consists of integration modules that takes two visual and three non-visual elements as an input. We achieved state-of-the-art performance on accuracy of pedestrian crossing intention, F1-score, and AUC with both public standard pedestrian datasets, PIE and JAAD. Furthermore, we compared the performance of our MCIP with other networks quantitatively by visualizing the intention of the pedestrian. Lastly, we performed ablation studies to observe the effectiveness of our multi-stream methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bhattacharyya, A., Reino, D.O., Fritz, M., Schiele, B.: Euro-PVI: pedestrian vehicle interactions in dense urban centers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  2. Bouhsain, S.A., Saadatnejad, S., Alahi, A.: Pedestrian intention prediction: a multi-task perspective. ArXiv preprint arXiv:2010.10270 (2020)

  3. Braun, M., Krebs, S., Flohr, F., Gavrila, D.M.: Eurocity persons: a novel benchmark for person detection in traffic scenes. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2019)

    Google Scholar 

  4. Cai, Z., Vasconcelos, N.: Cascade R-CNN: high quality object detection and instance segmentation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2019)

    Google Scholar 

  5. Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2019)

    Google Scholar 

  6. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  7. Dendorfer, P., Elflein, S., Leal-Taixé, L.: MG-GAN: a multi-generator model preventing out-of-distribution samples in pedestrian trajectory prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  8. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2011)

    Google Scholar 

  9. Giuliari, F., Hasan, I., Cristani, M., Galasso, F.: Transformer networks for trajectory forecasting. In: 2020 25th International Conference on Pattern Recognition (ICPR) (2021)

    Google Scholar 

  10. Hasan, I., Liao, S., Li, J., Akram, S.U., Shao, L.: Generalizable pedestrian detection: the elephant in the room. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  11. Khan, A.H., Munir, M., van Elst, L., Dengel, A.: F2DNet: fast focal detection network for pedestrian detection. ArXiv preprint arXiv:2203.02331 (2022)

  12. Kim, K., Lee, Y.K., Ahn, H., Hahn, S., Oh, S.: Pedestrian intention prediction for autonomous driving using a multiple stakeholder perspective model. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2020)

    Google Scholar 

  13. Kotseruba, I., Rasouli, A., Tsotsos, J.K.: Benchmark for evaluating pedestrian action prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2021)

    Google Scholar 

  14. Lin, Z., Pei, W., Chen, F., Zhang, D., Lu, G.: Pedestrian detection by exemplar-guided contrastive learning. ArXiv preprint arXiv:2111.08974 (2021)

  15. Liu, B., et al.: Spatiotemporal relationship reasoning for pedestrian intent prediction. IEEE Robot. Autom. Lett. (RA-L) PP(99), 1 (2020)

    Google Scholar 

  16. Lorenzo, J., et al.: CAPformer: pedestrian crossing action prediction using transformer. Sensors 21(17), 5694 (2021)

    Google Scholar 

  17. Lorenzo, J., Parra, I., Sotelo, M.: IntFormer: predicting pedestrian intention with the aid of the transformer architecture. ArXiv preprint arXiv:2105.08647 (2021)

  18. Lorenzo, J., Parra, I., Wirth, F., Stiller, C., Llorca, D.F., Sotelo, M.A.: RNN-based pedestrian crossing prediction using activity and pose-related features. In: IEEE Intelligent Vehicles Symposium (IV) (2020)

    Google Scholar 

  19. Lv, Z., Huang, X., Cao, W.: An improved GAN with transformers for pedestrian trajectory prediction models. Int. J. Intell. Syst. 36(12), 6989–7962 (2021)

    Google Scholar 

  20. Malla, S., Dariush, B., Choi, C.: Titan: future forecast using action priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  21. Neumann, L., Vedaldi, A.: Pedestrian and ego-vehicle trajectory prediction from monocular camera. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  22. Postnikov, A., Gamayunov, A., Ferrer, G.: Transformer based trajectory prediction. ArXiv preprint arXiv:2112.04350 (2021)

  23. Qingyun, F., Dapeng, H., Zhaokui, W.: Cross-modality fusion transformer for multispectral object detection. ArXiv preprint arXiv:2111.00273 (2021)

  24. Rasouli, A., Kotseruba, I., Kunic, T., Tsotsos, J.K.: Pie: a large-scale dataset and models for pedestrian intention estimation and trajectory prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  25. Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Agreeing to cross: how drivers and pedestrians communicate. In: IEEE Intelligent Vehicles Symposium (IV) (2017)

    Google Scholar 

  26. Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (2017)

    Google Scholar 

  27. Rasouli, A., Kotseruba, I., Tsotsos, J.K.: Pedestrian action anticipation using contextual feature fusion in stacked RNNs. In: Proceedings of The British Machine Vision Conference (BMVC) (2019)

    Google Scholar 

  28. Razali, H., Mordan, T., Alahi, A.: Pedestrian intention prediction: a convolutional bottom-up multi-task approach. Transport. Res. Part C: Emerg. Technol. 130, 103259 (2021)

    Google Scholar 

  29. Shi, L., et al.: SGCN: sparse graph convolution network for pedestrian trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  30. Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  31. Sui, Z., Zhou, Y., Zhao, X., Chen, A., Ni, Y.: Joint intention and trajectory prediction based on transformer. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2021)

    Google Scholar 

  32. Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  33. Yang, D., Zhang, H., Yurtsever, E., Redmill, K., Ozguner, U.: Predicting pedestrian crossing intention with feature fusion and spatio-temporal attention. IEEE Transactions on Intelligent Vehicles (T-IV) (2022)

    Google Scholar 

  34. Yao, H.Y., Wan, W.G., Li, X.: End-to-end pedestrian trajectory forecasting with transformer network. ISPRS Int. J. Geo-Inf. 11(1), 44 (2022)

    Google Scholar 

  35. Yao, Y., Atkins, E., Johnson-Roberson, M., Vasudevan, R., Du, X.: Coupling intent and action for pedestrian crossing behavior prediction. In: Proceedings of 30th International Joint Conference on Artificial Intelligence (IJCAI) (2021)

    Google Scholar 

  36. Yin, Z., Liu, R., Xiong, Z., Yuan, Z.: Multimodal transformer networks for pedestrian trajectory prediction. In: Proceedings of 30th International Joint Conference on Artificial Intelligence (IJCAI) (2021)

    Google Scholar 

  37. Zhang, S., Benenson, R., Schiele, B.: Citypersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the Institute of Information and Communications Technology Planning and Evaluation (IITP) Grant through the Ministry of Science and ICT (MSIT), Government of Korea (Development of Previsional Intelligence Based on Long-Term Visual Memory Network) under Grant 2020-0-00004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Je-Seok Ham .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ham, JS., Bae, K., Moon, J. (2023). MCIP: Multi-Stream Network for Pedestrian Crossing Intention Prediction. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13801. Springer, Cham. https://doi.org/10.1007/978-3-031-25056-9_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25056-9_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25055-2

  • Online ISBN: 978-3-031-25056-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics