Skip to main content
Log in

Global Context-Aware Feature Extraction and Visible Feature Enhancement for Occlusion-Invariant Pedestrian Detection in Crowded Scenes

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

The research in pedestrian detection has made remarkable progress in recent years. However, robust pedestrian detection in crowded scenes remains a considerable challenge. Many methods resort to additional annotations (visible body or head) of a dataset or develop attention mechanisms to alleviate the difficulties posed by occlusions. However, these methods rarely use contextual information to strengthen the features extracted by a backbone network. The main aim of this paper is to extract more effective and discriminative features of pedestrians for robust pedestrian detection with heavy occlusions. To this end, we propose a Global Context-Aware module to exploit contextual information for pedestrian detection. Fusing global context with the information derived from the visible part of occluded pedestrians enhances feature representations. The experimental results obtained on two challenging benchmarks, CrowdHuman and CityPersons, demonstrate the effectiveness and merits of the proposed method. Code and models are available at: https://github.com/FlyingZstar/crowded-pedestrian-detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 5693–5703

  2. Wang X, Tong J, Wang R (2021) Attention refined network for human pose estimation. Neural Process Lett 53(4):2853–2872

    Article  Google Scholar 

  3. Chen D, Zhang S, Ouyang W, Yang J, Tai Y (2018) Person Search via A Mask-Guided Two-Stream CNN Model

  4. Dong W, Zhang Z, Song C, Tan T (2020) Instance guided proposal network for person search. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  5. Ye M, Shen J, Lin G, Xiang T, Hoi SCH (2021) Deep learning for person re-identification: A survey and outlook. IEEE Trans Pattern Anal Mach Intell PP(99):1–1

    Google Scholar 

  6. Li D, Hu R, Huang W, Li D, Wang X, Hu C (2021) Trajectory association for person re-identification. Neural Process Lett 53(5):3267–3285

    Article  Google Scholar 

  7. Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 1933–1941

  8. Yang Y, Li G, Wu Z, Su L, Huang Q, Sebe N (2020) Reverse perspective network for perspective-aware object counting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 4374–4383

  9. Liu W, Liao S, Hu W, Liang X, Chen X (2018) Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of the European Conference on Computer Vision (ECCV), p. 618–634

  10. Mao J, Xiao T, Jiang Y, Cao Z (2017) What can help pedestrian detection? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 3127–3136

  11. Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection. In: European Conference on Computer Vision. Springer, Berlin, pp 354–370

    Google Scholar 

  12. Zhang S, Benenson R, Schiele B (2017) CityPersons: A diverse dataset for pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  13. Wojek C, Dollar P, Schiele B, Perona P (2012) Pedestrian detection: An evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743

    Article  Google Scholar 

  14. Shao S, Zhao Z, Li B, Xiao T, Yu G, Zhang X, Sun J (2018) CrowdHuman: A Benchmark for Detecting Human in a Crowd

  15. Ouyang W, Wang X (2014) Joint deep learning for pedestrian detection. In: IEEE International Conference on Computer Vision

  16. Chi C, Zhang S, Xing J, Lei Z, Zou X (2020) PedHunter: Occlusion robust pedestrian detector in crowded scenes. Proceedings of the AAAI Conference on Artificial Intell 34(7):10639–10646

    Article  Google Scholar 

  17. Pang Y, Xie J, Khan MH, Anwer RM, Khan FS, Shao L (2019) Mask-guided attention network for occluded pedestrian detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p. 4967–4975

  18. Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 6995–7003

  19. Wang X, Xiao T, Jiang Y, Shao S, Shen C (2018) Repulsion Loss: Detecting pedestrians in a crowd. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  20. Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Occlusion-aware R-CNN: Detecting pedestrians in a crowd. In: European Conference on Computer Vision (ECCV)

  21. Bodla N, Singh B, Chellappa R, Davis LS (2017) Soft-NMS: Improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

  22. Liu S, Huang D, Wang Y (2020) Adaptive NMS: Refining pedestrian detection in a crowd. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  23. Huang X, Ge Z, Jie Z, Yoshie O (2020) NMS by representative region: Towards crowded pedestrian detection by proposal pairing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  24. Chu X, Zheng A, Zhang X, Sun J (2020) Detection in crowded scenes: One proposal, multiple predictions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  25. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  26. Simonyan K, Zisserman A (2015) Very Deep Convolutional Networks for Large-Scale Image Recognition

  27. Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545

    Article  Google Scholar 

  28. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  29. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single shot multibox detector. In: European Conference on Computer Vision, p. 21–37

  30. Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

  31. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  32. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(6)

  33. Cai Z, Vasconcelos N (2018) Cascade R-CNN: Delving into high quality object detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  34. Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  35. Qin Z, Li Z, Zhang Z, Bao Y, Sun J (2019) ThunderNet: Towards real-time generic object detection on mobile devices. In: ICCV

  36. Tan M, Pang R, Le QV (2020) EfficientDet: Scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  37. Song L, Li Y, Jiang Z, Li Z, Sun H, Sun J, Zheng N (2020) Fine-Grained Dynamic Head for Object Detection

  38. Zhou C, Yuan J (2017) Multi-label learning of part detectors for heavily occluded pedestrian detection. In: IEEE International Conference on Computer Vision

  39. Tian Y, Luo P, Wang X, Tang X (2015) Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, p. 1904–1912

  40. Zhang J, Lin L, Li Y, Chen Y-c, Zhu J, Hu Y, Hoi SCH (2019) Attribute-aware Pedestrian Detection in a Crowd

  41. Zhou C, Yuan J (2018) Bi-box regression for pedestrian detection and occlusion estimation. In: ECCV

  42. Zhang K, Xiong F, Sun P, Hu L, Li B, Yu G (2019) Double Anchor R-CNN for Human Detection in a Crowd

  43. Xie J, Cholakkal H, Anwer RM, Khan FS, Shah M (2020) Count- and similarity-aware R-CNN for pedestrian detection. In: ECCV

  44. Song X, Zhao K, Chu WS, Zhang H, Guo J (2020) Progressive refinement network for occluded pedestrian detection. In: ECCV

  45. Wu J, Zhou C, Yang M, Zhang Q, Yuan J (2020) Temporal-context enhanced detection of heavily occluded pedestrians. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  46. Islam MM, Newaz AAR, Gokaraju B, Karimoddini A (2020) Pedestrian detection for autonomous cars: Occlusion handling by classifying body parts. In: 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), p. 1433–1438. IEEE

  47. Wang S, Cheng J, Liu H, Tang M (2018) PCN: Part and context information for pedestrian detection with CNNs. arXiv preprint arXiv:1804.04483

  48. Fei C, Liu B, Chen Z, Yu N (2019) Learning pixel-level and instance-level context-aware features for pedestrian detection in crowds. IEEE Access 7:94944–94953

    Article  Google Scholar 

  49. Xie H, Chen Y, Shin H (2019) Context-aware pedestrian detection especially for small-sized instances with deconvolution integrated Faster R-CNN (DIF R-CNN). Appl Intell 49(3):1200–1211

    Article  Google Scholar 

  50. Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2020) IAUnet: Global context-aware feature learning for person reidentification. IEEE Transactions on Neural Networks and Learning Systems

  51. Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, p. 10599–10606

  52. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 1–9

  53. Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: The IEEE International Conference on Computer Vision (ICCV)

  54. Cordts M, Omran M, Ramos S, Rehfeld T, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  55. Xu Z, Li B, Yuan Y, Dang A (2020) Beta R-CNN: Looking into pedestrian detection from another perspective. Advances in Neural Information Processing Systems

  56. Song T, Sun L, Xie D, Sun H, Pu S (2018) Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: Proceedings of the European Conference on Computer Vision (ECCV), p. 536–551

  57. Liu W, Liao S, Ren W, Hu W, Yu Y (2019) High-level semantic feature detection: A new perspective for pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 5187–5196

Download references

Funding

This work is supported by the National Key Research and Development Program of China under Grant (2017YFC1601800), the National Natural Science Foundation of China under Grant (61876072, 61902153, 62072243).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by [Zhenxing Liu]. The first draft of the manuscript was written by [Zhenxing Liu] and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Xiaoning Song or Zhenhua Feng.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Z., Song, X., Feng, Z. et al. Global Context-Aware Feature Extraction and Visible Feature Enhancement for Occlusion-Invariant Pedestrian Detection in Crowded Scenes. Neural Process Lett 55, 803–817 (2023). https://doi.org/10.1007/s11063-022-10910-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-10910-w

Keywords

Navigation