Skip to main content
Log in

A single-shot model for traffic-related pedestrian detection

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Traffic-related pedestrian detection is important for advanced driving-assistant systems and autonomous driving. In addition to pedestrian detection, traffic-related pedestrian detection involves the challenge of detecting small-target pedestrians from large input images. Recently, deep-learning-based methods, including convolution neural networks, have been applied to problems of pedestrian detection. In this study, we propose a single-shot multibox detector (SSD)-based method called E-SSD to increase the accuracy and speed of detecting traffic-related pedestrians. This method includes a deconvolutional feature-fusion module to provide shallow layers with additional contextual information, which is beneficial for detecting small-sized objects. Additionally, we included an attention layer designed to exploit channel attention and spatial attention in order to utilize the most valuable information for detecting target pedestrians. Furthermore, we built a traffic-related pedestrian dataset (UCAR pedestrian) specific for Beijing. Evaluation results on the UCAR dataset demonstrated that E-SSD was more effective than a baseline SSD model at detecting small-target pedestrians. Evaluation of E-SSD on the Caltech pedestrian, COCO Persons and INRIA pedestrian datasets demonstrated that its performance was comparable with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  2. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  3. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  4. Xie H, Chen Y, Shin H (2019) Context-aware pedestrian detection especially for small-sized instances with deconvolution integrated faster rcnn (dif r-cnn). Appl Intell 49(3):1200–1211

    Article  Google Scholar 

  5. Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision, Springer, pp 354–370

  6. Du X, El-Khamy M, Lee J, Davis L (2017) Fused dnn: a deep neural network fusion approach to fast and robust pedestrian detection. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp 953–961. IEEE

  7. Dollar P, Wojek C, Schiele B, Perona P (2011) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761

    Article  Google Scholar 

  8. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vision 88(2):303–338

    Article  Google Scholar 

  9. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, Springer, pp 21–37

  10. Ren Y, Zhu C, Xiao S (2018) Small object detection in optical remote sensing images via modified faster r-cnn. Appl Sci 8(5):813

    Article  Google Scholar 

  11. Han C, Gao G, Zhang Y (2019) Real-time small traffic sign detection with revised faster-rcnn. Multimedia Tools Appl 78(10), 13263–13278

    Article  Google Scholar 

  12. Hu P, Ramanan D (2017) Finding tiny faces. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 951–959

  13. Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: deconvolutional single shot detector. arXiv preprint arXiv:170106659

  14. Zhang Z, Qiao S, Xie C, Shen W, Wang B, Yuille AL (2018) Single-shot object detection with enriched semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5813–5821

  15. Pang Y, Xie J, Khan MH, Anwer RM, Khan FS, Shao L (2019) Mask-guided attention network for occluded pedestrian detection. In: Proceedings of the IEEE international conference on computer vision, pp 4967–4975

  16. Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in cnns. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6995–7003

  17. Zhang L, Liu Z, Zhang S, Yang X, Qiao H, Huang K, Hussain A (2019) Cross-modality interactive attention network for multispectral pedestrian detection. Inf Fusion 50:20–29

    Article  Google Scholar 

  18. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  19. Zhu Y, Zhao C, Guo H, Wang J, Zhao X, Lu H (2018) Attention couplenet: Fully convolutional attention coupling network for object detection. IEEE Trans Image Process 28(1):113–126

    Article  MathSciNet  Google Scholar 

  20. Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. arXiv preprint arXiv:180510180

  21. Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164

  22. Wang S, Cheng J, Liu H, Tang M (2018) Pcn: Part and context information for pedestrian detection with cnns. arXiv preprint arXiv:180404483

  23. Zhou C, Wu M, Lam SK (2019) Ssa-cnn: Semantic self-attention cnn for pedestrian detection. arXiv preprint arXiv:190209080

  24. Dziri A, Leroy B, Bremond F, et al. (2019) Spatial attention for pedestrian detection. In: 2019 16th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–8. IEEE

  25. Chen Z, Zhang L, Khattak AM, Gao W, Wang M (2019) Deep feature fusion by competitive attention for pedestrian detection. IEEE Access 7:21981–21989

    Article  Google Scholar 

  26. Hu Y, Wen G, Luo M, Dai D, Ma J, Yu Z (2018) Competitive inner-imaging squeeze and excitation for residual network. arXiv preprint arXiv:180708920

  27. Li C, Song D, Tong R, Tang M (2019) Illumination-aware faster r-cnn for robust multispectral pedestrian detection. Pattern Recognit 85:161–171

    Article  Google Scholar 

  28. Zhang L, Lin L, Liang X, He K (2016) Is faster r-cnn doing well for pedestrian detection? In: European conference on computer vision, Springer, pp 443–457

  29. Zhang S, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3221

  30. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556

  31. Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 845–853

  32. Li Z, Zhou F (2017) Fssd: feature fusion single shot multibox detector. arXiv preprint arXiv:171200960

  33. Jeong J, Park H, Kwak N (2017) Enhancement of ssd by concatenating feature maps for object detection. arXiv preprint arXiv:170509587

  34. Cao G, Xie X, Yang W, Liao Q, Shi G, Wu J (2018) Feature-fused ssd: fast detection for small objects. In: Ninth international conference on graphic and image processing (ICGIP 2017). International Society for Optics and Photonics, vol 10615, p 106151E

  35. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, Springer, pp 740–755

  36. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), IEEE, vol 1, pp 886–893

  37. Zhang S, Benenson R, Omran M, Hosang J, Schiele B (2016) How far are we from solving pedestrian detection? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1259–1267

  38. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al. (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252

    Article  MathSciNet  Google Scholar 

  39. Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212

  40. Li C (2018) High quality, fast, modular reference implementation of SSD in PyTorch. https://github.com/lufficc/SSD

  41. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  42. Yang J, Lu J, Batra D, Parikh D (2017) A faster pytorch implementation of faster r-cnn. https://github.com/jwyang/faster-rcnn.pytorch

  43. Liu S, Huang D, Wang a (2018) Receptive field block net for accurate and fast object detection. In: The European conference on computer vision (ECCV)

  44. Li J, Liang X, Shen S, Xu T, Feng J, Yan S (2017) Scale-aware fast r-cnn for pedestrian detection. IEEE Trans Multimed 20(4), 985–996

    Google Scholar 

  45. Zhang X, Cheng L, Li B, Hu HM (2018) Too far to see? not really!-pedestrian detection with scale-aware localization policy. IEEE Trans Image Process 27(8):3703–3715

    Article  MathSciNet  Google Scholar 

  46. Brazil G, Liu X (2019) Pedestrian detection with autoregressive network phases. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7231–7240

  47. Brazil G, Yin X, Liu X (2017) Illuminating pedestrians via simultaneous detection & segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 4950–4959

  48. Song T, Sun L, Xie D, Sun H, Pu S (2018) Small-scale pedestrian detection based on somatic topology localization and temporal feature aggregation. arXiv preprint arXiv:180701438

  49. Toca C, Ciuc M, Patrascu C (2015) Normalized autobinomial markov channels for pedestrian detection. In: BMVC, pp 175–1

  50. Nam W, Dollár P, Han JH (2014) Local decorrelation for improved pedestrian detection. In: Advances in neural information processing systems, pp 424–432

  51. Marín J, Vázquez D, López AM, Amores J, Leibe B (2013) Random forests of local experts for pedestrian detection. In: Proceedings of the IEEE international conference on computer vision, pp 2592–2599

  52. Dollár P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545

    Article  Google Scholar 

  53. Paisitkriangkrai S, Shen C, Van Den Hengel A (2014) Strengthening the effectiveness of pedestrian detection with spatially pooled features. In: European conference on computer vision, Springer, pp 546–561

  54. Lim JJ, Zitnick CL, Dollár P (2013) Sketch tokens: a learned mid-level representation for contour and object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3158–3165

  55. Lin C, Lu J, Wang G, Zhou J (2018) Graininess-aware deep feature learning for pedestrian detection. In: Proceedings of the European conference on computer vision (ECCV), pp 732–747

Download references

Acknowledgements

This work was completed while the first author was working as an intern in the UCAR AI lab. The first author thanks the UCAR AI lab for their support with this work. Additionally, the authors acknowledge funding from the Fundamental Research Funds for Central Universities of China (Nos. FRF-GF-18-009B and FRF-BD-19-001A) and the 111 Project (Grant No. B12012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weidong Zhang.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, C., Ai, Y., Qi, X. et al. A single-shot model for traffic-related pedestrian detection. Pattern Anal Applic 25, 853–865 (2022). https://doi.org/10.1007/s10044-022-01076-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-022-01076-1

Keywords

Navigation