A single-shot model for traffic-related pedestrian detection

Sun, Chang; Ai, Yibo; Qi, Xing; Wang, Sheng; Zhang, Weidong

doi:10.1007/s10044-022-01076-1

A single-shot model for traffic-related pedestrian detection

Theoretical Advances
Published: 09 June 2022

Volume 25, pages 853–865, (2022)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Chang Sun¹,
Yibo Ai¹,
Xing Qi²,
Sheng Wang² &
…
Weidong Zhang¹

263 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Traffic-related pedestrian detection is important for advanced driving-assistant systems and autonomous driving. In addition to pedestrian detection, traffic-related pedestrian detection involves the challenge of detecting small-target pedestrians from large input images. Recently, deep-learning-based methods, including convolution neural networks, have been applied to problems of pedestrian detection. In this study, we propose a single-shot multibox detector (SSD)-based method called E-SSD to increase the accuracy and speed of detecting traffic-related pedestrians. This method includes a deconvolutional feature-fusion module to provide shallow layers with additional contextual information, which is beneficial for detecting small-sized objects. Additionally, we included an attention layer designed to exploit channel attention and spatial attention in order to utilize the most valuable information for detecting target pedestrians. Furthermore, we built a traffic-related pedestrian dataset (UCAR pedestrian) specific for Beijing. Evaluation results on the UCAR dataset demonstrated that E-SSD was more effective than a baseline SSD model at detecting small-target pedestrians. Evaluation of E-SSD on the Caltech pedestrian, COCO Persons and INRIA pedestrian datasets demonstrated that its performance was comparable with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

R-SSD: refined single shot multibox detector for pedestrian detection

Article 14 January 2022

Chaoqi Yan, Hong Zhang, … Ding Yuan

MLFFCSP: a new anti-occlusion pedestrian detection network with multi-level feature fusion for small targets

Article 23 February 2023

Ruohong Huan, Ji Zhang, … Peng Chen

Real-time pedestrian detection via hierarchical convolutional feature

Article 15 March 2018

Dongming Yang, Jiguang Zhang, … Xiaopeng Zhang

References

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Xie H, Chen Y, Shin H (2019) Context-aware pedestrian detection especially for small-sized instances with deconvolution integrated faster rcnn (dif r-cnn). Appl Intell 49(3):1200–1211
Article Google Scholar
Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision, Springer, pp 354–370
Du X, El-Khamy M, Lee J, Davis L (2017) Fused dnn: a deep neural network fusion approach to fast and robust pedestrian detection. In: 2017 IEEE winter conference on applications of computer vision (WACV), pp 953–961. IEEE
Dollar P, Wojek C, Schiele B, Perona P (2011) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761
Article Google Scholar
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vision 88(2):303–338
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, Springer, pp 21–37
Ren Y, Zhu C, Xiao S (2018) Small object detection in optical remote sensing images via modified faster r-cnn. Appl Sci 8(5):813
Article Google Scholar
Han C, Gao G, Zhang Y (2019) Real-time small traffic sign detection with revised faster-rcnn. Multimedia Tools Appl 78(10), 13263–13278
Article Google Scholar
Hu P, Ramanan D (2017) Finding tiny faces. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 951–959
Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: deconvolutional single shot detector. arXiv preprint arXiv:170106659
Zhang Z, Qiao S, Xie C, Shen W, Wang B, Yuille AL (2018) Single-shot object detection with enriched semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5813–5821
Pang Y, Xie J, Khan MH, Anwer RM, Khan FS, Shao L (2019) Mask-guided attention network for occluded pedestrian detection. In: Proceedings of the IEEE international conference on computer vision, pp 4967–4975
Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in cnns. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6995–7003
Zhang L, Liu Z, Zhang S, Yang X, Qiao H, Huang K, Hussain A (2019) Cross-modality interactive attention network for multispectral pedestrian detection. Inf Fusion 50:20–29
Article Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Zhu Y, Zhao C, Guo H, Wang J, Zhao X, Lu H (2018) Attention couplenet: Fully convolutional attention coupling network for object detection. IEEE Trans Image Process 28(1):113–126
Article MathSciNet Google Scholar
Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. arXiv preprint arXiv:180510180
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164
Wang S, Cheng J, Liu H, Tang M (2018) Pcn: Part and context information for pedestrian detection with cnns. arXiv preprint arXiv:180404483
Zhou C, Wu M, Lam SK (2019) Ssa-cnn: Semantic self-attention cnn for pedestrian detection. arXiv preprint arXiv:190209080
Dziri A, Leroy B, Bremond F, et al. (2019) Spatial attention for pedestrian detection. In: 2019 16th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–8. IEEE
Chen Z, Zhang L, Khattak AM, Gao W, Wang M (2019) Deep feature fusion by competitive attention for pedestrian detection. IEEE Access 7:21981–21989
Article Google Scholar
Hu Y, Wen G, Luo M, Dai D, Ma J, Yu Z (2018) Competitive inner-imaging squeeze and excitation for residual network. arXiv preprint arXiv:180708920
Li C, Song D, Tong R, Tang M (2019) Illumination-aware faster r-cnn for robust multispectral pedestrian detection. Pattern Recognit 85:161–171
Article Google Scholar
Zhang L, Lin L, Liang X, He K (2016) Is faster r-cnn doing well for pedestrian detection? In: European conference on computer vision, Springer, pp 443–457
Zhang S, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3221
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556
Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 845–853
Li Z, Zhou F (2017) Fssd: feature fusion single shot multibox detector. arXiv preprint arXiv:171200960
Jeong J, Park H, Kwak N (2017) Enhancement of ssd by concatenating feature maps for object detection. arXiv preprint arXiv:170509587
Cao G, Xie X, Yang W, Liao Q, Shi G, Wu J (2018) Feature-fused ssd: fast detection for small objects. In: Ninth international conference on graphic and image processing (ICGIP 2017). International Society for Optics and Photonics, vol 10615, p 106151E
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision, Springer, pp 740–755
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), IEEE, vol 1, pp 886–893
Zhang S, Benenson R, Omran M, Hosang J, Schiele B (2016) How far are we from solving pedestrian detection? In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1259–1267
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al. (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252
Article MathSciNet Google Scholar
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212
Li C (2018) High quality, fast, modular reference implementation of SSD in PyTorch. https://github.com/lufficc/SSD
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Yang J, Lu J, Batra D, Parikh D (2017) A faster pytorch implementation of faster r-cnn. https://github.com/jwyang/faster-rcnn.pytorch
Liu S, Huang D, Wang a (2018) Receptive field block net for accurate and fast object detection. In: The European conference on computer vision (ECCV)
Li J, Liang X, Shen S, Xu T, Feng J, Yan S (2017) Scale-aware fast r-cnn for pedestrian detection. IEEE Trans Multimed 20(4), 985–996
Google Scholar
Zhang X, Cheng L, Li B, Hu HM (2018) Too far to see? not really!-pedestrian detection with scale-aware localization policy. IEEE Trans Image Process 27(8):3703–3715
Article MathSciNet Google Scholar
Brazil G, Liu X (2019) Pedestrian detection with autoregressive network phases. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7231–7240
Brazil G, Yin X, Liu X (2017) Illuminating pedestrians via simultaneous detection & segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 4950–4959
Song T, Sun L, Xie D, Sun H, Pu S (2018) Small-scale pedestrian detection based on somatic topology localization and temporal feature aggregation. arXiv preprint arXiv:180701438
Toca C, Ciuc M, Patrascu C (2015) Normalized autobinomial markov channels for pedestrian detection. In: BMVC, pp 175–1
Nam W, Dollár P, Han JH (2014) Local decorrelation for improved pedestrian detection. In: Advances in neural information processing systems, pp 424–432
Marín J, Vázquez D, López AM, Amores J, Leibe B (2013) Random forests of local experts for pedestrian detection. In: Proceedings of the IEEE international conference on computer vision, pp 2592–2599
Dollár P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545
Article Google Scholar
Paisitkriangkrai S, Shen C, Van Den Hengel A (2014) Strengthening the effectiveness of pedestrian detection with spatially pooled features. In: European conference on computer vision, Springer, pp 546–561
Lim JJ, Zitnick CL, Dollár P (2013) Sketch tokens: a learned mid-level representation for contour and object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3158–3165
Lin C, Lu J, Wang G, Zhou J (2018) Graininess-aware deep feature learning for pedestrian detection. In: Proceedings of the European conference on computer vision (ECCV), pp 732–747

Download references

Acknowledgements

This work was completed while the first author was working as an intern in the UCAR AI lab. The first author thanks the UCAR AI lab for their support with this work. Additionally, the authors acknowledge funding from the Fundamental Research Funds for Central Universities of China (Nos. FRF-GF-18-009B and FRF-BD-19-001A) and the 111 Project (Grant No. B12012).

Author information

Authors and Affiliations

National Center for Materials Service Safety, University of Science and Technology Beijing, Beijing, China
Chang Sun, Yibo Ai & Weidong Zhang
AI Lab, UCAR, 118 East Zhongguancun Road, Haidian Dist., Beijing, China
Xing Qi & Sheng Wang

Authors

Chang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yibo Ai
View author publications
You can also search for this author in PubMed Google Scholar
Xing Qi
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Weidong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weidong Zhang.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, C., Ai, Y., Qi, X. et al. A single-shot model for traffic-related pedestrian detection. Pattern Anal Applic 25, 853–865 (2022). https://doi.org/10.1007/s10044-022-01076-1

Download citation

Received: 05 March 2020
Accepted: 10 May 2022
Published: 09 June 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s10044-022-01076-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A single-shot model for traffic-related pedestrian detection

Abstract

Access this article

Similar content being viewed by others

R-SSD: refined single shot multibox detector for pedestrian detection

MLFFCSP: a new anti-occlusion pedestrian detection network with multi-level feature fusion for small targets

Real-time pedestrian detection via hierarchical convolutional feature

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A single-shot model for traffic-related pedestrian detection

Abstract

Access this article

Similar content being viewed by others

R-SSD: refined single shot multibox detector for pedestrian detection

MLFFCSP: a new anti-occlusion pedestrian detection network with multi-level feature fusion for small targets

Real-time pedestrian detection via hierarchical convolutional feature

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation