FE-CSP: a fast and efficient pedestrian detector with center and scale prediction

Qin, Yugang; Qian, Yurong; Wei, Hongyang; Fan, Yingying; Feng, Peiyun

doi:10.1007/s11227-022-04815-7

FE-CSP: a fast and efficient pedestrian detector with center and scale prediction

Published: 21 September 2022

Volume 79, pages 4084–4104, (2023)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Yugang Qin¹,
Yurong Qian ORCID: orcid.org/0000-0001-6564-4745^1,2,3,
Hongyang Wei¹,
Yingying Fan² &
…
Peiyun Feng¹

424 Accesses
Explore all metrics

Abstract

There are still many pressing problems in pedestrian detection, such as difficulty in detection due to severe pedestrian occlusion, difficulty in detecting small objects and low detection speed. In this paper, we propose A Fast and Efficient Pedestrian Detector with Center and Scale Prediction (FE-CSP). We combine channel attention with spatial attention, replace the traditional convolution with deformable convolution, and embed the backbone network to propose CSANet (Channel and Spatial Attention Network), which efficiently extracts the semantic features of the object, and then propose a feature pyramid network to replace the traditional concatenation to perform multi-scale feature detection, which effectively improves the detection speed. By conducting experiments on CityPersons, our method achieves 10.1%, 13.7% and 47.4% $MR^{-2}$ at a speed of 0.21 s/img on the reasonable setting, small setting and heavy setting, respectively. On Caltech, our method achieves 5.2% $MR^{-2}$ at a speed of 0.06 s/img on the Reasonable setting, further demonstrating the superiority and generalization ability of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Self-Attention Model for Pedestrians Detection

From macro to micro: rethinking multi-scale pedestrian detection

Article 01 March 2023

R-SSD: refined single shot multibox detector for pedestrian detection

Article 14 January 2022

Data Availability

All data generated or analysed during this study are included in this published article.

References

Huang L, Zhao X, Huang K (2019) Bridging the gap between detection and tracking: A unified approach. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3999–4009
Hattori H, Naresh Boddeti V, Kitani KM, Kanade T (2015) Learning scene-specific pedestrian detectors without real data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3819–3827
Hbaieb A, Rezgui J, Chaari L (2019) Pedestrian detection for autonomous driving within cooperative communication system. In: 2019 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6. IEEE
Wei H, Zhang Q, Qian Y, Xu Z, Han J (2022) Mtsdet: multi-scale traffic sign detection with attention and path aggregation. Appl. Intell. 64:1–13
Google Scholar
Dai J, Li Y, He K, Sun J (2016) R-fcn: object detection via region-based fully convolutional networks. Adv. Neural Informat. Process. Syst. 29:1–5
Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969
Huang R, Pedoeem J, Chen C (2018) Yolo-lite: a real-time object detection algorithm optimized for non-gpu computers. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 2503–2510. IEEE
Loey M, Manogaran G, Taha MHN, Khalifa NEM (2021) Fighting against covid-19: A novel deep learning model based on yolo-v2 with resnet-50 for medical face mask detection. Sustain. Cities Soc. 65:10260
Article Google Scholar
Heuer F, Mantowsky S, Bukhari S, Schneider G (2021) Multitask-centernet (mcn): Efficient and diverse multitask learning using an anchor free approach. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 997–1005
Everingham M, Eslami S, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: A retrospective. Int J Comput Vis 111(1):98–136
Article Google Scholar
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. European Conference on Computer Vision. Springer, London, pp 740–755
Google Scholar
Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 734–750
Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9627–9636
Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6569–6578
Liu W, Liao S, Ren W, Hu W, Yu Y (2019) High-level semantic feature detection: A new perspective for pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5187–5196
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Adv Neural Informat Process syst 28:11–27
Google Scholar
Zhang S, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3213–3221
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Occlusion-aware r-cnn: detecting pedestrians in a crowd. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 637–653
Wang X, Xiao T, Jiang, Y, Shao S, Sun J, Shen C (2018) Repulsion loss: Detecting pedestrians in a crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7774–7783
Cai Z, Vasconcelos N (2019) Cascade r-cnn: high quality object detection and instance segmentation. IEEE Trans Patt Anal Mach Intell 43(5):1483–1498
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. European Conference on Computer Vision. Springer, Berlin, pp 21–37
Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988
Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790
Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2021) You only look one-level feature. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13039–13048
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Song T, Sun L, Xie D, Sun H, Pu S (2018) Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 536–551
Liu W, Liao S, Hu W, Liang X, Chen X (2018) Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 618–634
Liu W, Liao S, Ren W, Hu W, Yu Y (2019) High-level semantic feature detection: A new perspective for pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5187–5196
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Patt Anal Mach Intell 20(11):1254–1259
Article Google Scholar
Rensink RA (2000) The dynamic representation of scenes. Visual Cognit 7(1–3):17–42
Article Google Scholar
Corbetta M, Shulman GL (2002) Control of goal-directed and stimulus-driven attention in the brain. Nature Rev Neurosci 3(3):201–215
Article Google Scholar
Zhu X, Cheng D, Zhang Z, Lin S, Dai J (2019) An empirical study of spatial attention mechanisms in deep networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6688–6697
Zhao H, Zhang Y, Liu S, Shi J, Loy CC, Lin D, Jia J (2018) Psanet: Point-wise spatial attention network for scene parsing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 267–283
Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. Adv Neural Informat Process Syst 28:1–7
Google Scholar
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803
Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141
Lu E, Hu X (2021) Image super-resolution via channel attention and spatial attention. Appl Intell 90:1–9
Google Scholar
Lu Z, Xu B, Sun L, Zhan T, Tang S (2020) 3-d channel and spatial attention based multiscale spatial-spectral residual network for hyperspectral image classification. IEEE J Select Topics Appl Earth Observat Remote Sens 13:4311–4324
Article Google Scholar
Chen J, Chen Y, Li W, Ning G, Tong M, Hilton A (2021) Channel and spatial attention based deep object co-segmentation. Knowledge-Based Systems 211:106
Article Google Scholar
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19
Zhu X, Cheng D, Zhang Z, Lin S, Dai J (2019) An empirical study of spatial attention mechanisms in deep networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6688–6697
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125
Dollar P, Wojek C, Schiele B, Perona P (2011) Pedestrian detection: An evaluation of the state of the art. IEEE Trans Patt Anal Mach Intell 34(4):743–761
Article Google Scholar
Liu S, Huang D, Wang Y (2019) Adaptive nms: Refining pedestrian detection in a crowd. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6459–6468
Zhang J, Lin L, Zhu J, Li Y, Chen Y-C, Hu Y, Hoi SC (2020) Attribute-aware pedestrian detection in a crowd. IEEE Trans Multimed 23:3085–3097
Article Google Scholar
Zhang, Y, He H, Li J, Li Y, See J, Lin W (2021) Variational pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11622–11631
Tang Y, Li B, Liu M, Chen B, Wang Y, Ouyang W (2021) Autopedestrian: an automatic data augmentation and loss function search scheme for pedestrian detection. IEEE Trans Image Process 30:8483–8496
Article Google Scholar
Song X, Zhao K, Chu W-S, Zhang H, Guo J (2020) Progressive refinement network for occluded pedestrian detection. European conference on computer vision. Springer, Berlin, pp 32–48
Google Scholar
Song X, Chen B, Li P, Wang B, Zhang H (2022) Prnet++: Learning towards generalized occluded pedestrian detection via progressive refinement network. Neurocomputing 482:98–115
Article Google Scholar
Dai J, Qi H, Xiong, Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773
Zhang T, Cao Y, Zhang L, Li X (2022) Efficient feature fusion network based on center and scale prediction for pedestrian detection. Visu Comput 6:1–8
Google Scholar

Download references

Funding

This work is supported by the National Natural Science Foundation of China (Grant No. 61966035), the International Cooperation Project of the Science and Technology Department of the Autonomous Region (Grant No. 2020E01023), the Joint Foundation of the National Natural Science Foundation of China (Grant No. U1803261), the Autonomous Region Natural Science Foundation of China (Grant No. 2021D01C083) and Autonomous Region Science and Technology Program Youth Science Fund Project (Grant No. 2022D01C83).

Author information

Authors and Affiliations

School of Software, Xinjiang University, Urumqi, 830000, China
Yugang Qin, Yurong Qian, Hongyang Wei & Peiyun Feng
Key Laboratory of Signal Detection and Processing in Xinjiang Uygur Autonomous Region, Urumqi, 830000, China
Yurong Qian & Yingying Fan
College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China
Yurong Qian

Authors

Yugang Qin
View author publications
You can also search for this author inPubMed Google Scholar
Yurong Qian
View author publications
You can also search for this author inPubMed Google Scholar
Hongyang Wei
View author publications
You can also search for this author inPubMed Google Scholar
Yingying Fan
View author publications
You can also search for this author inPubMed Google Scholar
Peiyun Feng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yurong Qian.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest to this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Qin, Y., Qian, Y., Wei, H. et al. FE-CSP: a fast and efficient pedestrian detector with center and scale prediction. J Supercomput 79, 4084–4104 (2023). https://doi.org/10.1007/s11227-022-04815-7

Download citation

Accepted: 07 September 2022
Published: 21 September 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s11227-022-04815-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FE-CSP: a fast and efficient pedestrian detector with center and scale prediction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Hybrid Self-Attention Model for Pedestrians Detection

From macro to micro: rethinking multi-scale pedestrian detection

R-SSD: refined single shot multibox detector for pedestrian detection

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now