Skip to main content
Log in

Pedestrian detection based on channel feature fusion and enhanced semantic segmentation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

At present, pedestrian detection is widely applied to autonomous driving and intelligent transportation and robots, etc. But the balance between accuracy and speed is still not reached. In complex background with high pedestrian density and serious occlusion, missing detection or false detection may occur by pedestrian detection models based on center and scale prediction (CSP). An improved pedestrian detection method based on channel feature fusion and enhanced semantic segmentation is presented. A feature fusion module based on squeeze and excitation is proposed in feature extraction. Multi-scale feature maps are fused to obtain faster detection speed and higher detection accuracy. An enhanced semantic segmentation module is presented in detection head to solve missing detection for long-distance pedestrians. CIOU (Complete Intersection Over Union) loss function is used to improve the confidence levels of pedestrians. Experiments on different networks, scales of feature fusion and detection methods are carried out to verify the performance of proposed approach. The experimental results show that the proposed model can detect pedestrians with high accuracy in occluded, dense and long-distance scenes. The detection speed can be accelerated while keeping low missed detection rate and less computational cost. It is shown that the approach can achieve high accuracy and robustness especially in complex background.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data Availability

The data generated and analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Chen L, Lin SB, Lu XK, Cao DP, Wu HB, Guo C, Liu C, Wang FY (2021) Deep neural network based vehicle and pedestrian detection for autonomous driving: A Survey. IEEE Trans Intell Transp Syst 22(6):3234–3246

    Article  Google Scholar 

  2. Dilek E, Dener M (2023) Computer Vision Applications in Intelligent Transportation Systems: A Survey. Sensors 23(6):2938

    Article  Google Scholar 

  3. Zhou HZ, Yu G (2021) Research on pedestrian detection technology based on the SVM classifier trained by HOG and LTP features. Future Gener Comput Syst-Int J eScience 125:604–615

    Article  Google Scholar 

  4. Zhang Y, Xu LH, Zhang YK (2022) Research on hierarchical pedestrian detection based on SVM classifier with improved kernel function. Meas Control 55(9–10):1088–1096

    Article  Google Scholar 

  5. Liu KQ, Wang WG, Wang J (2019) Pedestrian detection with lidar point clouds based on single template matching. Electronics 8(7):780

    Article  Google Scholar 

  6. Preethaa KRS, Sabari A (2020) Intelligent video analysis for enhanced pedestrian detection by hybrid metaheuristic approach. Soft Comput 24(16):12303–12311

    Article  Google Scholar 

  7. Sam DB, Peri SV, Sundararaman MN, KamathA Babu RV (2021) Locate, size, and count: accurately resolving people in dense crowds via detection. IEEE Trans Pattern Anal Mach Intell 43(8):2739–2751

    Google Scholar 

  8. Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y (2021) OpenPose: realtime multi-person 2D pose estimation using Part Affinity Fields. IEEE Trans Pattern Anal Mach Intell 43(1):172–186

    Article  Google Scholar 

  9. Marnissi MA, Fradi H, Sahbani A, Ben Amara NE (2022) Unsupervised thermal-to-visible domain adaptation method for pedestrian detection. Pattern Recognit Lett 153:222–231

    Article  Google Scholar 

  10. Panigrahi S, Raju USN (2021) Pedestrian detection based on hand-crafted features and multi-layer feature fused-ResNet Model. Int J Artif Intell Tools 30(05):2150028

    Article  Google Scholar 

  11. Li JN, Liang XD, Shen SM, Xu TF, Feng JS, Yan SC (2018) Scale-aware fast R-CNN for pedestrian detection. IEEE Trans Multimed 20(4):985–996

    Google Scholar 

  12. Dai XB, Hu JP, Zhang HM, Shitu A, Luo CL, Osman A, Sfarra S, Duan YX (2021) Multi-task faster R-CNN for nighttime pedestrian detection and distance estimation. Infrared Phys Technol 115:103694

  13. Liu CQ, Wang HS, Liu CJ (2022) Double mask R-CNN for pedestrian detection in a crowd. Mob Inf Syst 2022:4012252

    Google Scholar 

  14. Gawande U, Hajari K, Golhar Y (2022) SIRA: Scale illumination rotation affine invariant mask R-CNN for pedestrian detection. Appl Intell 52(9):10398–10416

    Article  Google Scholar 

  15. Gunduz MS, Isik G (2023) A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models. J Real-Time Image Process 20(1):5

    Article  Google Scholar 

  16. Boudjit K, Ramzan N (2022) Human detection based on deep learning YOLO-v2 for real-time UAV applications. J Exp Theor Artif Intell 34(3):527–544

    Article  Google Scholar 

  17. Pandiyan P, Thangaraj R, Subramanian M et al (2022) Real-time monitoring of social distancing with person marking and tracking system using YOLO V3 model. Int J Sensor Netw 38(3):154–165

    Article  Google Scholar 

  18. Zhou H, Wu T, Sun K et al (2022) Towards high accuracy pedestrian detection on edge GPUs. Sensors 22(16):5980

    Article  Google Scholar 

  19. Chen XW, Jia YP, Tong XQ, Li ZR (2022) Research on pedestrian detection and DeepSort tracking in front of intelligent vehicle based on deep learning. Sustainability 14(15):9281

    Article  Google Scholar 

  20. Liu W, Liao S, Ren W et al (2019) High-level semantic feature detection: A new perspective for pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5187-5196

  21. Hu J, Shen L, Albanie S et al (2020) Squeeze-and-Excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023

    Article  Google Scholar 

  22. Li GY, Zong CF, Liu GF, Zhu TJ (2020) Application of convolutional neural network (CNN)-AdaBoost algorithm in pedestrian detection. Sens Mater 32(6):1997–2006

    Google Scholar 

  23. Zhai SP, Dong SS, Shang DR, Wang SH (2020) An improved faster R-CNN pedestrian detection algorithm based on feature fusion and context analysis. IEEE Access 8:138117–138128

    Article  Google Scholar 

  24. Pop DO, Rogozan A, Nashashibi F, Bensrhair A (2021) Pedestrian recognition using cross-modality learning in convolutional neural networks. IEEE Intell Transp Syst Mag 13(1):210–224

    Article  Google Scholar 

  25. Hsu WY, Lin WY (2020) Ratio-and-scale-aware YOLO for pedestrian detection. IEEE Trans Image Process 30:934–947

    Article  Google Scholar 

  26. Panigrahi S, Raju USN (2022) InceptionDepth-wiseYOLOv2: improved implementation of YOLO framework for pedestrian detection. Int J Multimed Inf Retr 11(3):409–430

    Article  Google Scholar 

  27. Zhang SS, Chen D, Yang J, Schiele B (2021) Guided attention in CNNs for occluded pedestrian detection and re-identification. Int J Comput Vis 129(6):1875–1892

    Article  Google Scholar 

  28. Assefa AA, Tian WH, Acheampong KN, Aftab MU, Ahmad M (2022) Small-scale and occluded pedestrian detection using multi mapping feature extraction function and modified Soft-NMS. Comput Intell Neurosci 2022:9325803

    Article  Google Scholar 

  29. Ma J, Wan HL, Wang JX, Xia H, Bai CJ (2021) An improved one-stage pedestrian detection method based on multi-scale attention feature extraction. J Real-Time Image Process 18(6):1965–1978

    Article  Google Scholar 

  30. Wang MJ, Chen HJ, Li YF, You YH, Zhu JL (2021) Multi-scale pedestrian detection based on self-attention and adaptively spatial feature fusion. IET Intell Transp Syst 15(6):837–849

    Article  Google Scholar 

  31. Wang TT, Wan L, Tang L, Liu MS (2022) MGA-YOLOv4: a multi-scale pedestrian detection method based on mask-guided attention. Appl Intell 52(13):15308–15324

    Article  Google Scholar 

  32. Zou FM, Li X, Xu QM, Sun ZL, Zhu JX (2023) Correlation-and-correction fusion attention network for occluded pedestrian detection. IEEE Sens J 23(6):6061–6073

    Article  Google Scholar 

  33. Woo S, Park J, Lee JY, Kweon IS (2018) CBAM: convolutional block attention module. European Conference on Computer Vision. In: Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Lecture Notes in Computer Science, v 11211 LNCS, pp 3-19

  34. Hu J, Shen L, Albanie S, Sun G (2018) Gather-Excite: Exploiting feature context in convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 9401-9411

  35. Gao Z, Xie J, Wang Q, Li P (2019) Global second-order pooling convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3024-3033

  36. Liu JJ, Hou QB, Cheng MM, Wang CH, Feng JS (2020) Improving convolutional networks with self-calibrated convolutions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10096-10105

  37. Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) CCNet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV 2019), pp 603-612

  38. Li D, Zhou A, Yao A (2019) Hbonet: Harmonious bottleneck on two orthogonal dimensions. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV 2019), pp 3316-3325

  39. Yang GY, Wang ZY, Zhuang SN, Wang H (2022) PFF-CB: multiscale occlusion pedestrian detection method based on PFF and CBAM. Comput Intell Neurosci 2022:3798060

    Google Scholar 

  40. Liu MS , Wan L, Wang B, Wang TT (2023) SE-YOLOv4: shuffle expansion YOLOv4 for pedestrian detection based on PixelShuffle. Applied Intelligence. https://doi.org/10.1007/s10489-023-04456-0

  41. Zheng Z, Wang P, Ren D et al (2021) Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cybern 52(8):8574–8586

    Article  Google Scholar 

  42. Girshick R (2015) Fast R-CNN. In: Proceedings of the 2015 IEEE International Conference on Computer Vision, pp 1440-1448

  43. Liu W, Liao S, Hu W (2019) Efficient single-stage pedestrian detector by asymptotic localization fitting and multi-scale context encoding. IEEE Trans Image Process 29(99):1413–1425

    MathSciNet  Google Scholar 

  44. Song T, Sun L, Xie D, Sun H, Pu S Small-scale pedestrian detection based on somatic topology localization and temporal feature aggregation. In: Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v 11211 LNCS, pp 554-569

  45. Zhang SF, Wen LY, Bian X, Lei Z, Li SZ (2018) Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v 11207 LNCS, pp 657-674

  46. Zhang SS, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), v 2017-January, pp 4457-4465

Download references

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 62376089, 62202147), the Key R & D plan of Hubei Province (2020BHB004, 2020BAB012), the Natural Science Foundation of Hubei Province (2020CFB798).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinlu Zong.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zong, X., Xu, Y., Ye, Z. et al. Pedestrian detection based on channel feature fusion and enhanced semantic segmentation. Appl Intell 53, 30203–30218 (2023). https://doi.org/10.1007/s10489-023-04957-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04957-y

Keywords

Navigation