Skip to main content

Advertisement

Log in

Double parallel branches FCOS for human detection in a crowd

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The improvement from region-level to pixel-level and fewer hyper-parameters make anchor-free detectors popular. Most anchor-free algorithms will set a center-ness branch to reduce prediction points far away from the center of the target, which will indirectly weaken the more important features of the head in the pedestrian dataset. However, in a dense crowd, the head features of humans are critical to alleviating the problem of occlusion. In order to alleviate this problem, we have counted the characteristics of the target scale of a dense pedestrian dataset and introduced a Double Parallel Branches FCOS(DPB-FCOS) detector method. Based on the original prediction branch, we add a head branch to generate additional prediction boxes, and redefine the positive sample selection method of this branch, so that it can generate more prediction boxes in the head position of the human body. At the same time, considering the three factors of overlap area, distance, and aspect ratio, we designed a regression loss that is more suitable for anchor-free detectors. The center point distance in DIoU is used instead by the distance between the upper left and lower right corner points, which significantly improves the model’s performance. We verify our method on two popular models. Compared with baseline, FCOS can improve the accuracy by 5.9% and ATSS can improve the accuracy by 3.8% on the CrowdHuman dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Bochkovskiy A, Wang CY, Liao H (2020) Yolov4: Optimal speed and accuracy of object detection

  2. Bodla N, Singh B, Chellappa R, Davis LS (2017) Improving object detection with one line of code

  3. Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 6154–6162

  4. Chen Y, Wang L, Li C, Hou Y, Li W (2020) Convnets-based action recognition from skeleton motion maps. Multimedia Tools and Applications, 79(3)

  5. Dai L, Jifeng H, Yi S, Kaiming, Jian (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems 29, pp 379–387

  6. Du X, El-Khamy M, Lee J, Davis LS (2017) Fused dnn: a deep neural network fusion approach to fast and robust pedestrian detection. In: 2017 IEEE Winter conference on applications of computer vision (WACV)

  7. Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Keypoint triplets for object detection. In: The IEEE international conference on computer vision (ICCV), pp 6569–6578

  8. Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd : Deconvolutional single shot detector coRR

  9. Ge Z, Jie Z, Huang X, Xu R, Yoshie O (2020) Ps-rcnn: Detecting secondary human instances in a crowd via primary object suppression. In: IEEE

  10. Girshick R (2015) Fast r-cnn. In: The IEEE international conference on computer vision (ICCV), pp 1440–1448

  11. Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Computer Society

  12. He K, Gkioxari G, Dollar P, Girshick R (2017) Mask r-cnn. In: The IEEE international conference on computer vision (ICCV), pp 2961–2969

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  14. Huang Z, Yue K, Deng J, Zhou F (2020) Visible feature guidance for crowd pedestrian detection

  15. Jianan, Li, Xiaodan, Liang, Shengmei, Shen, Tingfa, Xu, Jiashi, Feng (2017) Scale-aware fast r-cnn for pedestrian detection. IEEE Transactions on Multimedia

  16. Jianan, Li, Xiaodan, Liang, Shengmei, Shen, Tingfa, Xu, Jiashi, Feng (2017) Scale-aware fast r-cnn for pedestrian detection. IEEE Transactions on Multimedia

  17. Karen S, Andrew Z (2014) Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556

  18. Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: The european conference on computer vision (ECCV), pp 734–750

  19. Leibe B, Matas J, Sebe N, Welling M (2016) [Lecture notes in computer science] computer vision – eccv 2016 volume 9908 —— a unified multi-scale deep convolutional neural network for fast object detection, vol. 10.1007/978-3-319-46493-0, no Chapter 22, 354–370

  20. Lin TY, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2117–2125

  21. Lin TY, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: The IEEE international conference on computer vision (ICCV), pp 2980–2988

  22. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S (2015) Ssd: Single shot multibox detector. In: The European Conference on Computer Vision (ECCV), pp 21–37

  23. Liu S, Huang D, Wang Y (2019) Adaptive nms: Refining pedestrian detection in a crowd. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)

  24. Liu W et al (2018) Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. Springer, Cham

    Book  Google Scholar 

  25. Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18Th international conference on pattern recognition (ICPR’06), vol 3, pp 850–855

  26. Pang C, Wang W, Lan R, Shi Z, Luo X (2020) Bilinear pyramid network for flower species categorization. Multimed Tools Appl 6:1–11

    Google Scholar 

  27. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Computer vision & pattern recognition

  28. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 7263–7271

  29. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv e-prints

  30. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  31. Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Savarese S (2019) Generalized intersection over union: A metric and a loss for bounding box regression. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)

  32. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  33. Santosh KC, Antani SK (2020) Recent trends in image processing and pattern recognition. Multimed Tools Appl 79(47-48):1–3

    Article  Google Scholar 

  34. Shao S, Zhao Z, Li B, Xiao T, Yu G, Zhang X, Sun J (2018) Crowdhuman: A benchmark for detecting human in a crowd

  35. Song Q, Yang F, Yang L, Liu C, Xia L (2020) Learning point-guided localization for detection in remote sensing images. J Sel Top Appl Earth Obs Remote Sens, vol PP 99:1–1

    Google Scholar 

  36. Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: The IEEE international conference on computer vision (ICCV), pp 9627–9636

  37. Liu W, Liao S, Hu W et al (2017) Denet: Scalable real-time object detection with directed sparse sampling. In: 2017 IEEE International conference on computer vision (ICCV)

  38. Wang X, Chen K, Huang Z, Yao C, Liu W (2017) Point linking network for object detection

  39. Wang S, Cheng J, Liu H, Tang M (2018) Pcn: Part and context information for pedestrian detection with cnns. arXiv

  40. Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2017) Repulsion loss: Detecting pedestrians in a crowd

  41. Xiao Y, Tian Z, Yu J, Zhang Y, Lan X (2020) A review of object detection based on deep learning. Multimedia Tools and Applications, (11)

  42. Yang L, Song Q, Wang Z, Hu M, Liu C, Xin X, Jia W, Xu S (2020) Renovating parsing r-cnn for accurate multiple human parsing. In: Proceedings of European Conference on Computer Vision (ECCV)

  43. Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) Unitbox: an advanced object detection network. ACM

  44. Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: 2020 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)

  45. Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Occlusion-aware r-cnn: Detecting pedestrians in a crowd. Springer, Cham

    Google Scholar 

  46. Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 4203–4212

  47. Zhang K, Xiong F, Sun P, Hu L, Li B, Yu G (2019) Double anchor r-cnn for human detection in a crowd

  48. Zheng Z, Wang P, Liu W, Li J, Ren D (2020) Distance-iou loss: Faster and better learning for bounding box regression. In: AAAI Conference on artificial intelligence

  49. Zhou S, Qiu J (2021) Enhanced ssd with interactive multi-scale attention features for object detection. Multimedia Tools and Applications, (1)

  50. Zhou C, Yuan J (2018) Bi-box regression for pedestrian detection and occlusion estimation. Springer, Cham

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qing Song.

Ethics declarations

Conflict of Interests

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, Q., Wang, H., Yang, L. et al. Double parallel branches FCOS for human detection in a crowd. Multimed Tools Appl 81, 15707–15723 (2022). https://doi.org/10.1007/s11042-022-12439-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12439-5

Keywords

Navigation