Abstract
Center and Scale Prediction (CSP) first introduced the Anchor-free method to the field of pedestrian detection. Pedestrian detection often occurs in complex scenes subject to occlusion, and it is difficult to extract pedestrian features in a single centre point prediction in CSP. To solve this problem, this paper presents a multi-branch detection network (MBDN) based on trigger attention. Firstly, a multi-centre point prediction branch feature extraction model (multi-centre) is proposed to solve the problem of CSP missed detections in occlusion scenarios. Secondly, a novel trigger attention module is designed. The module uses visible parts as triggers to automatically learn the weight relationships of multiple branches, let the network automatically learn the confidence of the centre points of different branches, and automatically strengthen the branch where the visible area on the feature map is located. Finally, a channel non-maximum suppression (NMS) module is used in the MBDN network to reduce the redundant bounding boxes. Then experiments results show that the log-average missing rate (MR−2) of the heavy subset is reduced from 49.63% to 45.51% while maintaining the performance on a reasonable subset. Code and models can be accessed at (https://github.com/weidalin/MBDN).
Similar content being viewed by others
Data Availability
The datasets used or analyzed during the current study are available from the corresponding author on reasonable request.
References
Chu X, Zheng A, Zhang X, Sun J (2020) Detection in crowded scenes: one proposal, multiple predictions. 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 12211-12220
Zhang K, Xiong F, Sun P, Hu L, Li B, Yu G (2019) Double anchor r-cnn for human detection in a crowd. arXiv:1909.09998
Xie H, Chen Y, Shin H (2018) Context-aware pedestrian detection especially for small-sized instances with deconvolution integrated faster rcnn (dif r-cnn). Appl Intell 49:1200–1211
Sun C, Ai Y, Wang S, Zhang W (2021) Mask-guided ssd for small-object detection. Appl Intell 51:3311–3322
Hasan I, Liao S, Li J, Akram S-U, Shao L (2020). Pedestrian detection: the elephant in the room. arXiv:2003.08799
Zhou X-Y, Wang D-Q, Krähenbühl P (2019). Objects as points. arXiv:1904.07850
Liu W, Hasan I, Liao S-C (2019) Center and scale prediction: a box-free approach for pedestrian and face detection. arXiv:1904.02948
Liu W, Liao S, Ren W, Hu W, Yu Y (2019) High-level semantic feature detection: a new perspective for pedestrian detection. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 5182-5191
Girshick R (2015) Fast R-CNN. 2015 IEEE international conference on computer vision (ICCV), pp. 1440-1448
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Liu W, Angurelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg A (2016) SSD: Single Shot MultiBox Detector. In: SSD: single shot MultiBox detector. European Conference on Computer Vision. Springer, Cham, pp 1–37
Lin T, Goyal P, Girshick R, He K, Dollár P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327
Dai X, Yang X, Wei X (2021) Tirnet: object detection in thermal infrared images for autonomous driving. Appl Intell 51:1244–1261
Law H, Deng J (2020) Cornernet: detecting objects as paired keypoints. Int J Comput Vis 128(3):642–656
Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 840-849
Tian Z, Shen C, Chen H and He T (2019) FCOS: fully convolutional one-stage object detection. 2019 IEEE/CVF international conference on computer vision (ICCV), pp. 9626-9635
He Z-W, Ren Z-D, Yang X-B, Yang Y, Zhang W-S (2021) Mead: a mask-guided anchor-free detector for oriented aerial object detection. Applied intelligence, 1-16. https://doi.org/10.1007/s10489-021-02570-5
Tang Z-Y, Yang J-B, Pei Z-C, Song X (2021) Coordinate-based anchor-free module for object detection. Applied intelligence, pp.1-15. https://doi.org/10.1007/s10489-021-02373-8
Zhou X, Zhou J, Krähenbühl P (2019) Bottom-up object detection by grouping extreme and center points. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 850-859
Wang X, Girshick R, Gupta A, He K (2018) Non-local Neural Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7794–7803
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023
Woo S, Park J, Lee J-Y, Kweon I (2018) CBAM: Convolutional Block Attention Module. In: Cbam: convolutional block attention module. In European Conference on Computer Vision, Springer, Cham, pp 3–19
Lu E, Hu X (2021) Image super-resolution via channel attention and spatial attention. Appl Intell:1–9. https://doi.org/10.1007/s10489-021-02464-6
Wang J, Yu J, He Z (2021) DECA: a novel multi-scale efficient channel attention module for object detection in real-life fire images. Appl Intell:1–14. https://doi.org/10.1007/s10489-021-02496-y
Bodla N, Singh B, Chellappa R, Davis L (2017) Soft-NMS -- improving object detection with one line of code. In 2017 IEEE international conference on computer vision (ICCV), pp.5562-570
Ning C, Zhou H, Yan S, Tang J (2017) Inception single shot MultiBox detector for object detection. In: 2017 IEEE international conference on Multimedia & Expo Workshops (ICMEW), pp 549–554
Liu S, Huang D, Wang Y (2019) Adaptive NMS: refining pedestrian detection in a crowd," 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 6452-6461, https://doi.org/10.1109/CVPR.2019.00662
Huang X, Ge Z, Jie Z, Yoshie O (2020) NMS by representative region: towards crowded pedestrian detection by proposal pairing. 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 10747-10756, https://doi.org/10.1109/CVPR42600.2020.01076
Zhang S, Benenson R, Schiele B (2017) CityPersons: a diverse dataset for pedestrian detection. 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp. 4457-4465. https://doi.org/10.1109/CVPR.2017.474
Lin T-Y, Marie M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick C (2014) Microsoft COCO: common objects in context. Springer International Publishing, European Conference on Computer Vision, pp 740–755
Dollar P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761. https://doi.org/10.1109/TPAMI.2011.155
Braun M, Krebs S, Flohr F, Gavrila DM (2019) EuroCity persons: a novel benchmark for person detection in traffic scenes. IEEE Trans Pattern Anal Mach Intell 41(8):1844–1861. https://doi.org/10.1109/TPAMI.2019.2897684
Wang W (2020) Adapted center and scale prediction: more stable and more accurate. arXiv:2002.09053
Luo P, Ren J, Peng Z, Zhang R, Li J (2018) Differentiable learning-to-normalize via switchable normalization. arXiv:1806.10779
Deng J, Dong W, Socher R, Li L, Li K, Li F-F (2009) ImageNet: a large-scale hierarchical image database. 2009 IEEE conference on computer vision and pattern recognition, pp. 248-255, https://doi.org/10.1109/CVPR.2009.5206848
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp. 770-778, https://doi.org/10.1109/CVPR.2016.90
Song T, Sun L, Xie D, Sun H, Pu S (2018) Small-scale pedestrian detection based on somatic topology localization and temporal feature aggregation. European conference on computer vision, pp. 554-569
Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2018) Repulsion loss: detecting pedestrians in a crowd. 2018 IEEE/CVF conference on computer vision and pattern recognition, pp. 7774-7783, https://doi.org/10.1109/CVPR.2018.00811
Zhang S, Wen L, Bian X, Lei Z, Li S (2018) Occlusion-aware r-cnn: detecting pedestrians in a crowd. European Conference on Computer Vision, Springer, Cham, pp 657–674
Liu R, Ma H (2019) Semantic head enhanced pedestrian detection in a crowd. arXiv:1911.11985
Liu W, Liao S, Hu W, Liang X, Chen X (2018) Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. European conference on computer vision, springer, Cham. Springer, Cham, pp.643-659
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767
Zhang S, Wen L, Bian X, Lei Z, Li S Z (2018). Single-shot refinement neural network for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4203–4212
Acknowledgements
This work was sponsored in part by National Natural Science Foundation of Guangdong under Grant 2020A1515011409, in part by Key-Area Research and Development Program of Guangdong Province under Grant 2019B01015300, 2021B0101190003, in part by Special Project of Science and Technology Innovation Strategy of Guangdong Province under Grant 2021A1414030004, in part by Key Program of NSFC-Guangdong Joint Funds under Grant U1801263, U2001201, in part by Provincial Agricultural Science and TechnologyInnovation and Extension Project of Guangdong Province under Grant 2019KJ147, and in part by Guangdong Provincial Key Laboratory of Cyber-Physical System under Grant 2020B1212060069.
Author information
Authors and Affiliations
Contributions
Zhuowei Wang contributed to the conception of the study; Weida Lin performed the experiment; Weida Lin and Yang Wang contributed significantly to analysis and manuscript preparation; Weida Lin performed the data analyses and wrote the manuscript; Weida Lin, Zhuowei Wang, Lianglun Cheng, Xiaoyu Song helped perform the analysis with constructive discussions.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent to publish
Not applicable.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Z., Lin, W., Cheng, L. et al. Multi-branch detection network based on trigger attention for pedestrian detection under occlusion. Appl Intell 53, 6119–6132 (2023). https://doi.org/10.1007/s10489-022-03747-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03747-2