Abstract
With pedestrian detection algorithms, balancing the trade-off between accuracy and speed remains challenging. Following the central point-based one-stage object detection paradigm, a pedestrian detection algorithm based on multi-scale attention feature aggregation (MAFA) is proposed to improve accuracy while considering real-time performance. We refer to the proposed algorithm as MAFA-Net. Through the design of deep dilate blocks, deeper features are extracted. Pedestrian attention blocks are added to mine more relevant information between features from the perspective of spatial and passage-wise dimensions, and pedestrian features are enhanced. Feature aggregation modules are used to fuse different scale features, and combine the rich high-level semantic features with the accurate location features of the low-level features. Experiments were conducted on two challenging pedestrian detection datasets, i.e., CityPersons and Caltech, using MR−2 as the evaluation index. For Caltech, MR−2 is 4.58% under reasonable conditions. For CityPersons, MR−2 is 11.47% and 10.05% under reasonable and partial occlusion conditions, which is 0.43% and 1.35% better than the suboptimal comparison detection method. The results demonstrate that a good performance is obtained, and the effectiveness and feasibility of the algorithm are verified.
Similar content being viewed by others
References
Tian Y, Luo P, Wang X., Tang X (2015) Pedestrian detection aided by deep learning semantic tasks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 5079–5087
Zhang L, Lin L, Liang X, He K (2016) Is faster r-cnn doing well for pedestrian detection? In: European conference on computer vision (ECCV), pp 443–457
Liu W, Liao S, Ren W, Hu W, Yu Y (2019)High-level semantic feature detection: a new perspective for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5187–5196
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1137–1149
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu Cheng Y, Berg A C (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp. 21–37
Yu X, Ye X, Gao Q (2018) Infrared handprint image restoration algorithm based on apoptotic mechanism. In: IEEE Access, pp. 47334–47343
Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: European conference on computer vision (ECCV), pp 734-750
Zhu C, Chen F, Shen Z, Savvides M (2019) Soft anchor-point object detection. In: European Conference on Computer Vision, pp. 91–107
Duan K, Xie L, Qi H, Bai S, Huang Q, Tian Q (2020) Corner proposal network for anchor-free, Two-stage Object Detection arXiv 2007.13816
Dong Z, Li G, Liao Y, Wang F, Ren P, Qian C (2020) Centripetalnet: pursuing high-quality keypoint pairs for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10519–10528
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587
Pang Y, Xie J, Khan M H, Anwer R M, Khan F S, Shao L (2019)Mask-guided attention network for occluded pedestrian detection. In: IEEE International Conference on Computer Vision, pp. 4967–4975
Xie J, Pang Y, Cholakkal H, Anwer R M, Khan F S, Shao L (2020) PSC-Net: Learning Part Spatial Co-occurence for Occluded Pedestrian Detection arXiv: 2001.09252
Zhang J, Lin L, Li Y, Chen Y C, Zhu J, Hu Y, Hoi S C (2019)Attribute-aware Pedestrian Detection in a Crowd arXiv: 1910.09188
Zhuang C, Lei Z, Li S Z (2020) SADet: Learning An Efficient and Accurate Pedestrian Detector arXiv: 2007.13119
Huang X, Ge Z, Jie Z, Yoshie O (2020) NMS by representative region: towards crowded pedestrian detection by proposal pairing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10750–10759
Chu X, Zheng A, Zhang X, Sun, J (2020) Detection in crowded scenes: one proposal, multiple predictions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12214–12223
Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2018) Repulsion loss: detecting pedestrians in a crowd. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7774–7783
Lin T Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125
Zhang S, Benenson R, Schiele B (2017) Citypersons: a diverse dataset for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3221
Dollar P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. In: IEEE trans. Pattern Anal. Mach. Intell, pp. 743–761
Tarvainen A, Valpola H (2017) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in neural information processing systems, pp. 1195–1204
Wang W (2020) Adapted center and scale prediction: more stable and more accurate. arXiv:2002.09053
Liu X, Zhu X, Li M, Wang L, Zhu E, Liu T, Gao W (2020) Multiple kernel k-means with incomplete kernels. In: IEEE transactions on pattern analysis and machine intelligence, pp. 1191–1204
Zhang S, Wen L, Bian X, Lei Z, Li S Z (2018)Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: European conference on computer vision (ECCV), pp 637-653
Lu R, Ma H, Wang Y (2020) Semantic head enhanced pedestrian detection in a crowd. In: Neurocomputing, pp. 343–351
Song T, Sun L, Xie D, Sun H, Pu S (2018)Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: European conference on computer vision (ECCV), pp 536-551
Wang Z, Wang J, Yang Y (2020) Resisting the Distracting-factors in Pedestrian Detection arXiv: 2005.07344
Liu W, Liao S, Hu W, Liang X, Chen X (2018) Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of the European Conference on Computer Vision (ECCV) , pp 618–634
Zhang S, Yang X, Liu Y, Xu C (2020) Asymmetric multi-stage CNNs for small-scale pedestrian detection. In: Neurocomputing, pp. 12–26
Zhang S, Xie Y, Wan J, Xia H, Li S Z, Guo G (2019) Widerperson: a diverse dataset for dense pedestrian detection in the wild. In: IEEE Transactions on Multimedia, pp. 380–393
Lin C, Lu J, Wang G, Zhou J (2020)Graininess-aware deep feature learning for robust pedestrian detection. In: IEEE transactions on image processing, pp. 3820–3834
Yang X, Liu Q (2021)Scale-sensitive feature reassembly network for pedestrian detection. Sensors 21(12):4189
Ge Z, Wang J, Huang X, Liu S, Yoshie O (2021) Lla: loss-aware label assignment for dense pedestrian detection. arXiv:2101.04307
Brazil G, Yin X, Liu X (2017) Illuminating pedestrians via simultaneous detection & segmentation. In: IEEE International Conference on Computer Vision, pp. 4950–4959
Mao J, Xiao T, Jiang Y, Cao Z (2017) What can help pedestrian detection? In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3127–3136
Liu T, Luo W, Ma L, Huang J J, Stathaki T, Dai T (2020) Coupled network for robust pedestrian detection with gated multi-layer feature extraction and deformable occlusion handling. In: IEEE transactions on image processing, pp. 754–766
Sun C, Ai Y, Wang S, Zhang W (2021)Mask-guided SSD for small-object detection. In: Applied Intelligence, pp. 3311–3322
Acknowledgments
This study was sponsored by the China Shandong Key R&D Plan (2018GGX106008) and was supported by the China Shandong Key Laboratory of Medical Physical Image Processing Technology.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xia, H., Wan, H., Ou, J. et al. MAFA-net: pedestrian detection network based on multi-scale attention feature aggregation. Appl Intell 52, 7686–7699 (2022). https://doi.org/10.1007/s10489-021-02796-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02796-3