Skip to main content
Log in

MAFA-net: pedestrian detection network based on multi-scale attention feature aggregation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

With pedestrian detection algorithms, balancing the trade-off between accuracy and speed remains challenging. Following the central point-based one-stage object detection paradigm, a pedestrian detection algorithm based on multi-scale attention feature aggregation (MAFA) is proposed to improve accuracy while considering real-time performance. We refer to the proposed algorithm as MAFA-Net. Through the design of deep dilate blocks, deeper features are extracted. Pedestrian attention blocks are added to mine more relevant information between features from the perspective of spatial and passage-wise dimensions, and pedestrian features are enhanced. Feature aggregation modules are used to fuse different scale features, and combine the rich high-level semantic features with the accurate location features of the low-level features. Experiments were conducted on two challenging pedestrian detection datasets, i.e., CityPersons and Caltech, using MR−2 as the evaluation index. For Caltech, MR−2 is 4.58% under reasonable conditions. For CityPersons, MR−2 is 11.47% and 10.05% under reasonable and partial occlusion conditions, which is 0.43% and 1.35% better than the suboptimal comparison detection method. The results demonstrate that a good performance is obtained, and the effectiveness and feasibility of the algorithm are verified.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Tian Y, Luo P, Wang X., Tang X (2015) Pedestrian detection aided by deep learning semantic tasks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 5079–5087

  2. Zhang L, Lin L, Liang X, He K (2016) Is faster r-cnn doing well for pedestrian detection? In: European conference on computer vision (ECCV), pp 443–457

  3. Liu W, Liao S, Ren W, Hu W, Yu Y (2019)High-level semantic feature detection: a new perspective for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5187–5196

  4. Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1137–1149

  5. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu Cheng Y, Berg A C (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp. 21–37

  6. Yu X, Ye X, Gao Q (2018) Infrared handprint image restoration algorithm based on apoptotic mechanism. In: IEEE Access, pp. 47334–47343

  7. Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: European conference on computer vision (ECCV), pp 734-750

  8. Zhu C, Chen F, Shen Z, Savvides M (2019) Soft anchor-point object detection. In: European Conference on Computer Vision, pp. 91–107

  9. Duan K, Xie L, Qi H, Bai S, Huang Q, Tian Q (2020) Corner proposal network for anchor-free, Two-stage Object Detection arXiv 2007.13816

  10. Dong Z, Li G, Liao Y, Wang F, Ren P, Qian C (2020) Centripetalnet: pursuing high-quality keypoint pairs for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10519–10528

  11. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587

  12. Pang Y, Xie J, Khan M H, Anwer R M, Khan F S, Shao L (2019)Mask-guided attention network for occluded pedestrian detection. In: IEEE International Conference on Computer Vision, pp. 4967–4975

  13. Xie J, Pang Y, Cholakkal H, Anwer R M, Khan F S, Shao L (2020) PSC-Net: Learning Part Spatial Co-occurence for Occluded Pedestrian Detection arXiv: 2001.09252

  14. Zhang J, Lin L, Li Y, Chen Y C, Zhu J, Hu Y, Hoi S C (2019)Attribute-aware Pedestrian Detection in a Crowd arXiv: 1910.09188

  15. Zhuang C, Lei Z, Li S Z (2020) SADet: Learning An Efficient and Accurate Pedestrian Detector arXiv: 2007.13119

  16. Huang X, Ge Z, Jie Z, Yoshie O (2020) NMS by representative region: towards crowded pedestrian detection by proposal pairing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10750–10759

  17. Chu X, Zheng A, Zhang X, Sun, J (2020) Detection in crowded scenes: one proposal, multiple predictions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12214–12223

  18. Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2018) Repulsion loss: detecting pedestrians in a crowd. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7774–7783

  19. Lin T Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125

  20. Zhang S, Benenson R, Schiele B (2017) Citypersons: a diverse dataset for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3221

  21. Dollar P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. In: IEEE trans. Pattern Anal. Mach. Intell, pp. 743–761

  22. Tarvainen A, Valpola H (2017) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in neural information processing systems, pp. 1195–1204

  23. Wang W (2020) Adapted center and scale prediction: more stable and more accurate. arXiv:2002.09053

  24. Liu X, Zhu X, Li M, Wang L, Zhu E, Liu T, Gao W (2020) Multiple kernel k-means with incomplete kernels. In: IEEE transactions on pattern analysis and machine intelligence, pp. 1191–1204

  25. Zhang S, Wen L, Bian X, Lei Z, Li S Z (2018)Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: European conference on computer vision (ECCV), pp 637-653

  26. Lu R, Ma H, Wang Y (2020) Semantic head enhanced pedestrian detection in a crowd. In: Neurocomputing, pp. 343–351

  27. Song T, Sun L, Xie D, Sun H, Pu S (2018)Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: European conference on computer vision (ECCV), pp 536-551

  28. Wang Z, Wang J, Yang Y (2020) Resisting the Distracting-factors in Pedestrian Detection arXiv: 2005.07344

  29. Liu W, Liao S, Hu W, Liang X, Chen X (2018) Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of the European Conference on Computer Vision (ECCV) , pp 618–634

  30. Zhang S, Yang X, Liu Y, Xu C (2020) Asymmetric multi-stage CNNs for small-scale pedestrian detection. In: Neurocomputing, pp. 12–26

  31. Zhang S, Xie Y, Wan J, Xia H, Li S Z, Guo G (2019) Widerperson: a diverse dataset for dense pedestrian detection in the wild. In: IEEE Transactions on Multimedia, pp. 380–393

  32. Lin C, Lu J, Wang G, Zhou J (2020)Graininess-aware deep feature learning for robust pedestrian detection. In: IEEE transactions on image processing, pp. 3820–3834

  33. Yang X, Liu Q (2021)Scale-sensitive feature reassembly network for pedestrian detection. Sensors 21(12):4189

    Article  Google Scholar 

  34. Ge Z, Wang J, Huang X, Liu S, Yoshie O (2021) Lla: loss-aware label assignment for dense pedestrian detection. arXiv:2101.04307

  35. Brazil G, Yin X, Liu X (2017) Illuminating pedestrians via simultaneous detection & segmentation. In: IEEE International Conference on Computer Vision, pp. 4950–4959

  36. Mao J, Xiao T, Jiang Y, Cao Z (2017) What can help pedestrian detection? In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3127–3136

  37. Liu T, Luo W, Ma L, Huang J J, Stathaki T, Dai T (2020) Coupled network for robust pedestrian detection with gated multi-layer feature extraction and deformable occlusion handling. In: IEEE transactions on image processing, pp. 754–766

  38. Sun C, Ai Y, Wang S, Zhang W (2021)Mask-guided SSD for small-object detection. In: Applied Intelligence, pp. 3311–3322

Download references

Acknowledgments

This study was sponsored by the China Shandong Key R&D Plan (2018GGX106008) and was supported by the China Shandong Key Laboratory of Medical Physical Image Processing Technology.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hao Xia or Chengjie Bai.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, H., Wan, H., Ou, J. et al. MAFA-net: pedestrian detection network based on multi-scale attention feature aggregation. Appl Intell 52, 7686–7699 (2022). https://doi.org/10.1007/s10489-021-02796-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02796-3

Keywords

Navigation