Skip to main content
Log in

AFC-Net: adjacent feature complementary for crowded pedestrian detection

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In recent years, despite the significant performance improvement for pedestrian detection algorithms in crowded scenes, an imbalance between detection accuracy and speed still exists. To address this issue, we propose an adjacent features complementary network for crowded pedestrian detection based on one-stage anchor-free detector, which is called AFC-Net. Firstly, deep dilated convolution (DDC) is invoked in the backbone to expand receptive fields, so that the feature map can remain its original size with feature spatial sensitivity enhanced. Secondly, hierarchical feature extraction (HFE) is designed to extract feature information pertinently according to the feature properties from different layers. Specifically, multi-scale feature extractor and channel attention mechanism are employed to extract contextual information among features on high-level features. Spatial attention mechanism is applied to filter background information on low-level features. Finally, adjacent feature integration (AFI) is proposed to aggregate the correlative features of adjacent layers so as to make expressive ability of features more comprehensive, thus improving the pedestrian detection results. In the challenging CityPersons dataset and CrowdHuman dataset, the crowded scene pedestrian detection network with complementary adjacent features has achieved great results in pedestrian detection. The result achieved from the experiment shows that the proposed algorithm can still maintain the comparability and stability of detection accuracy, while the network parameters are greatly reduced and the speed is effectively improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bi, L., Feng, D.D., Fulham, M., Kim, J.: Multi-label classification of multi-modality skin lesion via hyper-connected convolutional neural network. Pattern Recogn. 107, 107502 (2020). https://doi.org/10.1016/j.patcog.2020.107502

    Article  Google Scholar 

  2. Ke, X., Lin, X., Qin, L.: Lightweight convolutional neural network-based pedestrian detection and re-identification in multiple scenarios. Mach. Vis. Appl. (2021). https://doi.org/10.1007/s00138-021-01169-7

    Article  Google Scholar 

  3. Murthy, C.B., Hashmi, M.F., Keskar, A.: Efficientlitedet: a real-time pedestrian and vehicle detection algorithm. Mach. Vis. Appl. (2022). https://doi.org/10.1007/s00138-022-01293-y

    Article  Google Scholar 

  4. Jung, H.-K., Choi, G.-S.: Improved yolov5: efficient object detection using drone images under various conditions. Appl. Sci. (2022). https://doi.org/10.3390/app12147255

    Article  Google Scholar 

  5. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  6. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1. IEEE, vol. 2005, pp. 886–893 (2005). https://doi.org/10.1109/CVPR.2005.177

  7. Fu, K., Zhao, Q., Gu, I.Y.-H., Yang, J.: Deepside: a general deep framework for salient object detection. Neurocomputing 356, 69–82 (2019). https://doi.org/10.1016/j.neucom.2019.04.062

    Article  Google Scholar 

  8. Lin, C., Lu, J., Wang, G., Zhou, J.: Graininess-aware deep feature learning for pedestrian detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision—ECCV 2018, pp. 745–761. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_45

    Chapter  Google Scholar 

  9. Xie, J., Pang, Y., Cholakkal, H., Anwer, R., Khan, F., Shao, L.: PSC-NET: learning part spatial co-occurrence for occluded pedestrian detection. Sci. China Inf. Sci. 64, 1–13 (2021). https://doi.org/10.1007/s11432-020-2969-8

    Article  MathSciNet  Google Scholar 

  10. Liu, W., Liao, S., Ren, W., Hu, W., Yu, Y.: High-level semantic feature detection: a new perspective for pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5187–5196 (2019). https://doi.org/10.1109/CVPR.2019.00533

  11. Zhuang, C., Li, Z., Zhu, X., Lei, Z., Li, S.Z.: SADet: learning an efficient and accurate pedestrian detector. In: 2021 IEEE International Joint Conference on Biometrics (IJCB), IEEE, pp. 1–8 (2021). https://doi.org/10.1109/IJCB52358.2021.9484371

  12. Hou, Q., Cheng, M.-M., Hu, X., Borji, A., Tu, Z., Torr, P.H.: Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3203–3212 (2017). https://doi.org/10.1109/TPAMI.2018.2815688

  13. Wang, J., Yu, J., He, Z.: DECA: a novel multi-scale efficient channel attention module for object detection in real-life fire images. Appl. Intell. 52, 1362–1375 (2022). https://doi.org/10.1007/s10489-021-02496-y

    Article  Google Scholar 

  14. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-CNN: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. (2015). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  15. Qin, Z., Li, Z., Zhang, Z., Bao, Y., Yu, G., Peng, Y., Sun, J.: Thundernet: towards real-time generic object detection on mobile devices. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6718–6727 (2019). https://doi.org/10.1109/ICCV.2019.00682

  16. Wang, Z., Wu, Z., Lu, J., Zhou, J.: BiDet: an efficient binarized object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2049–2058 (2020). https://doi.org/10.1109/CVPR42600.2020.00212

  17. Cui, Y., Yang, L., Liu, D.: Dynamic proposals for efficient object detection (2022). arXiv:2207.05252

  18. Huang, L., Yang, Y., Deng, Y., Yu, Y.: Densebox: unifying landmark localization with end to end object detection. arXiv preprint arXiv:1509.04874 (2015). https://doi.org/10.48550/arXiv.1509.04874

  19. Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6569–6578 (2019). https://doi.org/10.1109/ICCV.2019.00667

  20. Liu, Z., Zheng, T., Xu, G., Yang, Z., Liu, H., Cai, D.: Training-time-friendly network for real-time object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 11685–11692 (2020). https://doi.org/10.1609/aaai.v34i07.6838

  21. Su, H., He, Y., Jiang, R., Zhang, J., Zou, W., Fan, B.: DSLA: dynamic smooth label assignment for efficient anchor-free object detection. Pattern Recogn. 131, 108868 (2022). https://doi.org/10.1016/j.patcog.2022.108868

    Article  Google Scholar 

  22. Liu, S., Huang, D., Wang, Y.: Adaptive NMS: refining pedestrian detection in a crowd. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6459–6468 (2019). https://doi.org/10.1109/CVPR.2019.00662

  23. Wang, J., Zhao, C., Huo, Z., Qiao, Y., Sima, H.: High quality proposal features generation for crowded pedestrian detection. Pattern Recogn. (2022). https://doi.org/10.1016/j.patcog.2022.108605

    Article  Google Scholar 

  24. Zhou, K., Chen, L., Cao, X.: Improving multispectral pedestrian detection by addressing modality imbalance problems. In: European Conference on Computer Vision, pp. 787–803. Springer (2020). https://doi.org/10.1007/978-3-030-58523-5_46

  25. Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017). https://doi.org/10.1109/ICCV.2017.324

  26. Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017). https://doi.org/10.1109/CVPR.2017.106

  27. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  28. Ma, J., Wan, H., Wang, J., Xia, H., Bai, C.: An improved scheme of deep dilated feature extraction on pedestrian detection. SIViP 15, 231–239 (2021). https://doi.org/10.1007/s11760-020-01742-z

    Article  Google Scholar 

  29. Zhang, S., Benenson, R., Schiele, B.: Citypersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3221 (2017). https://doi.org/10.48550/arXiv.1702.05693

  30. Shao, S., Zhao, Z., Li, B., Xiao, T., Yu, G., Zhang, X., Sun, J.: Crowdhuman: a benchmark for detecting human in a crowd. arXiv preprint arXiv:1805.00123 (2018). https://doi.org/10.48550/arXiv.1805.00123

  31. Liu, W., Liao, S., Hu, W., Liang, X., Chen, X.: Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 618–634 (2018). https://doi.org/10.1007/978-3-030-01264-9_38

  32. Ma, J., Wan, H., Wang, J., Xia, H., Bai, C.: An improved one-stage pedestrian detection method based on multi-scale attention feature extraction. J. Real-Time Image Proc. 18, 1965–1978 (2021). https://doi.org/10.1016/j.dsp.2021.103311

    Article  Google Scholar 

  33. Li, Q., Qiang, H., Li, J.: Conditional random fields as message passing mechanism in anchor-free network for multi-scale pedestrian detection. Inf. Sci. 550, 1–12 (2021). https://doi.org/10.1016/j.dsp.2021.103311

    Article  MathSciNet  Google Scholar 

  34. Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S. Z.: Occlusion-aware r-CNN: detecting pedestrians in a crowd. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 637–653 (2018). https://doi.org/10.1007/978-3-030-01219-9_39

  35. Lu, R., Ma, H., Wang, Y.: Semantic head enhanced pedestrian detection in a crowd. Neurocomputing 400, 343–351 (2020). https://doi.org/10.1016/j.neucom.2020.03.037

    Article  Google Scholar 

  36. Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., Shen, C.: Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7774–7783 (2018). https://doi.org/10.1109/CVPR.2018.00811

  37. Zhang, S., Yang, X., Liu, Y., Xu, C.: Asymmetric multi-stage CNNs for small-scale pedestrian detection. Neurocomputing 409, 12–26 (2020). https://doi.org/10.1016/j.neucom.2020.05.019

    Article  Google Scholar 

  38. Zhang, Y., Yi, P., Zhou, D., Yang, X., Yang, D., Zhang, Q., Wei, X.: CSANet: channel and spatial mixed attention CNN for pedestrian detection. IEEE Access 8, 76243–76252 (2020). https://doi.org/10.1109/ACCESS.2020.2986476

    Article  Google Scholar 

  39. Song, T., Sun, L., Xie, D., Sun, H., Pu, S.: Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 536–551 (2018). https://doi.org/10.1007/978-3-030-01234-2_33

  40. Wang, Z., Wang, J., Yang, Y.: Resisting the distracting-factors in pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020). https://doi.org/10.48550/arXiv.2005.07344

  41. Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9627–9636 (2019). https://doi.org/10.1109/ICCV.2019.00972

  42. Rukhovich, D., Sofiiuk, K., Galeev, D., Barinova, O., Konushin, A.: IterDet: iterative scheme for object detection in crowded environments. In: Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pp. 344–354. Springer (2021). https://doi.org/10.1007/978-3-030-73973-7_33

Download references

Acknowledgements

This work was supported by Fundamental Research Funds for the Universities of Henan Province (NSFRF220414) and Excellent Young Teachers Program of Henan Polytechnic University (No. 2019XQG-02).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhanqiang Huo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Zhao, C., Liu, Z. et al. AFC-Net: adjacent feature complementary for crowded pedestrian detection. Machine Vision and Applications 34, 85 (2023). https://doi.org/10.1007/s00138-023-01439-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-023-01439-6

Keywords

Navigation