Skip to main content
Log in

An improved scheme of deep dilated feature extraction on pedestrian detection

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Trade-off or appropriate balance between high accuracy on object identification and fast speed of identification process is one of the most challenging problems in the study of pedestrian detection algorithms which is based on convolutional neural network. In this paper, we presented a one-stage pedestrian detection algorithm to optimise the trade-off based on an improved scheme via implying deep network features. Firstly, a novel branch was attached to ResNet-50 backbone network. In comparison to the conventional convolution, a dilated convolution in the branch was used to extract much richer context features. Secondly, a classification regression sub-network with stacking predictors was proposed to locate objects and recognise whether the objects are pedestrians. Finally, a novel loss function was introduced into the scheme to improve our network training method by learning more detailed information regarding pedestrian locations. The proposed scheme in this study demonstrated a competitive missing rate which resulting in 12.90 in the ideal circumstances of accuracy and high speed against the challenging benchmark CityPerson in pedestrian detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Rohil, M.K., Gupta, N., Yadav, P.: An improved model for no-reference image quality assessment and a no-reference video quality assessment model based on frame analysis. Signal Image Video Process. 14, 205–213 (2020)

    Article  Google Scholar 

  2. Li, Y., Xu, J., Xia, R., Wang, X.-C., Xie, W.-X.: A two-stage framework of target detection in high-resolution hyperspectral images. Signal Image Video Process. 13, 1339–1346 (2019)

    Article  Google Scholar 

  3. Han, B., Wang, Y., Yang, Z., Gao, X.: Small-scale pedestrian detection based on deep neural network. IEEE Trans. Intell. Transp. Syst. 21, 3046–3055 (2019)

    Article  Google Scholar 

  4. Qian, Y., Yang, M., Zhao, X., Wang, C., Wang, B.: Oriented spatial transformer network for pedestrian detection using fish-eye camera. IEEE Trans. Multimed. 22(2), 421–431 (2020)

    Article  Google Scholar 

  5. Liu, W., Liao, S., Ren, W., Hu, W., Yu, Y.: High-level semantic feature detection: a new perspective for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5187–5196 (2019)

  6. Baek, J., Hyun, J., Kim, E.: A pedestrian detection system accelerated by kernelized proposals. IEEE Trans. Intell. Transp. Syst. 21(3), 1216–1228 (2020)

    Article  Google Scholar 

  7. Lin, C., Lu, J., Wang, G., Zhou, J.: Graininess-aware deep feature learning for robust pedestrian detection. IEEE Trans. Image Process. 29, 3820–3834 (2020)

    Article  Google Scholar 

  8. Pei, D., Jing, M., Liu, H., Jiang, L., Sun, F.: A fast RetinaNet fusion framework for multi-spectral pedestrian detection. Infrared Phys. Technol. (2019). https://doi.org/10.1016/j.infrared.2019.103178

    Article  Google Scholar 

  9. Doğan, Y., Demirci, S., Güdükbay, U., Dibeklioğlu, H.: Augmentation of virtual agents in real crowd videos. Signal Image Video Process. 13(4), 643–650 (2019)

    Article  Google Scholar 

  10. Zhang, S., Benenson, R., Schiele, B.: Citypersons: a diverse dataset for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3213–3221 (2017)

  11. Gao, X., Ram, S., Rodríguez, J.J.: A post-processing scheme for the performance improvement of vehicle detection in wide-area aerial imagery. Signal Image Video Process. 14(3), 625–633, 635 (2020)

    Article  Google Scholar 

  12. Touil, D.E., Terki, N., Medouakh, S.: Hierarchical convolutional features for visual tracking via two combined color spaces with SVM classifier. Signal Image Video Process. 13(2), 359–368 (2019)

    Article  Google Scholar 

  13. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  14. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: European Conference on Computer Vision (ECCV), pp. 21–37 (2016)

  15. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117–2125 (2017)

  16. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)

  17. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)

  18. Tian, Y., Luo, P., Wang, X., Tang, X.: Pedestrian detection aided by deep learning semantic tasks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5079–5087 (2015)

  19. Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)

    Article  Google Scholar 

  20. Tian, Y., Luo, P., Wang, X., Tang, X.: Deep learning strong parts for pedestrian detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 1904–1912 (2015)

  21. Nam, W., Dollr, P., Han, J.H.: Local decorrelation for improved pedestrian detection. NIPS 1, 1–9 (2014)

    Google Scholar 

  22. Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: European Conference on Computer Vision (ECCV), pp. 443–457 (2016)

  23. Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: European Conference on Computer Vision (ECCV), pp. 354–370 (2016)

  24. Li, J., Liang, X., Shen, S., Xu, T., Feng, J., Yan, S.: Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimed. 20(4), 985–996 (2018)

    Google Scholar 

  25. Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6154–6162 (2018)

  26. Mao, J., Xiao, T., Jiang, Y., Cao, Z.: What can help pedestrian detection?. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3127–3136 (2017)

  27. Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., Shen, C.: Repulsion loss: detecting pedestrians in a crowd. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7774–7783 (2018)

  28. Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: European Conference on Computer Vision (ECCV), pp. 637–653 (2018)

  29. Liu, W., Liao, S., Hu, W., Liang, X., Chen, X.: Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: European Conference on Computer Vision (ECCV), pp. 618–634 (2018)

  30. Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Detnet: a backbone network for object detection (2018). arXiv:1804.06215

  31. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  32. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)

  33. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

  34. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2012)

    Article  Google Scholar 

  35. Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: NIPS, pp. 1195–1204 (2017)

  36. Li, Z., Chen, Z., Wu, Q.J., Liu, C.: Real-time pedestrian detection with deep supervision in the wild. Signal Image Video Process. 13(4), 761–769 (2019)

    Article  Google Scholar 

  37. Maji, S., Berg, A.C., Malik, J:. Classification using intersection kernel support vector machines is efficient. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)

  38. Lin, C.Y., Xie, H.X., Zheng, H.: PedJointNet: joint head–shoulder and full body deep network for pedestrian detection. IEEE Access 7, 47687–47697 (2019)

    Article  Google Scholar 

  39. Song, T., Sun, L., Xie, D., Sun, H., Pu, S.: Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: European Conference on Computer Vision (ECCV), pp. 536–551 (2018)

Download references

Acknowledgements

This study was sponsored by the China Shandong Key R&D Plan (2018GGX106008) and was supported by the China Shandong Key Laboratory of Medical Physical Image Processing Technology. The authors are very grateful for the fruitful discussion with Hui Shi.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chengjie Bai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, J., Wan, H., Wang, J. et al. An improved scheme of deep dilated feature extraction on pedestrian detection. SIViP 15, 231–239 (2021). https://doi.org/10.1007/s11760-020-01742-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-020-01742-z

Keywords

Navigation