Skip to main content
Log in

Real-time pedestrian detection with deep supervision in the wild

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Pedestrian detection is a challenging research task, and it is widely applied in automatic driving and intelligent surveillance fields. Although many approaches based on deep learning have shown effectiveness for detecting pedestrian, these approaches are difficult to achieve a good trade-off between real time and accuracy. In this paper, a new pedestrian detection algorithm is proposed to address the above problem, and then, a new pedestrian dataset is introduced to evaluate detection performance in our experiment. Our model contains region generation module and region prediction module, and our model allows for parallel processing of two modules for speed. The feature pyramid strategy is adopted in generation module to make full use of features, and deconvolution layers are used to obtain more high-level feature contextual. The deep supervision idea is introduced to prediction module to guide the detection results toward ground truth. Eventually, the proposed method is evaluated on three different datasets (INRIA, ETH and Caltech) and compared with other existing state-of-the-arts, and the experimental results present the competitive accuracy and real time of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Benenson, R., Omran, M., Hosang, J., Schiele, B.: Ten years of pedestrian detection, what have we learned? Eur. Conf. Comput. Vis. 8926, 613–627 (2014)

    Google Scholar 

  2. Cai, Z., Saberian, M., Vasconcelos, N.: Learning complexity aware cascades for deep pedestrian detection. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3361–3369 (2015)

  3. Du, X., El-Khamy, M., Lee, J., Davis, L.: Fused dnn: a deep neural network fusion approach to fast and robust pedestrian detection. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 953–961 (2017)

  4. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893 (2005)

  5. Perona, P., Dollar, P., Schiele B., Wojek, C.: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians

  6. Dollar, P., Tu, Z., Tao, H., Belongie, S.: Feature mining for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007, CVPR’07, pp. 1–8 (2007)

  7. Dollr, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)

    Article  Google Scholar 

  8. Dollr, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2011)

    Article  Google Scholar 

  9. Ess, A., Leibe, B., Gool, L.V.: Depth and appearance for mobile scene analysis. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)

  10. Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, vol. 8, pp. 1–8 (2008)

  11. Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: deconvolutional single shot detector. In: Computer Vision and Pattern Recognition (2017)

  12. Ghorban, F., Marn, J., Su, Y., Colombo, A., Kummert, A.: Aggregated channels network for real-time pedestrian detection. In: Computer Vision and Pattern Recognition (2018)

  13. Girshick, R.: Fast r-cnn. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)

  14. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  16. Lan, X., Ye, M., Zhang, S., Yuen, P.C.: Robust collaborative discriminative learning for rgb-infrared tracking. In: AAAI Conference on Artificial Intelligence, pp. 7008–7015 (2018)

  17. Lee, C., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. AISTATS 38, 09 (2014)

    Google Scholar 

  18. Lee, A.Y., Kim, H., Park, E., Cui, X., Kim, H.: Wide-residual-inception networks for real-time object detection. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 758–764 (2017)

  19. Lin, Z., Davis, L.S.: A pose-invariant descriptor for human detection and segmentation. Eur. Conf. Comput. Vis. 5305, 423–436 (2008)

    Google Scholar 

  20. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. Eur. Conf. Comput. Vis. 9905, 21–37 (2016)

    Article  Google Scholar 

  21. Maji, S., Berg, A.C., Malik, J:. Classification using intersection kernel support vector machines is efficient. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8 (2008)

  22. Nam, W., Dollr, P., Han, J.H.: Local decorrelation for improved pedestrian detection. In: NIPS, vol. 1, pp. 1–9 (2014)

  23. Ouyang, W., Wang, X.: Joint deep learning for pedestrian detection. In: 2013 IEEE International Conference on Computer Vision, pp. 2056–2063 (2013)

  24. Paisitkriangkrai, S., Shen, C., Hengel, A., Van, D.: Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features. Springer, Berlin (2014)

    Book  Google Scholar 

  25. Pedro, P.O., Lin, T.Y., Collobert, R., Dollr, P.: Learning to refine object segments. In: Lecture Notes in Computer Science, vol. 9905, pp. 75–91 (2016)

  26. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)

  27. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)

    Article  Google Scholar 

  28. Shen, W., Zhao, K., Jiang, Y., Wang, Y., Bai, X., Yuille, A.: Deepskeleton: learning multi-task scale-associated deep side outputs for object skeleton extraction in natural images. IEEE Trans. Image Process. 26(11), 5298–5311 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  29. Tom, D., Monti, F., Baroffio, L., Bondi, L., Tagliasacchi, M., Tubaro, S.: Deep convolutional neural networks for pedestrian detection. Signal Process. Image Commun. 47(C), 482–489 (2016)

    Article  Google Scholar 

  30. Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion appearance. Int. J. Comput. Vis. 2(2), 734–741 (2003)

    Google Scholar 

  31. Wojek, C., Schiele, B.: A performance evaluation of single and multi-feature people detection. DAGM Symp. Pattern Recognit. 4(4), 82–91 (2008)

    Article  Google Scholar 

  32. Xie, S., Tu, Z.: Holistically-nested edge detection. Int. J. Comput. Vis. 125(1), 3–18 (2017)

    Article  MathSciNet  Google Scholar 

  33. Yang, B., Yan, J., Lei, Z., Li, S. Z.: Convolutional channel features. In: 2015 IEEE International Conference on Computer Vision, pp. 82–90 (2015)

  34. Zhang, L., Lin, L., Liang, X., He, K.: Is faster r-cnn doing well for pedestrian detection? Eur. Conf. Comput. Vis. 9906, 443–457 (2016)

    Google Scholar 

  35. Zhang, S., Benenson, R., Omran, M., Hosang, J., Schiele, B.: Towards reaching human performance in pedestrian detection. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2018)

    Google Scholar 

  36. Zhang, X., Cheng, L., Li, B., Hu, H.M.: Too far to see? Not really!—pedestrian detection with scale-aware localization policy. IEEE Trans. Image Process. 27(8), 3703–3715 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  37. Zhang, Z., Qiao, S., Xie, C., Shen, W., Wang, B., Yuille, A.: Single-shot object detection with enriched semantics. In: Computer Vision and Pattern Recognition, vol. 12 (2017)

Download references

Funding

The funding was provided by National Natural Science Foundation of China (Grand Nos. 61203261; 61876099) and China Postdoctoral Science Foundation-funded project (Grand No. 2012M521335).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenxue Chen.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Chen, Z., Wu, Q.M.J. et al. Real-time pedestrian detection with deep supervision in the wild. SIViP 13, 761–769 (2019). https://doi.org/10.1007/s11760-018-1406-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-018-1406-6

Keywords

Navigation