Skip to main content

A New Framework for People Counting from Coarse to Fine Could be Robust to Viewpoint and Illumination

  • Conference paper
  • First Online:
Book cover Intelligent Information and Database Systems (ACIIDS 2018)

Abstract

People counting is one of the important tasks in video surveillance. In spite of the significant improvements, this task still has had many challenges such as heavy occlusion in the crowded environment, viewpoint variation, the variety of illumination, etc. People counting process consisted of people detection stage and people tracking stage. This paper focused on boosting the people counting results based on people detection. Our suggested method combines the Deformable Part Models (DPM) and the Deep Convolutional Neural Network (DCNN) to take their advantages and to overcome the shortcomings of each method in people detection. Firstly, to be robust to viewpoint and occlusion, we fuse the people detection results from parts detected by DPM such as head, head-shoulders, upper body, full body. Secondly, to overcome the inefficiency of DPM due to zoom-in view, we use DCNN in detecting head region because the body is often occluded, leaving only head be in full appearance for counting. Finally, we use the late fusion of the detection results from two listed models. PETS 2012 and TUD datasets are selected to experiment and the performance is evaluated by MAE, MRE. The experimental results show that our method could achieve higher performance than the method of Abiol [1], Conte [5] and Subburaman [22] on PETS dataset and especially it could outperform state-of-the-art method as YOLO9000 [10] with parameters fine tuning accordingly to HollywoodHeads dataset. Moreover, it could achieve the high performance in the sparse, medium-density crowd environment and it could be robust to scale, viewpoint, illumination, occlusion, and deformation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.cvg.reading.ac.uk/PETS2012/a.html.

  2. 2.

    https://motchallenge.net/data/2D_MOT_2015.

References

  1. Albiol, A., et al.: Video analysis using corner motion statistics. In: Performance Evaluation of Tracking and Surveillance Workshop at CVPR, pp. 31–37 (2009)

    Google Scholar 

  2. Zeng, C., et al.: Robust head-shoulder detection by PCA-based multilevel HOG-LBP detector for people counting. In: 20th ICPR, pp. 2069–2072 (2010). https://doi.org/10.1109/icpr.2010.509

  3. Tome, D., et al.: Deep convolutional neural networks for pedestrian detection. arXiv:1510.03608 (2016). https://doi.org/10.1016/j.image.2016.05.007

  4. Kang, D., et al.: Beyond counting: comparisons of density maps for crowd analysis tasks - counting, detection, and tracking. arXiv:1705.10118 (2017)

  5. Conte, D., et al.: A method for counting people in crowded scenes. In: 7th IEEE International Conference on AVSS, pp. 225–232 (2010). https://doi.org/10.1109/avss.2010.78

  6. Ling, D., et al.: An automatic people counting method of hotel dining with occlusion. J. Artif. Intell. Pract. 1(1), 1–7 (2016)

    Google Scholar 

  7. Schroff, F., et al.: FaceNet: a unified embedding for face recognition and clustering. arXiv:1503.03832 (2015). https://doi.org/10.1109/cvpr.2015.7298682

  8. Idrees, H.: Visual analysis of extremely dense crowded scenes. Ph.D. dissertation, University of Central Florida, USA (2014)

    Google Scholar 

  9. Barandiaran, J., et al.: Real-time people counting using multiple lines. In: WIAMIS, pp. 159–162 (2008). https://doi.org/10.1109/wiamis.2008.27

  10. Redmon, J., et al.: YOLO9000: better, faster, stronger. arXiv:1612.08242 (2017)

  11. García, J., et al.: Directional people counter based on head tracking. IEEE TIE 60(9), 3991–4000 (2013). https://doi.org/10.1109/TIE.2012.2206330

    Google Scholar 

  12. van de Sande, K.E.A., et al.: Segmentation as selective search for object recognition. In: ICCV (2011). https://doi.org/10.1109/iccv.2011.6126456

  13. Boominathan, L., et al.: CrowdNet: a deep convolutional network for dense crowd counting. arXiv:1608.06197, pp. 640–644 (2016)

  14. Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting people using mutually consistent poselet activations. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 168–181. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15567-3_13

    Chapter  Google Scholar 

  15. Pizzo, L.D., et al.: Counting people by RGB or depth overhead cameras. Pattern Recogn. Lett. 81, 41–50 (2016). https://doi.org/10.1016/j.patrec.2016.05.033

    Article  Google Scholar 

  16. Ngoc, L.Q., et al.: Event retrieval in soccer video from coarse to fine based on multi-modal approach. In: IEEE RIVF, pp. 308–313 (2010). https://doi.org/10.1109/rivf.2010.5632694

  17. Oquab, M., et al.: Learning and transferring mid-level image representations using convolutional neural networks. In: IEEE Conference on CVPR, pp. 1717–1724 (2014). https://doi.org/10.1109/cvpr.2014.222

  18. Felzenszwalb, P.F., et al.: Object detection with discriminatively trained part based models. IEEE TPAMI 32, 1627–1645 (2010). https://doi.org/10.1109/TPAMI.2009.167

    Article  Google Scholar 

  19. Felzenszwalb, P.F., et al.: Discriminatively trained deformable part models (2010). Release 4 http://people.cs.uchicago.edu/~pff/latent-release4

  20. Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv:1311.2524 (2014). https://doi.org/10.1109/cvpr.2014.81

  21. Vu, T.-H., et al.: Context-aware CNNs for person head detection. In: IEEE on ICCV, pp. 2893–2901 (2015). https://doi.org/10.1109/ICCV.2015.331

  22. Subburaman, V.B., et al.: Counting people in the crowd using a generic head detector. In: 9th IEEE on AVSS, pp. 470–475 (2012). https://doi.org/10.1109/avss.2012.87

  23. Liu, W., et al.: SSD: single shot multibox detector. arXiv:1512.02325 (2016). https://doi.org/10.1007/978-3-319-46448-0_2

  24. Taigman, Y., et al.: DeepFace: closing the gap to human-level performance in face verification. In: IEEE on CVPR, pp. 1701–1708 (2014). https://doi.org/10.1109/cvpr.2014.220

  25. Zhao, Z., Li, H., Zhao, R., Wang, X.: Crossing-line crowd counting with two-phase deep neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 712–726. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_43

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to An H. Nguyen or Ngoc Q. Ly .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, A.H., Ly, N.Q. (2018). A New Framework for People Counting from Coarse to Fine Could be Robust to Viewpoint and Illumination. In: Nguyen, N., Hoang, D., Hong, TP., Pham, H., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2018. Lecture Notes in Computer Science(), vol 10752. Springer, Cham. https://doi.org/10.1007/978-3-319-75420-8_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75420-8_59

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75419-2

  • Online ISBN: 978-3-319-75420-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics