Skip to main content
Log in

Single-image crowd counting: a comparative survey on deep learning-based approaches

  • Trends and Surveys
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Crowd counting is an attracting computer vision problem. Solutions to crowd counting hold high adaptability to other counting problems such as traffic counting and cell counting. Numerous methods have been proposed for the problem. Deep learning-based methods play a significant role in recent advancement. However, no existing literature reviews capture their sophisticated development by challenges. In this paper, we discuss and categorize recent deep learning works in crowd counting by considering how they address the challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

References

  1. Babu Sam D, Sajjan NN, Venkatesh Babu R, Srinivasan M (2018) Divide and grow: cap-turing huge diversity in crowd images with incrementally growing CNN. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3618–3626

  2. Boominathan L, Kruthiventi SS, Babu RV (2016) Crowdnet: a deep convolutional network for dense crowd counting. In: Proceedings of the 2016 ACM on multimedia conference. ACM, pp 640–644

  3. Cai B, Xu X, Jia K, Qing C, Tao D (2016) Dehazenet: an end-to-end system for single image haze removal. IEEE Trans Image Process 25(11):5187–5198

    Article  MathSciNet  Google Scholar 

  4. Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750

    Chapter  Google Scholar 

  5. Chan AB, Liang ZSJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008. IEEE, pp 1–7

  6. Chan AB, Vasconcelos N (2012) Counting people with low-level features and Bayesian regression. IEEE Trans Image Process 21(4):2160–2177

    Article  MathSciNet  Google Scholar 

  7. Chen K, Loy CC, Gong S, Xiang T (2012) Feature mining for localised crowd counting. In: BMVC, vol 1, no 2, p 3

  8. Cho SY, Chow TW, Leung CT (1999) A neural-based crowd estimation by hybrid global learning algorithm. IEEE Trans Syst Man Cybern Part B (Cybern) 29(4):535–541

    Article  Google Scholar 

  9. Choudri S, Ferryman JM, Badii A (2009) Robust background model for pixel based people counting using a single uncalibrated camera. In 2009 twelfth IEEE international workshop on performance evaluation of tracking and surveillance. IEEE, pp 1–8

  10. Davies AC, Yin JH, Velastin SA (1995) Crowd monitoring using image processing. Electron Commun Eng J 7(1):37–47

    Article  Google Scholar 

  11. Dong L, Parameswaran V, Ramesh V, Zoghlami I (2007) Fast crowd segmentation using shape indexing. In: 2007 IEEE 11th international conference on computer vision. IEEE, pp 1–8

  12. Ferryman J, Ellis AL (2014) Performance evaluation of crowd image analysis using the PETS2009 dataset. Pattern Recognit Lett 44:3–15

    Article  Google Scholar 

  13. Ge W, Collins RT (2009) Marked point processes for crowd counting. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 2913–2920

  14. Han K, Wan W, Yao H, Hou L (2017) Image crowd counting using convolutional neural network and Markov random field. arXiv preprint arXiv:1706.03686

  15. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  16. Huang S, Li X, Zhang Z, Wu F, Gao S, Ji R, Han J (2018) Body structure aware deep crowd counting. IEEE Trans Image Process 27(3):1049–1059

    Article  MathSciNet  Google Scholar 

  17. Hussain N, Yatim HSM, Hussain NL, Yan JLS, Haron F (2011) CDES: a pixel-based crowd density estimation system for Masjid al-Haram. Saf Sci 49(6):824–833

    Article  Google Scholar 

  18. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2547–2554

  19. Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. arXiv preprint arXiv:1808.01050

  20. Junior JCSJ, Musse SR, Jung CR (2010) Crowd analysis using computer vision techniques. IEEE Signal Process Mag 27(5):66–77

    Google Scholar 

  21. Kong D, Gray D, Tao H (2005) Counting pedestrians in crowds using viewpoint invariant training. In: BMVC, vol 1, p 2

  22. Kong D, Gray D, Tao H (2006) A viewpoint invariant approach for crowd counting. In: 18th international conference on pattern recognition (ICPR’06), vol 3. IEEE, pp 1187–1190

  23. Kumagai S, Hotta K, Kurita T (2017) Mixture of counting CNNS: adaptive integration of CNNS specialized to specific appearance for crowd counting. arXiv preprint arXiv:1703.09393

  24. Lempitsky V, Zisserman A (2010) Learning to count objects in images. In: Advances in neural information processing systems, pp 1324–1332

  25. Li H, He X, Wu H, Kasmani SA, Wang R, Luo X, Lin L (2018) Structured inhomogeneous density map learning for crowd counting. arXiv preprint arXiv:1801.06642

  26. Li M, Zhang Z, Huang K, Tan T (2008) Estimating the number of people in crowded scenes by MID based foreground segmentation and head–shoulder detection. In: 19th international conference on pattern recognition, 2008. ICPR 2008. IEEE, pp 1–4

  27. Li Y, Zhang X, Chen D (2018) Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100

  28. Lin SF, Chen JY, Chao HX (2001) Estimation of number of people in crowded scenes using perspective transformation. IEEE Trans Syst Man Cybern Part A Syst Hum 31(6):645–654

    Article  Google Scholar 

  29. Liu J, Gao C, Meng D, Hauptmann AG (2018) Decidenet: Counting varying density crowds through attention guided detection and density estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5197–5206

  30. Liu L, Wang H, Li G, Ouyang W, Lin L (2018) Crowd counting using deep recurrent spatial-aware network. arXiv preprint arXiv:1807.00601

  31. Liu X, van de Weijer J, Bagdanov AD (2018) Leveraging unlabeled data for crowd counting by learning to rank. arXiv preprint arXiv:1803.03095

  32. Lu E, Xie W, Zisserman A (2018) Class-agnostic counting. In: Asian conference on computer vision

  33. Ma R, Li L, Huang W, Tian Q (2004) On pixel count based crowd density estimation for visual surveillance. In: IEEE conference on cybernetics and intelligent systems, 2004, vol 1. IEEE, pp 170–173

  34. Ma W, Huang L, Liu C (2008) Advanced local binary pattern descriptors for crowd estimation. In: 2008 IEEE Pacific–Asia workshop on computational intelligence and industrial application vol 2. IEEE, pp 958–962

  35. Ma W, Huang L, Liu C (2008) Crowd estimation using multi-scale local texture analysis and confidence-based soft classification. In: 2008 second international symposium on intelligent information technology application, vol 1. IEEE, pp 142–146

  36. Ma W, Huang L, Liu C (2010) Crowd density analysis using co-occurrence texture features. In: 2010 5th international conference on computer sciences and convergence information technology (ICCIT). IEEE, pp 170–175

  37. Marana AN, Velastin SA, Costa LF, Lotufo RA (1997) Estimation of crowd density using image processing. In: IEE Colloquium on Image Processing for Security Applications (Digest No: 1997/074). IET, pp 1–11

  38. Marana AN, Costa LDF, Lotufo RA, Velastin SA (1998) On the efficacy of texture analysis for crowd monitoring. In: Proceedings SIBGRAPI’98. International symposium on computer graphics, image processing, and vision (Cat. No. 98EX237). IEEE, pp 354–361

  39. Marana AN, Costa LDF, Lotufo RA, Velastin SA (1999) Estimating crowd density with Minkowski fractal dimension. In: 1999 IEEE international conference on acoustics, speech, and signal processing. Proceedings. ICASSP99 (Cat. No. 99CH36258), vol 6. IEEE, pp 3521–3524

  40. Onoro-Rubio D, L’opez-Sastre RJ (2016) Towards perspective-free object counting with deep learning. In: European conference on computer vision. Springer, pp 615–629

  41. Rahmalan H, Nixon MS, Carter JN (2006) On crowd density estimation for surveillance. In: IET Conference on Crime and Security, pp 540–545

  42. Ranjan V, Le H, Hoai M (2018) Iterative crowd counting. arXiv preprint arXiv:1807.09959

  43. Regazzoni CS, Tesei A, Murino V (1993) A real-time vision system for crowding monitoring. In: Proceedings of IECON’93—19th annual conference of IEEE industrial electronics. IEEE, pp 1860–1864

  44. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

  45. Saleh SAM, Suandi SA, Ibrahim H (2015) Recent survey on crowd density estimation and counting for visual surveillance. Eng Appl Artif Intell 41:103–114

    Article  Google Scholar 

  46. Sam DB, Babu RV (2018) Top-down feedback for crowd counting convolutional neural network. In: Thirty-second AAAI conference on artificial intelligence

  47. Sam DB, Sajjan NN, Maurya H, Babu RV (2019) Almost unsupervised learning for dense crowd counting. In: Thirty-third AAAI conference on artificial intelligence

  48. Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5744–5752

  49. Shang C, Ai H, Bai B (2016) End-to-end crowd counting via joint learning local and global count. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 1215–1219

  50. Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5245–5254

  51. Shi Z, Zhang L, Liu Y, Cao X, Ye Y, Cheng MM, Zheng G (2018) Crowd counting with deep negative correlation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5382–5390

  52. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  53. Sindagi VA, Patel VM (2017) CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6

  54. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid CNNS. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 1879–1888

  55. Sindagi VA, Patel VM (2018) A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognit Lett 107:3–16

    Article  Google Scholar 

  56. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  57. Watson R, Yip P (2011) How many were there when it mattered? Significance 8(3):104–107

    Article  Google Scholar 

  58. Wu X, Liang G, Lee KK, Xu Y (2006) Crowd density estimation using texture analysis and learning. In: 2006 IEEE international conference on robotics and biomimetics. IEEE, pp 214–219

  59. Xiaohua L, Lansun S, Huanqin L (2006) Estimation of crowd density based on wavelet and support vector machine. Trans Inst Meas Control 28(3):299–308

    Article  Google Scholar 

  60. Xiong F, Shi X, Yeung DY (2017) Spatiotemporal modeling for crowd counting in videos. In: 2017 IEEE International conference on computer vision (ICCV). IEEE, pp 5161–5169

  61. Yin JH, Velastin SA, Davies AC (1995) Image processing techniques for crowd density estimation using a reference image. In: Asian conference on computer vision. Springer, Berlin, pp 489–498

    Chapter  Google Scholar 

  62. Zeng L, Xu X, Cai B, Qiu S, Zhang T (2017) Multi-scale convolutional neural networks for crowd counting. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 465–469

  63. Zhan B, Monekosso DN, Remagnino P, Velastin SA, Xu LQ (2008) Crowd analysis: a survey. Mach Vis Appl 19(5–6):345–357

    Article  Google Scholar 

  64. Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 833–841

  65. Zhang L, Shi M, Chen Q (2018) Crowd counting via scale-adaptive convolutional neural network. In: 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 1113–1121

  66. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

  67. Zhao T, Nevatia R, Wu B (2008) Segmentation and tracking of multiple humans in crowded environments. IEEE Trans Pattern Anal Mach Intell 30(7):1198–1211

    Article  Google Scholar 

  68. Zitouni MS, Bhaskar H, Dias J, Al-Mualla ME (2016) Advances and trends in visual crowd analysis: a systematic survey and evaluation of crowd modelling techniques. Neurocomputing 186:139–159

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh Duc Ngo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research is funded by Vietnam National University Ho Chi Minh City (VNU-HCM) under Grant No. C2018-26-01.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, V., Ngo, T.D. Single-image crowd counting: a comparative survey on deep learning-based approaches. Int J Multimed Info Retr 9, 63–80 (2020). https://doi.org/10.1007/s13735-019-00181-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-019-00181-y

Keywords

Navigation