Skip to main content
Log in

A crowd counting method via density map and counting residual estimation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recently, state-of-the-art crowd counting methods have focused more on predicting a density map and then obtaining the final aggregated count. In 2018, a typical density map-based network for congested scene recognition called CSRNet was proposed, and it achieved better crowd counting performance than previous methods with a simple architecture. It utilizes the first 10 layers from VGG-16 as the front end and deploys dilated convolutional layers as the back-end to generate high-quality density maps. CSRNet has been demonstrated on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the World Expo’10 dataset, and the UCSD dataset) and delivered great performance. To obtain better performance, in this paper, we propose a small network as a new component that generates a counting residual estimation, and we combine our component with CSRNet. We demonstrate this combined network on three datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, and the World Expo’10 dataset) and compare the results with those of CSRNet. The results show that our method has significantly improved the results of CSRNet. Through a series of experiments, such as ablation experiments and control experiments, we demonstrate the effectiveness of our method. In the future, we will apply our method to other networks to achieve better results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Bai S, He Z, Qiao Y, Hu H, Wu W, Yan J (2020) Adaptive dilated network with self-correction supervision for counting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4594–4603

  2. Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the European Conference on Computer Vision, pp 734–750

  3. Chan AB, Vasconcelos N (2009) Bayesian poisson regression for crowd counting. In: 2009 IEEE 12th international conference on computer vision, pp 545–551

  4. Chen K, Loy CC, Gong S, Xiang T (2012) Feature mining for localised crowd counting. In: BMVC, vol 1, no 2, pp 3

  5. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  6. Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587

  7. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol 1, pp 886–893

  8. de Sá CC, Gonçalves MA, Sousa DX, Salles T (2016) Generalized BROOF-L2R: a general framework for learning to rank based on boosting and random forests. In: proceedings of the 39th international ACM SIGIR conference on Research and Development in information retrieval, pp 95–104

  9. Diwakar M, Kumar M (2018) A review on CT image noise and its denoising. Biomed Signal Process Control 42:73–88

    Article  Google Scholar 

  10. Diwakar M, Singh P (2020) CT image denoising using multivariate model and its method noise thresholding in non-subsampled shearlet domain. Biomed Signal Process Control 57:101754

    Article  Google Scholar 

  11. Dollar P, Wojek C, Schiele B, Perona P (2011) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761

    Article  Google Scholar 

  12. Enzweiler M, Gavrila DM (2008) Monocular pedestrian detection: survey and experiments. IEEE Trans Pattern Anal Mach Intell 31(12):2179–2195

    Article  Google Scholar 

  13. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  14. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232

    Article  MathSciNet  MATH  Google Scholar 

  15. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. Proc IEEE Conf Comput Vis Pattern Recognit 2013:2547–2554

    Google Scholar 

  16. Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the European conference on computer vision (ECCV), pp 532–546

  17. Kumar M, Diwakar M (2019) A new exponentially directional weighted function based CT image denoising using total variation. J King Saud Univ-Comput Inf Sci 31(1):113–124

    Google Scholar 

  18. Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol 1, pp 878–885

  19. Lempitsky V, Zisserman A (2010) Learning to count objects in images. In: Advances in neural information processing systems, pp 1324–1332

  20. Li M, Zhang Z, Huang K, Tan T (2008) Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: 2008 19th international conference on pattern recognition, pp 1–4

  21. Li Y, Zhang X, Chen D (2018) CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100

  22. Liu J, Gao C, Meng D, Hauptmann AG (2018) DecideNet: counting varying density crowds through attention guided detection and density estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5197–5206

  23. Liu W, Salzmann M, Fua P (2019) Context-aware crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5099–5108

  24. Liu N, Long Y, Zou C, Niu Q, Pan L, Wu H (2019) ADCrowdNet: an attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3225–3234

  25. Liu L, Qiu Z, Li G, Liu S, Ouyang W, Lin L (2019) Crowd counting with deep structured scale integration network. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1774–1783

  26. Luo W, Xing J, Milan A, Zhang X, Liu W, Zhao X, Kim TK (2014) Multiple object tracking: a literature review. arXiv:1409.7618

  27. Luo W, Sun P, Zhong F, Liu W, Zhang T, Wang Y (2019) End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans Pattern Anal Mach Intell 42:1317–1332

    Article  Google Scholar 

  28. Mohan A, Chen Z, Weinberger K (2011) Web-search ranking with initialized gradient boosted regression trees. In: Proceedings of the learning to rank challenge, pp 77–89

  29. Pham VQ, Kozakaya T, Yamaguchi O, Okada R (2015) Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3253–3261

  30. Ryan D, Denman S, Fookes C, et al (2009) Crowd counting using multiple local features. Digital image computing: techniques and applications. In: 2009 digital image computing: techniques and applications, pp 81–88

  31. Sang J, Wu W, Luo H, Xiang H, Zhang Q, Hu H, Xia X (2019) Improved crowd counting method based on scale-adaptive convolutional neural network. IEEE Access 7:24411–24419

    Article  Google Scholar 

  32. Sindagi VA, Patel VM (2017) CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6

  33. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid CNNs. In: IEEE International Conference on Computer Vision, pp 1879–1888

  34. Sindagi VA, Patel VM (2018) A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recogn Lett 107:3–16

    Article  Google Scholar 

  35. Tuzel O, Porikli F, Meer P (2008) Pedestrian detection via classification on riemannian manifolds. IEEE Trans Pattern Anal Mach Intell 30(10):1713–1727

    Article  Google Scholar 

  36. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  37. Wan J, Chan A (2019) Adaptive density map generation for crowd counting. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1130–1139

  38. Wan J, Luo W, Wu B, Chan AB, Liu W (2019) Residual regression with semantic prior for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4036–4045

  39. Wu B, Nevatia R (2007) Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors. Int J Comput Vis 75(2):247–266

    Article  Google Scholar 

  40. Yan Z, Yuan Y, Zuo W, Tan X, Wang Y, Wen S, Ding E (2019) Perspective-guided convolution networks for crowd counting. In: Proceedings of the IEEE International Conference on Computer Vision, pp 952–961

  41. Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122

  42. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 589–597

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (No. 61971073).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Sang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, L., Guo, Y., Sang, J. et al. A crowd counting method via density map and counting residual estimation. Multimed Tools Appl 81, 43503–43512 (2022). https://doi.org/10.1007/s11042-022-13220-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13220-4

Keywords

Navigation