A crowd counting method via density map and counting residual estimation

Yang, Li; Guo, Yanqun; Sang, Jun; Wu, Weiqun; Wu, Zhongyuan; Liu, Qi; Xia, Xiaofeng

doi:10.1007/s11042-022-13220-4

A crowd counting method via density map and counting residual estimation

Published: 24 May 2022

Volume 81, pages 43503–43512, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Li Yang^1,2,
Yanqun Guo^3,4,
Jun Sang ORCID: orcid.org/0000-0002-8703-7310^1,2,
Weiqun Wu^1,2,
Zhongyuan Wu^1,2,
Qi Liu^1,2 &
…
Xiaofeng Xia^1,2

394 Accesses
Explore all metrics

Abstract

Recently, state-of-the-art crowd counting methods have focused more on predicting a density map and then obtaining the final aggregated count. In 2018, a typical density map-based network for congested scene recognition called CSRNet was proposed, and it achieved better crowd counting performance than previous methods with a simple architecture. It utilizes the first 10 layers from VGG-16 as the front end and deploys dilated convolutional layers as the back-end to generate high-quality density maps. CSRNet has been demonstrated on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the World Expo’10 dataset, and the UCSD dataset) and delivered great performance. To obtain better performance, in this paper, we propose a small network as a new component that generates a counting residual estimation, and we combine our component with CSRNet. We demonstrate this combined network on three datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, and the World Expo’10 dataset) and compare the results with those of CSRNet. The results show that our method has significantly improved the results of CSRNet. Through a series of experiments, such as ablation experiments and control experiments, we demonstrate the effectiveness of our method. In the future, we will apply our method to other networks to achieve better results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Denstity Level Aware Network for Crowd Counting

Crowd Counting from a Still Image Using Multi-scale Fully Convolutional Network with Adaptive Human-Shaped Kernel

Approaches on crowd counting and density estimation: a review

Article 20 February 2021

References

Bai S, He Z, Qiao Y, Hu H, Wu W, Yan J (2020) Adaptive dilated network with self-correction supervision for counting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4594–4603
Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the European Conference on Computer Vision, pp 734–750
Chan AB, Vasconcelos N (2009) Bayesian poisson regression for crowd counting. In: 2009 IEEE 12th international conference on computer vision, pp 545–551
Chen K, Loy CC, Gong S, Xiang T (2012) Feature mining for localised crowd counting. In: BMVC, vol 1, no 2, pp 3
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol 1, pp 886–893
de Sá CC, Gonçalves MA, Sousa DX, Salles T (2016) Generalized BROOF-L2R: a general framework for learning to rank based on boosting and random forests. In: proceedings of the 39th international ACM SIGIR conference on Research and Development in information retrieval, pp 95–104
Diwakar M, Kumar M (2018) A review on CT image noise and its denoising. Biomed Signal Process Control 42:73–88
Article Google Scholar
Diwakar M, Singh P (2020) CT image denoising using multivariate model and its method noise thresholding in non-subsampled shearlet domain. Biomed Signal Process Control 57:101754
Article Google Scholar
Dollar P, Wojek C, Schiele B, Perona P (2011) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761
Article Google Scholar
Enzweiler M, Gavrila DM (2008) Monocular pedestrian detection: survey and experiments. IEEE Trans Pattern Anal Mach Intell 31(12):2179–2195
Article Google Scholar
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Article MathSciNet MATH Google Scholar
Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. Proc IEEE Conf Comput Vis Pattern Recognit 2013:2547–2554
Google Scholar
Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the European conference on computer vision (ECCV), pp 532–546
Kumar M, Diwakar M (2019) A new exponentially directional weighted function based CT image denoising using total variation. J King Saud Univ-Comput Inf Sci 31(1):113–124
Google Scholar
Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol 1, pp 878–885
Lempitsky V, Zisserman A (2010) Learning to count objects in images. In: Advances in neural information processing systems, pp 1324–1332
Li M, Zhang Z, Huang K, Tan T (2008) Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: 2008 19th international conference on pattern recognition, pp 1–4
Li Y, Zhang X, Chen D (2018) CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100
Liu J, Gao C, Meng D, Hauptmann AG (2018) DecideNet: counting varying density crowds through attention guided detection and density estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5197–5206
Liu W, Salzmann M, Fua P (2019) Context-aware crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5099–5108
Liu N, Long Y, Zou C, Niu Q, Pan L, Wu H (2019) ADCrowdNet: an attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3225–3234
Liu L, Qiu Z, Li G, Liu S, Ouyang W, Lin L (2019) Crowd counting with deep structured scale integration network. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1774–1783
Luo W, Xing J, Milan A, Zhang X, Liu W, Zhao X, Kim TK (2014) Multiple object tracking: a literature review. arXiv:1409.7618
Luo W, Sun P, Zhong F, Liu W, Zhang T, Wang Y (2019) End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans Pattern Anal Mach Intell 42:1317–1332
Article Google Scholar
Mohan A, Chen Z, Weinberger K (2011) Web-search ranking with initialized gradient boosted regression trees. In: Proceedings of the learning to rank challenge, pp 77–89
Pham VQ, Kozakaya T, Yamaguchi O, Okada R (2015) Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3253–3261
Ryan D, Denman S, Fookes C, et al (2009) Crowd counting using multiple local features. Digital image computing: techniques and applications. In: 2009 digital image computing: techniques and applications, pp 81–88
Sang J, Wu W, Luo H, Xiang H, Zhang Q, Hu H, Xia X (2019) Improved crowd counting method based on scale-adaptive convolutional neural network. IEEE Access 7:24411–24419
Article Google Scholar
Sindagi VA, Patel VM (2017) CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6
Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid CNNs. In: IEEE International Conference on Computer Vision, pp 1879–1888
Sindagi VA, Patel VM (2018) A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recogn Lett 107:3–16
Article Google Scholar
Tuzel O, Porikli F, Meer P (2008) Pedestrian detection via classification on riemannian manifolds. IEEE Trans Pattern Anal Mach Intell 30(10):1713–1727
Article Google Scholar
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Article Google Scholar
Wan J, Chan A (2019) Adaptive density map generation for crowd counting. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1130–1139
Wan J, Luo W, Wu B, Chan AB, Liu W (2019) Residual regression with semantic prior for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4036–4045
Wu B, Nevatia R (2007) Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors. Int J Comput Vis 75(2):247–266
Article Google Scholar
Yan Z, Yuan Y, Zuo W, Tan X, Wang Y, Wen S, Ding E (2019) Perspective-guided convolution networks for crowd counting. In: Proceedings of the IEEE International Conference on Computer Vision, pp 952–961
Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 589–597

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (No. 61971073).

Author information

Authors and Affiliations

Key Laboratory of Dependable Service Computing in Cyber Physical Society of Ministry of Education, Chongqing University, Chongqing, 400044, China
Li Yang, Jun Sang, Weiqun Wu, Zhongyuan Wu, Qi Liu & Xiaofeng Xia
School of Big Data & Software Engineering, Chongqing University, Chongqing, 401331, China
Li Yang, Jun Sang, Weiqun Wu, Zhongyuan Wu, Qi Liu & Xiaofeng Xia
School of Information Science and Technology, Southwest Jiaotong University, Chengdu, 610031, China
Yanqun Guo
Southwest Institute of Electronic Equipment, Chengdu, 610036, China
Yanqun Guo

Authors

Li Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yanqun Guo
View author publications
You can also search for this author in PubMed Google Scholar
Jun Sang
View author publications
You can also search for this author in PubMed Google Scholar
Weiqun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhongyuan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Xia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Sang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, L., Guo, Y., Sang, J. et al. A crowd counting method via density map and counting residual estimation. Multimed Tools Appl 81, 43503–43512 (2022). https://doi.org/10.1007/s11042-022-13220-4

Download citation

Received: 18 May 2020
Revised: 21 December 2020
Accepted: 11 May 2022
Published: 24 May 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11042-022-13220-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A crowd counting method via density map and counting residual estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Denstity Level Aware Network for Crowd Counting

Crowd Counting from a Still Image Using Multi-scale Fully Convolutional Network with Adaptive Human-Shaped Kernel

Approaches on crowd counting and density estimation: a review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now