Pyramid-dilated deep convolutional neural network for crowd counting

Wang, Weixing; Liu, Quanli; Wang, Wei

doi:10.1007/s10489-021-02537-6

Pyramid-dilated deep convolutional neural network for crowd counting

Published: 29 May 2021

Volume 52, pages 1825–1837, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Weixing Wang^1,2,
Quanli Liu^1,3 &
Wei Wang^1,3

999 Accesses
18 Citations
Explore all metrics

Abstract

Statistics on crowds in crowded scenes can reflect the density level of crowds and provide safety warnings. This is a laborious task if conducted manually. In recent years, automated crowd counting has received extensive attention in the computer vision field. However, this task is still challenging mainly due to the serious occlusion in crowds and large appearance variations caused by the viewing angles of cameras. To overcome these difficulties, a pyramid-dilated deep convolutional neural network for accurate crowd counting called PDD-CNN is proposed. PDD-CNN is based on a VGG-16 network that is designed to generate dense attribute feature maps from an image with an arbitrary size or resolution. Then, two pyramid dilated modules are adopted, each consisting of four parallel dilated convolutional layers with different rates and a parallel average pooling layer to capture the multiscale features. Finally, three cascading dilated convolutions are used to regress the density map and perform accurate count estimation. In addition, a novel training loss, combining the Euclidean loss with the structural similarity loss, is employed to attenuate the blurry effects of density map estimation. The experimental results on three datasets (ShanghaiTech, UCF_CC_50, and UCF-QNRF) demonstrate that the proposed PDD-CNN produces high-quality density maps and achieves a good counting performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

A Deep-Fusion Network for Crowd Counting in High-Density Crowded Scenes

Article Open access 28 September 2021

DLMP-Net: A Dynamic Yet Lightweight Multi-pyramid Network for Crowd Density Estimation

Double multi-scale feature fusion network for crowd counting

Article 07 March 2024

References

Lempitsky V, Zisserman A (2010) Learning to count objects in images,” Advances in Neural Information Processing Systems, pp. 1324–1332
Boominathan L, Kruthiventi SS, Babu RV (2016) Crowdnet: a deep convolutional network for dense crowd counting,” Proceedings of the 24th ACM International Conference on Multimedia, pp. 640–644
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. IEEE Confer Comput VisionPattern Recogn:589–597
Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 5744–5752
Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns,” IEEE International Conference on Computer Vision (ICCV), pp. 1879-1888
Deb D, Ventura J (2018) An aggregated multicolumn dilated convolution network for perspective-free counting, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 308–317
Ji Q, Zhu T, Bao D (2020) A hybrid model of convolutional neural networks and deep regression forests for crowd counting,” Applied Intelligence, pp. 1–15
Duan H, Wang S, Guan Y (2020) SOFA-Net: Second-Order and First-order Attention Network for Crowd Counting, arXiv preprint arXiv:2008.03723, pp. 1–12
Oñoro-Rubio D, López-Sastre RJ (2016) Towards perspective-free object counting with deep learning,” European Conference on Computer Vision, pp. 615–629
Kang D, Chan A (2018) Crowd counting by adaptively fusing predictions from an image pyramid,” arXiv preprint arXiv:1805.06115, pp. 1–12
Marsden M, McGuiness K, Little S, O’Connor NE (2017) Fully convolutional crowd counting on highly congested scenes,” 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP), pp. 27–33
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network, IEEE Conference on Computer Vision and Pattern Recognition, pp. 6230–6239
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Machine Intell 40(4):834–848
Article Google Scholar
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions,” arXiv preprint arXiv:1511.07122, pp. 1–13
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134
Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5245–5254
Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maddeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds,” IEEE European Conference on Computer Vision, pp. 544–559
Zhao H, Gallo O, Frosio I, Kautz J (2017) Loss functions for image restoration with neural networks. IEEE Trans Computational Imaging 3(1):47–57
Article Google Scholar
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article Google Scholar
Loy CC, Chen K, Gong S, Xiang T (2013) Crowd counting and profiling: methodology and evaluation,” Modeling, Simulation and Visual Analysis of Crowds, Springer, pp. 347–382
Viola P, Jones MJ, Snow D (2005) Detecting pedestrians using patterns of motion and appearance. Int J Comput Vis 63(2):153–161
Article Google Scholar
Li M, Zhang Z, Huang K Tan T (2008) Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection,” 19th International Conference on Pattern Recognition, pp. 1–4
Lin S, Chen J-Y, Chao H-X (2001) Estimation of number of people in crowded scenes using perspective transformation. IEEE Trans Syst Man Cybern Syst Hum 31(6):645–654
Article Google Scholar
Chan AB, Liang Z-SJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: Counting people without people models or tracking,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7
Chan AB, Vasconcelos N (2012) Counting people with low level features and bayesian regression. IEEE Trans Image Process 21(4):2160–2177
Article MathSciNet Google Scholar
Ryan D, Denman S, Fookes C, Sridharan S (2009) Crowd counting using multiple local features, Digital Image Computing: Techniques and Applications, pp. 81-88
Ryan D, Denman S, Sridharan S, Fookes C (2015) An evaluation of crowd counting methods, features and regression models. Comput Vis Image Underst 130:1–17
Article Google Scholar
Pham V-Q, Kozakaya T, Yamaguchi O, Okada R (2015) Count forest: co-voting uncertain number of targets using random forest for crowd density estimation, IEEE International Conference on Computer Vision, pp. 3253–3261
Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. IEEE Confer Comput Vision Pattern Recogn:833–841
Wang L, Yin B, Tang X, Li Y (2019) Removing background interference for crowd counting via de-background detail convolutional network. Neurocomputing 332:360–371
Article Google Scholar
Shi M, Yang Z, Xu C, Chen Q (2018) Revisiting perspective information for efficient crowd counting,” arXiv preprint arXiv: 1807.01989, pp. 1–10
Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting, European Conference on Computer Vision, pp. 757–773
Li Y, Zhang X, Chen D (2018) CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes, IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp:1091–1100
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556, pp. 1–14
P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell (2018) Understanding convolution for semantic segmentation,” IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1451–1460
Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images, IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2547–2554
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet: classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, pp. 1097–1105
Hinton GE (2012) A Practical Guide to Training Restricted Boltzmann Machines, Neural Networks: Tricks of the Trade. Springer, Berlin, pp 599–619
Book Google Scholar
Bengio Y (2012) Practical recommendations for gradient-based training of deep architectures, Neural networks: Tricks of the Trade. Springer, Berlin, pp 437–478
Book Google Scholar
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks,” Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp. 249–256
Li J, Xue Y, Wang W, Ouyang G (2020) Cross-level parallel network for crowd counting. IEEE Trans Indust Informatics 16(1):566–576
Article Google Scholar
Zeng X, Wu Y, Hu S, Wang R, Ye Y (2020) DSPNet: Deep scale purifier network for dense crowd counting, Expert Systems With Applications, pp. 1–10
Wang Q, Gao J, Lin W, Yuan Y (2020) Pixel-Wise Crowd Understanding via Synthetic Data,” International Journal of Computer Vision, pp. 1–21
Sindagi VA, Patel VM (2017) CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting, 14th International Conference on Advanced Video and Signal Based Surveillance, pp. 1–6

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (61773085).

Author information

Authors and Affiliations

School of Control Science and Engineering, Dalian University of Technology, Dalian, 116024, China
Weixing Wang, Quanli Liu & Wei Wang
School of Mechanical Engineering, Dalian University, Dalian, 116622, China
Weixing Wang
Key Laboratory of Intelligent Control and Optimization for Industrial Equipment (Dalian University of Technology), Ministry of Education, Dalian, 116024, China
Quanli Liu & Wei Wang

Authors

Weixing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Quanli Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quanli Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, W., Liu, Q. & Wang, W. Pyramid-dilated deep convolutional neural network for crowd counting. Appl Intell 52, 1825–1837 (2022). https://doi.org/10.1007/s10489-021-02537-6

Download citation

Accepted: 17 May 2021
Published: 29 May 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10489-021-02537-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pyramid-dilated deep convolutional neural network for crowd counting

Abstract

Access this article

Similar content being viewed by others

A Deep-Fusion Network for Crowd Counting in High-Density Crowded Scenes

DLMP-Net: A Dynamic Yet Lightweight Multi-pyramid Network for Crowd Density Estimation

Double multi-scale feature fusion network for crowd counting

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Pyramid-dilated deep convolutional neural network for crowd counting

Abstract

Access this article

Similar content being viewed by others

A Deep-Fusion Network for Crowd Counting in High-Density Crowded Scenes

DLMP-Net: A Dynamic Yet Lightweight Multi-pyramid Network for Crowd Density Estimation

Double multi-scale feature fusion network for crowd counting

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation