Average up-sample network for crowd counting

Wu, Di; Fan, Zheyi; Cui, Mengjie

doi:10.1007/s10489-021-02470-8

Average up-sample network for crowd counting

Published: 20 May 2021

Volume 52, pages 1376–1388, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

372 Accesses
4 Citations
Explore all metrics

Abstract

The task of crowd counting is receiving increased attention recently, but it still faces many challenges, such as extremely dense scene, scale variation and background clutter. The quality of generated density map plays an important role in counting performance. In this paper, we propose an encoder-decoder structure network called Average Up-sample Convolution Neural Network (AU-CNN), for high-quality density map and accurate counting estimation. The encoder extracts the features of input image while the decoder gradually recovers the size of feature map to the original size of input image by developing a simple but effective average up-sample module. The average up-sample module takes the average of interpolation results from three different up-sample methods, without adding any other redundant parameters. Moreover, compared with most existing counting algorithm using only Euclidean loss, we use a combined loss function of Euclidean loss and count loss to optimize the network, which is demonstrated effective in performance improving. Experiments on the ShanghaiTech, UCF_CC_50, and UCF_QNRF demonstrate the great counting performance and robustness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Article 30 January 2023

Deep learning in multi-object detection and tracking: state of the art

Article 09 April 2021

References

Aich S, Stavness I (2018) Global sum pooling: A generalization trick for object counting with small datasets of large images. arXiv:1805.11123
Bahmanyar R, Vig E, Reinartz P (2019) Mrcnet: Crowd counting and density map estimation in aerial and ground imagery. arXiv:1909.12743
Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 734–750
Chen K, Loy CC, Gong S, Xiang T (2012) Feature mining for localised crowd counting. In: BMVC, vol 1, p 3
Cheng ZQ, Li JX, Dai Q, Wu X, He JY, Hauptmann AG (2019) Improving the learning of multi-column convolutional neural network for crowd counting. In: Proceedings of the 27th ACM international conference on multimedia, pp 1897–1906
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. IEEE, pp 886–893
Ding X, He F, Lin Z, Wang Y, Guo H, Huang Y (2020) Crowd density estimation using fusion of multi-layer features. IEEE Trans Intell Transport Syst
Dong Z, Zhang R, Shao X, Li Y (2020) Scale-recursive network with point supervision for crowd scene analysis. Neurocomputing 384:314–324
Article Google Scholar
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Machine Intell 32(9):1627–1645
Article Google Scholar
Gao G, Liu Q, Wang Y (2020) Counting dense objects in remote sensing images. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4137–4141
Gao J, Lin W, Zhao B, Wang D, Gao C, Wen J (2019) Cˆ 3 framework: An open-source pytorch code for crowd counting. arXiv:1907.02724
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Hou Y, Li C, Yang F, Ma C, Zhu L, Li Y, Jia H, Xie X (2020) Bba-net: A bi-branch attention network for crowd counting. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4072– 4076
Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2547–2554
Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the european conference on computer vision (ECCV), pp 532–546
Jiang X, Xiao Z, Zhang B, Zhen X, Cao X, Doermann D, Shao L (2019) Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6133–6142
Jiang X, Zhang L, Zhang T, Lv P, Zhou B, Pang Y, Xu M, Xu C (2020) Density-aware multi-task learning for crowd counting. IEEE Transactions on Multimedia
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization
Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. IEEE, pp 878–885
Lempitsky V, Zisserman A (2010) Learning to count objects in images. In: Advances in neural information processing systems, pp 1324–1332
Li J, Xue Y, Wang W, Ouyang G (2019) Cross-level parallel network for crowd counting. IEEE Trans Indust Inform PP:1–1. https://doi.org/10.1109/TII.2019.2935244
Google Scholar
Li Y, Zhang X, Chen D (2018) Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100
Lin SF, Chen JY, Chao HX (2001) Estimation of number of people in crowded scenes using perspective transformation. IEEE Transactions on Systems. Man Cybern-Part A Syst Humans 31(6):645–654
Article Google Scholar
Liu N, Long Y, Zou C, Niu Q, Pan L, Wu H (2019) Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3225–3234
Liu X, Van De Weijer J, Bagdanov AD (2018) Leveraging unlabeled data for crowd counting by learning to rank. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7661–7669
Marsden M, McGuinness K, Little S, O’Connor NE (2016) Fully convolutional crowd counting on highly congested scenes. arXiv:1612.00220
Miao Y, Lin Z, Ding G, Han J (2020) Shallow feature based dense attention network for crowd counting. In: AAAI, pp 11765–11772
Oh MH, Olsen PA, Ramamurthy KN (2020) Crowd counting with decomposed uncertainty. In: AAAI, pp 11799–11806
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Machine Intell 24(7):971–987
Article Google Scholar
Onoro-Rubio D, López-Sastre RJ (2016) Towards perspective-free object counting with deep learning. In: European conference on computer vision. Springer, pp 615–629
Paragios N, Ramesh V (2001) A mrf-based approach for real-time subway monitoring. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, vol 1. IEEE, pp I–I
Paszke A, Gross S, Massa F, Lerer A, Chintala S (2019) Pytorch: An imperative style high-performance deep learning library
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767
Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 4031–4039
Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5245–5254
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE international conference on computer vision, pp 1861–1870
Sindagi VA, Patel VM (2019) Ha-ccn: Hierarchical attention-based crowd counting network. IEEE Trans Image Process 29:323–335
Article MathSciNet Google Scholar
Tuzel O, Porikli F, Meer P (2008) Pedestrian detection via classification on riemannian manifolds. IEEE Trans Pattern Anal Machine Intell 30(10):1713–1727
Article Google Scholar
Walach E, Wolf L (2016) Learning to count with cnn boosting. In: European conference on computer vision. Springer, pp 660– 676
Wang C, Zhang H, Yang L, Liu S, Cao X (2015) Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM international conference on multimedia, pp 1299– 1302
Yang B, Zhan W, Wang N, Liu X, Lv J (2019) Counting crowds using a scale-distribution-aware network and adaptive human-shaped kernel. Neurocomputing
Zhang A, Shen J, Xiao Z, Zhu F, Zhen X, Cao X, Shao L (2019) Relational attention network for crowd counting. In: Proceedings of the IEEE international conference on computer vision, pp 6788–6797
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grants 61701029, the National Natural Science Foundation of Beijing under Grants L192036, and Innovation Fund for Industry, Education and Research, Science and Technology Development Center, Ministry of Education, under Grants 201920548040.

Author information

Authors and Affiliations

School of Information and Electronics, Beijing Institute of Technology, Beijing, China
Di Wu, Zheyi Fan & Mengjie Cui

Authors

Di Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zheyi Fan
View author publications
You can also search for this author in PubMed Google Scholar
Mengjie Cui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zheyi Fan.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, D., Fan, Z. & Cui, M. Average up-sample network for crowd counting. Appl Intell 52, 1376–1388 (2022). https://doi.org/10.1007/s10489-021-02470-8

Download citation

Accepted: 21 April 2021
Published: 20 May 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10489-021-02470-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Average up-sample network for crowd counting

Abstract

Access this article

Similar content being viewed by others

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Deep learning in multi-object detection and tracking: state of the art

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Average up-sample network for crowd counting

Abstract

Access this article

Similar content being viewed by others

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Deep learning in multi-object detection and tracking: state of the art

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation