Skip to main content
Log in

Average up-sample network for crowd counting

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The task of crowd counting is receiving increased attention recently, but it still faces many challenges, such as extremely dense scene, scale variation and background clutter. The quality of generated density map plays an important role in counting performance. In this paper, we propose an encoder-decoder structure network called Average Up-sample Convolution Neural Network (AU-CNN), for high-quality density map and accurate counting estimation. The encoder extracts the features of input image while the decoder gradually recovers the size of feature map to the original size of input image by developing a simple but effective average up-sample module. The average up-sample module takes the average of interpolation results from three different up-sample methods, without adding any other redundant parameters. Moreover, compared with most existing counting algorithm using only Euclidean loss, we use a combined loss function of Euclidean loss and count loss to optimize the network, which is demonstrated effective in performance improving. Experiments on the ShanghaiTech, UCF_CC_50, and UCF_QNRF demonstrate the great counting performance and robustness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Aich S, Stavness I (2018) Global sum pooling: A generalization trick for object counting with small datasets of large images. arXiv:1805.11123

  2. Bahmanyar R, Vig E, Reinartz P (2019) Mrcnet: Crowd counting and density map estimation in aerial and ground imagery. arXiv:1909.12743

  3. Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 734–750

  4. Chen K, Loy CC, Gong S, Xiang T (2012) Feature mining for localised crowd counting. In: BMVC, vol 1, p 3

  5. Cheng ZQ, Li JX, Dai Q, Wu X, He JY, Hauptmann AG (2019) Improving the learning of multi-column convolutional neural network for crowd counting. In: Proceedings of the 27th ACM international conference on multimedia, pp 1897–1906

  6. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. IEEE, pp 886–893

  7. Ding X, He F, Lin Z, Wang Y, Guo H, Huang Y (2020) Crowd density estimation using fusion of multi-layer features. IEEE Trans Intell Transport Syst

  8. Dong Z, Zhang R, Shao X, Li Y (2020) Scale-recursive network with point supervision for crowd scene analysis. Neurocomputing 384:314–324

    Article  Google Scholar 

  9. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Machine Intell 32(9):1627–1645

    Article  Google Scholar 

  10. Gao G, Liu Q, Wang Y (2020) Counting dense objects in remote sensing images. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4137–4141

  11. Gao J, Lin W, Zhao B, Wang D, Gao C, Wen J (2019) Cˆ 3 framework: An open-source pytorch code for crowd counting. arXiv:1907.02724

  12. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  13. Hou Y, Li C, Yang F, Ma C, Zhu L, Li Y, Jia H, Xie X (2020) Bba-net: A bi-branch attention network for crowd counting. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4072– 4076

  14. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2547–2554

  15. Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the european conference on computer vision (ECCV), pp 532–546

  16. Jiang X, Xiao Z, Zhang B, Zhen X, Cao X, Doermann D, Shao L (2019) Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6133–6142

  17. Jiang X, Zhang L, Zhang T, Lv P, Zhou B, Pang Y, Xu M, Xu C (2020) Density-aware multi-task learning for crowd counting. IEEE Transactions on Multimedia

  18. Kingma DP, Ba J (2014) Adam: A method for stochastic optimization

  19. Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. IEEE, pp 878–885

  20. Lempitsky V, Zisserman A (2010) Learning to count objects in images. In: Advances in neural information processing systems, pp 1324–1332

  21. Li J, Xue Y, Wang W, Ouyang G (2019) Cross-level parallel network for crowd counting. IEEE Trans Indust Inform PP:1–1. https://doi.org/10.1109/TII.2019.2935244

    Google Scholar 

  22. Li Y, Zhang X, Chen D (2018) Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100

  23. Lin SF, Chen JY, Chao HX (2001) Estimation of number of people in crowded scenes using perspective transformation. IEEE Transactions on Systems. Man Cybern-Part A Syst Humans 31(6):645–654

    Article  Google Scholar 

  24. Liu N, Long Y, Zou C, Niu Q, Pan L, Wu H (2019) Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3225–3234

  25. Liu X, Van De Weijer J, Bagdanov AD (2018) Leveraging unlabeled data for crowd counting by learning to rank. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7661–7669

  26. Marsden M, McGuinness K, Little S, O’Connor NE (2016) Fully convolutional crowd counting on highly congested scenes. arXiv:1612.00220

  27. Miao Y, Lin Z, Ding G, Han J (2020) Shallow feature based dense attention network for crowd counting. In: AAAI, pp 11765–11772

  28. Oh MH, Olsen PA, Ramamurthy KN (2020) Crowd counting with decomposed uncertainty. In: AAAI, pp 11799–11806

  29. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Machine Intell 24(7):971–987

    Article  Google Scholar 

  30. Onoro-Rubio D, López-Sastre RJ (2016) Towards perspective-free object counting with deep learning. In: European conference on computer vision. Springer, pp 615–629

  31. Paragios N, Ramesh V (2001) A mrf-based approach for real-time subway monitoring. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, vol 1. IEEE, pp I–I

  32. Paszke A, Gross S, Massa F, Lerer A, Chintala S (2019) Pytorch: An imperative style high-performance deep learning library

  33. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767

  34. Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 4031–4039

  35. Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5245–5254

  36. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  37. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE international conference on computer vision, pp 1861–1870

  38. Sindagi VA, Patel VM (2019) Ha-ccn: Hierarchical attention-based crowd counting network. IEEE Trans Image Process 29:323–335

    Article  MathSciNet  Google Scholar 

  39. Tuzel O, Porikli F, Meer P (2008) Pedestrian detection via classification on riemannian manifolds. IEEE Trans Pattern Anal Machine Intell 30(10):1713–1727

    Article  Google Scholar 

  40. Walach E, Wolf L (2016) Learning to count with cnn boosting. In: European conference on computer vision. Springer, pp 660– 676

  41. Wang C, Zhang H, Yang L, Liu S, Cao X (2015) Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM international conference on multimedia, pp 1299– 1302

  42. Yang B, Zhan W, Wang N, Liu X, Lv J (2019) Counting crowds using a scale-distribution-aware network and adaptive human-shaped kernel. Neurocomputing

  43. Zhang A, Shen J, Xiao Z, Zhu F, Zhen X, Cao X, Shao L (2019) Relational attention network for crowd counting. In: Proceedings of the IEEE international conference on computer vision, pp 6788–6797

  44. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grants 61701029, the National Natural Science Foundation of Beijing under Grants L192036, and Innovation Fund for Industry, Education and Research, Science and Technology Development Center, Ministry of Education, under Grants 201920548040.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheyi Fan.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, D., Fan, Z. & Cui, M. Average up-sample network for crowd counting. Appl Intell 52, 1376–1388 (2022). https://doi.org/10.1007/s10489-021-02470-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02470-8

Keywords

Navigation