Skip to main content

GTFNet: Ground Truth Fitting Network for Crowd Counting

  • Conference paper
  • First Online:
  • 3058 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12396))

Abstract

Crowd counting aims to estimate the number of pedestrians in a single image. Current crowd counting methods usually obtain counting results by integrating density maps. However, the label density map generated by the Gaussian kernel cannot accurately map the ground truth in the corresponding crowd image, thereby affecting the final counting result. In this paper, a ground truth fitting network called GTFNet was proposed, which aims to generate estimated density maps which can fit the ground truth better. Firstly, the VGG network combined with the dilated convolutional layers was used as the backbone network of GTFNet to extract hierarchical features. The multi-level features were concatenated to achieve compensation for information loss caused by pooling operations, which may assist the network to obtain texture information and spatial information. Secondly, the regional consistency loss function was designed to obtain the mapping results of the estimated density map and the label density map at different region levels. During the training process, the region-level dynamic weights were designed to assign a suitable region fitting range for the network, which can effectively reduce the impact of label errors on the estimated density maps. Finally, our proposed GTFNet was evaluated upon three crowd counting datasets (ShanghaiTech, UCF_CC_50 and UCF-QRNF). The experimental results demonstrated that the proposed GTFNet achieved excellent overall performance on all these datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Beibei, Z.: Crowd analysis: a survey. Mach. Vis. Appl. 19(5–6), 345–357 (2008)

    Google Scholar 

  2. Teng, L.: Crowded scene analysis: a survey. IEEE Trans. Circuits Syst. Video Technol. 25(3), 367–386 (2015)

    Google Scholar 

  3. Dalal, N.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893. IEEE (2005)

    Google Scholar 

  4. Felzenszwalb, P.F.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2009)

    Article  Google Scholar 

  5. Zhao, T.: Segmentation and tracking of multiple humans in crowded environments. IEEE Trans. Pattern Anal. Mach. Intell. 30(7), 1198–1211 (2008)

    Article  Google Scholar 

  6. Rodriguez, M.: Density-aware person detection and tracking in crowds. In: 2011 International Conference on Computer Vision, pp. 2423–2430. IEEE (2011)

    Google Scholar 

  7. Wang, M.: Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11), vol. 7, pp. 3401–3408. IEEE (2011)

    Google Scholar 

  8. Wu, B.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 1, pp. 90–97. IEEE (2005)

    Google Scholar 

  9. Zhang, C.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 833–841. IEEE (2015)

    Google Scholar 

  10. Szegedy, C.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE (2015)

    Google Scholar 

  11. Szegedy, C.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826. IEEE (2016)

    Google Scholar 

  12. He, K.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016)

    Google Scholar 

  13. Zhang, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597. IEEE (2016)

    Google Scholar 

  14. Li, Y.: Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1091–1100. IEEE (2018)

    Google Scholar 

  15. Simonyan, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556

  16. Yu, F.: Dilated residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 472–480. IEEE (2017)

    Google Scholar 

  17. Jiang, X.: Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6133–6142. IEEE (2019)

    Google Scholar 

  18. Idrees, H.: Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2547–2554. IEEE (2013)

    Google Scholar 

  19. Idrees, H., et al.: Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 544–559. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_33

    Chapter  Google Scholar 

  20. Yosinski, J.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328. MIT Press (2014)

    Google Scholar 

  21. Paszke, A., Gross, S., Chintala, S., Chanan, G.: Pytorch: tensors and dynamic neural networks in python with strong gpu acceleration. PyTorch: tensors and dynamic neural networks in Python with strong GPU acceleration 6 (2017)

    Google Scholar 

  22. Cao, X., Wang, Z., Zhao, Y., Su, F.: Scale aggregation network for accurate and efficient crowd counting. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 757–773. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_45

    Chapter  Google Scholar 

  23. Wang, Q.: Learning from synthetic data for crowd counting in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8198–8207. IEEE (2019)

    Google Scholar 

  24. Liu, N.: Adcrowdnet: an attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3225–3234. IEEE (2019)

    Google Scholar 

  25. Shi, M.: Perspective-aware CNN for crowd counting (2018)

    Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61971073).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Sang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tan, J., Sang, J., Xiang, Z., Shi, Y., Xia, X. (2020). GTFNet: Ground Truth Fitting Network for Crowd Counting. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12396. Springer, Cham. https://doi.org/10.1007/978-3-030-61609-0_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61609-0_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61608-3

  • Online ISBN: 978-3-030-61609-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics