An Improved Crowd Counting Method Based on YOLOv3

Zheng, Shuang; Wu, Junfeng; Liu, Fugang; Liang, Yunhao; Zhao, Lingfei

doi:10.1007/978-3-031-04409-0_31

Shuang Zheng¹⁶,
Junfeng Wu¹⁶,
Fugang Liu¹⁶,
Yunhao Liang¹⁶ &
…
Lingfei Zhao¹⁶

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 438))

Included in the following conference series:

International Conference on Machine Learning and Intelligent Communications

513 Accesses

Abstract

This paper proposes a method of crowd counting. We use ResNeSt-50 as the backbone network of YOLOv3. After the backbone network, we add SPP (Spatial Pyramid Potential) and PANet (Path Aggregation Network) to enhance the receptive field of convolutional neural network and improve the accuracy of stream of people or crowd counting in real application scenarios. In the application scenario of high-density crowd counting, an improved VGG network is used to design a deep network to capture high-level semantic information. At the same time, a shallow network is constructed to detect the head blob of people far away from the camera. The deep network and the shallow network are combined to detect high-density crowd. Finally, through the effective fusion of the above two network models, the accuracy and applicability of the algorithm are further improved. It can improve the detection accuracy in the case of small number of people and occlusion, and effectively reduce the estimation error in the scene with high density crowd.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhang, Y., Zhou, D., Chen, S., et al.: Single-image crowd counting via multi-column convolutional neural network. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3290–3298. IEEE, Las Vegas, NV, USA (2016)
Google Scholar
Sindagi, V.A., Patel, V.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1879–1888. IEEE, Venice, Italy (2017)
Google Scholar
Li, Y., Zhang, X., Chen, D.: CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1091–1100, IEEE, Salt Lake City, UT (2018)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement. arXiv:1804.02767 (2018)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517–6525. IEEE, Honolulu HI ( 2017)
Google Scholar
Zhang, H., Wu, C., Zhang, Z.: ResNeSt: Split-Attention Networks. arXiv:2004.08955 (2020)
Szegedy, C., Liu, W., Jia, Y.Q., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition. pp. 1–9. IEEE, Boston, USA (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
Google Scholar
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500. IEEE, Honolulu, Hawaii (2017)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141. IEEE, Salt Lake City, UT, USA (2018)
Google Scholar
Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: 2019 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 510–519. IEEE, Long Beach, CA, USA (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE, Las Vegas, NV (2016)
Google Scholar
He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2014)
Article Google Scholar
Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759–8768. IEEE, Salt Lake City, UT, USA (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: The 3rd International Conference on Learning Representations, pp. 7749–8758. IEEE, Banff, Canada (2014)
Google Scholar

Download references

Acknowledgements

This work has been partially supported by “Heilongjiang Science Foundation Project (LH2021F052)” .

Author information

Authors and Affiliations

Heilongjiang University of Science and Technology, Harbin, 150022, China
Shuang Zheng, Junfeng Wu, Fugang Liu, Yunhao Liang & Lingfei Zhao

Authors

Shuang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Junfeng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Fugang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yunhao Liang
View author publications
You can also search for this author in PubMed Google Scholar
Lingfei Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fugang Liu .

Editor information

Editors and Affiliations

Jinhua Advanced Research Institute, Jinhua, China
Xiaolin Jiang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, S., Wu, J., Liu, F., Liang, Y., Zhao, L. (2022). An Improved Crowd Counting Method Based on YOLOv3. In: Jiang, X. (eds) Machine Learning and Intelligent Communications. MLICOM 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 438. Springer, Cham. https://doi.org/10.1007/978-3-031-04409-0_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-04409-0_31
Published: 18 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04408-3
Online ISBN: 978-3-031-04409-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics