Abstract
Crowd counting has important applications in many fileds, but it is still a challenging task due to background occlusion, scale variation and uneven distribution of crowd. This paper proposes the Multi-level Progressive Aggregation Network (MPANet) to enhance the channel and spatial dependencies of feature maps and effectively integrate the multi-level features. Besides, the Aggregation Refinement (AR) module is designed to integrate low-level spatial information and high-level semantic information. The proposed AR module can effectively utilize the complementary properties between multi-level features to generate high-quality density maps. Moreover, the Multi-scale Aware (MA) module is constructed to capture rich contextual information through convolutional kernels of different sizes. Furthermore, the Semantic Attention (SA) module is designed to enhance spatial and channel response on feature maps, which can reduce false recognition of the background region. Extensive experiments on four challenging datasets demonstrate that our approach outperforms most state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cao, X., Wang, Z., Zhao, Y., Su, F.: Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the European Conference on Computer Vision, pp. 734–750 (2018)
Chen, X., Bin, Y., Sang, N., Gao, C.: Scale pyramid network for crowd counting. In: 2019 IEEE Winter Conference on Applications of Computer Vision, pp. 1941–1950 (2019)
Gao, G., Gao, J., Liu, Q., Wang, Q., Wang, Y.: CNN-based density estimation and crowd counting: a survey. arXiv preprint arXiv:2003.12783 (2020)
Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: International Conference on Computer Vision and Pattern Recognition, pp. 2547–2554 (2013)
Idrees, H., Tayyab, M., Athrey, K., Zhang, D., Al-Maadeed, S., Rajpoot, N., Shah, M.: Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the European Conference on Computer Vision, pp. 532–546 (2018)
Jiang, X., Xiao, Z., Zhang, B., Zhen, X., Cao, X., Doermann, D., Shao, L.: Crowd counting and density estimation by trellis encoder-decoder networks. In: International Conference on Computer Vision and Pattern Recognition, pp. 6133–6142 (2019)
Li, Y., Zhang, X., Chen, D.: CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: International Conference on Computer Vision and Pattern Recognition, pp. 1091–1100 (2018)
Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting. In: International Conference on Computer Vision and Pattern Recognition, pp. 5099–5108 (2019)
Liu, X., Yang, J., Ding, W.: Adaptive mixture regression network with local counting map for crowd counting. arXiv preprint arXiv:2005.05776 (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Sam, D.B., Peri, S.V., Sundararaman, M.N., Kamath, A., Radhakrishnan, V.B.: Locate, size and count: accurately resolving people in dense crowds via detection. IEEE Trans. Pattern Anal. Mach. Intell. (2020)
Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: International Conference on Computer Vision and Pattern Recognition, pp. 4031–4039 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1861–1870 (2017)
Sindagi, V.A., Patel, V.M.: HA-CCN: hierarchical attention-based crowd counting network. IEEE Trans. Image Process. 29, 323–335 (2019)
Song, Q., et al.: To choose or to fuse? Scale selection for crowd counting. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2576–2583 (2021)
Valloli, V.K., Mehta, K.: W-net: reinforced U-net for density map estimation. arXiv preprint arXiv:1903.11249 (2019)
Wang, Q., Gao, J., Lin, W., Li, X.: NWPU-crowd: a large-scale benchmark for crowd counting and localization. IEEE Trans. Pattern Anal. Mach. Intell. 43(6), 2141–2149 (2020)
Wang, Q., Gao, J., Lin, W., Yuan, Y.: Learning from synthetic data for crowd counting in the wild. In: International Conference on Computer Vision and Pattern Recognition, pp. 8198–8207 (2019)
Wang, X., Lv, R., Zhao, Y., Yang, T., Ruan, Q.: Multi-scale context aggregation network with attention-guided for crowd counting. In: International Conference on Signal Processing, pp. 240–245 (2020)
Wang, Y., Hou, J., Hou, X., Chau, L.P.: A self-training approach for point-supervised object detection and counting in crowds. IEEE Trans. Image Process. 30, 2876–2887 (2021)
Yan, Z., et al.: Perspective-guided convolution networks for crowd counting. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
Yang, Y., Li, G., Wu, Z., Su, L., Huang, Q., Sebe, N.: Reverse perspective network for perspective-aware object counting. In International Conference on Computer Vision and Pattern Recognition, pp. pp. 4374–4383 (2020)
Zhang, A., et al.: Relational attention network for crowd counting. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6788–6797 (2019)
Zhang, A., et al.: Attentional neural fields for crowd counting. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5714–5723 (2019)
Zhang, L., Shi, M., Chen, Q.: Crowd counting via scale-adaptive convolutional neural network. In: 2018 IEEE Winter Conference on Applications of Computer Vision, pp. 1113–1121 (2018)
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: International Conference on Computer Vision and Pattern Recognition, pp. 589–597 (2016)
Acknowledgments
This work is supported by the National Natural Science Foundation of China (61976127).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Meng, C., Han, R., Pang, C., Kang, C., Lyu, C., Lyu, L. (2021). MPANet: Multi-level Progressive Aggregation Network for Crowd Counting. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13110. Springer, Cham. https://doi.org/10.1007/978-3-030-92238-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-92238-2_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92237-5
Online ISBN: 978-3-030-92238-2
eBook Packages: Computer ScienceComputer Science (R0)