Abstract
Crowd counting has received increasing attention in the field of video surveillance and urban security system. However, many previous models are prone to poor generalization capability to unknown samples when limited labeled samples are available. To improve or mitigate the above weakness, we develop a novel Pseudo-label Growth Dictionary Pair Learning (PG-DPL) method for crowd counting. To be exact, we treat crowd counting as a task of classification and leverage dictionary learning-based (DL) strategy to target the task. Considering that being short of diverse training samples and imbalanced distribution across different classes in crowd scene inevitably result in large prediction deviation caused by the DL model, we propose to apply pseudo-label growth (PG) and adaptive dictionary size (ADS) to improve the accuracy of crowd counting with limited labeled samples. In the proposed method, PG optimizes the initial prediction via reconstructing the discriminant term to improve the robustness of learned dictionary, while ADS explores the imbalanced distribution among different classes to adapt to the size of class-specific dictionary. Extensive validation experiments on five benchmark databases indicate that the proposed PG-DPL can achieve compelling performance compared to other state-of-the-art methods.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Sindagi VA, Patel VM (2017) A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognit Lett 107:3–16
Xie W, Noble JA, Zisserman A (2018) Microscopy cell counting and detection with fully convolutional regression networks. Comput Methods Biomech Biomed Eng Imaging Vis 6(3):283–292
Zhang S, Wu G, Costeira JP, Moura MJ (2017) FCN-rLSTM: Deep Spatio-Temporal neural networks for vehicle counting in city cameras. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3687–3696
Marsden M, Mcguinness K, Little S, Keogh CE, Oconnor NE (2018) People, penguins and petri dishes: Adapting object counting models to new visual domains and object types without forgetting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8070–8079
Jiang X, Zhang L, Xu M, Zhang T, Lv P, Zhou B, Yang X, Pang Y (2020) Attention scaling for crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4706–4715
Tian Y, Lei Y, Zhang J, Wang J (2020) PaDNet: Pan-density crowd counting. IEEE Trans Image Process 29:2714–2727
Wang L, Yin B, Guo A, Ma H, Cao J (2018) Skip-connection convolutional neural network for still image crowd counting. Appl Intell 48(10):3360–3371
Chan AB, Liang ZS, Vasconcelos N (2008) Privacy preserving crowd monitoring: Counting people without people models or tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–7
Zhang K, Wang H, Liu W, Li M, Lu J, Liu Z (2020) An efficient semi-supervised manifold embedding for crowd counting. Appl Soft Comput 96:106634
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597
Li Y, Zhang X, Chen D (2018) CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100
Simonyan K, Zisserman A (2015) Very deep convolutional networks for Large-Scale image recognition. Int Conf Learn Represent 1–14
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 510–519
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision, pp 483–499
Ranjan V, Le H, Hoai M (2018) Iterative crowd counting. In: European conference on computer vision, pp 278–293
Wan J, Chan AB (2019) Adaptive density map generation for crowd counting. In: Proceedings of the IEEE international conference on computer vision, pp 1130–1139
Olmschenk G, Chen J, Tang H, Zhu Z (2019) Dense crowd counting convolutional neural networks with minimal data using semi-supervised dual-goal generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 21–28
Liu X, De Weijer JV, Bagdanov AD (2018) Leveraging unlabeled data for crowd counting by learning to rank. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7661–7669
Wei W, Meng D, Zhao Q, Xu Z, Wu Y (2019) Semi-supervised transfer learning for image rain removal. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3877–3886
Sindagi VA, Patel VM (2020) HA-CCN: Hierarchical attention-based crowd counting network. IEEE Trans Image Process 29:323–335
Yasarla R, Sindagi VA, Patel VM (2020) Syn2Real transfer learning for image deraining using gaussian processes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2726–2736
Maeda S (2020) Unpaired image super-resolution using pseudo-supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 291–300
Wright J, Yang AY, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Machine Intell 31(2):210–227
Zhang K, Li J, Wang H, Liu X, Gao X (2018) Learning local dictionaries and similarity structures for single image super-resolution. Signal Process 142:231–243
Zhang K, Tao D, Gao X, Li X, Li J (2017) Coarse-to-fine learning for single-image super-resolution. IEEE Trans Neural Netw 28(5):1109–1122
Zhang K, Luo S, Li M, Jing J, Lu J, Xiong Z (2020) Learning stacking regressors for single image super-resolution. Appl Intell 50(12):4325–4341
Gu S, Zhang L, Zuo W, Feng X (2014) Projective dictionary pair learning for pattern classification. Neural Inform Process Syst 793–801
Yang M, Zhang L, Feng X, Zhang D (2011) Fisher Discrimination Dictionary Learning for sparse representation. In: Proceedings of the IEEE international conference on computer vision, pp 543–550
Feng Z, Yang M, Zhang L, Liu Y, Zhang D (2013) Joint discriminative dimensionality reduction and dictionary learning for face recognition. Pattern Recognit 46(8):2134–2143
Foroughi H, Ray N, Zhang H (2015) Robust people counting using sparse representation and random projection. Pattern Recognit 48(10):3038–3052
Vu TH, Monga V (2017) Fast Low-Rank shared dictionary learning for image classification. IEEE Trans Image Process 26(11):5160–5175
Luo J, Wang J, Xu H, Lu H (2016) Real-time people counting for indoor scenes[J]. Signal Process 124:27–35
Ling M, Geng X (2019) Indoor crowd counting by mixture of gaussians label distribution learning. IEEE Trans Image Process 28(11):5691–5701
Cong Y, Gong H, Zhu S (2009) Flow mosaicking: Real-time pedestrian counting without scene-specific learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1093–1100
Ma Z, Chan AB (2013) Crossing the Line: Crowd counting by integer programming with local features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2539–2546
Zheng H, Lin Z, Cen J, Wu Z, Zhao Y (2019) Cross-Line pedestrian counting based on spatially-consistent two-stage local crowd density estimation and accumulation. IEEE Trans Circ Syst Video Technol 29 (3):787–799
Cao L, Zhang X, Ren W, Huang K (2015) Large scale crowd analysis based on convolutional neural network. Pattern Recognit 48(10):3016–3024
Zhao Z, Li H, Zhao R, Wang X (2016) Crossing-line crowd counting with two-phase deep neural networks. In: European conference on computer vision, pp 712–726
Chen K, Gong S, Xiang T, Loy CC (2013) Cumulative attribute space for age and crowd density estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2467–2474
Xiong F, Shi X, Yeung D (2017) Spatiotemporal modeling for crowd counting in videos. In: Proceedings of the IEEE international conference on computer vision, pp 5161– 5169
Fang Y, Zhan B, Cai W, Gao S, Hu B (2019) Locality-constrained spatial transformer network for video crowd counting. Int Conf Multimed Expo 814–819
Zou Z, Shao H, Qu X, Wei W, Zhou P (2019) Enhanced 3D convolutional networks for crowd counting. 30th British Machine Vision Conf 250
Oliva A, Torralba A (2001) Modeling the shape of the scene: A holistic representation of the spatial envelope. Int J Comput Vision 42(3):145–175
Dollar P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: An evaluation of the state of the art. IEEE Trans Pattern Anal Machine Intell 34(4):743–761
Huang S, Li X, Zhang Z, Gao S, Ji R, Han J (2018) Body structure aware deep crowd counting. IEEE Trans Image Process 27(3):1049–1059
Zhang Z, Sun Y, Wang Y, Zhang Z, Zhang H, Liu G, Wang M (2020) Twin-incoherent self-expressive locality-adaptive latent dictionary pair learning for classification, IEEE Trans Neural Netw Learn Syst https://doi.org/10.1109/TNNLS.2020.2979748
Wang C, Zhang H, Yang L, Liu S, Cao X (2015) Deep people counting in extremely dense crowds. In: Proceedings of the 2015 ACM on multimedia conference, pp 1299–1302
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. Neural Inform Process Syst 1097–1105
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Grant 61971339, Grant 61471161, and Grant 61972136, in part by the Key Project of the Natural Science Foundation of Shaanxi Province under Grant 2018JZ6002, in part by the Doctoral Startup Foundation of Xi’an Polytechnic University under Grant BS1616 and Grant BS1408, in part by Shaanxi Innovation Ability Support Program (2021TD-29) –Textile Intelligent Equipment Information and Control Innovation Team, and in part by Shaanxi Innovation Team of Universities – Textile Intelligent Equipment Information and Control Innovation Team.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, W., Wang, H., Luo, H. et al. Pseudo-label growth dictionary pair learning for crowd counting. Appl Intell 51, 8913–8927 (2021). https://doi.org/10.1007/s10489-021-02274-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02274-w