Skip to main content
Log in

Pseudo-label growth dictionary pair learning for crowd counting

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Crowd counting has received increasing attention in the field of video surveillance and urban security system. However, many previous models are prone to poor generalization capability to unknown samples when limited labeled samples are available. To improve or mitigate the above weakness, we develop a novel Pseudo-label Growth Dictionary Pair Learning (PG-DPL) method for crowd counting. To be exact, we treat crowd counting as a task of classification and leverage dictionary learning-based (DL) strategy to target the task. Considering that being short of diverse training samples and imbalanced distribution across different classes in crowd scene inevitably result in large prediction deviation caused by the DL model, we propose to apply pseudo-label growth (PG) and adaptive dictionary size (ADS) to improve the accuracy of crowd counting with limited labeled samples. In the proposed method, PG optimizes the initial prediction via reconstructing the discriminant term to improve the robustness of learned dictionary, while ADS explores the imbalanced distribution among different classes to adapt to the size of class-specific dictionary. Extensive validation experiments on five benchmark databases indicate that the proposed PG-DPL can achieve compelling performance compared to other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Sindagi VA, Patel VM (2017) A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognit Lett 107:3–16

    Article  Google Scholar 

  2. Xie W, Noble JA, Zisserman A (2018) Microscopy cell counting and detection with fully convolutional regression networks. Comput Methods Biomech Biomed Eng Imaging Vis 6(3):283–292

    Article  Google Scholar 

  3. Zhang S, Wu G, Costeira JP, Moura MJ (2017) FCN-rLSTM: Deep Spatio-Temporal neural networks for vehicle counting in city cameras. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3687–3696

  4. Marsden M, Mcguinness K, Little S, Keogh CE, Oconnor NE (2018) People, penguins and petri dishes: Adapting object counting models to new visual domains and object types without forgetting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8070–8079

  5. Jiang X, Zhang L, Xu M, Zhang T, Lv P, Zhou B, Yang X, Pang Y (2020) Attention scaling for crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4706–4715

  6. Tian Y, Lei Y, Zhang J, Wang J (2020) PaDNet: Pan-density crowd counting. IEEE Trans Image Process 29:2714–2727

    Article  Google Scholar 

  7. Wang L, Yin B, Guo A, Ma H, Cao J (2018) Skip-connection convolutional neural network for still image crowd counting. Appl Intell 48(10):3360–3371

    Article  Google Scholar 

  8. Chan AB, Liang ZS, Vasconcelos N (2008) Privacy preserving crowd monitoring: Counting people without people models or tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–7

  9. Zhang K, Wang H, Liu W, Li M, Lu J, Liu Z (2020) An efficient semi-supervised manifold embedding for crowd counting. Appl Soft Comput 96:106634

    Article  Google Scholar 

  10. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

  11. Li Y, Zhang X, Chen D (2018) CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100

  12. Simonyan K, Zisserman A (2015) Very deep convolutional networks for Large-Scale image recognition. Int Conf Learn Represent 1–14

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  14. Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 510–519

  15. Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision, pp 483–499

  16. Ranjan V, Le H, Hoai M (2018) Iterative crowd counting. In: European conference on computer vision, pp 278–293

  17. Wan J, Chan AB (2019) Adaptive density map generation for crowd counting. In: Proceedings of the IEEE international conference on computer vision, pp 1130–1139

  18. Olmschenk G, Chen J, Tang H, Zhu Z (2019) Dense crowd counting convolutional neural networks with minimal data using semi-supervised dual-goal generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 21–28

  19. Liu X, De Weijer JV, Bagdanov AD (2018) Leveraging unlabeled data for crowd counting by learning to rank. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7661–7669

  20. Wei W, Meng D, Zhao Q, Xu Z, Wu Y (2019) Semi-supervised transfer learning for image rain removal. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3877–3886

  21. Sindagi VA, Patel VM (2020) HA-CCN: Hierarchical attention-based crowd counting network. IEEE Trans Image Process 29:323–335

    Article  MathSciNet  Google Scholar 

  22. Yasarla R, Sindagi VA, Patel VM (2020) Syn2Real transfer learning for image deraining using gaussian processes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2726–2736

  23. Maeda S (2020) Unpaired image super-resolution using pseudo-supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 291–300

  24. Wright J, Yang AY, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Machine Intell 31(2):210–227

    Article  Google Scholar 

  25. Zhang K, Li J, Wang H, Liu X, Gao X (2018) Learning local dictionaries and similarity structures for single image super-resolution. Signal Process 142:231–243

    Article  Google Scholar 

  26. Zhang K, Tao D, Gao X, Li X, Li J (2017) Coarse-to-fine learning for single-image super-resolution. IEEE Trans Neural Netw 28(5):1109–1122

    Article  Google Scholar 

  27. Zhang K, Luo S, Li M, Jing J, Lu J, Xiong Z (2020) Learning stacking regressors for single image super-resolution. Appl Intell 50(12):4325–4341

    Article  Google Scholar 

  28. Gu S, Zhang L, Zuo W, Feng X (2014) Projective dictionary pair learning for pattern classification. Neural Inform Process Syst 793–801

  29. Yang M, Zhang L, Feng X, Zhang D (2011) Fisher Discrimination Dictionary Learning for sparse representation. In: Proceedings of the IEEE international conference on computer vision, pp 543–550

  30. Feng Z, Yang M, Zhang L, Liu Y, Zhang D (2013) Joint discriminative dimensionality reduction and dictionary learning for face recognition. Pattern Recognit 46(8):2134–2143

    Article  Google Scholar 

  31. Foroughi H, Ray N, Zhang H (2015) Robust people counting using sparse representation and random projection. Pattern Recognit 48(10):3038–3052

    Article  Google Scholar 

  32. Vu TH, Monga V (2017) Fast Low-Rank shared dictionary learning for image classification. IEEE Trans Image Process 26(11):5160–5175

    Article  MathSciNet  Google Scholar 

  33. Luo J, Wang J, Xu H, Lu H (2016) Real-time people counting for indoor scenes[J]. Signal Process 124:27–35

    Article  Google Scholar 

  34. Ling M, Geng X (2019) Indoor crowd counting by mixture of gaussians label distribution learning. IEEE Trans Image Process 28(11):5691–5701

    Article  MathSciNet  Google Scholar 

  35. Cong Y, Gong H, Zhu S (2009) Flow mosaicking: Real-time pedestrian counting without scene-specific learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1093–1100

  36. Ma Z, Chan AB (2013) Crossing the Line: Crowd counting by integer programming with local features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2539–2546

  37. Zheng H, Lin Z, Cen J, Wu Z, Zhao Y (2019) Cross-Line pedestrian counting based on spatially-consistent two-stage local crowd density estimation and accumulation. IEEE Trans Circ Syst Video Technol 29 (3):787–799

    Article  Google Scholar 

  38. Cao L, Zhang X, Ren W, Huang K (2015) Large scale crowd analysis based on convolutional neural network. Pattern Recognit 48(10):3016–3024

    Article  Google Scholar 

  39. Zhao Z, Li H, Zhao R, Wang X (2016) Crossing-line crowd counting with two-phase deep neural networks. In: European conference on computer vision, pp 712–726

  40. Chen K, Gong S, Xiang T, Loy CC (2013) Cumulative attribute space for age and crowd density estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2467–2474

  41. Xiong F, Shi X, Yeung D (2017) Spatiotemporal modeling for crowd counting in videos. In: Proceedings of the IEEE international conference on computer vision, pp 5161– 5169

  42. Fang Y, Zhan B, Cai W, Gao S, Hu B (2019) Locality-constrained spatial transformer network for video crowd counting. Int Conf Multimed Expo 814–819

  43. Zou Z, Shao H, Qu X, Wei W, Zhou P (2019) Enhanced 3D convolutional networks for crowd counting. 30th British Machine Vision Conf 250

  44. Oliva A, Torralba A (2001) Modeling the shape of the scene: A holistic representation of the spatial envelope. Int J Comput Vision 42(3):145–175

    Article  Google Scholar 

  45. Dollar P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: An evaluation of the state of the art. IEEE Trans Pattern Anal Machine Intell 34(4):743–761

    Article  Google Scholar 

  46. Huang S, Li X, Zhang Z, Gao S, Ji R, Han J (2018) Body structure aware deep crowd counting. IEEE Trans Image Process 27(3):1049–1059

    Article  MathSciNet  Google Scholar 

  47. Zhang Z, Sun Y, Wang Y, Zhang Z, Zhang H, Liu G, Wang M (2020) Twin-incoherent self-expressive locality-adaptive latent dictionary pair learning for classification, IEEE Trans Neural Netw Learn Syst https://doi.org/10.1109/TNNLS.2020.2979748

  48. Wang C, Zhang H, Yang L, Liu S, Cao X (2015) Deep people counting in extremely dense crowds. In: Proceedings of the 2015 ACM on multimedia conference, pp 1299–1302

  49. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. Neural Inform Process Syst 1097–1105

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61971339, Grant 61471161, and Grant 61972136, in part by the Key Project of the Natural Science Foundation of Shaanxi Province under Grant 2018JZ6002, in part by the Doctoral Startup Foundation of Xi’an Polytechnic University under Grant BS1616 and Grant BS1408, in part by Shaanxi Innovation Ability Support Program (2021TD-29) –Textile Intelligent Equipment Information and Control Innovation Team, and in part by Shaanxi Innovation Team of Universities – Textile Intelligent Equipment Information and Control Innovation Team.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kaibing Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, W., Wang, H., Luo, H. et al. Pseudo-label growth dictionary pair learning for crowd counting. Appl Intell 51, 8913–8927 (2021). https://doi.org/10.1007/s10489-021-02274-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02274-w

Keywords

Navigation