Skip to main content

A Novel cascaded deep architecture with weak-supervision for video crowd counting and density estimation

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Video-based crowd counting is an essential surveillance tool that plays a crucial role in mitigating crowd catastrophes by facilitating the development and implementation of efficient crowd management methods. The deep learning approaches using density map-based regression consider local crowd distribution but are erroneous for point-level annotation of human heads. The weakly supervised approach overcomes such an issue by mapping global crowd attributes onto ground-truth counts. Also, video-based density map regression approaches don’t handle human shape variation and background effects. Hence, this research suggests a unique cascade of two deep structures: a local density map regressor and a global crowd count regressor with weakly supervised learning. The former model can effectively deal with human shape variation, minimise background effects, consider local crowd distribution, and provide crowd density maps. In contrast, the latter adopts a weakly supervised learning mechanism and provides scene-level crowd counting by considering global attributes of density maps. The trials were conducted using three datasets, namely Venice, Mall, and UCSD, yielding promising and improved outcomes. The codes can be available at https://github.com/santosh1448/LDR_GCCR_Weakly_Supervised.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The data that support the findings of this study can be obtained from the benchmark dataset Venice (Liu et al. 2019), Mall (Chen et al. 2012) and UCSD (Chan et al. 2008).

References

Download references

Acknowledgements

The support and the resources provided by ‘PARAM Shivay Facility' under the National Supercomputing Mission, Government of India at the Indian Institute of Technology, Varanasi, are gratefully acknowledged.

Funding

There was no funding obtained for this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santosh Kumar Tripathy.

Ethics declarations

Conflict of interest

The authors declare that they don’t have any conflict of interest that could have influenced the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tripathy, S.K., Srivastava, S., Bajaj, D. et al. A Novel cascaded deep architecture with weak-supervision for video crowd counting and density estimation. Soft Comput 28, 8319–8335 (2024). https://doi.org/10.1007/s00500-024-09681-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-024-09681-4

Keywords