Skip to main content
Log in

MACC Net: Multi-task attention crowd counting network

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Crowd counting and Crowd density map estimation face several challenges, including occlusions, non-uniform density, and intra-scene scale and perspective variations. Significant progress has been made in the development of most crowd counting approaches in recent years, especially with the emergence of deep learning and massive crowd datasets. The purpose of this work is to address the problem of crowd density estimatation in both sparse and crowded situations. In this paper, we propose a multi-task attention based crowd counting network (MACC Net), which consists of three contributions: 1) density level classification, which offers the global contextual information for the density estimation network; 2) density map estimation; and 3) segmentation guided attention to filter out the background noise from the foreground features. The proposed MACC Net is evaluated on four popular datasets including ShanghaiTech, UCF-CC-50, UCF-QRNF, and a recently launched dataset HaCrowd. The MACC Net achieves the state of the art in estimation when applied to HaCrowd and UCF-CC-50, while on the others, it obtains competitive results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Ge W, Collins RT (2009) Marked point processes for crowd counting. In: 2009 IEEE conference on computer vision and pattern recognition, CVPR 2009. https://doi.org/10.1109/CVPRW.2009.5206621, pp 2913–2920

  2. Li M, Zhang Z, Huang K, Tan T (2008) Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection. In: Proceedings - international conference on pattern recognition. https://doi.org/10.1109/icpr.2008.4761705

  3. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2013.329, pp 2547–2554

  4. Lempitsky V, Zisserman A (2010) Learning to count objects in images. In: Advances in neural information processing systems 23: 24th annual conference on neural information processing systems 2010, NIPS 2010, vol 23

  5. Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 11206 LNCS, pp 544–559. https://doi.org/10.1007/978-3-030-01216-8_33

  6. Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using Multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503. arXiv:1604.02878, https://doi.org/10.1109/LSP.2016.2603342

    Article  Google Scholar 

  7. Kravchik M, Shabtai A (2018) Detecting cyber attacks in industrial control systems using convolutional neural networks. In: Proceedings of the ACM Conference on computer and communications security. p 72–83, association for computing machinery. https://doi.org/10.1145/3264888.3264896

  8. Oñoro-Rubio D, López-Sastre RJ (2016) Towards perspective-free object counting with deep learning. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 9911 LNCS, p 615–629. Springer. https://doi.org/10.1007/978-3-319-46478-7_38

  9. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. Proceedings of the IEEE Computer society conference on computer vision and pattern recognition 2016-Decem:589–597. https://doi.org/10.1109/CVPR.2016.70

    Google Scholar 

  10. Li Y, Zhang X, Chen D (2018) CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Computer society conference on computer vision and pattern recognition. p 1091–1100, IEEE Computer Society,??? https://doi.org/10.1109/CVPR.2018.00120

  11. Hossain MA, Hosseinzadeh M, Chanda O, Wang Y (2019) Crowd counting using scale-aware attention networks. In: Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019. https://doi.org/10.1109/WACV.2019.00141https://doi.org/10.1109/WACV.2019.00141, pp 1280–1288

  12. Gao J, Wang Q, Li X (2020) PCC Net: Perspective crowd counting via spatial convolutional network. IEEE Trans Circuits Syst Video Technol 30(10):3486–3498. arXiv:1905.10085. https://doi.org/10.1109/TCSVT.2019.2919139

    Article  Google Scholar 

  13. Wang Q, Breckon TP (2022) Crowd counting via segmentation guided attention networks and curriculum loss. IEEE Transactions on Intelligent Transportation Systems, p 1–11. arXiv:1911.07990. https://doi.org/10.1109/tits.2021.3138896

  14. Pham VQ, Kozakaya T, Yamaguchi O, Okada R (2015) COUNT forest: Co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE international conference on computer vision, vol 2015 Inter. https://doi.org/10.1109/ICCV.2015.372, pp 3253–3261

  15. Wan J, Chan A (2019) Adaptive density map generation for crowd counting. In: Proceedings of the IEEE international conference on computer vision, vol 2019-Octob. https://doi.org/10.1109/ICCV.2019.00122, pp 1130–1139

  16. Zhang Y, Zhao H, Duan Z, Huang L, Deng J, Zhang Q (2021) Congested crowd counting via adaptive multi-scale context learning. Sensors 21(11):3777. https://doi.org/10.3390/s21113777

    Article  Google Scholar 

  17. Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 07-12-June, p 833–841. IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298684

  18. Wang Q, Gao J, Lin W, Yuan Y (2019) Learning from synthetic data for crowd counting in the wild. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, p 8190–8199. IEEE computer society. https://doi.org/10.1109/CVPR.2019.00839

  19. Kang D, Dhar D, Chan AB (2020) Incorporating Side Information by Adaptive Convolution. Int J Comput Vis 128(12):2897–2918. https://doi.org/10.1007/s11263-020-01345-8

    Article  Google Scholar 

  20. Walach E, Wolf L (2016) Learning to count with CNN boosting. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 9906 LNCS, p 660–676. Springer. https://doi.org/10.1007/978-3-319-46475-6_41

  21. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 2017-Decem, p 5999–6009. Neural information processing systems foundation??? arXiv:1706.03762v5

  22. Zhang A, Shen J, Xiao Z, Zhu F, Zhen X, Cao X, Shao L (2019) Relational attention network for crowd counting. In: Proceedings of the IEEE International conference on computer vision, vol 2019-Octob, p 6787–6796. https://doi.org/10.1109/ICCV.2019.00689https://doi.org/10.1109/ICCV.2019.00689

  23. Liu N, Long Y, Zou C, Niu Q, Pan L, Wu H (2019) Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2019-June, p 3220–3229. IEEE computer society. https://doi.org/10.1109/CVPR.2019.00334. arXiv:https://arxiv.org/abs/1811.11968v5

  24. Jiang X, Zhang L, Xu M, Zhang T, Lv P, Zhou B, Yang X, Pang Y (2020) Attention scaling for crowd counting. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, p 4705–4714. IEEE computer society. https://doi.org/10.1109/CVPR42600.2020.00476

  25. Wu X, Liang G, Lee KK, Xu Y (2006) Crowd density estimation using texture analysis and learning. In: 2006 IEEE International Conference on Robotics and Biomimetics, ROBIO 2006, p 214–219. https://doi.org/10.1109/ROBIO.2006.340379

  26. Fu M, Xu P, Li X, Liu Q, Ye M, Zhu C (2015) Fast crowd density estimation with convolutional neural networks. Eng Appl Artif Intell 43:81–88. https://doi.org/10.1016/j.engappai.2015.04.006

    Article  Google Scholar 

  27. Chen JC, Kumar A, Ranjan R, Patel VM, Alavi A, Chellappa R (2016) A cascaded convolutional neural network for age estimation of unconstrained faces. In: 2016 IEEE 8th International conference on biometrics theory, applications and systems, BTAS 2016. Institute of electrical and electronics engineers Inc. https://doi.org/10.1109/BTAS.2016.7791154

  28. Girshick R (2015) Fast R-CNN.. In: Proceedings of the IEEE International Conference on Computer Vision. https://github.com/rbgirshick/. Accessed 11 April 2022

  29. Xu C, Liang D, Xu Y, Bai S, Zhan W, Bai X, Tomizuka M (2022) AutoScale: Learning to Scale for Crowd Counting. Int J Comput Vis 130(2):405–434. https://doi.org/10.1007/s11263-021-01542-z

    Article  Google Scholar 

  30. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2016-Decem, p 2818–2826. IEEE computer society. https://doi.org/10.1109/CVPR.2016.308

  31. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 07-12-June, p 431–440. IEEE computer society. https://doi.org/10.1109/CVPR.2015.7298965

  32. Wang Q, Breckon TP (2022) Crowd counting via segmentation guided attention networks and curriculum loss. IEEE Transactions on Intelligent Transportation Systems, p 1–11. arXiv:https://arxiv.org/abs/1911.07990. https://doi.org/10.1109/tits.2021.3138896

  33. Jiang L, Meng D, Zhao Q, Shan S, Hauptmann AG (2015) Self-Paced Curriculum learning proceedings of the AAAI conference on artificial intelligence 29(1)

  34. HaCrowd https://github.com/KAU-Smart-Crowd/HaCrowd Accessed 11 Nov. 2022

  35. Pytorch (2019) PyTorch: tensors and dynamic neural networks in Python with strong GPU acceleration. https://github.com/pytorch/pytorch Accessed 11 Nov. 2022

  36. Kingma DP, Ba JL (2015) Adam: A method for stochastic optimization. In: 3rd International conference on learning representations, ICLR 2015 - conference track proceedings. international conference on learning representations, ICLR,. arXiv:https://arxiv.org/abs/1412.6980v9

  37. Sindagi VA, Patel VM (2017) CNN-Based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE International Conference on advanced video and signal based surveillance, AVSS 2017. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/AVSS.2017.8078491

  38. Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, vol 2017-Janua, p 4031–4039. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/CVPR.2017.429https://doi.org/10.1109/CVPR.2017.429

  39. Shi Z, Zhang L, Liu Y, Cao X, Ye Y, Cheng MM, Zheng G (2018) Crowd Counting with Deep Negative Correlation Learning. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, p 5382–5390. IEEE Computer Society. https://doi.org/10.1109/CVPR.2018.00564

  40. Liu YB, Jia RS, Liu QM, Zhang XL, Sun HM (2021) Crowd counting method based on the self-attention residual network. Appl Intell 51(1):427–440. https://doi.org/10.1007/s10489-020-01842-whttps://doi.org/10.1007/s10489-020-01842-w

    Article  Google Scholar 

  41. Wu D, Fan Z, Cui M (2022) Average up-sample network for crowd counting. Appl Intell 52(2):1376–1388. https://doi.org/10.1007/s10489-021-02470-8

    Article  Google Scholar 

Download references

Acknowledgments

The Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia has funded this project, under grant no. (FP-090-43).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sahar Aldhaheri.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aldhaheri, S., Alotaibi, R., Alzahrani, B. et al. MACC Net: Multi-task attention crowd counting network. Appl Intell 53, 9285–9297 (2023). https://doi.org/10.1007/s10489-022-03954-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03954-x

Keywords

Navigation