A hybrid model of convolutional neural networks and deep regression forests for crowd counting

Ji, Qingge; Zhu, Ting; Bao, Di

doi:10.1007/s10489-020-01688-2

A hybrid model of convolutional neural networks and deep regression forests for crowd counting

Published: 25 March 2020

Volume 50, pages 2818–2832, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Qingge Ji^1,2,
Ting Zhu^1,2 &
Di Bao^1,2

750 Accesses
9 Citations
Explore all metrics

Abstract

Real-time monitoring variation of crowd via video surveillance plays a significant role in the new generation of technology in a smart city. We propose a crowd counting algorithm based on deep regression forest, named CountForest. First of all, according to the correlation among frames, the crowd counting problem is transformed into a label-distribution-learning problem. Then we combine convolutional neural networks(CNN) and deep regression forest to make a hybrid model. CNN is introduced for the task of feature learning and deep decision forest is extended to address label distribution learning problem in crowd counting. Thereinto, the proposed network replaces its softmax layer with the aforementioned probabilistic decision forest in order to better establish a mapping relationship between image features and crowds’ number so as to implement an end-to-end hybrid model for crowd counting problem. Our method demonstrated in the final experiments not only attains the high accuracy in crowd counting but has comparable robustness and instantaneity in selected public datasets as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Crowd Counting from a Still Image Using Multi-scale Fully Convolutional Network with Adaptive Human-Shaped Kernel

A Deep-Fusion Network for Crowd Counting in High-Density Crowded Scenes

Article Open access 28 September 2021

Approaches on crowd counting and density estimation: a review

Article 20 February 2021

References

Cao L, Zhang X, Ren W, Huang K (2015) Large scale crowd analysis based on convolutional neural network. Pattern Recogn 48(10):3016–3024
Article Google Scholar
Chan AB, Liang ZSJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: Counting people without people models or tracking. In: Computer vision and pattern recognition, 2008. CVPR 2008. IEEE Conference on, pp 1-7
Change Loy C, Gong S, Xiang T (2013) From semi-supervised to transfer counting of crowds. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2256–2263
Chen K, Loy CC, Gong S, Xiang T (2012) Feature mining for localized crowd counting. In: British Machine Vision Conference, pp 1–11
Chen K, Gong S, Xiang T, Chen CL (2013) Cumulative attribute space for age and crowd density estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2467–2474
Davies AC, Yin JH, Velastin SA (1995) Crowd monitoring using image processing. Electronics and Communication Engineering Journal 7(1):37–47
Article Google Scholar
Dollar P, Wojek C, Schiele B, Perona P (2011) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761
Article Google Scholar
Foroughi H, Ray N, Zhang H (2015) Robust people counting using sparse representation and random projection. Pattern Recogn 48(10):3038–3052
Article Google Scholar
Forsyth D (2014) Object detection with discriminatively trained part-based models. Computer 47(2):6–7
Article MathSciNet Google Scholar
Fu M, Xu P, Li X, Liu Q, Ye M, Zhu C (2015) Fast crowd density estimation with convolutional neural networks. Eng Appl Artif Intell 43:81–88
Article Google Scholar
Gavrila DM (2007) A bayesian, exemplar-based approach to hierarchical shape matching. IEEE Trans Pattern Anal Mach Intell 29(8):1408–21
Article Google Scholar
Gavrila DM, Philomin V (1999) Real-time object detection for “smart” vehicles. IEEE Intconfcomputvis 57(2):87–93. vol. 1
Google Scholar
Geng X, Ji R (2013) Label distribution learning. In: IEEE International Conference on Data Mining Workshops, pp 377–383
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res - Proceedings Track 9:249–256
Google Scholar
Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Computer Vision and Pattern Recognition, pp 2547–2554
Jiang H, Jin W (2019) Effective use of convolutional neural networks and diverse deep supervision for better crowd counting. Appl Intell 49:1–19
Article Google Scholar
Kang D, Ma Z, Chan AB (2018) Beyond counting: Comparisons of density maps for crowd analysis tasks - counting, detection, and tracking. IEEE Trans Circ Syst Vid Tech 29(5):1408–1422
Article Google Scholar
Kumagai S, Hotta K, Kurita T (2017) Mixture of counting cnns: Adaptive integration of cnns specialized to specific appearance for crowd counting. arXiv:1703.09393
Li M, Huang K, Tan T (2009) Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: International Conference on Pattern Recognition, pp 1–4
Liu M, Jiang J, Guo Z, Wang Z, Liu Y (2018) Crowd counting with fully convolutional neural network. In: 2018 25Th IEEE international conference on image processing (ICIP), IEEE, pp 953–957
Marana A, Costa LD, Lotufo R, Velastin S (1998) On the efficacy of texture analysis for crowd monitoring. In: Proc International Symposium on Computer Graphics, Image Processing, pp 354–361
Onoro-Rubio D, López-Sastre RJ (2016) Towards perspective-free object counting with deep learning. In: European Conference on Computer Vision, Springer, pp 615–629
Papageorgiou C, Poggio T (2000) A trainable system for object detection. Int J Comput Vis 38(1):15–33
Article MATH Google Scholar
Paragios N, Ramesh V (2003) A mrf-based approach for real-time subway monitoring. In: Computer Vision and Pattern Recognition, 2001. CVPR 2001., pp I–1034–I–1040 vol.1
Pham VQ, Kozakaya T, Yamaguchi O, Okada R (2015) Count forest: Co-voting uncertain number of targets using random forest for crowd density estimation. In: IEEE International Conference on Computer Vision, pp 3253–3261
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, PP 4031-4039
Shang C, Ai H, Bai B (2016) End-to-end crowd counting via joint learning local and global count. In: IEEE International Conference on Image Processing, pp 1215–1219
Shen W, Zhao K, Guo Y, Yuille AL (2017) Label distribution learning forests. In: Advances in Neural Information Processing Systems, pp 834–843
Sheng B, Shen C, Lin G, Li J, Yang W, Sun C (2016) Crowd counting via weighted vlad on dense attribute feature map. IEEE Trans Circ Syst Vid Tech 28(8):1788–1797
Article Google Scholar
Sindagi VA, Patel VM (2018) A survey of recent advances in cnn-based single image crowd counting and density estimation. Pattern Recogn Lett 107:3–16
Article Google Scholar
Song J, Guo Y, Gao L, Li X, Hanjalic A, Shen HT (2018) From deterministic to generative: Multimodal stochastic rnns for video captioning. IEEE Trans Neural Netw Learn Syst 30(10):3047–3058
Article Google Scholar
Tan B, Zhang J, Wang L (2011) Semi-supervised elastic net for pedestrian counting. Pattern Recogn 44(10-11):2297–2304
Article Google Scholar
Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Article Google Scholar
Walach E, Wolf L (2016) Learning to count with cnn boosting. In: European Conference on Computer Vision, pp 660–676
Wang C, Zhang H, Yang L, Liu S, Cao X (2015) Deep people counting in extremely dense crowds. In: ACM International Conference on Multimedia, pp 1299–1302
Wang X, Gao L, Song J, Shen H (2016) Beyond frame-level cnn: saliency-aware 3-d cnn with lstm for video action recognition. IEEE Signal Process Lett 24(4):510–514
Article Google Scholar
Wang X, Gao L, Wang P, Sun X, Liu X (2017) Two-stream 3-d convnet fusion for action recognition in videos with arbitrary size and length. IEEE T MULTIMEDIA 20(3):634–644
Article Google Scholar
Wu B, Nevatia R (2005) Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Tenth IEEE International Conference on Computer Vision, pp 90–97
Xu B, Qiu G (2016) Crowd density estimation based on rich features and random projection forest. In: IEEE Winter Conference on Applications of Computer Vision, pp 1–8
Zeng L, Xu X, Cai B, Qiu S, Zhang T (2017) Multi-scale convolutional neural networks for crowd counting. In: IEEE International conference on image processing (ICIP), IEEE, pp 465–469
Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 833–841
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Computer Vision and Pattern Recognition, pp 589–597
Zhang Z, Wang M, Geng X (2015) Crowd counting in public video surveillance by label distribution learning. Neurocomputing 166:151–163
Article Google Scholar

Download references

Acknowledgements

This work was supported by Natural Science Foundation of Guangdong Province, China (No.2016A030313288).

Author information

Authors and Affiliations

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, 510006, China
Qingge Ji, Ting Zhu & Di Bao
Guangdong Province Key Laboratory of Big Data Analysis and Processing, Guangzhou, 510006, China
Qingge Ji, Ting Zhu & Di Bao

Authors

Qingge Ji
View author publications
You can also search for this author in PubMed Google Scholar
Ting Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Di Bao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qingge Ji.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ji, Q., Zhu, T. & Bao, D. A hybrid model of convolutional neural networks and deep regression forests for crowd counting. Appl Intell 50, 2818–2832 (2020). https://doi.org/10.1007/s10489-020-01688-2

Download citation

Published: 25 March 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s10489-020-01688-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid model of convolutional neural networks and deep regression forests for crowd counting

Abstract

Access this article

Similar content being viewed by others

Crowd Counting from a Still Image Using Multi-scale Fully Convolutional Network with Adaptive Human-Shaped Kernel

A Deep-Fusion Network for Crowd Counting in High-Density Crowded Scenes

Approaches on crowd counting and density estimation: a review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hybrid model of convolutional neural networks and deep regression forests for crowd counting

Abstract

Access this article

Similar content being viewed by others

Crowd Counting from a Still Image Using Multi-scale Fully Convolutional Network with Adaptive Human-Shaped Kernel

A Deep-Fusion Network for Crowd Counting in High-Density Crowded Scenes

Approaches on crowd counting and density estimation: a review

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation