Abstract
Crowd distribution estimation has strong demands in surveillance applications, such as overcrowding detection, anomaly detection and traffic monitoring. Although a number of methods have been proposed for crowd counting, it is still a challenging task to estimate an accurate crowd distribution map which reflects the actual spatial intensity of the crowd in a real scene, due to the inhomogeneity of crowd distribution and the uncertainty of observation perspective. To address this problem, this paper proposes a multi-scale recursive convolutional neural network (MRCNN) based framework to map the image to its crowd distribution map. The proposed neural network is trained alternatively with two joint objectives, the estimation of crowd density map and perspective map. Since the scale size and scale variance of crowd are good cues for estimating both crowd density map and perspective map, formulating these two objectives together enables learning a strong feature representation for both tasks. By convolving a perspective-adaptive kernel on the crowd density map, we can generate a pixel-wise crowd distribution map in which the pixel value denotes the actual intensity of the crowd at the corresponding location in the real scene. An extension dataset from Shanghaitech crowd dataset B is introduced for the perspective map learning task, in which 700 images with about 3500 height-annotated pedestrians are labelled. Experimental results on Shanghaitech datasets (both A and B), UCF_CC_50 dataset and UCSD dataset demonstrate the effectiveness and reliability of our proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brostow, G.J., Cipolla, R.: Unsupervised bayesian detection of independent motion in crowds. In: CVPR (2006)
Rabaud, V., Belongie, S.: Counting crowded moving objects. In: CVPR (2006)
Cheriyadat, A.M., Bhaduri, B.L., Radke, R.J.: Detecting multiple moving objects in crowded environments with coherent motion regions. In: CVPR (2008)
Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: CVPR (2013)
Davies, A.C., Yin, J.H., Velastin, S.A.: Crowd monitoring using image processing. Electron. Commun. Eng. J. 7(1), 37–47 (1995)
Marana, A.N., Costa, L.D.F., Lotufo, R.A., Velastin, S.A.: Estimating crowd density with minkowski fractal dimension. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6, pp. 3521–3524 (1999)
Paragios, N., Ramesh, V.: A MRF-based approach for real-time subway monitoring. In: CVPR (2001)
Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: CVPR (2008)
Ryan, D., Denman, S., Fookes, C., Sridharan, S.: Crowd counting using multiple local features. In: Digital Image Computing: Techniques and Applications, pp. 81–88 (2009)
Hou, Y.L., Pang, G.K.H.: People counting and human detection in a challenging situation. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 41(1), 24–33 (2011)
Rahmalan, H., Nixon, M.S., Carter, J.N.: On crowd density estimation for surveillance. In: Crime and Security, pp. 540–545 (2007)
Ma, W., Huang, L., Liu, C.: Advanced local binary pattern descriptors for crowd estimation. In: PACIIA (2008)
Albiol, A., Silla, M.J., Mossi, J.M.: Video analysis using corner motion statistics. In: Proceedings of the IEEE International Workshop on Performance Evaluation of Tracking and Surveillance C38 Tools Appl (2010)
Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: CVPR (2015)
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: CVPR (2016)
Shang, C., Ai, H., Bai, B.: End-to-end crowd counting via joint learning local and global count. In: ICIP (2016)
Oñoro-Rubio, D., López-Sastre, R.J.: Towards perspective-free object counting with deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 615–629. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_38
Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: AVSS (2017)
Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: CVPR (2017)
Fradi, H., Dugelay, J.: Low level crowd analysis using frame-wise normalized feature for people counting. In: IEEE International Workshop on Information Forensics and Security, pp. 246–251 (2012)
Kim, J., Lee, J.K., Lee, K.M.: Deeply-recursive convolutional network for image super-resolution. In: CVPR (2016)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J.: Caffe: Convolutional Architecture for Fast Feature Embedding (2014)
Acknowledgments
This work is supported by the Natural Science Foundation of China (61472380).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Wei, M., Kang, Y., Song, W., Cao, Y. (2018). Crowd Distribution Estimation with Multi-scale Recursive Convolutional Neural Network. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10704. Springer, Cham. https://doi.org/10.1007/978-3-319-73603-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-73603-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73602-0
Online ISBN: 978-3-319-73603-7
eBook Packages: Computer ScienceComputer Science (R0)