Skip to main content
Log in

Convolutional neural networks for crowd behaviour analysis: a survey

  • Survey
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Interest in automatic crowd behaviour analysis has grown considerably in the last few years. Crowd behaviour analysis has become an integral part all over the world for ensuring peaceful event organizations and minimum casualties in the places of public and religious interests. Traditionally, the area of crowd analysis was computed using handcrafted features. However, the real-world images and videos consist of nonlinearity that must be used efficiently for gaining accuracies in the results. As in many other computer vision areas, deep learning-based methods have taken giant strides for obtaining state-of-the-art performance in crowd behaviour analysis. This paper presents a comprehensive survey of current convolution neural network (CNN)-based methods for crowd behaviour analysis. We have also surveyed popular software tools for CNN in the recent years. This survey presents detailed attributes of CNN with special emphasis on optimization methods that have been utilized in CNN-based methods. It also reviews fundamental and innovative methodologies, both conventional and latest methods of CNN, reported in the last few years. We introduce a taxonomy that summarizes important aspects of the CNN for approaching crowd behaviour analysis. Details of the proposed architectures, crowd analysis needs and their respective datasets are reviewed. In addition, we summarize and discuss the main works proposed so far with particular interest on CNNs on how they treat the temporal dimension of data, their highlighting features and opportunities and challenges for future research. To the best of our knowledge, this is a unique survey for crowd behaviour analysis using the CNN. We hope that this survey would become a reference in this ever-evolving field of research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  3. Deng, L.: An overview of deep-structured learning for information processing. In: Asian-Pacific Signal and Information Processing Annual Summit and Conference (APSIPA-ASC), Oct. 2011

  4. Vicsek, T., Zafeiris, A.: Collective motion. Phys. Rep. 517(3), 71–140 (2012)

    Article  Google Scholar 

  5. Hinton, G.: Deep neural networks for acoustic modelling in speech recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  6. Yu, D., Deng, L.: Deep learning and its applications to signal and information processing. IEEE Signal Process. Mag. 28(1), 145–154 (2011)

    Article  Google Scholar 

  7. Arel, I., Rose, C., Karnowski, T.: Deep machine learning—a new frontier in artificial intelligence. IEEE Comput. Intell. Mag. 5(4), 13–18 (2010)

    Article  Google Scholar 

  8. Deng, L.: A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Trans. Signal Inf. Process. 3, e2 (2014)

    Article  Google Scholar 

  9. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)

    Article  MATH  Google Scholar 

  10. Lo, S.-C., Lou, S.-L., Lin, J.-S., Freedman, M.T., Chien, M.V., Mun, S.K.: Artificial convolution neural network techniques and applications for lung nodule detection. IEEE Trans. Med. Imaging 14(4), 711–718 (1995)

    Article  Google Scholar 

  11. Lecun, Y.B.L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE (1998)

  12. Krizhevsky, A., Sutskever, I., Geoffrey, E.H.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS 2012), vol. 25 (2012)

  13. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 1–42 (2014)

    MathSciNet  Google Scholar 

  14. Moeslund, T.B., Granum, E.: A survey of computer vision-based human motion capture. Comput. Vis. Image Underst. 81(3), 231–268 (2001)

    Article  MATH  Google Scholar 

  15. Bishop, C.M.: Pattern Recognition & Machine Learning, vol. 128, 1st edn, pp. 1–58. Springer, New York (2006)

    MATH  Google Scholar 

  16. Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1), 41–50 (2003)

    Article  MathSciNet  Google Scholar 

  17. Lemley, J., Bazrafkan, S., Corcoran, P.: Deep learning for consumer devices and services: pushing the limits for machine learning, artificial intelligence, and computer vision. IEEE Consum. Electron. Mag. 6(2), 48–56 (2017)

    Article  Google Scholar 

  18. Leo, M., Medioni, G., Trivedi, M., Kanade, T., Farinella, G.: Computer vision for assistive technologies. Comput. Vis. Image Underst. 15, 1–15 (2017)

    Article  Google Scholar 

  19. Liu, D., Wang, Z., Nasrabadi, N., Huang, T.: Learning a mixture of deep networks for single image super-resolution. In: Asian Conference on Computer Vision (2017)

  20. Wing, J.M.: Computational thinking. Commun. ACM 49(3), 33–35 (2006)

    Article  Google Scholar 

  21. Sun, Y., Fisher, R.: Object-based visual attention for computer vision. Artif. Intell. 146(1), 77–123 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  22. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)

    Article  Google Scholar 

  23. Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G.: Recent advances in convolutional neural networks. eprint arXiv:1512.07108, Dec. 2015

  24. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)

    Article  Google Scholar 

  25. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  26. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: International Conference on Neural Information Processing Systems (2007)

  27. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

  28. Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195(1), 215–243 (1968)

    Article  Google Scholar 

  29. LeCun, Y., Cortes, C., Burges, C.J.: MNIST handwritten digit database (2010)

  30. Gewin, V.: Turning point: intelligence programmer. Nature 533(281), 145–284 (2016)

    Google Scholar 

  31. Clark, C., Storkey, A.: Teaching deep convolutional neural networks to play go. arXiv preprint arXiv:1412.3409 (2014)

  32. Wallach, I., Dzamba, M., Heifets, A.: AtomNet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery. arXiv preprint arXiv:1510.02855 (2015)

  33. Weisstein, E.W.: Convolution. From MathWorld—a Wolfram web resource (2009)

  34. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: ECCV (2014)

  35. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  36. Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, Boston (2015)

  37. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. eprint arXiv:1512.03385 (2015)

  38. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision, Amsterdam (2016)

  39. Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)

  40. Singh, S., Hoiem, D., Forsyth, D.: Swapout: learning an ensemble of deep architectures. arXiv preprint arXiv:1605.06465 (2016)

  41. Targ, S., Almeida, D., Lyman, K.: Resnet in resnet: generalizing residual architectures. arXiv preprint arXiv:1603.08029 (2016)

  42. Zhang, K., Sun, M., Han, T.X., Yuan, X., Guo, L., Liu, T.: Residual networks of residual networks: multilevel residual networks. IEEE Trans. Circuits Syst. Video Technol. (2016). https://doi.org/10.1109/TCSVT.2017.2654543

    Article  Google Scholar 

  43. Ngiam, J., Chen, Z., Chia, D., Koh, P.W., Le, Q.V., Ng, A.Y.: Tiled convolutional neural networks. In: NIPS (2010)

  44. Wang, Z., Oates, T.: Encoding time series as images for visual inspection and classification using tiled convolutional neural networks. In: AAAI Workshop (2015)

  45. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In ICLR (2016)

  46. Kalchbrenner, N., Espeholt, L., Simonyan, K., Oord, A., Graves, A., Kavukcuoglu, K.: Neural machine translation in linear time. arXiv preprint arXiv:1610.10099 (2016)

  47. Sercu, T., Goel, V.: Dense prediction on sequences with time-dilated convolutions for speech recognition. In: NIPS Workshop (2016)

  48. Oord, V.D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: Wavenet: a generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)

  49. Lin, M., Chen, Q., Yan, S.: Network in network. arXiv:1312.4400 (2013)

  50. Szegedy, C., Ioe, S., Vanhoucke, V., Alemi, A.: Inceptionv4, Inception-ResNet and the impact of residual connections on learning. arXiv:1602.07261 (2016)

  51. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. arXiv:1411.4038 (2015)

  52. Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: CVPR (2010)

  53. Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: ICCV (2011)

  54. Bruna, J., Szlam, A., LeCun, Y.: Signal recovery from pooling representations. eprint arXiv:1311.4025 (2014)

  55. Gulcehre, C., Cho, K., Pascanu, R., Bengio, Y.: Learned-norm pooling for deep feedforward and recurrent neural networks. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases (2014)

  56. Simoncelli, E.P., Heeger, D.J.: A model of neuronal responses in visual area MT. Vis. Res. 38(5), 743–761 (1998)

    Article  Google Scholar 

  57. Hyvärinen, A., Köster, U.: Complex cell pooling and the statistics of natural images. Netw. Comput. Neural Syst. 18(2), 81–100 (2007)

    Article  MathSciNet  Google Scholar 

  58. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation SOF feature detectors. eprint arXiv:1207.0580 (2012)

  59. Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: PMLR (2013)

  60. Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. eprint arXiv:1301.3557 (2013)

  61. Rippel, O., Snoek, J., Adams, R.P.: Spectral representations for convolutional neural networks. In: NIPS, Montreal (2015)

  62. Gong, Y., Ke, Q., Isard, M., Lazebnik, S.: A multi-view embedding space for modeling internet images, tags, and their semantics. Int. J. Comput. Vis. 106(2), 210–233 (2014)

    Article  Google Scholar 

  63. Jégou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1704–1716 (2012)

    Article  Google Scholar 

  64. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: International Conference on International Conference on Machine Learning, Haifa (2010)

  65. Maas, A.L., Hannun, Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing (2013)

  66. Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: International Conference on Machine Learning, Atlanta (2013)

  67. Springenberg, J.T., Riedmiller, M.: Improving deep neural networks with probabilistic maxout units. arXiv preprint arXiv:1312.6116 (2013)

  68. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification, In: IEEE International Conference on Computer Vision (2015)

  69. Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015)

  70. Clevert, D.-A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)

  71. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: International Conference on Computational Statistics (COMPSTAT’2010) (2010)

  72. Wijnhoven, R.G., dde With, P.H.N: Fast training of object detection using stochastic gradient descent. In: 2010 20th International Conference on Pattern Recognition (ICPR). IEEE (2010)

  73. Zinkevich, M.A., Weimer, M., Smola, A., Li, L.: Parallelized stochastic gradient descent. In: NIPS, Vancouver (2010)

  74. Recht, B., Re, C., Wright, S., Niu, F.: Hogwild: a lock-free approach to parallelizing stochastic gradient descent. In: NIPS (2011)

  75. Bengio, Y.: Deep learning of representations: looking forward. In: International Conference on Statistical Language and Speech Processing (2013)

  76. Dean, G., Corrado, G.S., Monga, R., Chen, K., Devin, M., Le, Q.V., Mao, M.Z., Ranzato, M., Senior, A., Tucker, P., Yang, K., Ng, A.Y.: Large scale distributed deep networks. In: NIPS. Lake Tahoe, Nevada (2012)

  77. Zhuang, Y., Chin, W.-S., Juan, Y.-C., Lin, C.-J.: A fast parallel SGD for matrix factorization in shared memory systems. In: ACM Conference on Recommender Systems, Hong Kong (2013)

  78. Thoma, M.: Analysis and optimization of convolutional neural network architectures. arXiv preprint arXiv:1707.09725 (2017)

  79. Ooi, B.C., et al.: SINGA: a distributed deep learning platform. In: ACM International Conference on Multimedia, Brisbane, (2015)

  80. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM International Conference on Multimedia, Orlando (2014)

  81. http://deeplearning4j.org/. Last visited 27 May 2017

  82. King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)

    Google Scholar 

  83. Seide, F.: Keynote: the computer science behind the Microsoft Cognitive Toolkit: An open source large-scale deep learning toolkit for Windows and Linux. In: IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (2017)

  84. Chen, T., et al.: Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015)

  85. Lopez, R.: Open NN: an open source neural networks C++ library [software] (2014)

  86. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Ghemawat, S.: TensorFlow: large scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)

  87. Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I., Bergeron, A., Bouchard, N., Warde-Farley, D., Bengio, Y.: Theano: new features and speed improvements. In: Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop (2012)

  88. Collobert, K.K.C.F.R.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop (No. EPFL-CONF-192376) (2011)

  89. Wu, S., Moore, B.E., Shah, M.: Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, San Francisco (2010)

  90. Zitouni, M.S., Bhaskar, H., Dias, J., Al-Mualla, M.: Advances and trends in visual crowd analysis: a systematic survey and evaluation of crowd modelling techniques. Neurocomputing 186, 139–159 (2016)

    Article  Google Scholar 

  91. Rodriguez, M., Laptev, I., Sivic, J., Audibert, J.Y.: Density-aware person detection and tracking in crowds. In: IEEE International Conference on Computer Vision (2011)

  92. Xu, D., Song, R., Wu, X., Li, N., Feng, W., Qian, H.: Video anomaly detection based on a hierarchical activity discovery within spatio-temporal context. Neurocomputing 143, 144–152 (2014)

    Article  Google Scholar 

  93. Cheng, Z., Qin, L., Huang, Q., Yan, S., Tian, Q.: Recognizing human group action by layered model with multiple cues. Neurocomputing 136, 124–135 (2014)

    Article  Google Scholar 

  94. Liang, R., Zhu, Y., Wang, H.: Counting crowd flow based on feature points. Neurocomputing 133, 377–384 (2014)

    Article  Google Scholar 

  95. Zhan, B., Monekosso, D.N., Remagnino, P., Velastin, S.A., Xu, L.-Q.: Crowd analysis: a survey. Mach. Vis. Appl. Mach. Vis. Appl. 19(5–6), 345–357 (2008)

    Article  Google Scholar 

  96. Rodrigues, F., Lourenco, M., Ribeiro, B., Pereira, F.: Learning supervised topic models for classification and regression from crowds. IEEE Trans. Pattern Anal. Mach. Intell. 99, 1–1 (2017)

    Google Scholar 

  97. Ali, S., Shah, M.: A Lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, pp. 1–6 (2007)

  98. McIvor, A.M.: Background subtraction techniques, image and vision computing. Proc. Image Vis. Comput. 4, 3099–3104 (2000)

    Google Scholar 

  99. Black, M.J., Fleet, D.J.: Probabilistic detection and tracking of motion bound-aries. Int. J. Comput. Vis. 38(3), 231–245 (2000)

    Article  MATH  Google Scholar 

  100. Garcia-Bunster, G., Torres-Torriti, M., Oberli, C.: Crowded pedestrian counting at bus stops from perspective transformations of foreground areas. IET Comput. Vis. 6(4), 296–305 (2012)

    Article  MathSciNet  Google Scholar 

  101. Chen, D.Y., Huang, P.C.: Visual-based human crowds behavior analysis based on graph modeling and matching. IEEE Sens. J. 13(6), 2129–2138 (2013)

    Article  Google Scholar 

  102. Stauffer, C., Grimson, W.E.L.W.: Adaptive background mixture models for real-time tracking. In: IEEE Conference Computer Vision and Pattern Recognition (1999)

  103. Chan, A.B., Vasconcelos, N.: Modeling, clustering, and segmenting video with mixtures of dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 909–926 (2008)

    Article  Google Scholar 

  104. Junior, J.C.S.J., Musse, S.R., Jung, C.R.: Crowd analysis using computer vision techniques. IEEE Signal Process. Mag. 27(5), 66–77 (2010)

    Google Scholar 

  105. http://www.desibrandstrategy.com/why-tirupati-tirumala-needs-smarter-analytics/. Accessed 17 June 2017

  106. http://l7.alamy.com/zooms/dab050fbd7424ff597ca74599e8eb7f9/holi-celebration-in-dauji-temple-dauji-uttar-pradesh-india-asia-d2r9h0.jpg. Accessed 17 June 2017

  107. https://cdn.theatlantic.com/assets/media/img/photo/2011/03/holi-the-festival-of-colors-2011/h15_19113087/main_900.jpg?1420521857. Accessed 17 June 2017

  108. https://en.wikipedia.org/wiki/List_of_human_stampedes_in_Hindu_temples. Accessed 30 Dec 2017

  109. http://edition.cnn.com/2017/05/22/europe/manchester-arena-incident/. Accessed 23 May 2017

  110. http://www.dailynewsegypt.com/2015/02/09/28-football-fans-killed-deliberate-massacre-ultras/. Accessed 9 Feb 2015

  111. http://robertchaen.com/2015/01/01/7935/. Accessed 1 Jan 2015

  112. https://en.wikipedia.org/wiki/List_of_terrorist_incidents_in_India. Accessed 11 Feb 2018

  113. Dimokranitou, A., Tsechpenakis, G.: Adversarial autoencoders for anomalous event detection in images. Thesis, Purdue University (2017)

  114. Saxena, S.: Crowd behavior recognition for video surveillance. In: International Conference on Advanced Concepts for Intelligent Vision Systems (2008)

  115. Husni, M., Suryana, N.: Crowd event detection in computer vision. In: International Conference on Signal Processing Systems (ICSPS) (2010)

  116. Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

  117. Rodriguez, M., Ali, S., Kanade, T.: Tracking in unstructured crowded scenes. In: International Conference on Computer Vision (2009)

  118. Ozturk, O., Yamasaki, T., Aizawa, K.: Detecting dominant motion flows in unstructured/structured crowd scenes. In: International Conference on Pattern Recognition (ICPR), Istanbul (2010)

  119. Sjarif, N.N.A., Shamsuddin, S.M., Hashim, S.Z.: Detection of abnormal behaviors in crowd scene: a review. Int. J. Adv. Soft Comput. Appl. 4(1), 1–33 (2012)

    Google Scholar 

  120. Yu, H., Zhou, Y., Simmons, J., Przybyla, C.P., Lin, Y., Fan, X., Mi, Y., Wang, S.: Groupwise tracking of crowded similar-appearance targets from low-continuity image sequences. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  121. Ihaddadene, N., Djeraba, C.: Real-time crowd motion analysis. In: International Conference on Pattern Recognition, Tampa (2008)

  122. Johansson, A., Helbing, D., Al-Abideen, H.Z., Al-Bosta, S.: From crowd dynamics to crowd safety: a video-based analysis. Adv. Complex Syst. 11(4), 497–527 (2008)

    Article  MATH  Google Scholar 

  123. Cao, T., Wu, X., Guo, J., Yu, S., Xu, Y.: Abnormal crowd motion analysis. In: IEEE International Conference on Robotics and Biomimetics (ROBIO), Guilin (2009)

  124. Hassner, T., Itcher, Y., Kliper-Gross, O.: Violent flows: Real-time detection of violent crowd behavior. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, Providence (2012)

  125. Krausz, B., Bauckhage, C.: Automatic detection of dangerous motion behavior in human crowds. In: IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS), Klagenfurt (2011)

  126. Liao, H., Xiang, J., Sun, W., Feng, Q., Dai, J.: An abnormal event recognition in crowd scene. In: International Conference on Image and Graphics (ICIG), Hefei (2011)

  127. Wang, B., Ye, M., Li, X., Zhao, F., Ding, J.: Abnormal crowd behavior detection using high-frequency and spatio-temporal features. Mach. Vis. Appl. 23(3), 501–511 (2012)

    Article  Google Scholar 

  128. Andersson, M., Gustafsson, F., St-Laurent, L., Prevost, D.: Recognition of anomalous motion patterns in urban surveillance. IEEE J. Sel. Top. Signal Process. 7(1), 102–110 (2013)

    Article  Google Scholar 

  129. Cho, S.-H., Kang, H.-B.: Abnormal behavior detection using hybrid agents in crowded scenes. Pattern Recognit. Lett. 44, 64–70 (2014)

    Article  Google Scholar 

  130. Candamo, J., Shreve, M., Goldgof, D.B., Sapper, D.B., Kasturi, R.: Understanding transit scenes: a survey on human behavior-recognition algorithms. IEEE Trans. Intell. Transp. Syst. 11(1), 206–224 (2010)

    Article  Google Scholar 

  131. Ge, W., Collins, R.T., Ruback, R.B.: Vision-based analysis of small groups in pedestrian crowds. IEEE Trans. Pattern Anal. Mach. Intell. 34(5), 1003–1016 (2012)

    Article  Google Scholar 

  132. Solmaz, B., Moore, B.E., Shah, M.: Identifying behaviors in crowd scenes using stability analysis for dynamical systems. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 2064–2070 (2012)

    Article  Google Scholar 

  133. Krausz, B., Bauckhage, C.: Loveparade 2010: automatic video analysis of a crowd disaster. Comput. Vis. Image Underst. 116(3), 307–319 (2012)

    Article  Google Scholar 

  134. Ge, W., Collins, R.T., Ruback, B.: Automatically detecting the small group structure of a crowd. In: Workshop on Applications of Computer Vision (WACV), Snowbird (2009)

  135. Dee, H.M., Caplier, A.: Crowd behaviour analysis using histograms of motion direction. In: International Conference on Image Processing (ICIP), Hong Kong (2010)

  136. Subburaman, V.B., Descamps, A., Carincotte, C.: Counting people in the crowd using a generic head detector. In: International Conference on Advanced Video and Signal-Based Surveillance (AVSS), Beijing (2012)

  137. Loy, C.C., Chen, K., Gong, S., Xiang, T.: Crowd counting and profiling: methodology and evaluation. In: Ali, S., Nishino, K., Manocha, D., Shah, M. (eds.) Modeling, Simulation and Visual Analysis of Crowds: A Multidisciplinary Perspective, pp. 347–382. Springer, New York (2013)

    Chapter  Google Scholar 

  138. Ullah, H., Conci, N.: Crowd motion segmentation and anomaly detection via multi-label optimization. In: ICPR Workshop on Pattern Recognition and Crowd Analysis (2012)

  139. Krisp, J.M., Peters, S., Burkert, F.: Visualizing crowd movement patterns using a directed kernel density estimation. In: Earth Observation of Global Changes (EOGC), pp. 255–268. Springer, Berlin (2013)

  140. Ullah, H., Conci, N.: Structured learning for crowd motion segmentation. In: International Conference on Image Processing (ICIP), Melbourne (2013)

  141. Ullah, H., Ullah, M., Conci, N.: Dominant motion analysis in regular and irregular crowd scenes. In: International Workshop on Human Behavior Understanding (2014)

  142. Li, W., Wu, X., Matsumoto, K., Zhao, H.-A.: Crowd density estimation: an improved approach. In: International Conference on Signal Processing (ICSP), Beijing (2010)

  143. Hsu, W.-L., Lin, K.-F., Tsai, C.-L.: Crowd density estimation based on frequency analysis. In: International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), Dalian (2011)

  144. Zhang, Z., Li, M.: Crowd density estimation based on statistical analysis of local intra-crowd motions for public area surveillance. Opt. Eng. 51(4), 047204 (2012)

    Article  Google Scholar 

  145. Zhou, B., Zhang, F., Peng, L.: Higher-order SVD analysis for crowd density estimation. Comput. Vis. Image Underst. 116(9), 1014–1021 (2012)

    Article  Google Scholar 

  146. Idrees, H., Soomro, K., Shah, M.: Detecting humans in dense crowds using locally-consistent scale prior and global occlusion reasoning. IEEE Trans. Pattern Anal. Mach. Intell. 37(10), 1986–1998 (2015)

    Article  Google Scholar 

  147. Rao, A.S., Gubbi, J., Marusic, S., Palaniswami, M.: Estimation of crowd density by clustering motion cues. Vis. Comput. 31(11), 1533–1552 (2015)

    Article  Google Scholar 

  148. Wang, L., Hu, W., Tan, T.: Recent developments in human motion analysis. Pattern Recognit. 36(3), 585–601 (2003)

    Article  Google Scholar 

  149. Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Trans. Syst. Man Cybern. 34(3), 334–352 (2004)

    Article  Google Scholar 

  150. Sodemann, A.A., Ross, M.P., Borghetti, B.J.: A review of anomaly detection in automated surveillance. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(6), 1257–1272 (2012)

    Article  Google Scholar 

  151. Gowsikhaa, D., Abirami, S., Baskaran, R.: Automated human behavior analysis from surveillance videos: a survey. Artif. Intell. Rev. 42(4), 747–765 (2014)

    Article  Google Scholar 

  152. Thida, M., Yong, Y., Climent-Pérez, P., Eng, H., Remagnino, P.: A literature review on video analytics of crowded scenes. In: Atrey, P., Kankanhalli, M., Cavallaro, A. (eds.) Intelligent Multimedia Surveillance, pp. 17–36. Springer, Berlin (2013)

    Chapter  Google Scholar 

  153. Jo, H., Chug, K., Sethi, R.: A review of physics-based methods for group and crowd analysis in computer vision. J. Postdr. Res. 1(1), 4–7 (2013)

    Google Scholar 

  154. Vishwakarma, S., Agrawal, A.: A survey on activity recognition and behavior understanding in video surveillance. Vis. Comput. 29(10), 983–1009 (2013)

    Article  Google Scholar 

  155. Afsar, P., Cortez, P., Santos, H.: Automatic visual detection of human behavior: a review from 2000 to 2014. Expert Syst. Appl. 42(20), 6935–6956 (2015)

    Article  Google Scholar 

  156. Li, T., Chang, H., Wang, M., Ni, B., Hong, R., Yan, S.: Crowded scene analysis: a survey. IEEE Trans. Circuits Syst. Video Technol. 25(3), 367–386 (2015)

    Article  Google Scholar 

  157. Kok, V.J., Lim, M.K., Chan, C.S.: Crowd behavior analysis: a review where physics meets biology. Neurocomputing 177, 342–362 (2016)

    Article  Google Scholar 

  158. Grant, J.M., Flynn, P.J.: Crowd scene understanding from video: a survey. ACM Trans. Multimed. Comput. Commun. Appl. 13(2), 1–23 (2017)

    Article  Google Scholar 

  159. Hughes, R.L.: The flow of human crowds. Annu. Rev. Fluid Mech. 35(1), 169–182 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  160. Leggett, R.: Real-time crowd simulation: a review. http://www.leggettnet.org.uk/docs/crowdsimulation.pdf (2004). Accessed 19 Jan 2015 (2004)

  161. Fisher, L.: The Perfect Swarm: The Science of Complexity in Everyday Life. Basic Books, New York (2009)

    Google Scholar 

  162. Moore, B.E., Ali, S., Mehran, R., Shah, M.: Visual crowd surveillance through a hydrodynamics lens. Commun. ACM 54(12), 64–73 (2011)

    Article  Google Scholar 

  163. Shao, J., Loy, C.C., Kang, K., Wang, X.: Crowded scene understanding by deeply learned volumetric slices. IEEE Trans. Circuits Syst. Video Technol. 27(3), 613–623 (2017)

    Article  Google Scholar 

  164. Andrearczyk, V., Whelan, P.F.: Convolutional neural network on three orthogonal planes for dynamic texture classification. arXiv preprint arXiv:1703.05530 (2017)

  165. Sabokrou, M., Fayyaz, M., Fathy, M., Klette, R.: Deep-cascade: cascading 3D deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans. Image Process. 26(4), 1992–2004 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  166. Kumagai, S., Hotta, K., Kurita, T.: Mixture of counting CNNs: adaptive integration of CNNs specialized to specific appearance for crowd counting. arXiv preprint arXiv:1703.09393 (2017)

  167. Zeng, L., Xu, X., Cai, B., Qiu, S., Zhang, T.: Multi-scale Convolutional Neural Networks for Crowd Counting. arXiv preprint arXiv:1702.02359 (2017)

  168. Zhuang, N., Yusufu, T., Ye, J., Hua, K.A.: Group activity recognition with differential recurrent convolutional neural networks. In: International Conference on Automatic Face & Gesture Recognition (FG 2017) (2017)

  169. Ahsan, U., Sun, C., Hays, J., Essa, I.: Complex event recognition from images with few training examples. In: IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa (2017)

  170. Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, Boston (2015)

  171. Kang, K., Wang, X.: Fully convolutional neural networks for crowd segmentation. arXiv preprint arXiv:1411.4464 (2014)

  172. Yun, S., Yun, K., Choi, J., Choi, J.Y.: Density-aware pedestrian proposal networks for robust people detection in crowded scenes. In: European Conference on Computer Vision (2016)

  173. Walach, E., Wolf, L.: Learning to count with CNN boosting. In: European Conference on Computer Vision (2016)

  174. Tieleman, T., Hinton, G.: Lecture 6.5–RmsProp: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 4, 26–31 (2012)

    Google Scholar 

  175. Carvalho, J., Marques, M., Costeira, J.P.: Understanding people flow in transportation hubs. arXiv preprint arXiv:1705.00027 (2017)

  176. Boominathan, L., Kruthiventi, S.S., Babu, R.V.: CrowdNet: a deep convolutional network for dense crowd counting. In: ACM on Multimedia Conference, Amsterdam (2016)

  177. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deep Lab: semantic image segmentation with deep convolutional nets and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)

  178. Onoro-Rubio, D., López-Sastre, R.J.: Towards perspective-free object counting with deep learning. In: European Conference on Computer Vision (2016)

  179. Kang, D., Dhar, D., Chan, A.B.: Crowd counting by adapting convolutional neural networks with side information. arXiv preprint arXiv:1611.06748 (2016)

  180. Marsden, M., McGuinness, K., Little, S., O’Connor, N.E.: Fully convolutional crowd counting on highly congested scenes. arXiv preprint arXiv:1612.00220 (2016)

  181. Zhao, Z., Li, H., Zhao, R., Wang, X.: Crossing-line crowd counting with two-phase deep neural networks. In: European Conference on Computer Vision (2016)

  182. Sourtzinos, P., Velastin, S.A., Jara, M., Zegers, P., Makris, D.: People counting in videos by fusing temporal cues from spatial context-aware convolutional neural networks. In: European Conference on Computer Vision (2016)

  183. Chattopadhyay, P., Vedantam, R., Selvaraju, R.R., Batra, D., Parikh, D.: Counting everyday objects in everyday scenes. arXiv preprint arXiv:1604.03505 (2016)

  184. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  185. Sheng, B., Shen, C., Lin, G., Li, J., Yang, W., Sun, C.: Crowd counting via weighted VLAD on dense attribute feature maps. IEEE Trans. Circuits Syst. Video Technol. 99, 1–1 (2016)

    Google Scholar 

  186. Yi, S.: Pedestrian Behavior Modeling and Understanding in Crowds. Thesis, The Chinese University of Hong Kong, Hong Kong (2016)

  187. Cao, L., Zhang, X., Ren, W., Huang, K.: Large scale crowd analysis based on convolutional neural network. Pattern Recognit. 48(10), 3016–3024 (2015)

    Article  Google Scholar 

  188. Hu, Y., Chang, H., Nian, F., Wang, Y., Li, T.: Dense crowd counting from still images with convolutional neural networks. J. Vis. Commun. Image Represent. 38, 530–539 (2016)

    Article  Google Scholar 

  189. Shao, J., Loy, C.C., Kang, K., Wang, X.: Slicing convolutional neural network for crowd video understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas (2016)

  190. Burney, A., Syed, T.Q.: Crowd video classification using convolutional neural networks. In: International Conference on Frontiers of Information Technology (FIT), Islamabad (2016)

  191. Ravanbakhsh, M., Nabi, M., Mousavi, H., Sangineto, E., Sebe, N.: Plug-and-play cnn for crowd motion analysis: an application in abnormal event detection. arXiv preprint arXiv:1610.00307 (2016)

  192. Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas (2016)

  193. Wang, T., Li, G., Lei, J., Li, S., Xu, S.: Crowd counting based on MMCNN in still images. In: Scandinavian Conference on Image Analysis (2017)

  194. Sabokrou, M., Fayyaz, M., Fathy, M., Moayedd, Z., Klette, R.: Fully convolutional neural network for fast anomaly detection in crowded scenes. arXiv preprint arXiv:1609.00866 (2016)

  195. Fu, M., Xua, P., Lia, X., Liua, Q., Yea, M., Zhu, C.: Fast crowd density estimation with convolutional neural networks. Eng. Appl. Artif. Intell. 43, 81–88 (2015)

    Article  Google Scholar 

  196. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: IEEE conference on Computer Vision and Pattern Recognition, Columbus (2014)

  197. Wang, C., Zhang, H., Yang, L., Liu, S., Cao, X.: Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia (2015)

  198. Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (2017)

  199. Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

  200. Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: IEEE International Conference on Computer Vision, Venice, Italy (2017)

  201. Xiong, F., Shi, X., Yeung, D.-Y.: Spatiotemporal modeling for crowd counting in videos. arXiv preprint arXiv:1707.07890 (2017)

  202. Liu, B., Vasconcelos, N.: Bayesian model adaptation for crowd counts. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

  203. Sindagi, V.A., Patel, V.M.: A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognit. Lett. (2017). https://doi.org/10.1016/j.patrec.2017.07.007

  204. Pham, V.-Q., Kozakaya, T., Yamaguchi, O., Okada, R.: Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

  205. Shao, J., Kang, K., Loy, C.C., Wang, X.: Deeply learned attributes for crowded scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

  206. Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016)

  207. Gutoski, M., Aquino, N.M.R., Ribeiro, M., Lazzaretti, E.A., Lopes, S.H.: Detection of video anomalies using convolutional autoencoders and one-class support vector machines. http://cbic2017.org/

  208. Feng, Y., Yuan, Y., Lu, X.: Learning deep event models for crowd anomaly detection. Neurocomputing 219, 548–556 (2017)

    Article  Google Scholar 

  209. Zhou, S., Shen, W., Zeng, D., Fang, M., Wei, Y., Zhang, Z.: Spatial-temporal convolutional neural networks for anomaly detection and localization in crowded scenes. Sig. Process. Image Commun. 47, 358–368 (2016)

    Article  Google Scholar 

  210. Smeureanu, S., Ionescu, R.T., Popescu, M., Alexe, B.: Deep appearance features for abnormal behavior detection in video. In: Image Analysis and Processing—ICIAP 2017, Catania, Italy (2017)

  211. Sun, J., Shao, J., He, C.: Abnormal event detection for video surveillance using deep one-class learning. Multimed. Tools Appl. (2017). https://doi.org/10.1007/s11042-017-5244-2

    Article  Google Scholar 

  212. Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. arXiv preprint arXiv:1709.09121 (2017)

  213. Péteri, R., Fazekas, S., Huiskes, M.J.: Dyntex: a comprehensive database of dynamic textures. Pattern Recognit. Lett. 31(12), 1627–1632 (2010)

    Article  Google Scholar 

  214. Doretto, G., Chiuso, A., Wu, Y.N., Soatto, S.: Dynamic textures. Int. J. Comput. Vis. 51(2), 91–109 (2003)

    Article  MATH  Google Scholar 

  215. Ghanem, B., Ahuja, N.: Maximum margin distance learning for dynamic texture recognition. In: Computer Vision–ECCV 2010 (2010)

  216. Chan, A.B., Sheng John, Z., Vasconcelos, L.N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on IEEE Computer Vision and Pattern Recognition, 2008, CVPR 2008 (2008)

  217. Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: BMVC (2012)

  218. Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)

  219. Blunsden, S., Fisher, R.B.: The BEHAVE video dataset: ground truthed video for multi-person behavior classification. Ann. BMVA 4, 1–12 (2010)

    Article  Google Scholar 

  220. Papadopoulos, S., Schinas, E., Mezaris, V., Troncy, R., Kompatsiaris, I.: Social event detection at mediaeval 2012: challenges, dataset and evaluation. In: Proceedings of MediaEval 2012 Workshop (2012)

  221. Li, L., Su, H., Xing, E., Fei-Fei, L.: Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: NIPS (2010)

  222. Everingham, M., Eslami, S.M.A., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)

    Article  Google Scholar 

  223. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: ECCV, pp. 740–755 (2014)

  224. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2012)

    Article  Google Scholar 

  225. Shang, C., Ai, H., Bai, B.: End-to-end crowd counting via joint learning local and global count. In: 2016 IEEE International Conference on Image Processing (ICIP), Phoenix (2016)

  226. Conigliaro, D., Rota, P., Setti, F., Bassetti, C., Conci, N., Sebe, N., Cristani, M.: The S-Hock Dataset: analyzing crowds at the stadium. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

  227. Shao, J., Change Loy, C., Wang, X.: Scene-independent group profiling in crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)

  228. Wu, S., Yang, H., Zheng, S., Su, H., Fan, Y., Yang, M.-H.: Crowd behavior analysis via curl and divergence of motion trajectories. Int. J. Comput. Vis. 123(3), 499–519 (2017)

    Article  MathSciNet  Google Scholar 

  229. Yoo, Y., Yun, K., Yun, S., Hong, J., Jeong, H., Young Choi, J.: Visual path prediction in complex scenes with crowded moving objects. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dinesh Kumar Vishwakarma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tripathi, G., Singh, K. & Vishwakarma, D.K. Convolutional neural networks for crowd behaviour analysis: a survey. Vis Comput 35, 753–776 (2019). https://doi.org/10.1007/s00371-018-1499-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-018-1499-5

Keywords

Navigation