Skip to main content
Log in

An efficient deep neural model for detecting crowd anomalies in videos

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Identifying unusual crowd events is highly challenging, laborious, and prone to errors in video surveillance applications. We propose a novel end-to-end deep learning architecture called Stacked Denoising Auto-Encoder (DeepSDAE) to address these challenges, comprising SDAE, VGG16 and Plane-based one-class Support Vector Machine (SVM), abbreviated as PSVM, to detect anomalies such as stationary people in an active scene or loitering activities in a crowded scene. The DeepSDAE framework is a hybrid deep learning architecture. It consists of a four-layered SDAE and an enhanced convolutional neural network (CNN) model. Our framework employs Reinforcement Learning to optimise the learning parameters to detect crowd anomalies. We use the Markov Decision Process (MDP) with Deep Q-learning to find the optimal Q value. We also present a late fusion procedure to combine individual decisions from the intermediate and final layers of the SDAE and VGG16 networks to detect different anomalies. Our experiments on four real-world datasets reveal the superior performance of our proposed framework in detecting (frame-level and pixel-level) anomalies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Sequence of states, actions and rewards that reach a terminal state

References

  1. Varadarajan J, Odobez J-M (2009) Topic models for scene analysis and abnormality detection. In: 2009 IEEE 12th international conference on computer vision workshops, pp 1338–1345

  2. Luff P, Heath C, Jirotka M (2000) Surveying the scene: technologies for everyday awareness and monitoring in control rooms. Interact Comput 13(2):193–228

    Article  Google Scholar 

  3. Aggarwal JK, Cai Q (1999) Human motion analysis: a review. Comput Vis Image Underst 73(3):428–440

    Article  Google Scholar 

  4. Krüger V, Kragic D, Ude A, Geib C (2007) The meaning of action: a review on action recognition and mapping. Adv Robot 21(13):1473–1501

    Article  Google Scholar 

  5. Rao AS, Gubbi J, Rajasegarar S, Marusic S, Palaniswami M (2014) Detection of anomalous crowd behaviour using hyperspherical clustering. In: 2014 International conference on digital image computing: techniques and applications (DICTA), pp 1–8

  6. Yang M, Rajasegarar S, Erfani SM, Leckie C (2019) Deep learning and one-class svm based anomalous crowd detection. In: 2019 International joint conference on neural networks (IJCNN). IEEE, pp 1–8

  7. Erfani SM, Rajasegarar S, Karunasekera S, Leckie C (2016) High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recogn 58:121–134

    Article  Google Scholar 

  8. (2013). UCSD anomaly detection dataset. http://www.svcl.ucsd.edu/projects/anomaly/dataset.html. Last Accessed 26 Feb 2022

  9. (2013). Avenue dataset for abnormal event detection. http://www.cse.cuhk.edu.hk/leojia/projects/detectabnormal/dataset.html. Last Accessed 26 Feb 2022

  10. Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 fps in matlab. In: ICCV, pp 2720–2727

  11. Adam A, Rivlin E, Shimshoni I, Reinitz D (2008) Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans Pattern Anal Mach Intell 30(3):555–560

    Article  Google Scholar 

  12. Rao AS, Gubbi J, Marusic S, Palaniswami M (2015) Estimation of crowd density by clustering motion cues. Vis Comput 31(11):1533–1552

    Article  Google Scholar 

  13. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  14. Mo X, Monga V, Bala R, Fan Z (2014) Adaptive sparse representations for video anomaly detection. IEEE Trans Circuits Syst Video Technol 24(4):631–645

    Article  Google Scholar 

  15. Bird N, Atev S, Caramelli N, Martin R, Masoud O, Papanikolopoulos N (2006) Real time, online detection of abandoned objects in public areas. In: ICRA 2006. IEEE, pp 3775–3780

  16. Mohammadi S, Perina A, Kiani H, Murino V (2016) Angry crowds: detecting violent events in videos. In: European conference on computer vision. Springer, pp 3–18

  17. Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 1975–1981

  18. Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6536–6545

  19. Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked RNN framework. In: Proceedings of the IEEE international conference on computer vision, pp 341–349

  20. Xu D, Ricci E, Yan Y, Song J, Sebe N (2015) Learning deep representations of appearance and motion for anomalous event detection. arXiv:1510.01553

  21. Feng Y, Yuan Y, Lu X (2016) Deep representation for abnormal event detection in crowded scenes. In: 2016 ACM on multimedia conference, pp 591–595

  22. Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 733–742

  23. Chong YS, Tay YH (2017) Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks. Springer, pp 189–196

  24. Dubey S, Boragule A, Gwak J, Jeon M (2021) Anomalous event recognition in videos based on joint learning of motion and appearance with multiple ranking measures. Appl Sci 11(3):1344

    Article  Google Scholar 

  25. Morais R, Le V, Tran T, Saha B, Mansour M, Venkatesh S (2019) Learning regularity in skeleton trajectories for anomaly detection in videos. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11996–12004

  26. Sabokrou M, Fayyaz M, Fathy M, Moayed Z, Klette R (2018) Deep-anomaly: fully convolutional neural network for fast anomaly detection in crowded scenes. Comput Vis Image Underst 172:88–97

    Article  MATH  Google Scholar 

  27. Ravanbakhsh M, Nabi M, Mousavi H, Sangineto E, Sebe N (2018) Plug-and-play cnn for crowd motion analysis: an application ine abnormal event detection. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 1689–1698

  28. Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) See more, know more: unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3623–3632

  29. Lu X, Wang W, Shen J, Crandall D, Luo J (2020) Zero-shot video object segmentation with co-attention siamese networks. IEEE transactions on pattern analysis and machine intelligence

  30. Mishra SR, Mishra TK, Sarkar A, Sanyal G (2020) Detection of anomalies in human action using optical flow and gradient tensor. In: Smart intelligent computing and applications. Springer, pp 561–570

  31. Mishra SR, Mishra TK, Sanyal G, Sarkar A, Satapathy SC (2020) Real time human action recognition using triggered frame extraction and a typical cnn heuristic. Pattern Recogn Lett 135:329–336

    Article  Google Scholar 

  32. Jafari MH, Luong C, Tsang M, Gu AN, Van Woudenberg N, Rohling R, Tsang T, Abolmaesumi P (2021) U-land: uncertainty-driven video landmark detection. IEEE Trans Med Imaging 41(4):793–804

    Article  Google Scholar 

  33. Shao J, Loy CC, Wang X (2016) Learning scene-independent group descriptors for crowd understanding. IEEE Trans Circuits Syst Video Technol 27(6):1290–1303

    Article  Google Scholar 

  34. Ghafoori Z, Rajasegarar S, Erfani SM, Karunasekera S, Leckie CA (2016) Unsupervised parameter estimation for one-class support vector machines. In: Pacific-asia conference on knowledge discovery and data mining. Springer, pp 183–195

  35. Snoek CG, Worring M, Smeulders AW (2005) Early versus late fusion in semantic video analysis. In: Proceedings of the 13th annual ACM international conference on multimedia, pp 399–402

  36. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  37. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  38. Luo W, Liu W, Lian D, Tang J, Duan L, Peng X, Gao S (2019) Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans Pattern Anal Mach Intell 43(3):1070–1084

    Article  Google Scholar 

  39. Kim J, Grauman K (2009) Observe locally, infer globally: a space-time mrf for detecting abnormal activities with incremental updates. In: 2009 IEEE conference on computer vision and pattern recognition, pp 2921–2928

  40. Reddy V, Sanderson C, Lovell BC (2011) Improved anomaly detection in crowded scenes via cell-based analysis of foreground speed, size and texture. In: CVPRW. IEEE, pp 55–61

  41. Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: CVPR. IEEE, pp 3449–3456

  42. Leyva R, Sanchez V, Li C-T (2017) Video anomaly detection with compact feature sets for online performance. IEEE Trans Image Process 26(7):3463–3478

    Article  MathSciNet  MATH  Google Scholar 

  43. Turchini F, Seidenari L, Bimbo AD (2017) Convex polytope ensembles for spatio-temporal anomaly detection. In: International conference on image analysis and processing. Springer, pp 174–184

  44. Chaker R, Al Aghbari Z, Junejo IN (2017) Social network model for crowd anomaly detection and localization. Pattern Recogn 61:266–281

    Article  Google Scholar 

  45. Luo W, Liu W, Gao S (2017) Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE international conference on multimedia and expo (ICME). IEEE, pp 439–444

  46. Ionescu RT, Smeureanu S, Popescu M, Alexe B (2018) Detecting abnormal events in video using narrowed motion clusters. arXiv:1801.05030

  47. Smeureanu S, Ionescu RT, Popescu M, Alexe B (2017) Deep appearance features for abnormal behavior detection in video. In: International conference on image analysis and processing. Springer, pp 779–789

  48. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861

Download references

Acknowledgments

The authors are very grateful to Editor and the anonymous reviewers for their valuable comments and suggestions that improved the presentation and quality of this paper highly. This work was supported by the Natural Science Foundation of China under Grants 12201523, and also supported by the Fundamental Research Funds for the Central Universities under Grants No. 2682021CX078.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhengchun Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, M., Tian, S., Rao, A.S. et al. An efficient deep neural model for detecting crowd anomalies in videos. Appl Intell 53, 15695–15710 (2023). https://doi.org/10.1007/s10489-022-04233-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04233-5

Keywords

Navigation