Abstract
Abnormal event detection is a challenging task, due to object scale variation, impact of background and anomaly defined differently in different context. In this paper, we propose a new multi-scale feature prediction framework for abnormal event detection. Firstly, we construct a multi-scale alignment feature generator to fuse the characteristic of different receptive fields so that address the objects of different scales in video frame. Secondly, in order to weak the influence of background, a novel channel-wise attention mechanism is introduced to highlight those informative channels while suppressing the confusing ones. Finally, an autoencoder-based deep feature prediction module is applied to capture temporal information and contextual information to generate predicted features. Instead of giving a definition of anomaly, we treat predicted features that differ from the actual features as abnormal features. Experimental results on four benchmark datasets demonstrate the superiority of the proposed framework over the state-of-the-art approaches.






Similar content being viewed by others
References
Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 fps in matlab. In: 2013 IEEE International conference on computer vision, pp 2720–2727. https://doi.org/10.1109/ICCV.2013.338
Song H, Sun C, Wu X, Chen M, Jia Y (2020) Learning normal patterns via adversarial attention-based autoencoder for abnormal event detection in videos. IEEE Trans Multimed 22(8):2138–2148. https://doi.org/10.1109/TMM.2019.2950530
Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection - a new baseline. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6536–6545 https://doi.org/10.1109/CVPR.2018.00684
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6479–6488 https://doi.org/10.1109/CVPR.2018.00678
Xia L, Li Z (2021) An abnormal event detection method based on the Riemannian manifold and lstm network. Neurocomputing 463:144–154. https://doi.org/10.1016/j.neucom.2021.08.017
Xia L, Li Z (2021) A new method of abnormal behavior detection using lstm network with temporal attention mechanism. J Supercomput 77(4):3223–3241. https://doi.org/10.1007/s11227-020-03391-y
Wang J, Xia L, Hu X, Xiao Y (2019) Abnormal event detection with semi-supervised sparse topic model. Neural Comput Appl 31(5):1607–1617. https://doi.org/10.1007/s00521-018-3417-1
Coşar S, Donatiello G, Bogorny V, Garate C, Alvares LO, Brémond F (2017) Toward abnormal trajectory and event detection in video surveillance. IEEE Trans Circuits Syst Video Technol 27(3):683–695. https://doi.org/10.1109/TCSVT.2016.2589859
Sabokrou M, Fayyaz M, Fathy M, Moayed Z, Klette R (2018) Deep-anomaly: fully convolutional neural network for fast anomaly detection in crowded scenes. Comput Vis Image Underst 172:88–97. https://doi.org/10.1016/j.cviu.2018.02.006
Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 1975–1981 https://doi.org/10.1109/CVPR.2010.5539872
Cheng K-W, Chen Y-T, Fang W-H (2015) Video anomaly detection and localization using hierarchical feature representation and gaussian process regression. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 2909–2917. https://doi.org/10.1109/CVPR.2015.7298909
Nguyen TN, Meunier J (2019) Anomaly detection in video sequence with appearance-motion correspondence. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 1273–1283 https://doi.org/10.1109/ICCV.2019.00136
Zhao Y, Deng B, Shen C, Liu Y, Lu H, Hua X-S (2017) Spatio-temporal autoencoder for video anomaly detection. In: Proceedings of the 25th ACM international conference on multimedia. MM ’17, pp 1933–1941. Association for Computing Machinery, New York, NY, USA https://doi.org/10.1145/3123266.3123451
Li T, Chen X, Zhu F, Zhang Z, Yan H (2021) Two-stream deep spatial-temporal auto-encoder for surveillance video abnormal event detection. Neurocomputing 439:256–270. https://doi.org/10.1016/j.neucom.2021.01.097
Piciarelli C, Micheloni C, Foresti GL (2008) Trajectory-based anomalous event detection. IEEE Trans Circuits Syst Video Technol 18(11):1544–1554. https://doi.org/10.1109/TCSVT.2008.2005599
Tung F, Zelek JS, Clausi DA (2011) Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance. Image Vis Comput 29(4):230–240. https://doi.org/10.1016/j.imavis.2010.11.003
Cui X, Liu Q, Gao M, Metaxas DN (2011) Abnormal detection using interaction energy potentials. In: CVPR 2011, pp 3161–3167 https://doi.org/10.1109/CVPR.2011.5995558
Zhao B, Fei-Fei L, Xing EP (2011) Online detection of unusual events in videos via dynamic sparse coding. In: CVPR 2011, pp 3313–3320 https://doi.org/10.1109/CVPR.2011.5995524
Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: CVPR 2011, pp 3449–3456 https://doi.org/10.1109/CVPR.2011.5995434
Yuan Y, Feng Y, Lu X (2018) Structured dictionary learning for abnormal event detection in crowded scenes. Pattern Recogn 73:99–110. https://doi.org/10.1016/j.patcog.2017.08.001
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol. 1, pp 886–8931 https://doi.org/10.1109/CVPR.2005.177
Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: Leonardis A, Bischof H, Pinz A (eds) Computer Vision - ECCV 2006. Springer, Berlin, Heidelberg, pp 428–441
Reddy V, Sanderson C, Lovell BC (2011) Improved anomaly detection in crowded scenes via cell-based analysis of foreground speed, size and texture. In: CVPR 2011 WORKSHOPS, pp 55–61 https://doi.org/10.1109/CVPRW.2011.5981799
Kim J, Grauman K (2009) Observe locally, infer globally: a space-time mrf for detecting abnormal activities with incremental updates. In: 2009 IEEE conference on computer vision and pattern recognition, pp 2921–2928 https://doi.org/10.1109/CVPR.2009.5206569
Chang Y, Tu Z, Xie W, Yuan J (2020) Clustering driven deep autoencoder for video anomaly detection. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Computer vision - ECCV 2020. Springer, Cham, pp 329–345
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neural Inform Process Syst. 27
Fernando T, Denman S, Sridharan S, Fookes C (2018) Soft + hardwired attention: an lstm framework for human trajectory prediction and abnormal event detection. Neural Netw 108:466–478. https://doi.org/10.1016/j.neunet.2018.09.002
Yang B, Cao J, Wang N, Liu X (2019) Anomalous behaviors detection in moving crowds based on a weighted convolutional autoencoder-long short-term memory network. IEEE Trans Cogn Dev Syst 11(4):473–482. https://doi.org/10.1109/TCDS.2018.2866838
Lee S, Kim HG, Ro YM (2020) Bman: bidirectional multi-scale aggregation networks for abnormal event detection. IEEE Trans Image Process 29:2395–2408. https://doi.org/10.1109/TIP.2019.2948286
Cai Y, Liu J, Guo Y, Hu S, Lang S (2021) Video anomaly detection with multi-scale feature and temporal information fusion. Neurocomputing 423:264–273. https://doi.org/10.1016/j.neucom.2020.10.044
Xie S, Tu Z (2015) Holistically-nested edge detection. In: 2015 IEEE international conference on computer vision (ICCV), pp 1395–1403 https://doi.org/10.1109/ICCV.2015.164
Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 1800–1807 https://doi.org/10.1109/CVPR.2017.195
Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 733–742 https://doi.org/10.1109/CVPR.2016.86
Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. In: 2009 IEEE conference on computer vision and pattern recognition, pp 935–942 https://doi.org/10.1109/CVPR.2009.5206641
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6230–6239 https://doi.org/10.1109/CVPR.2017.660
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 7132–7141 https://doi.org/10.1109/CVPR.2018.00745
Hinami R, Mei T, Satoh S (2017) Joint detection and recounting of abnormal events by learning deep generic knowledge. In: 2017 IEEE international conference on computer vision (ICCV), pp 3639–3647 https://doi.org/10.1109/ICCV.2017.391
Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked rnn framework. In: 2017 IEEE international conference on computer vision (ICCV), pp 341–349 https://doi.org/10.1109/ICCV.2017.45
Ionescu RT, Smeureanu S, Alexe B, Popescu M (2017) Unmasking the abnormal events in video. In: 2017 IEEE international conference on computer vision (ICCV), pp 2914–2922 https://doi.org/10.1109/ICCV.2017.315
Gong D, Liu L, Le V, Saha B, Mansour MR, Venkatesh S, Van Den Hengel A (2019) Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 1705–1714 https://doi.org/10.1109/ICCV.2019.00179
Ionescu RT, Smeureanu S, Popescu M, Alexe B (2019) Detecting abnormal events in video using narrowed normality clusters. In: 2019 IEEE winter conference on applications of computer vision (WACV), pp 1951–1960 https://doi.org/10.1109/WACV.2019.00212
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xia, L., Wei, C. Abnormal event detection in surveillance videos based on multi-scale feature and channel-wise attention mechanism. J Supercomput 78, 13470–13490 (2022). https://doi.org/10.1007/s11227-022-04410-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04410-w