Abstract
Automatically detecting anomalous events in surveillance videos is crucial for security maintenance. Due to the challenging nature of the task, the performance of the existing approaches is still limited. In this study, we propose a video anomaly detection method called multi-scale Siamese prediction framework (MSSP), where the Siamese network uses the information embedded in the observed anomalous events without requiring any additional parameters. To extract spatiotemporal features, we introduce a multi-scale term where an improved inception module and a convolutional GRU (Conv-GRU) module are combined. They are employed in each layer of the U-Net coding stage to mitigate the information loss caused by subsampling. To further optimize the proposed model, a loss function combining the prediction loss function and the contrastive loss is proposed. We evaluate the system performance on three public datasets: CUHK Avenue, UCSD Ped2, and ShanghaiTech dataset. Experimental results demonstrated that the MSSP framework achieved AUC values of 89.4%, 97.4% and 73.83%, respectively, which significantly outperforms other methods.
Similar content being viewed by others
References
Pang, G., Shen, C., Cao, L., et al.: Deep learning for anomaly detection: a review. https://arxiv.org/abs/2007.02500v1 (2020)
Luo, W., et al.: Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 43(3), 1070–1084 (2021)
Park, H., Noh, J., Ham, B.: Learning memory-guided normality for anomaly detection. In: IEEE Conference on CVPR, pp. 14360–14369 (2020)
Qiang, Y., Fei, S., Jiao, Y.: Anomaly detection based on latent feature training in surveillance scenarios. IEEE Access 9, 68108–68117 (2021)
Xu, D., Ricci, E., Yan, Y., Song, J., Sebe, N.: Learning deep representations of appearance and motion for anomalous event detection. In: BMVC, pp. 548–561 (2015)
Ravanbakhsh, M., Nabi, M., Sangineto, E., Marcenaro, L., Regazzoni, C., Sebe, N.: Abnormal event detection in videos using generative adversarial nets. In: IEEE International Conference on Image Processing, pp. 1577–1581 (2017)
Xu, K., Jiang, X., Sun, T.: Anomaly detection based on stacked sparse coding with intraframe classification strategy. IEEE Trans. Multimed. 20(5), 1062–1074 (2018)
Gong, D., et al.: Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: IEEE International Conference on Computer Vision, pp. 1705–1714 (2019)
Luo, W., Liu, W., Gao, S.: Remembering history with convolutional LSTM for anomaly detection. In: IEEE International Conference on Multimedia and Expo, pp. 439–444 (2017)
Zhao, B., Li, F.F., Xing, E.P.: Online detection of unusual events in videos via dynamic sparse coding. In: IEEE Conference on CVPR, pp. 3313–3320 (2011)
Masci, J., Meier, U., Cireşan, D.C., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: 21st International Conference on Artificial Neural Networks, pp. 52–59 (2011)
Ren, H., Pan, H., Olsen, S.I., Jensen, M.B., Moeslund, T.B.: A comprehensive study of sparse codes on abnormality detection (2016)
Medel, J.R., Savakis, A.: Anomaly detection in video using predictive convolutional long short-term memory networks. Available: http://arxiv.org/abs/1612.00390 (2016)
Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection—a new baseline. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6536–6545 (2018)
Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: IEEE Conference on CVPR, pp. 6479–6488 (2018)
Zhong, J.X., Li, N., Kong, W., Liu, S., Li, T.H., Li, G.: Graph convolutional label noise cleaner: train a plug-and-play action classifier for anomaly detection. In: IEEE Conference on CVPR, pp. 1237–1246 (2019)
Cai, Y., Liu, J., Guo, Y., Hu, S., Lang, S.: Video anomaly detection with multi-scale feature and temporal information fusion. Neurocomputing 432, 264–273 (2021)
Li, Y., Cai, Y., Liu, J., Lang, S., Zhang, X.: Spatio-temporal unity networking for video anomaly detection. IEEE Access 7, 172425–172432 (2019)
Mathieu, M., Couprie, C., Lecun, Y.: Deep multi-scale video prediction beyond mean square error (2015)
Luo, W., Liu, W., Lian, D., Tang, J., Duan, L., Peng, X., Gao, S.: Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 99(1), 1070–1084 (2019)
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in Matlab. In: IEEE/CVF International Conference on ICCV, pp. 2720–2727 (2013)
Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: IEEE Conference on CVPR, pp. 1975–1981 (2010)
Ionescu, R.T., Smeureanu, S., Alexe, B., Popescu, M.: Unmasking the abnormal events in video. In: IEEE International Conference on Computer Vision, pp. 2914–2922 (2017)
Saypadith, S., Onoye, T.: Video anomaly detection based on deep generative network. In: 2021 IEEE International Symposium on Circuits and Systems (ISCAS) (2021)
Lu, Y., Kumar, K.M., Nabavi, S.S., Wang, Y.: Future frame prediction using convolutional VRNN for anomaly detection. In: IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 1–8 (2019)
Nguyen, T.N., Meunier, J.: Anomaly detection in video sequence with appearance-motion correspondence. In: IEEE International Conference on Computer Vision, pp. 1273–1283 (2019)
Acknowledgements
The project was supported by National Key Research and Development Program of China under Grant 2017YFC1703302
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, J., Cai, Y., Liu, D. et al. Multi-scale Siamese prediction network for video anomaly detection. SIViP 17, 671–678 (2023). https://doi.org/10.1007/s11760-022-02274-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02274-4