Abstract
Abnormal detection of surveillance video is of great significance to social security and the protection of specific scenes. However, the existing methods fail to achieve a balance between accuracy and real-time performance. In this paper, we propose a two-stream spatio-temporal generative model (TSSTGM) for surveillance videos to detect abnormal behaviors in real-time. We construct an end-to-end video reconstruction and prediction framework based on deep learning to detect the anomalies by reconstruction error and prediction error. Specifically, we elaborately design a fully convolutional structure, enabling the model to accept input videos of any size. To ensure great performance in complex scenes, appearance, temporal and motion features are fully explored and fed into the discriminator to train the model with adversarial learning. Moreover, the input design and the calculation way of optical flow ensure the model runs in real-time. Experiments on two real-world datasets show that, when satisfying the real-time requirement, TSSTGM is still competitive compared with no matter real-time or non-real-time existing methods in AUC and EER metrics. Our model has been deployed in several campus security surveillance systems to detect dangerous behaviors, ensuring the personal safety of students.
Similar content being viewed by others
References
Bertini, M., Del Bimbo, A., Seidenari, L.: Multi-scale and real-time non-parametric approach for anomaly detection and localization. Comput. Vis. Image Underst. 116(3), 320–329 (2012)
Chang, E., Wang, Y.F.: Introduction to the special issue on video surveillance. Multimed. Syst. 10(2), 116–117 (2004)
Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks, pp. 189–196. Springer, New York (2017)
Chu, W., Xue, H., Yao, C., Cai, D.: Sparse coding guided spatiotemporal feature learning for abnormal event detection in large videos. IEEE Trans. Multimed. 21(1), 246–255 (2018)
Cui, X., Liu, Q., Gao, M., Metaxas, D.N.: Abnormal detection using interaction energy potentials. In: CVPR 2011, pp. 3161–3167. IEEE (2011)
Fan, L., Huang, W., Gan, C., Ermon, S., Gong, B., Huang, J.: End-to-end learning of motion representation for video understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6016–6025 (2018)
Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 733–742 (2016)
Huang, C., Wu, Z., Wen, J., Xu, Y., Jiang, Q., Wang, Y.: Abnormal event detection using deep contrastive learning for intelligent video surveillance system. IEEE Transactions on Industrial Informatics (2021)
Huang, C., Yang, Z., Wen, J., Xu, Y., Jiang, Q., Yang, J., Wang, Y.: Self-supervision-augmented deep autoencoder for unsupervised visual anomaly detection. IEEE Trans. Cybern. (2021)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134 (2017)
Kaltsa, V., Briassouli, A., Kompatsiaris, I., Hadjileontiadis, L.J., Strintzis, M.G.: Swarm intelligence for detecting interesting events in crowded environments. IEEE Trans. Image Process. 24(7), 2153–2166 (2015)
KingaD, A.: A methodforstochasticoptimization. Anon. InternationalConferenceon Learning Representations. SanDego: ICLR (2015)
Lee, S., Kim, H.G., Ro, Y.M.: Stan: Spatio-temporal adversarial networks for abnormal event detection. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 1323–1327. IEEE (2018)
Li, N., Chang, F., Liu, C.: Spatial-temporal cascade autoencoder for video anomaly detection in crowded scenes. IEEE Trans. Multimed. 23, 203–215 (2020)
Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6536–6545 (2018)
Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13588–13597 (2021)
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE international conference on computer vision, pp. 2720–2727 (2013)
Luo, W., Liu, W., Gao, S.: Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 439–444. IEEE (2017)
Luo, W., Liu, W., Lian, D., Gao, S.: Future frame prediction network for video anomaly detection. In: IEEE transactions on pattern analysis and machine intelligence (2021)
Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1975–1981. IEEE (2010)
Mathieu, M., Couprie, C., LeCun, Y.: Deep multi-scale video prediction beyond mean square error. arXiv preprint arXiv:1511.05440 (2015)
Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 935–942. IEEE (2009)
Park, H., Noh, J., Ham, B.: Learning memory-guided normality for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14372–14381 (2020)
Peng, X., Schmid, C.: Multi-region two-stream r-cnn for action detection. In: European conference on computer vision, pp. 744–759. Springer (2016)
Ravanbakhsh, M., Nabi, M., Sangineto, E., Marcenaro, L., Regazzoni, C., Sebe, N.: Abnormal event detection in videos using generative adversarial nets. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 1577–1581. IEEE (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, pp. 234–241. Springer (2015)
Stauffer, C., Grimson, W.E.L.: Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 747–757 (2000)
Ullah, H., Islam, I.U., Ullah, M., Afaq, M., Khan, S.D., Iqbal, J.: Multi-feature-based crowd video modeling for visual event detection. Multimedia Systems pp. 1–9 (2020)
Wu, P., Liu, J., Li, M., Sun, Y., Shen, F.: Fast sparse coding networks for anomaly detection in videos. Pattern Recogn. 107, 107515 (2020)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4471–4480 (2019)
Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime tv-l 1 optical flow. In: Joint pattern recognition symposium, pp. 214–223. Springer (2007)
Zaheer, M.Z., Lee, J.H., Mahmood, A., Astrid, M., Lee, S.I.: Stabilizing adversarially learned one-class novelty detection using pseudo anomalies. arXiv preprint arXiv:2203.13716 (2022)
Zhao, B., Fei-Fei, L., Xing, E.P.: Online detection of unusual events in videos via dynamic sparse coding. In: CVPR 2011, pp. 3313–3320. IEEE (2011)
Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.S.: Spatio-temporal autoencoder for video anomaly detection. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 1933–1941 (2017)
Funding
This work is supported by National Key R&D Project of China under Grants No.2021QY2102, National Natural Science Foundation of China under Grants No. 62172089, No.61972087, Natural Science Foundation of Jiangsu province under Grants No.BK20191258, Jiangsu Provincial Key Laboratory of Computer Networking Technology, Jiangsu Provincial Key Laboratory of Network and Information Security under Grants No. BM2003201, and Key Laboratory of Computer Network and Information Integration of Ministry of Education of China under Grants No. 93K-9, Nanjing Purple Mountain Laboratory.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by E. Ricci.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, W., Cao, J., Zhu, Y. et al. Real-time anomaly detection on surveillance video with two-stream spatio-temporal generative model. Multimedia Systems 29, 59–71 (2023). https://doi.org/10.1007/s00530-022-00979-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-022-00979-7