Abstract
This paper introduces FEMemAE-Jigsaw, a hybrid detection framework that leverages a fusion of reconstruction and jigsaw puzzle detection for video anomaly detection. Initially, we developed a new reconstruction model, FEMemAE, which utilizes an expanded memory module to more effectively retain the original input data’s information. By incorporating a Large Kernel selection module, the model can attend to more feature information. Furthermore, through the integration of a Fast Channel Attention mechanism, the model can more efficiently filter out useful features, thereby producing images with greater discriminability. Under the reconstruction condition, this study employs a further detection method using jigsaw puzzles, which, by training on the spatial information of video frames, can determine whether the input video frames are anomalous. Since the quality of the reconstructed data fundamentally influences the jigsaw puzzle detection, clearer and more discriminative data will be more beneficial for the model to detect normal and abnormal events. Experimental results demonstrate that this method outperforms existing methods on various standard datasets in terms of performance.
Similar content being viewed by others
Data availability
The datasets that support the findings of this study are available from the corresponding author,Ning He,upon reasonable request.
References
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Saligrama, V., Konrad, J., Jodoin, P.: Video anomaly identification. IEEE Signal Process. Mag. 27(5), 18–33 (2010)
Xinyu, W., Huiwen, G., Nannan, L., Huan, W., Yanlun, C.: Survey on the video-based abnormal event detection in crowd scenes. J. Electron. Meas. Instrum. 28, 575 (2014)
Roy, S.K., Deria, A., Hong, D., Rasti, B., Plaza, A., Chanussot, J.: Multimodal fusion transformer for remote sensing image classification. In: IEEE Transactions on Geoscience and Remote Sensing (2023)
Chen, S., Sun, P., Song, Y., Luo, P.: Diffusiondet: Diffusion model for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 19830–19843 (2023)
Zhu, X., Huang, P.-Y., Liang, J., de Melo, C.M., Hauptmann, A.G.: Stmt: A spatial-temporal mesh transformer for mocap-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1526–1536 (2023)
Wang, L., Tian, J., Zhou, S., Shi, H., Hua, G.: Memory-augmented appearance-motion network for video anomaly detection. Pattern Recogn. 138, 109335 (2023)
Lu, Y., Kumar, M. K, Nabavi, S.S., Wang, Y.: Future frame prediction using convolutional vrnn for anomaly detection. In: 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1-8
Gong, Y., Luo, S., Wang, C., Zheng, Y.: Feature differentiation reconstruction network for weakly-supervised video anomaly detection. IEEE Signal Processing Letters (2023)
Shao, W., Xiao, R., Rajapaksha, P., Wang, M., Crespi, N., Luo, Z., Minerva, R.: Video anomaly detection with ntcn-ml: a novel tcn for multi-instance learning. Pattern Recognit. 143, 109765 (2023)
Ye, Z., Li, Y., Cui, Z., Liu, Y., Li, L., Wang, L., Zhang, C.: Unsupervised video anomaly detection with self-attention based feature aggregating. In: 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), pp. 3551–3556 (2023)
Liu, Y., Yang, D., Wang, Y., Liu, J., Liu, J., Boukerche, A., Sun, P., Song, L.: Generalized video anomaly event detection: systematic taxonomy and comparison of deep models. ACM Comput. Surv. 56(7), 1–38 (2024)
Liu, J., Liu, Y., Lin, J., Li, J., Sun, P., Hu, B., Song, L., Boukerche, A., Leung, V.: Networking systems for video anomaly detection: A tutorial and survey. arXiv preprint arXiv:2405.10347 (2024)
Gong, D., Liu, L., Le, V., Saha, B., Mansour, M.R., Venkatesh, S., Hengel, A.v.d.: Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1705–1714 (2019)
Liu, Y., Yang, D., Fang, G., Wang, Y., Wei, D., Zhao, M., Cheng, K., Liu, J., Song, L.: Stochastic video normality network for abnormal event detection in surveillance videos. Knowl.-Based Syst. 280, 110986 (2023)
Fioresi, J., Dave, I.R., Shah, M.: Ted-spad: Temporal distinctiveness for self-supervised privacy-preservation for video anomaly detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13598–13609 (2023)
Morais, R., Le, V., Tran, T., Saha, B., Mansour, M., Venkatesh, S.: Learning regularity in skeleton trajectories for anomaly detection in videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11996-12004). (2019)
Liu, Z., Nie, Y., Long, C., Zhang, Q., Li, G.: A hybrid video anomaly detection framework via memory-augmented flow reconstruction and flow-guided frame prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 13588-13597). (2021)
Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 733-742). (2016)
Fan, Y., Wen, G., Li, D., Qiu, S., Xiao, F.: Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder. Comput. Vis. Image Underst. 195(11), 102920 (2020)
Liu, Y., Liu, J., Yang, K., Ju, B., Liu, S., Wang, Y., Yang, D., Sun, P., Song, L.: Amp-net: Appearance-motion prototype network assisted automatic video anomaly detection system. IEEE Trans. Ind. Inf.(2023)
Liu, Y., Xia, Z., Zhao, M., Wei, D., Wang, Y., Liu, S., Ju, B., Fang, G., Liu, J., Song, L.: Learning causality-inspired representation consistency for video anomaly detection. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 203–212 (2023)
Liu, Y., Yang, D., Fang, G., Wang, Y., Wei, D., Zhao, M., Cheng, K., Liu, J., Song, L.: Stochastic video normality network for abnormal event detection in surveillance videos. Knowl.-Based Syst. 280, 110986 (2023)
Bajgoti, A., Gupta, R., Balaji, P., Dwivedi, R., Siwach, M., Gupta, D.: Swinanomaly: Real-time video anomaly detection using video swin transformer and sort. IEEE Access (2023)
Tang, J., Wang, Z., Hao, G., Wang, K., Zhang, Y., Wang, N., Liang, D.: Sae-ppl: self-guided attention encoder with prior knowledge-guided pseudo labels for weakly supervised video anomaly detection. J. Vis. Commun. Image Represent. 97, 103967 (2023)
Integrating prediction and reconstruction for anomaly detection: B, Y.T.A., B, L.Z.A., C, S.Z.A.B., B, C.G.A., B, G.L.A., B, J.Y.A. Pattern Recogn. Lett. 129, 123–130 (2020)
Yang, Z., Liu, J., Wu, Z., Wu, P., Liu, X.: Video event restoration based on keyframes for video anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14592–14601 (2023)
Raja, R., Sharma, P.C., Mahmood, M.R., Saini, D.K.: Analysis of anomaly detection in surveillance video: recent trends and future vision. Multimed.Tools Appl. 82(8), 12635–12651 (2023)
Lv, H., Yue, Z., Sun, Q., Luo, B., Cui, Z., Zhang, H.: Unbiased multiple instance learning for weakly supervised video anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8022–8031 (2023)
Wang, Y., Liu, T., Zhou, J., Guan, J.: Video anomaly detection based on spatio-temporal relationships among objects. Neurocomputing 532, 141–151 (2023)
Doshi, K., Yilmaz, Y.: Towards interpretable video anomaly detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2655–2664 (2023)
Li, G., He, P., Li, H., Zhang, F.: Adversarial composite prediction of normal video dynamics for anomaly detection. Comput. Vis. Image Underst. 232, 103686 (2023)
Wang, G., Wang, Y., Qin, J., Zhang, D., Bao, X., Huang, D.: Video anomaly detection by solving decoupled spatio-temporal jigsaw puzzles. In: European Conference on Computer Vision, pp. 494-511 (2022)
Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles. Springer, Cham (2016)
Berroukham, A., Housni, K., Lahraichi, M., Boulfrifi, I.: Deep learning-based methods for anomaly detection in video surveillance: a review. Bull. Electr. Eng. Inf. 12(1), 314–327 (2023)
Georgescu, M.I., Barbalau, A., Ionescu, R.T., Khan, F.S., Popescu, M., Shah, M.: Anomaly detection in video via self-supervised and multi-task learning. In: Computer Vision and Pattern Recognition (2021)
Tang, Y., Zhao, L., Zhang, S., Gong, C., Li, G., Yang, J.: Integrating prediction and reconstruction for anomaly detection. Pattern Recogn. Lett. 129, 123–130 (2020)
Zhong, Y., Chen, X., Jiang, J., Ren, F.: A cascade reconstruction model with generalization ability evaluation for anomaly detection in videos. Pattern Recogn. 122, 108336 (2022)
Wang, L., Tian, J., Zhou, S., Shi, H., Hua, G.: Memory-augmented appearance-motion network for video anomaly detection. Pattern Recogn. 138, 109335 (2023)
Lv, H., Chen, C., Cui, Z., Xu, C., Li, Y., Yang, J.: Learning normal dynamics in videos with meta prototype network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 15425-15434)(2021)
Yan, H., Li, Z., Li, W., Wang, C., Zhang, C.: Contnet: Why not use convolution and transformer at the same time? (2021)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. Springer, Cham (2018)
Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2013)
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: IEEE International Conference on Computer Vision (2014)
Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection – a new baseline (2017)
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Computer Science (2014)
Mahadevan, V., Li, W.X., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: Computer Vision & Pattern Recognition (2010)
Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 733-742). (2016)
Luo, W., Liu, W., Gao, S.: Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International conference on multimedia and expo (ICME) (pp. 439-444) (2017)
Nguyen, T.N., Meunier, J.: Anomaly detection in video sequence with appearance-motion correspondence. In: International Conference on Computer Vision
Yan, S., Smith, J.S., Lu, W., Zhang, B.: Abnormal event detection from videos using a two-stream recurrent variational autoencoder. IEEE Trans. Cognit. Develop. Syst. 12(1), 30–42 (2020)
Zhou, J.T., Zhang, L., Fang, Z., Du, J., Peng, X., Xiao, Y.: Attention-driven loss for anomaly detection in video surveillance. IEEE Trans. Circuit. Syst. Video Technol. 30(12), 4639–4647 (2020)
Li, D., Nie, X., Li, X., Zhang, Y., Yin, Y.: Context-related video anomaly detection via generative adversarial network. Pattern Recognit. Lett. 156, 183 (2022)
Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.S.: [acm press the 2017 acm - mountain view, california, usa (2017.10.23-2017.10.27)]. In: proceedings of the 2017 acm on multimedia conference - mm \(\ddot{1}\)7 - spatio-temporal autoencoder for video anomaly detection. In: ACM, pp. 1933–1941 (2017)
Ravanbakhsh, M., Nabi, M., Sangineto, E., Marcenaro, L., Regazzoni, C., Sebe, N.: Abnormal event detection in videos using generative adversarial nets. In: 2017 IEEE international conference on image processing (ICIP) (2017)
Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked rnn framework. In: International Conference on Computer Vision (2017)
Luo, W., Liu, W., Lian, D., Tang, J., Duan, L., Peng, X., Gao, S.: Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 43(3), 1070–1084 (2021)
Liu, W., Luo, W., Li, Z., Zhao, P., Gao, S.: Margin learning embedded prediction for video anomaly detection with a few anomalies (2019)
Integrating prediction and reconstruction for anomaly detection: B, Y.T.A., B, L.Z.A., C, S.Z.A.B., B, C.G.A., B, G.L.A., B, J.Y.A. Pattern Recogn. Lett. 129, 123–130 (2020)
Dong, F., Zhang, Y., Nie, X.: Dual discriminator generative adversarial network for video anomaly detection. IEEE Access 8, 88170–88176 (2020)
Ye, M., Peng, X., Gan, W., Wu, W., Qiao, Y.: Anopcn: Video anomaly detection via deep predictive coding network. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1805–1813 (2019)
Acknowledgements
The authors would like to thank anonymous reviewers for their kind and valuable comments.
Funding
This work is supported by the National Natural Science Foundation of China ( 62272049, 62236006, 62172045), the key Projects of Beijing Union University (ZKZD202301)
Author information
Authors and Affiliations
Contributions
Hongfei Liu wrote the main manuscript text. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no Conflict of interest.
Ethical approval
There are no ethical issues with this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, H., He, N., Huang, X. et al. A video anomaly detection framework based on hybrid feature-enhanced memory reconstruction and jigsaw puzzle. SIViP 19, 12 (2025). https://doi.org/10.1007/s11760-024-03570-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11760-024-03570-x