Skip to main content

Memory-Augmented Spatial-Temporal Consistency Network for Video Anomaly Detection

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14430))

Included in the following conference series:

  • 896 Accesses

Abstract

Video anomaly detection (VAD) in intelligent surveillance systems is a crucial yet highly challenging task. Since appearance and motion information is vital for identifying anomalies, existing unsupervised VAD methods usually learn normality from them. However, these approaches tend to consider appearance and motion separately or simply integrate them while ignoring the consistency between them, resulting in sub-optimal performance. To address this problem, we propose a Memory-Augmented Spatial-Temporal Consistency Network, aiming to model the latent consistency between spatial appearance and temporal motion by learning the unified spatiotemporal representation. Additionally, we introduce a spatial-temporal memory fusion module to record spatial and temporal prototypes of regular patterns from the unified spatiotemporal representation, increasing the gap between normal and abnormal events in the feature space. Experimental results on three benchmarks demonstrate the effectiveness of the spatial-temporal consistency for VAD tasks. Our method performs comparably to the state-of-the-art methods with AUCs of 97.6%, 89.3%, and 73.3% on the UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets, respectively.

This work is partially supported by the National Natural Science Foundation of China (Grant No. 61972016) and the Science and Technology Commission of Shanghai Municipality Research Fund (Grant No. 21JC1405300).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Liu, Y., Yang, D., Wang, Y., Liu, J., Song, L.: Generalized video anomaly event detection: systematic taxonomy and comparison of deep models. arXiv preprint arXiv:2302.05087 (2023)

  2. Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection - a new baseline. In: CVPR, pp. 6536–6545 (2018)

    Google Scholar 

  3. Gong, D., Liu, L., Le, V., Saha, B., Mansour, M.R., Venkatesh, S., Van Den Hengel, A.: Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: ICCV, pp. 1705–1714 (2019)

    Google Scholar 

  4. Nguyen, T.N., Meunier, J.: Anomaly detection in video sequence with appearance-motion correspondence. In: ICCV, pp. 1273–1283 (2019)

    Google Scholar 

  5. Liu, Y., Liu, J., Zhao, M., Yang, D., Zhu, X., Song, L.: Learning appearance-motion normality for video anomaly detection. In: ICME, pp. 1–6 (2022)

    Google Scholar 

  6. Liu, Y., Liu, J., Lin, J., Zhao, M., Song, L.: Appearance-motion united auto-encoder framework for video anomaly detection. IEEE Trans. Circ. Syst. II Express Briefs 69(5), 2498–2502 (2022)

    Google Scholar 

  7. Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked rnn framework. In: ICCV, pp. 341–349 (2017)

    Google Scholar 

  8. Cai, R., Zhang, H., Liu, W., Gao, S., Hao, Z.: Appearance-motion memory consistency network for video anomaly detection. In: AAAI, pp. 938–946 (2021)

    Google Scholar 

  9. Chang, Y., et al.: Video anomaly detection with spatio-temporal dissociation. Pattern Recogn. 122, 108213 (2022)

    Article  Google Scholar 

  10. Wang, Y., Long, M., Wang, J., Gao, Z., Yu, P.S.: Predrnn: recurrent neural networks for predictive learning using spatiotemporal LSTMs. In: NeurIPS, pp. 879–888 (2017)

    Google Scholar 

  11. Park, H., Noh, J., Ham, B.: Learning memory-guided normality for anomaly detection. In: CVPR, pp. 14360–14369 (2020)

    Google Scholar 

  12. Ravanbakhsh, M., Sangineto, E., Nabi, M., Sebe, N.: Training adversarial discriminators for cross-channel abnormal event detection in crowds. In: WACV, pp. 1896–1904 (2019)

    Google Scholar 

  13. Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: CVPR, pp. 733–742 (2016)

    Google Scholar 

  14. Fang, Z., Zhou, J.T., Xiao, Y., Li, Y., Yang, F.: Multi-encoder towards effective anomaly detection in videos. IEEE Trans. Multimedia 23, 4106–4116 (2021)

    Article  Google Scholar 

  15. Lee, S., Kim, H.G., Ro, Y.M.: Stan: spatio-temporal adversarial networks for abnormal event detection. In: ICASSP, pp. 1323–1327 (2018)

    Google Scholar 

  16. Zhao, M., Liu, Y., Liu, J., Zeng, X.: Exploiting spatial-temporal correlations for video anomaly detection. In: ICPR, pp. 1727–1733 (2022)

    Google Scholar 

  17. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1

    Chapter  Google Scholar 

  18. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: CVPR, pp. 6848–6856 (2018)

    Google Scholar 

  19. Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2014)

    Article  Google Scholar 

  20. Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: ICCV, pp. 2720–2727 (2013)

    Google Scholar 

  21. Hao, Y., Li, J., Wang, N., Wang, X., Gao, X.: Spatiotemporal consistency-enhanced network for video anomaly detection. Pattern Recogn. 121, 108232 (2022)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinhua Zeng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Z., Zhao, M., Zeng, X., Wang, T., Pang, C. (2024). Memory-Augmented Spatial-Temporal Consistency Network for Video Anomaly Detection. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14430. Springer, Singapore. https://doi.org/10.1007/978-981-99-8537-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8537-1_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8536-4

  • Online ISBN: 978-981-99-8537-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics