Skip to main content

Residual spatiotemporal autoencoder for unsupervised video anomaly detection

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Modeling abnormal spatiotemporal events is challenging since data belonging to abnormal activities are less in the course of a surveillance stream. We solve this issue using a normality modeling approach, where abnormalities are detected as deviations from the normal patterns. To this end, we propose a residual spatiotemporal autoencoder, which is trainable end-to-end to carry out the anomaly detection task in surveillance videos. Irregularities are detected using the reconstruction loss, where normal frames are reconstructed well with a low reconstruction cost, and the converse is identified as abnormal frames. We evaluate the effect of residual connections in the STAE architecture and presented good practices to train an autoencoder for video anomaly detection using benchmark datasets, namely CUHK-Avenue, UCSD-Ped2, and Live Videos. Comparisons with the existing approaches prove that the effectiveness of residual blocks is incremental than going deeper with additional layers to train a spatiotemporal autoencoder with good generalization across datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Ali, A., Taylor, G.W.: Real-time end-to-end action detection with two-stream networks. In: 2018 15th Conference on Computer and Robot Vision (CRV), IEEE, pp. 31–38 (2018)

  2. Biswas, S., Babu, R.V., (2013) Real time anomaly detection in h. 264 compressed videos. In: Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), pp. 1–4. IEEE (2013)

  3. Christoph, R., Pinz, F.A.: Spatiotemporal residual networks for video action recognition. Advances in Neural Information Processing Systems, pp. 3468–3476 (2016)

  4. Del Giorno, A., Bagnell, J.A., Hebert, M.: A discriminative framework for anomaly detection in large videos. In: European Conference on Computer Vision, pp. 334–349. Springer (2016)

  5. Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., Brox, T.: Flownet: Learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)

  6. Ghrab, N.B., Fendri, E., Hammami, M.: Abnormal events detection based on trajectory clustering. In: 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV), pp. 301–306. IEEE (2016)

  7. Gong, D., Liu, L., Le, V., Saha, B., Mansour, MR., Venkatesh, S., Hengel, Avd.: Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection (2019). arXiv preprint arXiv:1904.02639

  8. Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, AK., Davis, L.S.: Learning temporal regularity in video sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 733–742 (2016)

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  10. Hu, X., Hu, S., Huang, Y., Zhang, H., Wu, H.: Video anomaly detection using deep incremental slow feature analysis network. IET Comput. Vis. 10(4), 258–267 (2016)

    Article  Google Scholar 

  11. Ionescu, R.T., Smeureanu, S., Popescu, M., Alexem B.: Detecting abnormal events in video using narrowed normality clusters. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1951–1960. IEEE (2019)

  12. Iqbal, A., Richard, A., Kuehne, H., Gall, J.: Recurrent residual learning for action recognition. In: German Conference on Pattern Recognition, pp. 126–137. Springer (2017)

  13. Kaltsa, V., Briassouli, A., Kompatsiaris, I., Strintzis, M.G.: Swarm-based motion features for anomaly detection in crowds. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 2353–2357. IEEE (2014)

  14. Khan, M.U.K., Park, H.S., Kyung, C.M.: Rejecting motion outliers for efficient crowd anomaly detection. IEEE Trans. Inf. Forensics Secur. 14(2), 541–556 (2018)

    Article  Google Scholar 

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980

  16. Leyva, R., Sanchez, V., Li, C.T.: Abnormal event detection in videos using binary features. In: 2017 40th International Conference on Telecommunications and Signal Processing (TSP), pp. 621–625. IEEE (2017)

  17. Leyva, R., Sanchez, V., Li, C.T.: The LV dataset: a realistic surveillance video dataset for abnormal event detection. In: 2017 5th International Workshop on Biometrics and Forensics (IWBF), pp 1–6. IEEE (2017)

  18. Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection—a new baseline. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6536–6545 (2018)

  19. Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2720–2727 (2013)

  20. Luo, W., Liu, W., Gao, S.: Remembering history with convolutional LSTM for anomaly detection. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 439–444. IEEE (2017)

  21. Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition pp. 1975–1981. IEEE (2010)

  22. Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 935–942. IEEE (2009)

  23. Noceti, N., Odone, F., Sciutti, A., Sandini, G.: Exploring biological motion regularities of human actions: a new perspective on video analysis. ACM Trans. Appl. Percept. 14(3), 21:1–21:20 (2017). https://doi.org/10.1145/3086591

    Article  Google Scholar 

  24. Revathi, A., Kumar, D.: An efficient system for anomaly detection using deep learning classifier. Signal Image Video Process. 11(2), 291–299 (2017)

    Article  Google Scholar 

  25. Ronneberger, O. Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241. Springer (2015)

  26. Sabokrou, M., Fayyaz, M., Fathy, M., Klette, R.: Deep-cascade: cascading 3d deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans. Image Process. 26(4), 1992–2004 (2017)

    Article  MathSciNet  Google Scholar 

  27. Sudhakaran, S., Lanz, O.: Learning to detect violent videos using convolutional long short-term memory. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2017)

  28. Tadros, T., Cullen, N.C., Greene, M.R., Cooper, E.A.: Assessing neural network scene classification from degraded images. ACM Trans. Appl. Percept. 16(4), 21:1–21:20 (2019). https://doi.org/10.1145/3342349

    Article  Google Scholar 

  29. Tran, H.T., Hogg, D.: Anomaly detection using a convolutional winner-take-all autoencoder. In: Proceedings of the British Machine Vision Conference 2017. British Machine Vision Association (2017)

  30. Tudor Ionescu, R., Smeureanu, S., Alexe, B., Popescu, M.: Unmasking the abnormal events in video. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2895–2903 (2017)

  31. Wang, S., Zeng, Y., Liu, Q., Zhu, C., Zhu, E., Yin, J.: Detecting abnormality without knowing normality. In: ACM International Conference on Multimedia. ACM Press (2018)

  32. Xingjian, S., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.c.: Convolutional lstm network: a machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015)

  33. Xu, D., Ricci, E., Yan, Y., Song, J., Sebe, N.: Learning deep representations of appearance and motion for anomalous event detection (2015). arXiv preprint arXiv:1510.01553

  34. Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: Cvpr, vol. 10, p. 7 (2010)

  35. Zhao, Y., Deng, B., Shen, C., Liu, Y., Lu, H., Hua, X.S.: Spatio-temporal autoencoder for video anomaly detection. In: ACM Multimedia (2017)

Download references

Acknowledgements

The authors would like to acknowledge the following funding agencies: “Council of Scientific and Industrial Research (CSIR)” (09/1095(0043)/19-EMR-I) and (No.DST/CSRI/2017/131(G)) project under the “Cognitive Science Research Initiative (CSRI)” sanctioned by the Department of Science and Technology, Government of India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Chandrakala.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deepak, K., Chandrakala, S. & Mohan, C.K. Residual spatiotemporal autoencoder for unsupervised video anomaly detection. SIViP 15, 215–222 (2021). https://doi.org/10.1007/s11760-020-01740-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-020-01740-1

Keywords