Skip to main content
Log in

Unsupervised Learning Approach for Abnormal Event Detection in Surveillance Video by Hybrid Autoencoder

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Abnormal detection plays an important role in video surveillance. LSTM encoder–decoder is used to learn representation of video sequences and applied for detecting abnormal event in complex environment. The learned representation of LSTM encoder–decoder is learned from encoder, and it is crucial for decoder. However, LSTM encoder–decoder generally fails to account for the global context of the learned representation with a fixed dimension representation. In this paper, we explore a hybrid autoencoder architecture, which not only extracts better spatio-temporal context, but also improves the extrapolate capability of the corresponding decoder by the shortcut connection. The experiment shows that the hybrid model performs better than the state-of-the-art anomaly detection methods in both qualitative and quantitative ways on benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Zhao B, Li FF, Xing EP (2011) Online detection of unusual events in videos via dynamic sparse coding. In: IEEE conference on computer vision and pattern recognition, pp 3313–3320

  2. Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: IEEE conference on computer vision and pattern recognition, pp 3449–3456

  3. Chen Z, Saligrama V (2012) Video anomaly detection based on local statistical aggregates. In: IEEE conference on computer vision and pattern recognition, pp 2112–2119

  4. Ricci E, Zen G, Sebe N, Messelodi S (2013) A prototype learning framework using EMD: application to complex scenes analysis. IEEE Trans Pattern Anal Mach Intell 35:513–526

    Article  Google Scholar 

  5. Sabokrou M, Fathy M, Hoseini M (2016) Video anomaly detection and localisation based on the sparsity and reconstruction error of auto-encoder. Electron Lett 52:1122–1124

    Article  Google Scholar 

  6. Xu D, Ricci E, Yan Y, Song J, Sebe N (2015) Learning deep representations of appearance and motion for anomalous event detection. In: The British machine vision conference

  7. Hasan M, Choi J, Neumann J, Roychowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. In: IEEE conference on computer vision and pattern recognition, pp 733–742

  8. Zhou XG, Zhang LQ (2015) Abnormal event detection using recurrent neural network. In: International conference on computer science and applications, pp 222–226

  9. Yong SC, Yong HT (2017) Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks, pp 189–196

  10. Goodfellow IJ, Pougetabadie J, Mirza M, Xu B, Wardefarley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. Adv Neural Inf Process Syst 3:2672–2680

    Google Scholar 

  11. Ravanbakhsh M, Nabi M, Sangineto E, Marcenaro L, Regazzoni C, Sebe N (2017) Abnormal event detection in videos using generative adversarial nets. In: International conference on image processing

  12. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241

  13. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  14. Graves A, Jaitly N (2014) Towards end-to-end speech recognition with recurrent neural networks. In: Proceedings of the 31st international conference on machine learning (ICML-14), pp 1764–1772

  15. Yildirim O (2018) A novel wavelet sequences based on deep bidirectional LSTM network model for ECG signal classification. In: Computers in biology and medicine, S0010482518300738

  16. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Adv Neural Inf Process Syst 4:3104–3112

    Google Scholar 

  17. Cho K, Courville A, Bengio Y (2015) Describing multimedia content using attention-based encoder–decoder networks. IEEE Trans Multimed 17(11):1875–1886

    Article  Google Scholar 

  18. Kim HY, Won CH (2018) Forecasting the volatility of stock price index: a hybrid model integrating LSTM with multiple GARCH-type models. In: Expert systems with applications, S0957417418301416

  19. Venugopalan S, Xu H, Donahue J et al (2014) Translating videos to natural language using deep recurrent neural networks. arXiv preprint arXiv:1412.4729

  20. Vinyals O, Toshev A, Bengio S et al (2015) Show and tell: a neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164

  21. Srivastava N, Mansimov E, Salakhutdinov R (2015) Unsupervised learning of video representations using LSTMS. In: International conference on machine learning, pp 843–852

  22. Wang X, Gao L, Song J et al (2016) Beyond frame-level CNN: saliency-aware 3D CNN with LSTM for video action recognition. IEEE Signal Process Lett PP(99):1–1

    Google Scholar 

  23. Song S, Lan C, Xing J et al (2018) Spatio-temporal attention based LSTM networks for 3D action recognition and detection. IEEE Trans Image Process 1–1

  24. Wang L, Zhou F, Li Z et al (2018) Abnormal event detection in videos using hybrid spatio-temporal autoencoder. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, pp 2276–2280

  25. Ji Y, Cohn T, Kong L et al (2015) Document context language models. arXiv preprint arXiv:1511.03962

  26. Shi X, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Conference and workshop on neural information processing systems, pp 802–810

  27. Wu L, Shen C, Hengel AVD (2016) Convolutional LSTM networks for video-based person re-identification. arXiv preprint arXiv:1606.01609v1

  28. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778

  29. Huang G, Liu Z, Weinberger KQ, Laurens VDM (2016) Densely connected convolutional networks. In: IEEE conference on computer vision and pattern recognition

  30. Graves A (2013) Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850

  31. Medel JR (2016) Anomaly detection using predictive convolutional long short-term memory units. Master’s thesis

  32. Vondrick C, Pirsiavash H, Torralba A (2016) Anticipating visual representations from unlabeled video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 98–106

  33. Kozlov Y, Weinkauf T. Persistence1D: extracting and filtering minima and maxima of 1D functions. http://people.mpi-inf.mpg.de/~weinkauf/notes/persistence1d.html

  34. Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowed scenes. In: IEEE conference on computer vision and pattern recognition, pp 1975–1981

  35. Lu C, shi J, Jia J (2013) Anomaly event detection at 150fps in matlab. In: IEEE international conference on computer vision, no 3, pp 2720–2727

  36. Adam A, Rivlin E, Shimshoni I et al (2008) Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans Pattern Anal Mach Intell 30(3):555–560

    Article  Google Scholar 

  37. Wang T, Snoussi H (2013) Histograms of optical flow orientation for abnormal events detection. In: 2013 IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS). IEEE, pp 45–52

Download references

Acknowledgements

This is an extended version of our paper accepted in 2018 IEEE ICIP [24]. This work is supported by the National Natural Science Foundation of China (NSFC) (No. 61471123).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fuqiang Zhou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, F., Wang, L., Li, Z. et al. Unsupervised Learning Approach for Abnormal Event Detection in Surveillance Video by Hybrid Autoencoder. Neural Process Lett 52, 961–975 (2020). https://doi.org/10.1007/s11063-019-10113-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-019-10113-w

Keywords

Navigation