Skip to main content

Clustering Driven Deep Autoencoder for Video Anomaly Detection

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12360))

Abstract

Because of the ambiguous definition of anomaly and the complexity of real data, video anomaly detection is one of the most challenging problems in intelligent video surveillance. Since the abnormal events are usually different from normal events in appearance and/or in motion behavior, we address this issue by designing a novel convolution autoencoder architecture to separately capture spatial and temporal informative representation. The spatial part reconstructs the last individual frame (LIF), while the temporal part takes consecutive frames as input and RGB difference as output to simulate the generation of optical flow. The abnormal events which are irregular in appearance or in motion behavior lead to a large reconstruction error. Besides, we design a deep k-means cluster to force the appearance and the motion encoder to extract common factors of variation within the dataset. Experiments on some publicly available datasets demonstrate the effectiveness of our method with the state-of-the-art performance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abati, D., Porrello, A., Calderara, S., Cucchiara, R.: Latent space autoregression for novelty detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 481–490 (2019)

    Google Scholar 

  2. Blanchard, G., Lee, G., Scott, C.: Semi-supervised novelty detection. J. Mach. Learn. Res. 11, 2973–3009 (2010)

    MathSciNet  MATH  Google Scholar 

  3. Chang, Y., Tu, Z., Luo, B., Qin, Q.: Learning spatiotemporal representation based on 3D autoencoder for anomaly detection. In: Cree, M., Huang, F., Yuan, J., Yan, W.Q. (eds.) ACPR 2019. CCIS, vol. 1180, pp. 187–195. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3651-9_17

    Chapter  Google Scholar 

  4. Fard, M.M., Thonet, T., Gaussier, E.: Deep k-means: jointly clustering with k-means and learning representations. arXiv, Learning (2018)

    Google Scholar 

  5. Ghasedi Dizaji, K., Herandi, A., Deng, C., Cai, W., Huang, H.: Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: IEEE International Conference on Computer Vision (CVPR), pp. 5736–5745 (2017)

    Google Scholar 

  6. Gong, D., et al.: Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 1705–1714 (2019)

    Google Scholar 

  7. Guo, X., Gao, L., Liu, X., Yin, J.: Improved deep embedded clustering with local structure preservation. In: International Joint Conferences on Artificial Intelligence (IJCAI), pp. 1753–1759 (2017)

    Google Scholar 

  8. Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 733–742 (2016)

    Google Scholar 

  9. Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. In: IEEE International Conference on Computer Vision (ICCV), pp. 3619–3627 (2017)

    Google Scholar 

  10. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  11. Hsu, C., Lin, C.: CNN-based joint clustering and representation learning with feature drift compensation for large-scale image data. IEEE Trans. Multimed. 20(2), 421–429 (2017)

    Article  Google Scholar 

  12. Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2462–2470 (2017)

    Google Scholar 

  13. Ionescu, R.T., Khan, F.S., Georgescu, M.I., Shao, L.: Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7842–7851 (2019)

    Google Scholar 

  14. Kim, J., Grauman, K.: Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2928 (2009)

    Google Scholar 

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  16. Lin, Z., et al.: A structured self-attentive sentence embedding. In: International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  17. Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection-a new baseline. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6536–6545 (2018)

    Google Scholar 

  18. Liu, Y., Zheng, Y.F.: Minimum enclosing and maximum excluding machine for pattern description and discrimination. In: International Conference on Pattern Recognition (ICPR), vol. 3, pp. 129–132 (2006)

    Google Scholar 

  19. Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: IEEE International Conference on Computer Vision, pp. 2720–2727 (2013)

    Google Scholar 

  20. Luo, W., Liu, W., Gao, S.: Remembering history with convolutional LSTM for anomaly detection. In: International Conference on Multimedia and Expo (ICME), pp. 439–444 (2017)

    Google Scholar 

  21. Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked RNN framework. In: IEEE International Conference on Computer Vision, pp. 341–349 (2017)

    Google Scholar 

  22. Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1975–1981. IEEE (2010)

    Google Scholar 

  23. Nguyen, T.N., Meunier, J.: Anomaly detection in video sequence with appearance-motion correspondence. In: IEEE International Conference on Computer Vision (ICCV), pp. 1273–1283 (2019)

    Google Scholar 

  24. Perera, P., Nallapati, R., Xiang, B.: OCGAN: one-class novelty detection using GANs with constrained latent representations. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2898–2906 (2019)

    Google Scholar 

  25. Poultney, C., Chopra, S., Cun, Y.L., et al.: Efficient learning of sparse representations with an energy-based model. In: Advances in Neural Information Processing Systems, pp. 1137–1144 (2007)

    Google Scholar 

  26. Ravanbakhsh, M., Nabi, M., Sangineto, E., Marcenaro, L., Regazzoni, C., Sebe, N.: Abnormal event detection in videos using generative adversarial nets. In: IEEE International Conference on Image Processing (ICIP), pp. 1577–1581 (2017)

    Google Scholar 

  27. Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: International Conference on Machine Learning (ICML), pp. 833–840 (2011)

    Google Scholar 

  28. Ruff, L., et al.: Deep one-class classification. In: International Conference on Machine Learning, pp. 4393–4402 (2018)

    Google Scholar 

  29. Ruff, L., et al.: Deep semi-supervised anomaly detection. In: International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  30. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)

    Google Scholar 

  31. Srivastava, N., Mansimov, E., Salakhudinov, R.: Unsupervised learning of video representations using LSTMs. In: International Conference on Machine Learning, pp. 843–852 (2015)

    Google Scholar 

  32. Tu, Z., et al.: Multi-stream CNN: learning representations based on human-related regions for action recognition. Pattern Recogn. 79, 32–43 (2018)

    Article  Google Scholar 

  33. Tu, Z., et al.: A survey of variational and CNN-based optical flow techniques. Sig. Process. Image Commun. 72, 9–24 (2019)

    Article  Google Scholar 

  34. Tung, F., Zelek, J.S., Clausi, D.A.: Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance. Image Vis. Comput. 29(4), 230–240 (2011)

    Article  Google Scholar 

  35. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning (ICML), pp. 1096–1103 (2008)

    Google Scholar 

  36. Wang, L., et al.: Temporal segment networks for action recognition in videos. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2740–2755 (2018)

    Article  Google Scholar 

  37. Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, pp. 478–487 (2016)

    Google Scholar 

  38. Xu, D., Yan, Y., Ricci, E., Sebe, N.: Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput. Vis. Image Underst. 156, 117–127 (2017)

    Article  Google Scholar 

  39. Yan, M., Meng, J., Zhou, C., Tu, Z., Tan, Y.P., Yuan, J.: Detecting spatiotemporal irregularities in videos via a 3D convolutional autoencoder. J. Vis. Commun. Image Represent. 67, 102747 (2020)

    Article  Google Scholar 

  40. Yu, T., Ren, Z., Li, Y., Yan, E., Xu, N., Yuan, J.: Temporal structure mining for weakly supervised action detection. In: IEEE International Conference on Computer Vision, pp. 5522–5531 (2019)

    Google Scholar 

  41. Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-\(L^1\) optical flow. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds.) DAGM 2007. LNCS, vol. 4713, pp. 214–223. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74936-3_22

    Chapter  Google Scholar 

  42. Zhao, B., Fei-Fei, L., Xing, E.P.: Online detection of unusual events in videos via dynamic sparse coding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3313–3320 (2011)

    Google Scholar 

  43. Zimek, A., Schubert, E., Kriegel, H.P.: A survey on unsupervised outlier detection in high-dimensional numerical data. Stat. Anal. Data Min. 5(5), 363–387 (2012)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the Fundamental Research Funds for the Central Universities (2042020KF0016 and CCNU20TS028). It was also supported by the Wuhan University-Huawei Company Project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhigang Tu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chang, Y., Tu, Z., Xie, W., Yuan, J. (2020). Clustering Driven Deep Autoencoder for Video Anomaly Detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12360. Springer, Cham. https://doi.org/10.1007/978-3-030-58555-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58555-6_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58554-9

  • Online ISBN: 978-3-030-58555-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics