Clustering Driven Deep Autoencoder for Video Anomaly Detection

Chang, Yunpeng; Tu, Zhigang; Xie, Wei; Yuan, Junsong

doi:10.1007/978-3-030-58555-6_20

Clustering Driven Deep Autoencoder for Video Anomaly Detection

Yunpeng Chang¹²,
Zhigang Tu¹²,
Wei Xie¹³ &
…
Junsong Yuan¹⁴

Conference paper
First Online: 16 November 2020

4866 Accesses
101 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12360))

Abstract

Because of the ambiguous definition of anomaly and the complexity of real data, video anomaly detection is one of the most challenging problems in intelligent video surveillance. Since the abnormal events are usually different from normal events in appearance and/or in motion behavior, we address this issue by designing a novel convolution autoencoder architecture to separately capture spatial and temporal informative representation. The spatial part reconstructs the last individual frame (LIF), while the temporal part takes consecutive frames as input and RGB difference as output to simulate the generation of optical flow. The abnormal events which are irregular in appearance or in motion behavior lead to a large reconstruction error. Besides, we design a deep k-means cluster to force the appearance and the motion encoder to extract common factors of variation within the dataset. Experiments on some publicly available datasets demonstrate the effectiveness of our method with the state-of-the-art performance.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Abati, D., Porrello, A., Calderara, S., Cucchiara, R.: Latent space autoregression for novelty detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 481–490 (2019)
Google Scholar
Blanchard, G., Lee, G., Scott, C.: Semi-supervised novelty detection. J. Mach. Learn. Res. 11, 2973–3009 (2010)
MathSciNet MATH Google Scholar
Chang, Y., Tu, Z., Luo, B., Qin, Q.: Learning spatiotemporal representation based on 3D autoencoder for anomaly detection. In: Cree, M., Huang, F., Yuan, J., Yan, W.Q. (eds.) ACPR 2019. CCIS, vol. 1180, pp. 187–195. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3651-9_17
Chapter Google Scholar
Fard, M.M., Thonet, T., Gaussier, E.: Deep k-means: jointly clustering with k-means and learning representations. arXiv, Learning (2018)
Google Scholar
Ghasedi Dizaji, K., Herandi, A., Deng, C., Cai, W., Huang, H.: Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: IEEE International Conference on Computer Vision (CVPR), pp. 5736–5745 (2017)
Google Scholar
Gong, D., et al.: Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 1705–1714 (2019)
Google Scholar
Guo, X., Gao, L., Liu, X., Yin, J.: Improved deep embedded clustering with local structure preservation. In: International Joint Conferences on Artificial Intelligence (IJCAI), pp. 1753–1759 (2017)
Google Scholar
Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 733–742 (2016)
Google Scholar
Hinami, R., Mei, T., Satoh, S.: Joint detection and recounting of abnormal events by learning deep generic knowledge. In: IEEE International Conference on Computer Vision (ICCV), pp. 3619–3627 (2017)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Hsu, C., Lin, C.: CNN-based joint clustering and representation learning with feature drift compensation for large-scale image data. IEEE Trans. Multimed. 20(2), 421–429 (2017)
Article Google Scholar
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2462–2470 (2017)
Google Scholar
Ionescu, R.T., Khan, F.S., Georgescu, M.I., Shao, L.: Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7842–7851 (2019)
Google Scholar
Kim, J., Grauman, K.: Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2928 (2009)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Lin, Z., et al.: A structured self-attentive sentence embedding. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection-a new baseline. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6536–6545 (2018)
Google Scholar
Liu, Y., Zheng, Y.F.: Minimum enclosing and maximum excluding machine for pattern description and discrimination. In: International Conference on Pattern Recognition (ICPR), vol. 3, pp. 129–132 (2006)
Google Scholar
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: IEEE International Conference on Computer Vision, pp. 2720–2727 (2013)
Google Scholar
Luo, W., Liu, W., Gao, S.: Remembering history with convolutional LSTM for anomaly detection. In: International Conference on Multimedia and Expo (ICME), pp. 439–444 (2017)
Google Scholar
Luo, W., Liu, W., Gao, S.: A revisit of sparse coding based anomaly detection in stacked RNN framework. In: IEEE International Conference on Computer Vision, pp. 341–349 (2017)
Google Scholar
Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1975–1981. IEEE (2010)
Google Scholar
Nguyen, T.N., Meunier, J.: Anomaly detection in video sequence with appearance-motion correspondence. In: IEEE International Conference on Computer Vision (ICCV), pp. 1273–1283 (2019)
Google Scholar
Perera, P., Nallapati, R., Xiang, B.: OCGAN: one-class novelty detection using GANs with constrained latent representations. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2898–2906 (2019)
Google Scholar
Poultney, C., Chopra, S., Cun, Y.L., et al.: Efficient learning of sparse representations with an energy-based model. In: Advances in Neural Information Processing Systems, pp. 1137–1144 (2007)
Google Scholar
Ravanbakhsh, M., Nabi, M., Sangineto, E., Marcenaro, L., Regazzoni, C., Sebe, N.: Abnormal event detection in videos using generative adversarial nets. In: IEEE International Conference on Image Processing (ICIP), pp. 1577–1581 (2017)
Google Scholar
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: International Conference on Machine Learning (ICML), pp. 833–840 (2011)
Google Scholar
Ruff, L., et al.: Deep one-class classification. In: International Conference on Machine Learning, pp. 4393–4402 (2018)
Google Scholar
Ruff, L., et al.: Deep semi-supervised anomaly detection. In: International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Google Scholar
Srivastava, N., Mansimov, E., Salakhudinov, R.: Unsupervised learning of video representations using LSTMs. In: International Conference on Machine Learning, pp. 843–852 (2015)
Google Scholar
Tu, Z., et al.: Multi-stream CNN: learning representations based on human-related regions for action recognition. Pattern Recogn. 79, 32–43 (2018)
Article Google Scholar
Tu, Z., et al.: A survey of variational and CNN-based optical flow techniques. Sig. Process. Image Commun. 72, 9–24 (2019)
Article Google Scholar
Tung, F., Zelek, J.S., Clausi, D.A.: Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance. Image Vis. Comput. 29(4), 230–240 (2011)
Article Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning (ICML), pp. 1096–1103 (2008)
Google Scholar
Wang, L., et al.: Temporal segment networks for action recognition in videos. IEEE Trans. Pattern Anal. Mach. Intell. 41, 2740–2755 (2018)
Article Google Scholar
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, pp. 478–487 (2016)
Google Scholar
Xu, D., Yan, Y., Ricci, E., Sebe, N.: Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput. Vis. Image Underst. 156, 117–127 (2017)
Article Google Scholar
Yan, M., Meng, J., Zhou, C., Tu, Z., Tan, Y.P., Yuan, J.: Detecting spatiotemporal irregularities in videos via a 3D convolutional autoencoder. J. Vis. Commun. Image Represent. 67, 102747 (2020)
Article Google Scholar
Yu, T., Ren, Z., Li, Y., Yan, E., Xu, N., Yuan, J.: Temporal structure mining for weakly supervised action detection. In: IEEE International Conference on Computer Vision, pp. 5522–5531 (2019)
Google Scholar
Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-\(L^1\) optical flow. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds.) DAGM 2007. LNCS, vol. 4713, pp. 214–223. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74936-3_22
Chapter Google Scholar
Zhao, B., Fei-Fei, L., Xing, E.P.: Online detection of unusual events in videos via dynamic sparse coding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3313–3320 (2011)
Google Scholar
Zimek, A., Schubert, E., Kriegel, H.P.: A survey on unsupervised outlier detection in high-dimensional numerical data. Stat. Anal. Data Min. 5(5), 363–387 (2012)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by the Fundamental Research Funds for the Central Universities (2042020KF0016 and CCNU20TS028). It was also supported by the Wuhan University-Huawei Company Project.

Author information

Authors and Affiliations

Wuhan University, Wuhan, 430079, China
Yunpeng Chang & Zhigang Tu
Central China Normal University, Wuhan, 430079, China
Wei Xie
State University of New York at Buffalo, Buffalo, NY, 14260-2500, USA
Junsong Yuan

Authors

Yunpeng Chang
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Tu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Xie
View author publications
You can also search for this author in PubMed Google Scholar
Junsong Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhigang Tu .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chang, Y., Tu, Z., Xie, W., Yuan, J. (2020). Clustering Driven Deep Autoencoder for Video Anomaly Detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12360. Springer, Cham. https://doi.org/10.1007/978-3-030-58555-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-58555-6_20
Published: 16 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58554-9
Online ISBN: 978-3-030-58555-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics