Learning Spatiotemporal Representation Based on 3D Autoencoder for Anomaly Detection

Chang, Yunpeng; Tu, Zhigang; Luo, Bin; Qin, Qianqing

doi:10.1007/978-981-15-3651-9_17

Yunpeng Chang¹⁰,
Zhigang Tu¹⁰,
Bin Luo¹⁰ &
…
Qianqing Qin¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1180))

Included in the following conference series:

Asian Conference on Pattern Recognition

680 Accesses

Abstract

Because of ambiguous definition of anomaly and the complexity of real data, anomaly detection in videos is of utmost importance in intelligent video surveillance. We approach this problem by learning a novel 3D convolution autoencoder architecture to capture informative spatiotemporal representation, and an 2D convolutional autoencoder to learn the pixel-wise correspondences of appearance and motion information to boost the performance. Experiments on some publicly available datasets demonstrate the effectiveness and competitive performance of our method on anomaly detection in videos.

Supported by Wuhan University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abnormal Event Detection in Videos Using Spatiotemporal Autoencoder

Dynamic video anomaly detection and localization using sparse denoising autoencoders

Article 21 June 2017

Unsupervised Learning Approach for Abnormal Event Detection in Surveillance Video by Hybrid Autoencoder

Article 04 October 2019

References

Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)
Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_33
Chapter Google Scholar
Goroshin, R., Bruna, J., Tompson, J., Eigen, D., LeCun, Y.: Unsupervised feature learning from temporal data. arXiv preprint arXiv:1504.02518 (2015)
Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 733–742 (2016)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Li, Y., Liu, W., Huang, Q.: Traffic anomaly detection based on image descriptor in videos. Multimed. Tools Appl. 75(5), 2487–2505 (2016)
Article Google Scholar
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 FPS in MATLAB. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2720–2727 (2013)
Google Scholar
Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1975–1981. IEEE (2010)
Google Scholar
Masci, J., Meier, U., Cireşan, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 52–59. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_7
Chapter Google Scholar
Poultney, C., Chopra, S., Cun, Y.L., et al.: Efficient learning of sparse representations with an energy-based model. In: Advances in Neural Information Processing Systems, pp. 1137–1144 (2007)
Google Scholar
Ramanathan, V., Tang, K., Mori, G., Fei-Fei, L.: Learning temporal embeddings for complex video analysis. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4471–4479 (2015)
Google Scholar
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: Proceedings of the 28th International Conference on International Conference on Machine Learning, pp. 833–840. Omnipress (2011)
Google Scholar
Srivastava, N., Mansimov, E., Salakhudinov, R.: Unsupervised learning of video representations using LSTMs. In: International Conference on Machine Learning, pp. 843–852 (2015)
Google Scholar
Stewart, R., Ermon, S.: Label-free supervision of neural networks with physics and domain knowledge. In: AAAI, vol. 1, pp. 1–7 (2017)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Google Scholar
Tung, F., Zelek, J.S., Clausi, D.A.: Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance. Image Vis. Comput. 29(4), 230–240 (2011)
Article Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)
Google Scholar
Dan, X., Yan, Y., Ricci, E., Sebe, N.: Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput. Vis. Image Underst. 156, 117–127 (2017)
Article Google Scholar

Download references

Acknowledgment

The work is supported by the funding CXFW-18-413100063 of Wuhan University. It is also supported by the Huawei-Wuhan University Funding (No. 250000916) and the National Key Research and Development Program of China (No. 2018YFB1600600).

Author information

Authors and Affiliations

Wuhan University, Wuhan, 430079, China
Yunpeng Chang, Zhigang Tu, Bin Luo & Qianqing Qin

Authors

Yunpeng Chang
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Tu
View author publications
You can also search for this author in PubMed Google Scholar
Bin Luo
View author publications
You can also search for this author in PubMed Google Scholar
Qianqing Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhigang Tu .

Editor information

Editors and Affiliations

University of Waikato, Hamilton, New Zealand
Michael Cree
National Ilan University, Yilan, Taiwan
Fay Huang
State University of New York at Buffalo, Buffalo, NY, USA
Junsong Yuan
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chang, Y., Tu, Z., Luo, B., Qin, Q. (2020). Learning Spatiotemporal Representation Based on 3D Autoencoder for Anomaly Detection. In: Cree, M., Huang, F., Yuan, J., Yan, W. (eds) Pattern Recognition. ACPR 2019. Communications in Computer and Information Science, vol 1180. Springer, Singapore. https://doi.org/10.1007/978-981-15-3651-9_17

Download citation

DOI: https://doi.org/10.1007/978-981-15-3651-9_17
Published: 07 March 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3650-2
Online ISBN: 978-981-15-3651-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics