A Convolutional Autoencoder Approach for Weakly Supervised Anomaly Video Detection

Hieu, Phan Nguyen Duc; Hung, Phan Duy

doi:10.1007/978-3-031-41456-5_11

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14162))

Included in the following conference series:

International Conference on Computational Collective Intelligence

864 Accesses

Abstract

Weakly-supervised video anomaly detection uses video-level labels to avoid annotating all frames or segments in the training video. This problem is typically considered as a multiple instance learning problem, the training process aims to learn how to score both abnormal segments and normal segments, and the score of abnormal segments is higher than the score of normal segments. The features are extracted from videos before the training or testing process. Although many models have been proposed and obtained good results, improving the performance of the problem remains a challenge. This study proposes a convolutional autoencoder based approach to reconstruct features, with an assumption that the reconstructed features contain important information early bound to the objective of normal or abnormal video classification (minimizing the cost function). The work is validated on the ShanghaiTech Campus dataset and has produced results that outperform state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Dynamic Learning for Improving Anomalous Event Prediction in Surveillance Videos

Article 18 December 2024

An explainable and efficient deep learning framework for video anomaly detection

Article 23 November 2021

Anomaly detection in surveillance videos using deep autoencoder

Article 24 December 2023

References

Ravanbakhsh, M., Nabi, M., Sangineto, E., Marcenaro, L., Regazzoni, C., Sebe, N.: Abnormal event detection in videos using generative adversarial nets. arXiv:1708.09644 (2017)
Hasan, M., Choi, J., Neumann, J.K.A., Davis, L.S.: Learning temporal regularity in video sequences. arXiv:1604.04574 (2016)
Smeureanu, S., Ionescu, R.T., Popescu, M., Alexe, B.: Deep appearance features for abnormal behavior detection in video. In: Battiato, S., Gallo, G., Schettini, R., Stanco, F. (eds.) ICIAP 2017. LNCS, vol. 10485, pp. 779–789. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68548-9_70
Sultani, W., Chen, C., Shah, M.: Real-world Anomaly detection in surveillance videos. arXiv:1801.04264 (2018)
Degardin, B.: Weakly and partially supervised learning frameworks for anomaly detection (2020). https://doi.org/10.13140/RG.2.2.30613.65769
Tian, Y., Pang, G., Chen, Y., Singh, R., Verjans, J.W., Carneiro, G.: Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. arXiv:2101.10030 (2021)
Deshpande, K., Punn, N.S., Sonbhadra, S.K., Agarwal, S.: Anomaly detection in surveillance videos using transformer based attention model. arXiv:2206.01524 (2022)
Goodfellow, I., et al.: Generative adversarial networks. Adv. Neural Inf. Process. Syst. 27, 3 (2014). https://doi.org/10.1145/3422622
Article Google Scholar
Georgescu, M., Barbalau, A., Ionescu, R.T., Khan, F.S., Popescu, M., Shah, M.: Anomaly detection in video via self-supervised and multi-task learning. arXiv:2011.07491 (2020)
Georgescu, M., Ionescu, R.T., Khan, F.S., Popescu, M., Shah, M.: A background-agnostic framework with adversarial training for abnormal event detection in video. arXiv:2021.3074805 (2020)
Wu, P., et al.: Not only look, but also listen: learning multimodal violence detection under weak supervision. arXiv:2007.04687 (2020)
Liu, Z., et al.: Video Swin Transformer. arXiv:2106.13230 (2021)
Kay, W., et al.: The kinetics human action video dataset. arXiv:1705.06950 (2017)
Carreira, J., Noland, E., Hillier, C., Zisserman, A.: A short note about kinetics-600. arXiv:1808.01340 (2018)
Goyal, R., et al.: The “Something Something” video database for learning and evaluating visual common sense. In: IEEE International Conference on Computer Vision (ICCV), Venice, Italy, pp. 5843–5851 (2017). https://doi.org/10.1109/ICCV.2017.622
Miech, A., Alayrac, J., Smaira, L., Laptev, I., Sivic, J., Zisserman, A.: End-to-end learning of visual representations from uncurated instructional videos. ArXiv. /abs/1912.06430 (2019)
Google Scholar
Miech, A., Zhukov, D., Alayrac, J., Tapaswi, M., Laptev, I., Sivic, J.: HowTo100M: learning a text-video embedding by watching hundred million narrated video clips. ArXiv. /abs/1906.03327 (2019)
Google Scholar
Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. ArXiv. /abs/1711.11248 (2017)
Google Scholar
Carreira, J., Zisserman, A.: Quo Vadis, action recognition? A new model and the kinetics dataset. ArXiv. /abs/1705.07750 (2017)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. arXiv:1412.0767 (2014)
Dosovitskiy, A., et al.: An image is worth 16 × 16 words: transformers for image recognition at scale. ArXiv. /abs/2010.11929 (2020)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. ArXiv. /abs/2103.14030 (2021)
Google Scholar
Neimark, D., Bar, O., Zohar, M., Asselmann, D.: Video transformer network. ArXiv. /abs/2102.00719 (2021)
Google Scholar
Arnab, A., Dehghani, M., Heigold, G., Sun, C., Lučić, M., Schmid, C.: ViViT: a video vision transformer. ArXiv. /abs/2103.15691 (2021)
Google Scholar
Michelucci, U.: An introduction to autoencoders. arXiv:2201.03898 (2022)
Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection – a new baseline. arXiv:1712.09867 (2017)
Zhong, J., Li, N., Kong, W., Liu, S., Li, T.H., Li, G.: Graph convolutional label noise cleaner: train a plug-and-play action classifier for anomaly detection. arXiv:1903.07256 (2019)
Wan, B., Fang, Y., Xia, X., Mei, J.: Weakly supervised video anomaly detection via center-guided discriminative learning. In: Proceeding of the IEEE International Conference on Multimedia and Expo (ICME), London, United Kingdom, pp. 1–6 (2020). https://doi.org/10.1109/ICME46284.2020.9102722
Zhang, J., Qing, L., Miao, J.: Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection. In: Proceeding of the IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, pp. 4030–4034 (2019). https://doi.org/10.1109/ICIP.2019.8803657
Feng, J., Hong, F., Zheng, W.: MIST: multiple instance self-training framework for video anomaly detection. arXiv:2104.01633 (2021)
Wu, J.-C., Hsieh, H.-Y., Chen, D.-J., Fuh, C.-S., Liu, T.-L.: Self-supervised sparse representation for video anomaly detection. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13673, pp. 729–745. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-19778-9_42
Hung, P.D., Kien, N.N.: SSD-MobileNet implementation for classifying fish species. In: Vasant, P., Zelinka, I., Weber, G.W. (eds.) ICO 2019. AISC, vol. 1072, pp. 399–408. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-33585-4_40
Hung, P.D., Su, N.T., Diep, V.T.: Surface classification of damaged concrete using deep convolutional neural network. Pattern Recognit. Image Anal. 29, 676–687 (2019)
Article Google Scholar
Hung, P.D., Su, N.T.: Unsafe construction behavior classification using deep convolutional neural network. Pattern Recognit. Image Anal. 31, 271–284 (2021)
Article Google Scholar
Duy, L.D., Hung, P.D.: Adaptive graph attention network in person re-identification. Pattern Recognit. Image Anal. 32, 384–392 (2022)
Article Google Scholar
Su, N.T., Hung, P.D., Vinh, B.T., Diep, V.T.: Rice leaf disease classification using deep learning and target for mobile devices. In: Al-Emran, M., Al-Sharafi, M.A., Al-Kabi, M.N., Shaalan, K. (eds.) ICETIS 2021. LNNS, vol. 299, pp. 136–148. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-82616-1_13

Download references

Author information

Authors and Affiliations

FPT University, Hanoi, Vietnam
Phan Nguyen Duc Hieu & Phan Duy Hung

Authors

Phan Nguyen Duc Hieu
View author publications
You can also search for this author in PubMed Google Scholar
Phan Duy Hung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Phan Duy Hung .

Editor information

Editors and Affiliations

Wrocław University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
Eötvös Loránd University, Budapest, Hungary
János Botzheim
Eötvös Loránd University, Budapest, Hungary
László Gulyás
Universidad Complutense de Madrid, Madrid, Spain
Manuel Núñez
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Jan Treur
Universität Münster, Münster, Germany
Gottfried Vossen
Wrocław University of Science and Technology, Wrocław, Poland
Adrianna Kozierkiewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hieu, P.N.D., Hung, P.D. (2023). A Convolutional Autoencoder Approach for Weakly Supervised Anomaly Video Detection. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2023. Lecture Notes in Computer Science(), vol 14162. Springer, Cham. https://doi.org/10.1007/978-3-031-41456-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-41456-5_11
Published: 13 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41455-8
Online ISBN: 978-3-031-41456-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Convolutional Autoencoder Approach for Weakly Supervised Anomaly Video Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Dynamic Learning for Improving Anomalous Event Prediction in Surveillance Videos

An explainable and efficient deep learning framework for video anomaly detection

Anomaly detection in surveillance videos using deep autoencoder

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Convolutional Autoencoder Approach for Weakly Supervised Anomaly Video Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Dynamic Learning for Improving Anomalous Event Prediction in Surveillance Videos

An explainable and efficient deep learning framework for video anomaly detection

Anomaly detection in surveillance videos using deep autoencoder

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation