Abstract
Anomaly detection approaches have limiting aspects regarding the representativeness of the information since the video data is captured from a single perspective and may not distinguish all relevant aspects of the scene. The lack of sufficient labeled data is also a challenging aspect of building video anomaly detection approaches. Although multiple instance learning (MIL) has been explored extensively in the weakly supervised video anomaly detection (WS-VAD) literature since it is less hungry for labeled data, there are no studies that exploit multiple overlapping camera views to provide wider representativeness of vision data under MIL assumption. In this work, we show the performance of the video anomaly detection task can be improved by using multiple cameras to capture spatiotemporal information from different perspectives. We propose the approach MC-MIL (Video Anomaly Detection with Multiple Overlapped Cameras and Multiple Instance Learning) framework, which consists of a training scheme with multiple cameras under multiple instance learning for video anomaly detection. We specialize our proposed framework for the two-camera case as a proof of concept for performance evaluation. Due to the lack of datasets for this task, we relabeled the multiple-camera PETS-2009 benchmark dataset for the anomaly detection task from multiple overlapped camera views to evaluate the MC-MIL algorithm. The result shows a significant performance improvement in the AUC ROC score compared to the single-camera configuration and with the literature.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets used during the current study are available at https://github.com/santiagosilas/MC-VAD-Dataset-BasedOn-PETS2009.
Notes
References
Deepak K, Srivathsan G, Roshan S, Chandrakala S (2021) Deep multi-view representation learning for video anomaly detection using spatiotemporal autoencoders. Circ Syst Signal Process 40(3):1333–1349
Shreyas D, Raksha S, Prasad B (2020) Implementation of an anomalous human activity recognition system. SN Comput Sci 1:1–10
Feng J-C, Hong F-T, Zheng W-S (2021) Mist: multiple instance self-training framework for video anomaly detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14009–14018
Asad M, Jiang H, Yang J, Tu E, Malik AA (2022) Multi-stream 3d latent feature clustering for abnormality detection in videos. Appl Intell 52(1):1126–1143
Ren J, Xia F, Liu Y, Lee I (2021) Deep video anomaly detection: opportunities and challenges. In: 2021 international conference on data mining workshops (ICDMW), pp 959–966. IEEE
Kamoona AM, Gosta AK, Bab-Hadiashar A, Hoseinnezhad R (2020) Multiple instance-based video anomaly detection using deep temporal encoding-decoding. arXiv preprint arXiv:2007.01548
Wan B, Fang Y, Xia X, Mei J (2020) Weakly supervised video anomaly detection via center-guided discriminative learning. In: 2020 IEEE international conference on multimedia and expo (ICME), pp. 1–6. IEEE
Lv H, Yue Z, Sun Q, Luo B, Cui Z, Zhang H (2023) Unbiased multiple instance learning for weakly supervised video anomaly detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8022–8031
Herrera F, Ventura S, Bello R, Cornelis C, Zafra A, Sánchez-Tarragó D, Vluymans S (2016) Multiple instance learning. Springer, Berlin, pp 17–33
Pehlivan S, Duygulu P (2011) A new pose-based representation for recognizing actions from multiple cameras. Comput Vis Image Underst 115(2):140–151
Pawar K, Attar V (2019) Deep learning approaches for video-based anomalous activity detection. World Wide Web 22(2):571–601
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6479–6488
Ilyas Z, Aziz Z, Qasim T, Bhatti N, Hayat MF (2021) A hybrid deep network based approach for crowd anomaly detection. Multimed Tools Appl 80:24053–24067
Direkoglu C (2020) Abnormal crowd behavior detection using motion information images and convolutional neural networks. IEEE Access 8:80408–80416
Wang S, Liu J, Yu G, Liu X, Zhou S, Zhu E, Yang Y, Yin J, Yang W (2022) Multiview deep anomaly detection: a systematic exploration. IEEE Trans Neural Netw Learn Syst
Yan X, Hu S, Mao Y, Ye Y, Yu H (2021) Deep multi-view learning methods: a review. Neurocomputing 448:106–129
Zhang J, Qing L, Miao J (2019) Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection. In: 2019 IEEE international conference on image processing (ICIP), pp 4030–4034. IEEE
Zhang S, Staudt E, Faltemier T, Roy-Chowdhury AK (2015) A camera network tracking (camnet) dataset and performance baseline. In: 2015 IEEE winter conference on applications of computer vision, pp 365–372. IEEE
Tax DM, Duin RP (1999) Data domain description using support vectors. In: ESANN, vol. 99, pp 251–256
Tian Y, Pang G, Chen Y, Singh R, Verjans JW, Carneiro G (2021) Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. arXiv preprint arXiv:2101.10030
Chen Y, Liu Z, Zhang B, Fok W, Qi X, Wu Y-C (2023) Mgfn: Magnitude-contrastive glance-and-focus network for weakly-supervised video anomaly detection. In: Proceedings of the AAAI conference on artificial intelligence, vol. 37, pp 387–395
Liu Y, Liu J, Zhu X, Wei D, Huang X, Song L (2022) Learning task-specific representation for video anomaly detection with spatial-temporal attention. In: ICASSP 2022-2022 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2190–2194. IEEE
Putra PU, Shima K, Shimatani K (2018) Markerless human activity recognition method based on deep neural network model using multiple cameras. In: 2018 5th international conference on control, decision and information technologies (CoDIT), pp 13–18. IEEE
Meratwal M, Spicher N, Deserno TM (2022) Multi-camera and multi-person indoor activity recognition for continuous health monitoring using long short term memory. In: Medical Imaging 2022: imaging informatics for healthcare, research, and applications, vol. 12037, pp 64–71. SPIE
Vijay TK, Dogra DP, Choi H, Nam G, Kim I-J (2022) Detection of road accidents using synthetically generated multi-perspective accident videos. IEEE Trans Intell Transp Syst 24(2):1926–1935
Ha TV, Nguyen HM, Thanh SH, Nguyen BT (2023) Fall detection using mixtures of convolutional neural networks. Multimed Tools Appl, 1–28
Pacheco C, Mavroudi E, Kokkoni E, Tanner HG, Vidal R (2021) A detection-based approach to multiview action classification in infants. In: 2020 25th international conference on pattern recognition (ICPR), pp. 6112–6119. IEEE
Yao H, Cavallaro A, Bouwmans T, Zhang Z (2017) Guest editorial introduction to the special issue on group and crowd behavior analysis for intelligent multicamera video surveillance. IEEE Trans Circ Syst Video Technol 27(3):405–408
Acknowledgements
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We have no conflicts of interest to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pereira, S.S.L., Maia, J.E.B. MC-MIL: video surveillance anomaly detection with multi-instance learning and multiple overlapped cameras. Neural Comput & Applic 36, 10527–10543 (2024). https://doi.org/10.1007/s00521-024-09611-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-09611-3