Skip to main content
Log in

Joint Spatio-temporal representation based efficient video event detection using and BMCIM model

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In recent days, video analytics has become an inevitable research area in surveillance video due to the surveillance cameras placed everywhere and thus increasing video data in megabytes. Therefore, it is difficult to analyze the significant event in the video. To detect the important event in a video, there is a need to represent the video content with spatio-temporal features. As the spatio-temporal representative feature can describe both appearance and motion of the event. This paper introduces a novel technique to effectively achieve event detection. Since the shape and motion of the foreground object are usually balanced with each other, this paper uses the joint spatio-temporal representation using two novel features such as SI-HOG and TI-EOF. The SI-HOG feature captures the shape information of the foreground object with its structural information and the TI-EOF feature efficiently extracts the foreground object’s motion. To effectively find the video event, the Bayesian-based Markov Chain Inference Model is being introduced as the data description technique. The proposed work is implemented by using the MATLAB tool and the performance of the proposed work is evaluated by the following existing approaches such as MAC, OCELM, and DAML. The efficiency of the proposed approach is tested on UCSD ped1, UCSD ped2, and UMN datasets. The proposed method achieved high accuracy compared to existing methods in terms of event detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Dataset availability

1. http://www.svcl.ucsd.edu/projects/anomaly/dataset.html

2. https://www.crcv.ucf.edu/projects/Abnormal_Crowd/

References

  1. Alzubi JA, Jain R, Nagrath P, Satapathy S, Taneja S, Gupta P (2020) Deep image captioning using an ensemble of CNN and LSTM based deep neural networks. J Intell Fuzzy Syst 40(4):5761–5769

    Article  Google Scholar 

  2. Alzubi OA, Alzubi JA, Al-Zoubi A’M, Hassonah MA, Kose U (2021) An efficient malware detection approach with feature weighting based on Harris Hawks optimization. Cluster Comput J 25:2369–2387

    Article  Google Scholar 

  3. Bazzani L, Cristani M, Murino V (2013) Symmetry-driven accumulation of local features for human characterization and re-identification. Comput Vis Image Underst 117(2):130–144

    Article  Google Scholar 

  4. Colque RM, Caetano C, Toledo M, Schwartz WR (2016) Histograms of Optical Flow Organizations and Magnitude and Entropy to Detect Anomalous Events in Videos. IEEE Trans Circuits Syst Video Technol:1–10

  5. Elhoseny M (2020) Multi-object detection and tracking (MODT) machine learning model for real-time video surveillance systems. Circuits Syst Signal Process 39(2):611–630

    Article  Google Scholar 

  6. Fernando WSK, Herath HMSPB, Perera PH, Ekanayake MPB, Godaliyadda GMRI, Wijayakulasooriya JV (2014) "Object identification, enhancement and tracking under dynamic background conditions," 7th International Conference on Information and Automation for Sustainability, pp. 1–6

  7. Gaidon ZH, Schmid C (2011) “Actom Sequence Models for Efficient Action Detection,” Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)

  8. Gall J, Yao A, Razavi N, van Gool L, Lempitsky V (2011) Hough forests for object detection, tracking, and action recognition. IEEE Trans Pattern Anal Mach Intell 33(11):2188–2202

    Article  Google Scholar 

  9. Gheissari N, Sebastian TB, Hartley R (2006) Person reidentification using spatiotemporal appearance. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06). IEEE, vol 2, pp 1528–1535

  10. Jain DK, Jacob S, Alzubi J, Menon V (2019) An efficient and adaptable multimedia system for converting PAL to VGA in real-time video processing. J Real-Time Image Proc, Web: 12 June

  11. Jojic N, Perina A, Cristani M, Murino V, Frey B (2009) “Stel component analysis: Modeling spatial correlations in image class structure,” 2009 IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work. CVPR Work. 2009, vol. 2009 IEEE, no. 2, pp. 2044–2051

  12. Ke Y, Sukthankar R, Hebert M (2007) “Event Detection in Crowded Videos,” Proc. 11th IEEE Int’l Conf. Computer Vision

  13. Kong S, Bhuyan MK, Sanderson C, Lovell BC (2008) “Tracking of persons for video surveillance of unattended environments,” Proc. - Int. Conf. Pattern Recognit., no. ii, pp. 1–4

  14. Laptev, Perez P (2007) “Retrieving Actions in Movies,” Proc. 11th IEEE Int’l Conf. Computer Vision

  15. Li X, Lu H, Zhang L, Ruan X, Yang M-H (2013) Saliency Detection via Dense and Sparse Reconstruction, 2013 IEEE International Conference on Computer Vision, pp. 2976–2983

  16. Li B, Leroux S, Simoens P (2021) Decoupled appearance and motion learning for efficient anomaly detection in surveillance video,” Elsevier. Comput Vis Image Underst:1–8

  17. Lim MK, Tang S, Chan CS (2014) iSurveillance: Intelligent framework for multiple events detection in surveillance videos,” Elsevier. Expert Syst Appl 41:4704–4715

    Article  Google Scholar 

  18. Lin W, Zhang Y, Lu J, Zhou B, Wang J, Yu Z (2014) Summarizing surveillance videos with local – patch – learning - based abnormality detection, blob sequence optimization, and type - based synopsis,” Elsevier. Neurocomputing:1–15

  19. Lloyd K, Rosin PL, Marshall D, Moore SC (2017) Detecting violent and abnormal crowd activity using temporal analysis of grey level co-occurrence matrix (GLCM)-based texture measures. Mach Vis Appl 28:361–371

    Article  Google Scholar 

  20. Movassagh AA, Alzubi JA, Gheisari M, Rahimi M, Mohan S k, Abbasi AA, Nabipour N (2021) Artificial neural networks training algorithm integrating invasive weed optimization with diferential evolutionary model. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-020-02623-6

  21. Schumann AS, Beyerer J (2019) “Attribute-based Person Retrieval and Search in Video Sequences,” Proc. AVSS 2018–2018 15th IEEE Int. Conf. Adv. Video Signal-Based Surveill

  22. Shehzed AJ Kim K (2019) "Multi-Person Tracking in Smart Surveillance System for Crowd Counting and Normal/Abnormal Events Detection," 2019 International Conference on Applied and Engineering Mathematics (ICAEM), pp. 163–168

  23. Tran D, Yuan J, Forsyth D (2014) Video event detection: from subvolume localization to Spatio-temporal path search. IEEE Trans Pattern Anal Mach Intell 36(2):404–416

    Article  Google Scholar 

  24. Vennila TJ, Balamurugan V (2020) "A Stochastic Framework for Keyframe Extraction," 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), pp. 1–5

  25. Wang S, Zhu E, Yin J, Porikli F (2017) Video anomaly detection and localization by local motion based joint video representation and OCELM, Elsevier. Neurocomputing 277:161–175

    Article  Google Scholar 

  26. Yang X, Rong X, Yang X, Tian Y (2017) Evaluation of Low-level Features for Real-World Surveillance Detection. IEEE Trans Circ and Syst for Video Tech 27:624–634

    Article  Google Scholar 

  27. Yuan J, Liu Z, Wu Y (2011) Discriminative video pattern search for efficient action detection. IEEE Trans Pattern Anal Mach Intell 33(9):1728–1743

    Article  Google Scholar 

  28. Yuan Y, Wang D, Wang Q (2017) Anomaly detection in traffic scenes via spatial-aware motion reconstruction. IEEE Trans Intell Transp Syst 18(5):1198–1209

    Article  Google Scholar 

  29. Zhang S, Zhu Y, Roy-Chowdhury AK (2016) Context-aware surveillance video summarization. IEEE Trans Image Process 25(11):5469–5478. https://doi.org/10.1109/TIP.2016.2601493

    Article  MathSciNet  MATH  Google Scholar 

  30. Zhang Y, Lu H, Zhang L, Xiang R (2016) Combining motion and appearance cues for anomaly detection. Pattern Recogn 51:443–452

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Anbarasa Pandian.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pandian, A.A., Maheswari, S. Joint Spatio-temporal representation based efficient video event detection using and BMCIM model. Multimed Tools Appl 82, 44577–44589 (2023). https://doi.org/10.1007/s11042-023-15055-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15055-z

Keywords

Navigation